Data Colada
Menu
  • Home
  • Table of Contents
  • Feedback Policy
  • About
Menu

Author: Uri Simonsohn

[51] Greg vs. Jamal: Why Didn’t Bertrand and Mullainathan (2004) Replicate?

Posted on September 6, 2016February 15, 2020 by Uri Simonsohn

Bertrand & Mullainathan (2004, .htm) is one of the best known and most cited American Economic Review (AER) papers [1]. It reports a field experiment in which resumes given typically Black names (e.g., Jamal and Lakisha) received fewer callbacks than those given typically White names (e.g., Greg and Emily). This finding is interpreted as evidence of racial discrimination…

Read more

[50] Teenagers in Bikinis: Interpreting Police-Shooting Data

Posted on July 14, 2016February 15, 2020 by Uri Simonsohn

The New York Times, on Monday, showcased (.htm) an NBER working paper (.pdf) that proposed that “blacks are 23.8 percent less likely to be shot at by police relative to whites.” (p.22) The paper involved a monumental data collection effort  to address an important societal question. The analyses are rigorous, clever and transparently reported. Nevertheless, I do…

Read more

[48] P-hacked Hypotheses Are Deceivingly Robust

Posted on April 28, 2016January 30, 2020 by Uri Simonsohn

Sometimes we selectively report the analyses we run to test a hypothesis. Other times we selectively report which hypotheses we tested. One popular way to p-hack hypotheses involves subgroups. Upon realizing analyses of the entire sample do not produce a significant effect, we check whether analyses of various subsamples — women, or the young, or republicans, or…

Read more

[47] Evaluating Replications: 40% Full ≠ 60% Empty

Posted on March 3, 2016February 12, 2020 by Uri Simonsohn

Last October, Science published the paper “Estimating the Reproducibility of Psychological Science” (htm), which reported the results of 100 replication attempts. Today it published a commentary by Gilbert et al. (.htm) as well as a response by the replicators (.htm). The commentary makes two main points. First, because of sampling error, we should not expect all of…

Read more

[43] Rain & Happiness: Why Didn’t Schwarz & Clore (1983) ‘Replicate’ ?

Posted on November 16, 2015February 11, 2020 by Uri Simonsohn

In my “Small Telescopes” paper, I introduced a new approach to evaluate replication results (SSRN). Among other examples, I described two studies as having failed to replicate the famous Schwarz and Clore (1983) finding that people report being happier with their lives when asked on sunny days. Figure and text from Small Telescopes paper (SSRN) I…

Read more

[42] Accepting the Null: Where to Draw the Line?

Posted on October 28, 2015February 11, 2020 by Uri Simonsohn

We typically ask if an effect exists.  But sometimes we want to ask if it does not. For example, how many of the “failed” replications in the recent reproducibility project published in Science (.pdf) suggest the absence of an effect? Data have noise, so we can never say ‘the effect is exactly zero.’  We can…

Read more

[41] Falsely Reassuring: Analyses of ALL p-values

Posted on August 24, 2015October 18, 2023 by Uri Simonsohn

It is a neat idea. Get a ton of papers. Extract all p-values. Examine the prevalence of p-hacking by assessing if there are too many p-values near p=.05. Economists have done it [SSRN], as have psychologists [.html], and biologists [.html]. These charts with distributions of p-values come from those papers: The dotted circles highlight the excess of…

Read more

[40] Reducing Fraud in Science

Posted on June 29, 2015February 11, 2020 by Uri Simonsohn

Fraud in science is often attributed to incentives: we reward sexy-results→fraud happens. The solution, the argument goes, is to reward other things.  In this post I counter-argue, proposing three alternative solutions. Problems with the Change the Incentives solution. First, even if rewarding sexy-results caused fraud, it does not follow we should stop rewarding sexy-results. We…

Read more

[39] Power Naps: When do Within-Subject Comparisons Help vs Hurt (yes, hurt) Power?

Posted on June 22, 2015February 11, 2020 by Uri Simonsohn

A recent Science-paper (.html) used a total sample size of N=40 to arrive at the conclusion that implicit racial and gender stereotypes can be reduced while napping.  N=40 is a small sample for a between-subject experiment. One needs N=92 to reliably detect that men are heavier than women (SSRN). The study, however, was within-subject, for instance, its dependent…

Read more

[36] How to Study Discrimination (or Anything) With Names; If You Must

Posted on April 23, 2015May 23, 2020 by Uri Simonsohn

Consider these paraphrased famous findings: “Because his name resembles ‘dentist,’ Dennis became one” (JPSP, .pdf) “Because the applicant was black (named Jamal instead of Greg) he was not interviewed” (AER, .pdf) “Because the applicant was female (named Jennifer instead of John), she got a lower offer” (PNAS, .pdf) Everything that matters (income, age, location, religion) correlates with…

Read more
  • Previous
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • Next

Get Colada email alerts.

Join 10.9K other subscribers

Social media

Recent Posts

  • [133] Heterofriendly: The Intuition for Why You Always Need Robust Standard Errors
  • [132] statuser: R in user-friendly mode
  • [131] Bending Over Backwards:
    The Quadratic Puts the U in AI
  • [130] ResearchBox: Even Easier to Use and More Transparently Permanent than Before
  • [129] P-curve works in practice, but would it work if you dropped a piano on it?

Get blogpost email alerts

Join 10.9K other subscribers

tweeter & facebook

We announce posts on Twitter
We announce posts on Bluesky
And link to them on our Facebook page

Posts on similar topics

    search

    © 2021, Uri Simonsohn, Leif Nelson, and Joseph Simmons. For permission to reprint individual blog posts on DataColada please contact us via email..