The paper titled "Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results" (.htm) is currently the most cited 2019 article in the Quarterly Journal of Economics (372 Google cites). It delivers bad news to economists running experiments: their p-values are wrong. To get correct p-values, the article explains, they need to…
Author: Uri Simonsohn
[96] Madam Speaker: Are Female Presenters Treated Worse in Econ Seminars?
A recent NBER paper titled "Gender and the Dynamics of Economics Seminars" (.htm) reports analyses of audience questions asked during 462 economics seminars, concluding that “women are asked more questions . . . and the questions asked of women are more likely to be patronizing or hostile . . . suggest[ing] yet another potential explanation…
[95] Groundhog: Addressing The Threat That R Poses To Reproducible Research
R, the free and open source program for statistical computing, poses a substantial threat to the reproducibility of published research. This post explains the problem and introduces a solution. The Problem: Packages R itself has some reproducibility problems (see example in this footnote [1]), but the big problem is its packages: the addon scripts that…
[91] p-hacking fast and slow: Evaluating a forthcoming AER paper deeming some econ literatures less trustworthy
The authors of a forthcoming AER article (.pdf), "Methods Matter: P-Hacking and Publication Bias in Causal Analysis in Economics", painstakingly harvested thousands of test results from 25 economics journals to answer an interesting question: Are studies that use some research designs more trustworthy than others? In this post I will explain why I think their…
[88] The Hot-Hand Artifact for Dummies & Behavioral Scientists
A friend recently asked for my take on the Miller and Sanjurjo's (2018; pdf) debunking of the hot hand fallacy. In that paper, the authors provide a brilliant and surprising observation missed by hundreds of people who had thought about the issue before, including the classic Gilovich, Vallone, & Tverksy (1985 .htm). In this post:…
[80] Interaction Effects Need Interaction Controls
In a recent referee report I argued something I have argued in several reports before: if the effect of interest in a regression is an interaction, the control variables addressing possible confounds should be interactions as well. In this post I explain that argument using as a working example a 2011 QJE paper (.htm) that…
[78c] Bayes Factors in Ten Recent Psych Science Papers
For this post, the third in a series on Bayes factors (.htm), I wanted to get a sense for how Bayes factors were being used with real data from real papers, so I looked at the 10 most recent empirical papers in Psychological Science containing the phrase "Bayes factor" (.zip). After browsing them all, I…
[78b] Hyp-Chart, the Missing Link Between P-values and Bayes Factors
Just two steps are needed to go from computing p-values to computing Bayes factors. This post explains both steps and introduces Hyp-Chart, the missing link we arrive at if we take only the first step. Hyp-Chart is a graph that shows how well the data fit the null vs. every possible alternative hypothesis [1]. Hyp-Chart…
[78a] If you think p-values are problematic, wait until you understand Bayes Factors
Would raising the minimum wage by $4 lead to greater unemployment? Milton, a Chicago economist, has a theory (supply and demand) that says so. Milton believes the causal effect is anywhere between 1% and 10%. After the minimum wage increase of $4, unemployment goes up 1%. Milton feels bad about the unemployed but good about…
[77] Number-Bunching: A New Tool for Forensic Data Analysis
In this post I show how one can analyze the frequency with which values get repeated within a dataset – what I call “number-bunching” – to statistically identify whether the data were likely tampered with. Unlike Benford’s law (.htm), and its generalizations, this approach examines the entire number at once, not only the first or…