There is a classic statistical test known as the Kolmogorov-Smirnov (KS) test (Wikipedia). This post is about an off-label use of the KS-test that I don’t think people know about (not even Kolmogorov or Smirnov), and which seems useful for experimentalists in behavioral science and beyond (most useful, I think, for clinical trials and field…
Author: Uri Simonsohn
[119] A Hidden Confound in a Psych Methods Pre‑registrations Critique
A forthcoming paper in Psych Methods (.pdf) had a set of coders evaluate 300 pre-registrations in terms of how informative they were about several study attributes (e.g., hypotheses, analysis, DVs). The authors analyzed the subjective codings and concluded that many pre-registrations in psychology, especially those relying on the AsPredicted template, provide insufficient information., Central to…
[117] The Impersonator: The Fake Data Were Coming From Inside the Lab
A previous version of this post was supposed to go live in January 2019. But the day before it was scheduled, the Data Colada team (Uri, Leif, and Joe) received an email that we took to be a potential death threat. After discussions with the local police, the FBI, and our families, we decided to…
[115] Preregistration Prevalence
Pre-registration is the best and possibly only solution to p-hacking. Ten years ago, pre-registrations were virtually unheard of in psychology, but they have become increasingly common since then. I was curious just how common they have become, and so I collected some data. This post shares the results. The data From the Web of Science…
[108] MRAN is Dead, long live GRAN
Microsoft has been making daily copies of the entire CRAN website of R packages since 2014. This archive, named MRAN, allows installing older versions of packages, which is valuable for reproducibility purposes. The 15,000+ R packages on CRAN are incessantly updated. For example, the package tidyverse depends on 109 packages; these packages accumulate 63 updates, just…
[103] Mediation Analysis is Counterintuitively Invalid
Mediation analysis is very common in behavioral science despite suffering from many invalidating shortcomings. While most of the shortcomings are intuitive [1], this post focuses on a counterintuitive one. It is one of those quirky statistical things that can be fun to think about, so it would merit a blog post even if it were…
[102] R on Steroids: Running WAY faster simulations in R
This post shows how to run simulations (loops) in R that can go 50 times faster than the default approach of running code like: for (k in 1:100) on your laptop. Obviously, a bit of a niche post. There are two steps. Step 1 involves running parallel rather than sequential loops [1]. This step can…
[100] Groundhog 2.0: Further addressing the threat R poses to reproducible research
About a year ago I wrote Colada[95], a post on the threat R poses to reproducible research. The core issue is the 'packages'. When using R, you can run library(some_package) and R can all of a sudden scrape a website, cluster standard errors, maybe even help you levitate. The problem is that packages get updated…
[99] Hyping Fisher: The Most Cited 2019 QJE Paper Relied on an Outdated Stata Default to Conclude Regression p-values Are Inadequate
The paper titled "Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results" (.htm) is currently the most cited 2019 article in the Quarterly Journal of Economics (372 Google cites). It delivers bad news to economists running experiments: their p-values are wrong. To get correct p-values, the article explains, they need to…
[96] Madam Speaker: Are Female Presenters Treated Worse in Econ Seminars?
A recent NBER paper titled "Gender and the Dynamics of Economics Seminars" (.htm) reports analyses of audience questions asked during 462 economics seminars, concluding that “women are asked more questions . . . and the questions asked of women are more likely to be patronizing or hostile . . . suggest[ing] yet another potential explanation…