Data Colada

[108] MRAN is Dead, long live GRAN

Posted on April 28, 2023April 28, 2023 by Uri Simonsohn

Microsoft has been making daily copies of the entire CRAN website of R packages since 2014. This archive, named MRAN, allows installing older versions of packages, which is valuable for reproducibility purposes. The 15,000+ R packages on CRAN are incessantly updated. For example, the package tidyverse depends on 109 packages; these packages accumulate 63 updates, just…

[107] Meaningless Means #3: The Truth About Lies

Posted on February 28, 2023February 28, 2023 by Joe Leif Uri

This is the third post in a series (.htm) in which we argue/show that meta-analytic means are often meaningless, because they often (1) include invalid tests of the hypothesis of interest to the meta-analyst and (2) combine incommensurate results. The meta-analysis we discuss here explores how dishonesty differs across four different experimental paradigms (e.g., coin…

[106] Meaningless Means #2: The Average Effect of Nudging in Academic Publications is 8.7%

Posted on November 29, 2022November 29, 2022 by Uri, Joe, & Leif

This post is the second in a series (.htm) in which we argue that meta-analytic means are often meaningless, because these averages (1) include invalid tests of the meta-analytic research question, and (2) aggregate incommensurable results. In each post we showcase examples of (1) and (2) in a different published meta-analysis. We seek out meta-analyses…

[105] Meaningless Means #1: The Average Effect
of Nudging Is d = .43

Posted on November 3, 2022November 29, 2022 by Joe Leif Uri

This post is the second in a series (see its introduction: htm) arguing that meta-analytic means are often meaningless, because (1) they include results from invalid tests of the research question of interest to the meta-analyst, and (2) they average across fundamentally incommensurable results. In this post we focus primarily on problem (2), though problem…

[104] Meaningless Means: Some Fundamental Problems With Meta-Analytic Averages

Posted on November 1, 2022November 2, 2022 by Uri, Joe, & Leif

This post is an introduction to a series of posts about meta-analysis [1]. We think that many, perhaps most, meta-analyses in the behavioral sciences are invalid. In this introductory post, we make that case with arguments. In subsequent posts, we will make that case by presenting examples taken from published meta-analyses. We have recently written…

[103] Mediation Analysis is Counterintuitively Invalid

Posted on September 26, 2022September 6, 2023 by Uri Simonsohn

Mediation analysis is very common in behavioral science despite suffering from many invalidating shortcomings. While most of the shortcomings are intuitive [1], this post focuses on a counterintuitive one. It is one of those quirky statistical things that can be fun to think about, so it would merit a blog post even if it were…

[102] R on Steroids: Running WAY faster simulations in R

Posted on September 6, 2022September 6, 2022 by Uri Simonsohn

This post shows how to run simulations (loops) in R that can go 50 times faster than the default approach of running code like: for (k in 1:100) on your laptop. Obviously, a bit of a niche post. There are two steps. Step 1 involves running parallel rather than sequential loops [1]. This step can…

[101] Transparency Makes Research Evaluable: Evaluating a Field Experiment on Crime Published in Nature

Posted on April 28, 2022April 28, 2022 by Joe & Uri

A recently published Nature paper (.htm) examined an interesting psychological hypothesis and applied it to a policy relevant question. The authors ran an ambitious field experiment and posted all their data, code, and materials. They also were transparent in showing the results of many different analyses, including some that yielded non-significant results. This is in…

[100] Groundhog 2.0: Further addressing the threat R poses to reproducible research

Posted on April 8, 2022April 9, 2022 by Uri Simonsohn

About a year ago I wrote Colada[95], a post on the threat R poses to reproducible research. The core issue is the 'packages'. When using R, you can run library(some_package) and R can all of a sudden scrape a website, cluster standard errors, maybe even help you levitate. The problem is that packages get updated…

[99] Hyping Fisher: The Most Cited 2019 QJE Paper Relied on an Outdated Stata Default to Conclude Regression p-values Are Inadequate

Posted on October 13, 2021October 27, 2021 by Uri Simonsohn

The paper titled "Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results" (.htm) is currently the most cited 2019 article in the Quarterly Journal of Economics (372 Google cites). It delivers bad news to economists running experiments: their p-values are wrong. To get correct p-values, the article explains, they need to…