If you are like me, from time to time your papers include links to online references. Because the internet changes so often, by the time readers follow those links, who knows if the cited content will still be there. This blogpost shares a simple way to ensure your links live “forever.” I got the idea…
Author: Uri Simonsohn
[33] "The" Effect Size Does Not Exist
Consider the robust phenomenon of anchoring, where people’s numerical estimates are biased towards arbitrary starting points. What does it mean to say “the” effect size of anchoring? It surely depends on moderators like domain of the estimate, expertise, and perceived informativeness of the anchor. Alright, how about “the average” effect-size of anchoring? That's simple enough….
[31] Women are taller than men: Misusing Occam’s Razor to lobotomize discussions of alternative explanations
Most scientific studies document a pattern for which the authors provide an explanation. The job of readers and reviewers is to examine whether that pattern is better explained by alternative explanations. When alternative explanations are offered, it is common for authors to acknowledge that although, yes, each study has potential confounds, no single alternative explanation…
[29] Help! Someone Thinks I p-hacked
It has become more common to publicly speculate, upon noticing a paper with unusual analyses, that a reported finding was obtained via p-hacking. This post discusses how authors can persuasively respond to such speculations. Examples of public speculation of p-hacking Example 1. A Slate.com post by Andrew Gelman suspected p-hacking in a paper that collected…
[28] Confidence Intervals Don't Change How We Think about Data
Some journals are thinking of discouraging authors from reporting p-values and encouraging or even requiring them to report confidence intervals instead. Would our inferences be better, or even just different, if we reported confidence intervals instead of p-values? One possibility is that researchers become less obsessed with the arbitrary significant/not-significant dichotomy. We start paying more…
[24] P-curve vs. Excessive Significance Test
In this post I use data from the Many-Labs replication project to contrast the (pointless) inferences one arrives at using the Excessive Significant Test, with the (critically important) inferences one arrives at with p-curve. The many-labs project is a collaboration of 36 labs around the world, each running a replication of 13 published effects in…
[23] Ceiling Effects and Replications
A recent failure to replicate led to an attention-grabbing debate in psychology. As you may expect from university professors, some of it involved data. As you may not expect from university professors, much of it involved saying mean things that would get a child sent to the principal's office (.pdf). The hostility in the debate has obscured an interesting…
[20] We cannot afford to study effect size in the lab
Methods people often say – in textbooks, task forces, papers, editorials, over coffee, in their sleep – that we should focus more on estimating effect sizes rather than testing for significance. I am kind of a methods person, and I am kind of going to say the opposite. Only kind of the opposite because it…
[19] Fake Data: Mendel vs. Stapel
Diederik Stapel, Dirk Smeesters, and Lawrence Sanna published psychology papers with fake data. They each faked in their own idiosyncratic way, nevertheless, their data do share something in common. Real data are noisy. Theirs aren't. Gregor Mendel's data also lack noise (yes, famous peas-experimenter Mendel). Moreover, in a mathematical sense, his data are just as…
[17] No-way Interactions
This post shares a shocking and counterintuitive fact about studies looking at interactions where effects are predicted to get smaller (attenuated interactions). I needed a working example and went with Fritz Strack et al.’s (1988, .html) famous paper [933 Google cites], in which participants rated cartoons as funnier if they saw them while holding a…