The smaller your sample, the less likely your evidence is to reveal the truth. You might already know this, but most people don’t (.html), or at least they don’t appropriately apply it (.html). (See, for example, nearly every inference ever made by anyone). My experience trying to teach this concept suggests that it’s best understood…
[25] Maybe people actually enjoy being alone with their thoughts
Recently Science published a paper concluding that people do not like sitting quietly by themselves (.html). The article received press coverage, that press coverage received blog coverage, which received twitter coverage, which received meaningful head-nodding coverage around my department. The bulk of that coverage (e.g., 1, 2, and 3) focused on the tenth study in…
[24] P-curve vs. Excessive Significance Test
In this post I use data from the Many-Labs replication project to contrast the (pointless) inferences one arrives at using the Excessive Significant Test, with the (critically important) inferences one arrives at with p-curve. The many-labs project is a collaboration of 36 labs around the world, each running a replication of 13 published effects in…
[23] Ceiling Effects and Replications
A recent failure to replicate led to an attention-grabbing debate in psychology. As you may expect from university professors, some of it involved data. As you may not expect from university professors, much of it involved saying mean things that would get a child sent to the principal's office (.pdf). The hostility in the debate has obscured an interesting…
[22] You know what's on our shopping list
As part of an ongoing project with Minah Jung, a nearly perfect doctoral student, we asked people to estimate the percentage of people who bought some common items in their last trip to the supermarket. For each of 18 items, we simply asked people (N = 397) to report whether they had bought it on…
[21] Fake-Data Colada: Excessive Linearity
Recently, a psychology paper (.html) was flagged as possibly fraudulent based on statistical analyses (.pdf). The author defended his paper (.html), but the university committee investigating misconduct concluded it had occurred (.pdf). In this post we present new and more intuitive versions of the analyses that flagged the paper as possibly fraudulent. We then rule…
[20] We cannot afford to study effect size in the lab
Methods people often say – in textbooks, task forces, papers, editorials, over coffee, in their sleep – that we should focus more on estimating effect sizes rather than testing for significance. I am kind of a methods person, and I am kind of going to say the opposite. Only kind of the opposite because it…
[19] Fake Data: Mendel vs. Stapel
Diederik Stapel, Dirk Smeesters, and Lawrence Sanna published psychology papers with fake data. They each faked in their own idiosyncratic way, nevertheless, their data do share something in common. Real data are noisy. Theirs aren't. Gregor Mendel's data also lack noise (yes, famous peas-experimenter Mendel). Moreover, in a mathematical sense, his data are just as…
[18] MTurk vs. The Lab: Either Way We Need Big Samples
Back in May 2012, we were interested in the question of how many participants a typical between-subjects psychology study needs to have an 80% chance to detect a true effect. To answer this, you need to know the effect size for a typical study, which you can’t know from examining the published literature because it…
[17] No-way Interactions
This post shares a shocking and counterintuitive fact about studies looking at interactions where effects are predicted to get smaller (attenuated interactions). I needed a working example and went with Fritz Strack et al.’s (1988, .html) famous paper [933 Google cites], in which participants rated cartoons as funnier if they saw them while holding a…