Data Colada
Menu
  • Home
  • Table of Contents
  • Feedback Policy
  • Seminar
  • About
Menu

[6] Samples Can't Be Too Large


Posted on November 4, 2013January 23, 2019 by Joe Simmons

Reviewers, and even associate editors, sometimes criticize studies for being “overpowered” – that is, for having sample sizes that are too large. (Recently, the between-subjects sample sizes under attack were about 50-60 per cell, just a little larger than you need to have an 80% chance to detect that men weigh more than women).

This criticism never makes sense.

The rationale for it is something like this: “With such large sample sizes, even trivial effect sizes will be significant. Thus, the effect must be trivial (and we don’t care about trivial effect sizes).”

But if this is the rationale, then the criticism is ultimately targeting the effect size rather than the sample size.  A person concerned that an effect “might” be trivial because it is significant with a large sample can simply compute the effect size, and then judge whether it is trivial.

(As an aside: Assume you want an 80% chance to detect a between-subjects effect. You need about 6,000 per cell for a "trivial" effect, say d=.05, and still about 250 per cell for a meaningful "small" effect, say d=.25. We don't need to worry that studies with 60 per cell will make trivial effects be significant).

It is OK to criticize a study for having a small effect size. But it is not OK to criticize a study for having a large sample size. This is because sample sizes do not change effect sizes. If I were to study the effect of gender on weight with 40 people or with 400 people, I would, on average, estimate the same effect size (d ~= .59). Collecting 360 additional observations does not decrease my effect size (though, happily, it does increase the precision of my effect size estimate, and that increased precision better enables me to tell whether an effect size is in fact trivial).

Our field suffers from a problem of underpowering. When we underpower our studies, we either suffer the consequences of a large file drawer of failed studies (bad for us) or we are motivated to p-hack in order to find something to be significant (bad for the field). Those who criticize studies for being overpowered are using a nonsensical argument to reinforce exactly the wrong methodological norms.

If someone wants to criticize trivial effect sizes, they can compute them and, if they are trivial, criticize them. But they should never criticize samples for being too large.

We are an empirical science. We collect data, and use those data to learn about the world. For an empirical science, large samples are good. It is never worse to have more data.


Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Get Colada email alerts.

Join 5,108 other subscribers

Social media

We tweet new posts: @DataColada
And mastopost'em: @DataColada@mas.to
And link to them on our Facebook page

Recent Posts

  • [107] Meaningless Means #3: The Truth About Lies
  • [106] Meaningless Means #2: The Average Effect of Nudging in Academic Publications is 8.7%
  • [105] Meaningless Means #1: The Average Effect
    of Nudging Is d = .43
  • [104] Meaningless Means: Some Fundamental Problems With Meta-Analytic Averages
  • [103] Mediation Analysis is Counterintuitively Invalid

Get blogpost email alerts

Join 5,108 other subscribers

tweeter & facebook

We tweet new posts: @DataColada
And link to them on our Facebook page

Posts on similar topics

Statistical Power
  • [54] The 90x75x50 heuristic: Noisy & Wasteful Sample Sizes In The “Social Science Replication Project”
  • [26] What If Games Were Shorter?
  • [18] MTurk vs. The Lab: Either Way We Need Big Samples
  • [6] Samples Can't Be Too Large

search

© 2021, Uri Simonsohn, Leif Nelson, and Joseph Simmons. For permission to reprint individual blog posts on DataColada please contact us via email..