Unexpectedly Difficult Statistical Concepts Archives - Page 2 of 3

[70] How Many Studies Have Not Been Run? Why We Still Think the Average Effect Does Not Exist

Posted on March 9, 2018February 12, 2020 by Leif Nelson

We have argued that, for most effects, it is impossible to identify the average effect (datacolada.org/33). The argument is subtle (but not statistical), and given the number of well-informed people who seem to disagree, perhaps we are simply wrong. This is my effort to explain why we think identifying the average effect is so hard….

[57] Interactions in Logit Regressions: Why Positive May Mean Negative

Posted on February 23, 2017July 28, 2022 by Uri Simonsohn

Of all economics papers published this century, the 10th most cited appeared in Economics Letters , a journal with an impact factor of 0.5. It makes an inconvenient and counterintuitive point: the sign of the estimate (b̂) of an interaction in a logit/probit regression, need not correspond to the sign of its effect on the…

[50] Teenagers in Bikinis: Interpreting Police-Shooting Data

Posted on July 14, 2016February 15, 2020 by Uri Simonsohn

The New York Times, on Monday, showcased (.htm) an NBER working paper (.pdf) that proposed that “blacks are 23.8 percent less likely to be shot at by police relative to whites.” (p.22) The paper involved a monumental data collection effort to address an important societal question. The analyses are rigorous, clever and transparently reported. Nevertheless, I do…

[46] Controlling the Weather

Posted on February 2, 2016January 30, 2020 by Joe & Uri

Behavioral scientists have put forth evidence that the weather affects all sorts of things, including the stock market, restaurant tips, car purchases, product returns, art prices, and college admissions. It is not easy to properly study the effects of weather on human behavior. This is because weather is (obviously) seasonal, as is much of what…

[42] Accepting the Null: Where to Draw the Line?

Posted on October 28, 2015February 11, 2020 by Uri Simonsohn

We typically ask if an effect exists. But sometimes we want to ask if it does not. For example, how many of the “failed” replications in the recent reproducibility project published in Science (.pdf) suggest the absence of an effect? Data have noise, so we can never say ‘the effect is exactly zero.’ We can…

[41] Falsely Reassuring: Analyses of ALL p-values

Posted on August 24, 2015October 18, 2023 by Uri Simonsohn

It is a neat idea. Get a ton of papers. Extract all p-values. Examine the prevalence of p-hacking by assessing if there are too many p-values near p=.05. Economists have done it [SSRN], as have psychologists [.html], and biologists [.html]. These charts with distributions of p-values come from those papers: The dotted circles highlight the excess of…

[39] Power Naps: When do Within-Subject Comparisons Help vs Hurt (yes, hurt) Power?

Posted on June 22, 2015February 11, 2020 by Uri Simonsohn

A recent Science-paper (.html) used a total sample size of N=40 to arrive at the conclusion that implicit racial and gender stereotypes can be reduced while napping. N=40 is a small sample for a between-subject experiment. One needs N=92 to reliably detect that men are heavier than women (SSRN). The study, however, was within-subject, for instance, its dependent…

[33] "The" Effect Size Does Not Exist

Posted on February 9, 2015February 11, 2020 by Uri Simonsohn

Consider the robust phenomenon of anchoring, where people’s numerical estimates are biased towards arbitrary starting points. What does it mean to say “the” effect size of anchoring? It surely depends on moderators like domain of the estimate, expertise, and perceived informativeness of the anchor. Alright, how about “the average” effect-size of anchoring? That's simple enough….

[27] Thirty-somethings are Shrinking and Other U-Shaped Challenges

Posted on September 17, 2014February 11, 2020 by Leif and Uri

A recent Psych Science (.pdf) paper found that sports teams can perform worse when they have too much talent. For example, in Study 3 they found that NBA teams with a higher percentage of talented players win more games, but that teams with the highest levels of talented players win fewer games. The hypothesis is easy enough…

[20] We cannot afford to study effect size in the lab

Posted on May 1, 2014January 30, 2020 by Uri Simonsohn

Methods people often say – in textbooks, task forces, papers, editorials, over coffee, in their sleep – that we should focus more on estimating effect sizes rather than testing for significance. I am kind of a methods person, and I am kind of going to say the opposite. Only kind of the opposite because it…