Behavioral scientists have put forth evidence that the weather affects all sorts of things, including the stock market, restaurant tips, car purchases, product returns, art prices, and college admissions. It is not easy to properly study the effects of weather on human behavior. This is because weather is (obviously) seasonal, as is much of what…
[45] Ambitious P-Hacking and P-Curve 4.0
In this post, we first consider how plausible it is for researchers to engage in more ambitious p-hacking (i.e., past the nominal significance level of p<.05). Then, we describe how we have modified p-curve (see app 4.0) to deal with this possibility. Ambitious p-hacking is hard. In “False-Positive Psychology” (SSRN), we simulated the consequences of four…
[44] AsPredicted: Pre-registration Made Easy
Pre-registering a study consists of leaving a written record of how it will be conducted and analyzed. Very few researchers currently pre-register their studies. Maybe it’s because pre-registering is annoying. Maybe it's because researchers don't want to tie their own hands. Or maybe it's because researchers see no benefit to pre-registering. This post addresses these…
[43] Rain & Happiness: Why Didn’t Schwarz & Clore (1983) ‘Replicate’ ?
In my “Small Telescopes” paper, I introduced a new approach to evaluate replication results (SSRN). Among other examples, I described two studies as having failed to replicate the famous Schwarz and Clore (1983) finding that people report being happier with their lives when asked on sunny days. Figure and text from Small Telescopes paper (SSRN) I…
[42] Accepting the Null: Where to Draw the Line?
We typically ask if an effect exists. But sometimes we want to ask if it does not. For example, how many of the “failed” replications in the recent reproducibility project published in Science (.pdf) suggest the absence of an effect? Data have noise, so we can never say ‘the effect is exactly zero.’ We can…
[41] Falsely Reassuring: Analyses of ALL p-values
It is a neat idea. Get a ton of papers. Extract all p-values. Examine the prevalence of p-hacking by assessing if there are too many p-values near p=.05. Economists have done it [SSRN], as have psychologists [.html], and biologists [.html]. These charts with distributions of p-values come from those papers: The dotted circles highlight the excess of…
[40] Reducing Fraud in Science
Fraud in science is often attributed to incentives: we reward sexy-results→fraud happens. The solution, the argument goes, is to reward other things. In this post I counter-argue, proposing three alternative solutions. Problems with the Change the Incentives solution. First, even if rewarding sexy-results caused fraud, it does not follow we should stop rewarding sexy-results. We…
[39] Power Naps: When do Within-Subject Comparisons Help vs Hurt (yes, hurt) Power?
A recent Science-paper (.html) used a total sample size of N=40 to arrive at the conclusion that implicit racial and gender stereotypes can be reduced while napping. N=40 is a small sample for a between-subject experiment. One needs N=92 to reliably detect that men are heavier than women (SSRN). The study, however, was within-subject, for instance, its dependent…
[38] A Better Explanation Of The Endowment Effect
It’s a famous study. Give a mug to a random subset of a group of people. Then ask those who got the mug (the sellers) to tell you the lowest price they’d sell the mug for, and ask those who didn’t get the mug (the buyers) to tell you the highest price they’d pay for…
[37] Power Posing: Reassessing The Evidence Behind The Most Popular TED Talk
A recent paper in Psych Science (.pdf) reports a failure to replicate the study that inspired a TED Talk that has been seen 25 million times. [1] The talk invited viewers to do better in life by assuming high-power poses, just like Wonder Woman’s below, but the replication found that power-posing was inconsequential. If an…