When I taught my first PhD-level methods course, I invited students to submit questions about any topic in statistics or methodology. Six out of 10 students asked about the same topic: robust & clustered standard errors. It's clearly a topic they found both important and confusing. Psychologists basically never use robust standard errors. But they…
[132] statuser: R in user-friendly mode
t.test(), the R function for running t-tests, is disconcertingly imperfect. A t-test involves computing the difference between two means. And yet, t.test(), does not report… …said difference of means. It reports the p-value for the difference of means, it reports the confidence interval for the difference of means, but not the difference of means itself….
[131] Bending Over Backwards:
The Quadratic Puts the U in AI
For a recent journal club in Barcelona, we read a just published article in the Journal of Experimental Psychology: General (JEP:G). The paper is on the impact of using gen-AI on creativity. The paper proposes an inverted U: people are most creative with moderate levels of AI use. The paper has three studies. Studies 1…
[130] ResearchBox: Even Easier to Use and More Transparently Permanent than Before
Over the past 10 years or so, posting data, code, and materials for published papers has gone from eccentric to mundane. There are a few platforms that enable sharing research files, including ResearchBox. ResearchBox is hosted by the Wharton Credibility Lab, which I co-direct. We also host the pre-registration platform AsPredicted, and a new platform…
[129] P-curve works in practice, but would it work if you dropped a piano on it?
P-curve is a statistical tool we developed about 15 years ago to help rule out selective reporting, be it p-hacking or file-drawering, as the sole explanation for a set of significant results. This post is about a forthcoming critique of p-curve in the statistics journal JASA (pdf). The authors identify four p-curve properties they object…
[128] LinkedOut: The Best Published Audit Study, And Its Interesting Shortcoming
There is a recent QJE paper reporting a LinkedIn audit study comparing responses to requests by Black vs White young males. I loved the paper. At every turn you come across a clever, effortful, and effective solution to a challenge posed by studying discrimination in a field experiment. But, no paper is perfect, and this…
[127] Meaningless Means #4: Correcting Scientific Misinformation
Before we got distracted by things like being sued, we had been working on a series called Meaningless Means, which exposed the fact that meta-analytic averaging is (really) bad. When a meta-analysis says something like, “The average effect of mindsets on academic performance is d = .32”, you should not take it at face value….
[126] Stimulus Plots
When we design experiments, we have to decide how to generate and select the stimuli that we use to test our hypotheses. In a forthcoming JPSP article, “Stimulus Sampling Reimagined” (htm), we propose that for at least 60 years we have been thinking about stimulus selection in experiments in the wrong way [1]. Specifically, with…
[125] "Complexity" 2: Don't be mean to the median
In Colada[124] I summarized a co-authored critique (with Banki, Walatka and Wu) of a recent AER paper that proposed risk preferences reflect 'complexity' rather than preferences a-la Prospect Theory. Ryan Oprea, the AER author, has written a rejoinder (.pdf). Its first main point (pages 5-12), is that our results with medians are 'knife edge' (p.8),…
[124] "Complexity": 75% of participants missed comprehension questions in AER paper critiquing Prospect Theory
Kahneman and Tversky’s (1979) “Prospect Theory” article is the most cited paper in the history of economics, and it won Kahneman the Nobel Prize in 2002. Among other things, it predicts that people are risk seeking for unlikely gains (e.g., they pay more than $1 for a 1% chance of $100) but risk averse for…
