Data Colada
Menu
  • Home
  • About
  • Feedback Policy
  • Table of Contents
Menu

[73] Don't Trust Internal Meta-Analysis

Posted on October 24, 2018December 21, 2018 by Guest co-author: Joachim Vosgerau and Uri, Leif, & Joe

Researchers have increasingly been using internal meta-analysis to summarize the evidence from multiple studies within the same paper. Much of the time, this involves computing the average effect size across the studies, and assessing whether that effect size is significantly different from zero. At first glance, internal meta-analysis seems like a wonderful idea. It increases…

Read more

[72] Metacritic Has A (File-Drawer) Problem

Posted on July 2, 2018December 21, 2018 by Joe Simmons

Metacritic.com scores and aggregates critics' reviews of movies, music, and video games. The website provides a summary assessment of the critics' evaluations, using a scale ranging from 0 to 100. Higher numbers mean that critics were more favorable. In theory, this website is pretty awesome, seemingly leveraging the wisdom-of-crowds to give consumers the most reliable…

Read more

[71] The (Surprising?) Shape of the File Drawer

Posted on April 30, 2018January 23, 2019 by Leif Nelson

Let's start with a question so familiar that you will have answered it before the sentence is even completed: How many studies will a researcher need to run before finding a significant (p<.05) result? (If she is studying a non-existent effect and if she is not p-hacking.) Depending on your sophistication, wariness about being asked…

Read more

[70] How Many Studies Have Not Been Run? Why We Still Think the Average Effect Does Not Exist

Posted on March 9, 2018January 23, 2019 by Leif Nelson

We have argued that, for most effects, it is impossible to identify the average effect (datacolada.org/33). The argument is subtle (but not statistical), and given the number of well-informed people who seem to disagree, perhaps we are simply wrong. This is my effort to explain why we think identifying the average effect is so hard….

Read more

[69] Eight things I do to make my open research more findable and understandable

Posted on February 6, 2018January 23, 2019 by Uri Simonsohn

It is now common for researchers to post original materials, data, and/or code behind their published research. That's obviously great, but open research is often difficult to find and understand. In this post I discuss 8 things I do, in my papers, code, and datafiles, to combat that. Paper 1) Before all method sections, I…

Read more

[68] Pilot-Dropping Backfires (So Daryl Bem Probably Did Not Do It)

Posted on January 25, 2018December 21, 2018 by Joe Leif Uri

Uli Schimmack recently identified an interesting pattern in the data from Daryl Bem's infamous "Feeling the Future" JPSP paper, in which he reported evidence for the existence of extrasensory perception (ESP; .pdf)[1]. In each study, the effect size is larger among participants who completed the study earlier (blogpost: .htm). Uli referred to this as the "decline…

Read more

[67] P-curve Handles Heterogeneity Just Fine

Posted on January 8, 2018January 8, 2018 by Joe Leif Uri

A few years ago, we developed p-curve (see p-curve.com), a statistical tool that identifies whether or not a set of statistically significant findings contains evidential value, or whether those results are solely attributable to the selective reporting of studies or analyses. It also estimates the true average power of a set of significant findings [1]….

Read more

[66] Outliers: Evaluating A New P-Curve Of Power Poses

Posted on December 6, 2017December 6, 2017 by Joe Leif Uri

In a forthcoming Psych Science paper, Cuddy, Schultz, & Fosse, hereafter referred to as CSF, p-curved 55 power-posing studies (.pdf |  SSRN), concluding that they contain evidential value [1]. Thirty-four of those studies were previously selected and described as "all published tests" (p. 657) by Carney, Cuddy, & Yap (2015; .pdf). Joe and Uri p-curved…

Read more

[65] Spotlight on Science Journalism: The Health Benefits of Volunteering

Posted on November 13, 2017January 23, 2019 by Leif Nelson

I want to comment on a recent article in the New York Times, but along the way I will comment on scientific reporting as well. I think that science reporters frequently fall short in assessing the evidence behind the claims they relay, but as I try to show, assessing evidence is not an easy task….

Read more

[64] How To Properly Preregister A Study

Posted on November 6, 2017November 6, 2017 by Joe Leif Uri

P-hacking, the selective reporting of statistically significant analyses, continues to threaten the integrity of our discipline. P-hacking is inevitable whenever (1) a researcher hopes to find evidence for a particular result, (2) there is ambiguity about how exactly to analyze the data, and (3) the researcher does not perfectly plan out his/her analysis in advance….

Read more
  • Previous
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 9
  • Next

Get Colada email alerts.

Join 2,708 other subscribers

Your hosts

Uri Simonsohn (.htm)
Joe Simmons (.htm)
Leif Nelson (.htm)

Other Posts on Similar Topics

    tweeter & facebook

    We tweet new posts: @DataColada
    And link to them on our Facebook page

    search

    All posts

    • [81] Data Replicada
    • [80] Interaction Effects Need Interaction Controls
    • [79] Experimentation Aversion: Reconciling the Evidence
    • [78c] Bayes Factors in Ten Recent Psych Science Papers
    • [78b] Hyp-Chart, the Missing Link Between P-values and Bayes Factors
    • [78a] If you think p-values are problematic, wait until you understand Bayes Factors
    • [77] Number-Bunching: A New Tool for Forensic Data Analysis
    • [76] Heterogeneity Is Replicable: Evidence From Maluma, MTurk, and Many Labs
    • [75] Intentionally Biased: People Purposely Don't Ignore Information They "Should" Ignore
    • [74] In Press at Psychological Science: A New 'Nudge' Supported by Implausible Data
    • [73] Don't Trust Internal Meta-Analysis
    • [72] Metacritic Has A (File-Drawer) Problem
    • [71] The (Surprising?) Shape of the File Drawer
    • [70] How Many Studies Have Not Been Run? Why We Still Think the Average Effect Does Not Exist
    • [69] Eight things I do to make my open research more findable and understandable
    • [68] Pilot-Dropping Backfires (So Daryl Bem Probably Did Not Do It)
    • [67] P-curve Handles Heterogeneity Just Fine
    • [66] Outliers: Evaluating A New P-Curve Of Power Poses
    • [65] Spotlight on Science Journalism: The Health Benefits of Volunteering
    • [64] How To Properly Preregister A Study
    • [63] "Many Labs" Overestimated The Importance of Hidden Moderators
    • [62] Two-lines: The First Valid Test of U-Shaped Relationships
    • [61] Why p-curve excludes ps>.05
    • [60] Forthcoming in JPSP: A Non-Diagnostic Audit of Psychological Research
    • [59] PET-PEESE Is Not Like Homeopathy
    • [58] The Funnel Plot is Invalid Because of This Crazy Assumption: r(n,d)=0
    • [57] Interactions in Logit Regressions: Why Positive May Mean Negative
    • [56] TWARKing: Test-Weighting After Results are Known
    • [55] The file-drawer problem is unfixable, and that's OK
    • [54] The 90x75x50 heuristic: Noisy & Wasteful Sample Sizes In The "Social Science Replication Project"
    • [53] What I Want Our Field To Prioritize
    • [52] Menschplaining: Three Ideas for Civil Criticism
    • [51] Greg vs. Jamal: Why Didn't Bertrand and Mullainathan (2004) Replicate?
    • [50] Teenagers in Bikinis: Interpreting Police-Shooting Data
    • [49] P-Curve Won't Do Your Laundry, But Will Identify Replicable Findings
    • [48] P-hacked Hypotheses Are Deceivingly Robust
    • [47] Evaluating Replications: 40% Full ≠ 60% Empty
    • [46] Controlling the Weather
    • [45] Ambitious P-Hacking and P-Curve 4.0
    • [44] AsPredicted: Pre-registration Made Easy
    • [43] Rain & Happiness: Why Didn't Schwarz & Clore (1983) 'Replicate' ?
    • [42] Accepting the Null: Where to Draw the Line?
    • [41] Falsely Reassuring: Analyses of ALL p-values
    • [40] Reducing Fraud in Science
    • [39] Power Naps: When do Within-Subject Comparisons Help vs Hurt (yes, hurt) Power?
    • [38] A Better Explanation Of The Endowment Effect
    • [37] Power Posing: Reassessing The Evidence Behind The Most Popular TED Talk
    • [36] How to Study Discrimination (or Anything) With Names; If You Must
    • [35] The Default Bayesian Test is Prejudiced Against Small Effects
    • [34] My Links Will Outlive You
    • [33] "The" Effect Size Does Not Exist
    • [32] Spotify Has Trouble With A Marketing Research Exam
    • [31] Women are taller than men: Misusing Occam's Razor to lobotomize discussions of alternative explanations
    • [30] Trim-and-Fill is Full of It (bias)
    • [29] Help! Someone Thinks I p-hacked
    • [28] Confidence Intervals Don't Change How We Think about Data
    • [27] Thirty-somethings are Shrinking and Other U-Shaped Challenges
    • [26] What If Games Were Shorter?
    • [25] Maybe people actually enjoy being alone with their thoughts
    • [24] P-curve vs. Excessive Significance Test
    • [23] Ceiling Effects and Replications
    • [22] You know what's on our shopping list
    • [21] Fake-Data Colada: Excessive Linearity
    • [20] We cannot afford to study effect size in the lab
    • [19] Fake Data: Mendel vs. Stapel
    • [18] MTurk vs. The Lab: Either Way We Need Big Samples
    • [17] No-way Interactions
    • [16] People Take Baths In Hotel Rooms
    • [15] Citing Prospect Theory
    • [14] How To Win A Football Prediction Contest: Ignore Your Gut
    • [13] Posterior-Hacking
    • [12] Preregistration: Not just for the Empiro-zealots
    • [11] "Exactly": The Most Famous Framing Effect Is Robust To Precise Wording
    • [10] Reviewers are asking for it
    • [9] Titleogy: Some facts about titles
    • [8] Adventures in the Assessment of Animal Speed and Morality
    • [7] Forthcoming in the American Economic Review: A Misdiagnosed Failure-to-Replicate
    • [6] Samples Can't Be Too Large
    • [5] The Consistency of Random Numbers
    • [4] The Folly of Powering Replications Based on Observed Effect Size
    • [3] A New Way To Increase Charitable Donations: Does It Replicate?
    • [2] Using Personal Listening Habits to Identify Personal Music Preferences
    • [1] "Just Posting It" works, leads to new retraction in Psychology

    Pages

    • About
    • Drop That Bayes: A Colada Series on Bayes Factors
    • Policy on Soliciting Feedback From Authors
    • Table of Contents

    Get email alerts

    Data Colada - All Content Licensed: CC-BY [Creative Commons]