In the third installment of Data Replicada, we report our attempt to replicate a recently published Journal of Consumer Research (JCR) article entitled, "The Uncertain Self: How Self-Concept Structure Affects Subscription Choice" (.htm).
The central theory in the paper can be expressed in the following way: If you are uncertain about your own self-concept, then you will not want to introduce even more uncertainty by altering your identity. Consumers with "low self-concept clarity" should therefore be more motivated to keep their identities stable by (1) retaining products that are relevant to their identities, and (2) choosing not to acquire new products that are relevant to their identities:
This JCR paper contains six studies. Studies 4 and 6 were conducted on MTurk, and we chose to try to replicate Study 4, both because its results were stronger than those in Study 6, and because it tested the two key hypotheses at once: that low self-concept clarity both increases the tendency to retain and decreases the tendency to acquire an identity-relevant magazine subscription .
We contacted the first author to request the materials needed to conduct a replication. She was extremely forthcoming and polite, replying within 24 hours. She shared the original Qualtrics file that she used to conduct that study, and we modified it only slightly to conduct our replication . We are very grateful to her for her help and professionalism.
In the preregistered replication (https://aspredicted.org/mi7km.pdf), we used the same survey as in the original study, and therefore the same instructions, procedures, images, and questions. This study did not deviate from the original study in any discernible way, except that the criteria we used to choose our MTurk sample, and the amount that we paid them to do the survey, may have been different from the original . The original study reported no exclusions, and so we did not exclude any participants who completed the dependent variable. You can access our Qualtrics survey here (.qsf), our materials here (.pdf), our data here (.csv), our codebook here (.xlsx), and our R code here (.R).
We asked 1,283 MTurkers to complete a survey that was ostensibly comprised of separate studies by different researchers. In the first part of the survey, participants were asked to write about three aspects of their lives that made them either uncertain ("Low Self-Concept Clarity" condition) or certain ("High Self-Concept Clarity" condition) about themselves:
After completing that "study," participants moved on to a "Consumer Choice Study." In the "Retention" condition, they were told that they currently have a digital subscription to People Magazine and The Economist, and that they are considering cancelling one of those subscriptions. In the "Acquisition" condition, they were told that they are searching for a new digital magazine subscription to subscribe to, and that they were considering People Magazine and The Economist. Participants then used an unnumbered sliding scale, defaulted to the scale midpoint, to indicate which magazine they would keep (in the Retention condition) or which magazine they would buy (in the Acquisition condition):
Though unnumbered on the page, the scale ranged from 0 = People Magazine to 100 = The Economist. Because a pretest showed that The Economist is more identity relevant than People Magazine, higher numbers on this scale indicate a greater desire to retain or acquire an identity-relevant subscription .
Before presenting the results, it is worth discussing our analysis plan. The original study presents evidence for a crossover interaction, meaning that the study contained evidence for two separate and independent hypotheses. If our replication were to find a significant interaction, that result alone would not distinguish between a successful replication of both of these effects or a successful replication of just one of them. Indeed, whenever you are replicating a crossover interaction, it is best to analyze the two results independently, since the results themselves are independent. Thus, we preregistered to separately analyze the Retention condition results and the Acquisition condition results. Of course, we could still compute the overall interaction, and we report that below as well, in a footnote.
Here are the original (left panel) and replication results (right panel):
First, let's consider whether the predicted results were significant in the (larger-sampled) replication attempt. They were not. Self-Concept Clarity condition did not significantly alter people's desire to retain The Economist (p = .379) or to acquire The Economist (p = .496)  .
Second, let's consider what the results tell us about the possible size of the effect, and about the sample size that would be required to detect it. The figure below shows that our retention condition effect size estimate is d = -.07, with a 90% confidence interval of [-.20, +.06]. Our acquisition condition effect size estimate is d = .05, with a 90% confidence interval of [-.08, +.18].
If the true effect sizes are equal to what we observed in the replication, then the original study would have had 7.8% power to detect the Retention Condition effect and 6.7% power to detect the Acquisition condition effect, and thus 0.5% power to detect both. You would need sample sizes of 6,500 and 10,800 to have an 80% chance to detect each of these effects, respectively.
In sum, using three times the sample size as the original study, we find no evidence that people's tendency to retain or acquire identity-relevant magazine subscriptions is affected by having them write about aspects of their lives that make them feel certain versus uncertain about themselves.
When we reached out to the authors for comments on the post, they responded as follows:
"We were encouraged to see the predicted crossover pattern in this data, although it was disappointing that the effect of the SCC manipulation is clearly not as strong as in the original data from 2017 (.csv). Overall, while we appreciate the importance of exact replications, one concern is whether strict adherence to the original methods is always the best test of an underlying theory, especially when an experiment relies on activating a mental state to test the relationship between constructs (Stroebe & Strack, 2014; .html). For example, the original study was designed to test how SCC affects subscription choices. To test the theory we asked people to complete an essay task that either increased or decreased their SCC. If this essay task was less effective in altering people's SCC in the replication, we would expect a weaker effect on choice. We did not include a manipulation check of SCC in the original study (although we had reported one in a previous study), and so S&N also did not include it in their methods. As a result, we don't know if the manipulation was as effective in the replication attempt. While there are many reasons that an established manipulation may not activate a mental state in a specific instance (Stroebe & Strack, 2014), one possibility relevant to the current discussion may be related to mTurk data quality.
"Anecdotal and empirical evidence suggests that mTurk data quality has declined, most noticeably since the summer of 2018 (Chmielewski & Kucker, 2019; .html), although we acknowledge that opinions differ on the impact of these issues (see Data Colada ). Especially when an experiment relies on a writing task to operationalize a mental state, low quality responses can dampen the impact of the manipulation and reduce effect sizes. To explore whether data quality might matter here, we followed the example of Chmielewski and Kucker (2019), and filtered out responses with low quality indicators (i.e.., nonsense responses to the prime, such as, "Testttttt," "yes very good allyes very good" (n = 33), completion times 2SD slower than the mean (n = 37), and multiple responses from the same IP address (n = 39)). The pattern that emerged was much closer to the original 2017 finding, F(1, 1170) = 2.74, p = .098, although the simple effects, while stronger, remain non-significant. Thus it is unclear to us if the effect sizes reported in this post are the most accurate representation of the theoretical impact of SCC on subscription choice. In sum, while the current data may not find evidence of the relationship obtained in our six studies, we certainly support the effort, and agree that more research is required to fully understand how SCC affects subscription choices."
- It is also possible that Study 5 was conducted on MTurk, but when we were deciding which study to replicate we couldn't tell for sure and so we didn't consider it. [↩]
- The modifications included a necessarily different consent form, and the addition of a "Pretest" condition, described below. [↩]
- We opened up the survey to every U.S. worker on MTurk, and we paid them $0.75 for completing the survey. The first author did not recall or have a record of these details for their study. [↩]
- To make sure that our participants also believed that The Economist was more identity-relevant than People Magazine, we randomly assigned an additional subset of our participants (n = 224) to complete this pretest, instead of assigning them to the Retention or Acquisition condition. We replicated the authors' pretest results: Participants found The Economist to be more identity-relevant than People, MEconomist = 3.66, MPeople = 2.78, t(223) = 9.84, p < 10-15. These means are nearly identical to what is reported in the article: MEconomist = 3.6, MPeople = 2.8. [↩]
- The interaction also failed to reach significance: F(1, 1279) = 1.23, p = .267. [↩]
- At the end of the study, we asked participants whether they had "ever before completed an MTurk survey in which you were asked to choose between People magazine and The Economist" (Possible answers: No, Maybe, Yes). Restricting the analyses to those participants who answered "No" to this question (n = 1,144 of 1,283) does not alter the results. Among these participants, Self-Concept Clarity condition did not significantly alter people's desire to retain The Economist (MHIGH = 57.46, MLOW = 59.56, p = .549) or to acquire The Economist (MHIGH = 58.00, MLOW = 56.96, p = .743). The interaction was also non-significant (p = .506). [↩]