Beth Tipton Profile picture
Statistics professor at @NorthwesternU, fellow at @IPRatNU. meta-analysis, causal generalization. @stats_tipton@toot.community

Nov 7, 2022, 14 tweets

Does growth mindset work? Join me for a tale of two meta-analyses. @chrisjb1

MA1, by Macnamara and Burgoyne, says ‘no’. doi.org/10.1037/bul000…

MA2, by Burnette and colleagues, says ‘under some contexts, for some students.’
doi.org/10.1037/bul000…

If you know me, you know where this is going.

The short answer: Skip MA1. Read MA2.

The answer isn’t “yes” or “no”. Examining the data in both MA1 and MA2 reveals that on average, the effect is close to zero. This is not a surprise. But for at-risk students, the average effect on academic achievement is moderate (~0.15SD).

Ok, but now for my thoughts. I want to begin by noting that I am a statistician and that I really do not care of GM works or not.

But I do care about MA and about its use. So we wrote a commentary: dx.doi.org/10.13140/RG.2.…

We focus on 4 best practices in large MAs. We contrast the two MAs, focusing on the different methodological choices they made. I will try to summarize.

1) MA should prominently (abstract, results, discussion) quantify the heterogeneity in effect sizes.

MA2 prominently reports that 95% of effects vary between -0.08 to 0.35.

MA1 focuses only on the average effect. No mention of the PI or heterogeneity anywhere.

2) Meta-analyses should include all the relevant within-study variation in effect sizes.

There is no need to average effect sizes to the study level. Doing so excludes an important source of variation.

MA2 includes all relevant effect sizes. MA1 does not.

Aside 1: These methodological choices matter.

MA1’s choices leads them to conclude that GM does not work, since “the effect” is d = 0.04. MA2’s choices led them to conclude that GM works for particular subgroups, for particular outcomes. These are very different conclusions.

3) MA should appropriately adjust for confounders, including study quality and publication bias.

‘Study quality’ must be measured, and the development of new measures requires validity to be established. The best measures are agreed upon by a field.

4) Meta-analyses should seek to explain heterogeneity using moderation analyses.

This requires more than one-variable at a time models. But also, just because p > .05 does not mean that there is no variation. You cannot prove that the effect is constant.

Aside 2: These methodological choices matter.

MA1 created their own measure of quality, using criteria that are out of sync with the field*. They then concluded that the literature was ‘low quality’ and focused their conclusions around a very narrow subset. MA2 did not.

* e.g. 1: MA1 identifies studies that randomized classes or schools as 'low quality'. In education, these are the dominant RCT design.

* e.g. 2: MA1's measure of FCOI is a ‘post-treatment’ variable. It measured *subsequent* success, after the study was published.

For those that are still reading: Be like MA2. If you’re going to conduct a large MA, with a lot of “heterogeneity in”, then it is your job to try to understand the “heterogeneity out.”

And please, can we stop with the “either/or”, “good/bad”, “yes/no” thinking about interventions? I’m tired of writing commentaries.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling