Beth Tipton Profile picture
Nov 7, 2022 14 tweets 3 min read Read on X
Does growth mindset work? Join me for a tale of two meta-analyses. @chrisjb1
MA1, by Macnamara and Burgoyne, says ‘no’. doi.org/10.1037/bul000…

MA2, by Burnette and colleagues, says ‘under some contexts, for some students.’
doi.org/10.1037/bul000…

If you know me, you know where this is going.
The short answer: Skip MA1. Read MA2.

The answer isn’t “yes” or “no”. Examining the data in both MA1 and MA2 reveals that on average, the effect is close to zero. This is not a surprise. But for at-risk students, the average effect on academic achievement is moderate (~0.15SD).
Ok, but now for my thoughts. I want to begin by noting that I am a statistician and that I really do not care of GM works or not.

But I do care about MA and about its use. So we wrote a commentary: dx.doi.org/10.13140/RG.2.…
We focus on 4 best practices in large MAs. We contrast the two MAs, focusing on the different methodological choices they made. I will try to summarize.
1) MA should prominently (abstract, results, discussion) quantify the heterogeneity in effect sizes.

MA2 prominently reports that 95% of effects vary between -0.08 to 0.35.

MA1 focuses only on the average effect. No mention of the PI or heterogeneity anywhere.
2) Meta-analyses should include all the relevant within-study variation in effect sizes.

There is no need to average effect sizes to the study level. Doing so excludes an important source of variation.

MA2 includes all relevant effect sizes. MA1 does not.
Aside 1: These methodological choices matter.

MA1’s choices leads them to conclude that GM does not work, since “the effect” is d = 0.04. MA2’s choices led them to conclude that GM works for particular subgroups, for particular outcomes. These are very different conclusions.
3) MA should appropriately adjust for confounders, including study quality and publication bias.

‘Study quality’ must be measured, and the development of new measures requires validity to be established. The best measures are agreed upon by a field.
4) Meta-analyses should seek to explain heterogeneity using moderation analyses.

This requires more than one-variable at a time models. But also, just because p > .05 does not mean that there is no variation. You cannot prove that the effect is constant.
Aside 2: These methodological choices matter.

MA1 created their own measure of quality, using criteria that are out of sync with the field*. They then concluded that the literature was ‘low quality’ and focused their conclusions around a very narrow subset. MA2 did not.
* e.g. 1: MA1 identifies studies that randomized classes or schools as 'low quality'. In education, these are the dominant RCT design.

* e.g. 2: MA1's measure of FCOI is a ‘post-treatment’ variable. It measured *subsequent* success, after the study was published.
For those that are still reading: Be like MA2. If you’re going to conduct a large MA, with a lot of “heterogeneity in”, then it is your job to try to understand the “heterogeneity out.”
And please, can we stop with the “either/or”, “good/bad”, “yes/no” thinking about interventions? I’m tired of writing commentaries.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Beth Tipton

Beth Tipton Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @stats_tipton

Jul 24, 2022
Buckle up friends, I’ve started writing commentaries about meta-analyses.

In this one, we argue that (1) the effect of nudges on average is small (d = 0.08), but (2) also very heterogeneous (+/- 1.0) across studies.
In contrast, the original article’s abstract says that “the effect” is d = 0.43. They conclude that this effect is essentially constant, noting that nudges “affect behavior relatively independently of contextual study characteristics...”
Read 11 tweets
Aug 10, 2020
Lately I have spoken with several people interested in building systems to help policymakers and practitioners leverage research to improve their decision-making in education. This has made me think a lot about the difference between data and evidence. 1/17
Evidence is data+, meaning data plus analysis methods, research design, assumptions, uncertainty, theory, and expertise. Evidence is expensive, whereas data is not. And evidence is, by design, not available for every question whereas data may be. 2/17
Practitioners want to know which (version of an) intervention will work in their very particular context. But there are infinite versions of these interventions and infinite different contexts. 3/17
Read 17 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(