Nicolas Sommet ๐Ÿ‡บ๐Ÿ‡ฆ Profile picture
Sep 8, 2022 โ€ข 15 tweets โ€ข 7 min read โ€ข Read on X
Power analysis for #interactions can be tough!

๐Ÿ“ข Our new preprint features:
๐Ÿญ An intuitive taxonomy of 12 types of interaction
...with the ๐˜•s to reach power = .80/.90
๐Ÿฎ A ๐Ÿ˜ญ meta-study
๐Ÿฏ Simulations testing 3 ways to โ†—๏ธ power
๐Ÿฐ A cool web app!

๐Ÿงต

osf.io/xhe3u/
๐Ÿญ๐—ฎ As we know from popular blogs/papers, power analyses differ b/w main effects & interactions because:

๐Ÿ‘‰a main effect corresponds to a difference b/w means

๐Ÿ‘‰a two-way interaction corresponds to a difference b/w mean subdifferences

(using simple b/w-Ss designs as examples) Image
๐Ÿญ๐—ฏ Thus, when running a power analysisโ€ฆ

โœ… It is OK to use a generic value to define the expected effect size of a main effect (e.g., a medium-sized difference of ๐˜ฅ = 0.35)

โŒ But it is NOT OK to use a generic value to define the expected effect size of an interaction Image
๐Ÿญ๐—ฐ To determine the type of interaction you expect, we argue that you must answer two Qs:

๐—ค๐Ÿญ What is the expected shape of my interaction?
โžก๏ธReversed? Fully attenuated? Partially attenuated?

๐—ค๐Ÿฎ What are the expected sizes of my simple slopes?
โžก๏ธSmall? Medium? Large? Image
๐Ÿญ๐—ฑ This results in 12 basic types of interactions.

๐Ÿ‘‡ see Table ๐Ÿ‘‡

E.g., a โ€œ0.35 | 0.00 fully attenuated interactionโ€ (in red) involves a medium-sized simple slope & a null simple slope. If such an interaction is true, ๐˜• = 1,024 will give you an 80% probability to detect it. Image
๐Ÿฎ๐—ฎ From there, we wanted to know how researchers handle power analysis when having an interaction hypothesis.

We ran a prereg meta-study & built a sample of 159 studies testing interactions published 10 influential psychology journals.

Three (kinda depressing) conclusions. ImageImageImage
๐Ÿฎ๐—ฏ Conclusions #1 ๐Ÿ™

The majority of the studies in the lit test partially attenuated interactions (the most difficult to detect) Image
๐Ÿฎ๐—ฐ Conclusions #2 โ˜น๏ธ

Less than 5% of the studies report an adequate power analysis (many use an inadequate generic value to define the expected effect size of the interaction) Image
๐Ÿฎ๐—ฑ Conclusions #3๐Ÿ˜ข

The overall median power to detect a medium-sized interaction of a given shape is .18. Image
๐Ÿฏ๐—ฎ From there, we wanted to find solutions to the problem of power when testing interactions.

We ran zillions of simulations to generate power curves for our 12 types of interaction & tested ways to increase power without increasing ๐˜•.

Three (rather comforting) strategies. Image
๐Ÿฏ๐—ฏ Strategy #1 ๐Ÿ™‚

๐ŸŸฆIf preregistering a one-tailed test (rather than using a two-tailed test), 21% fewer participants are needed to reach a power of .80 (blue curves) Image
๐Ÿฏ๐—ฐ Strategy #2 ๐Ÿ˜€

๐ŸŸฉIf using a mixed design* (rather than a between-participant design), 75% fewer participants are needed to reach a power of .80 (green curves)

*assuming a conservative between-measurements correlation of ฯ = .50 Image
๐Ÿฏ๐—ฑ Strategy #3 ๐Ÿ˜ƒ

๐ŸŸจ If using a planned contrast analysis* (rather than the orthodox factorial approach), 60% fewer participants are needed to reach a power of .80 (yellow curves)

*only applies to fully attenuated interactions Image
๐Ÿฐ Finally, we developed INTร—Power, a user-friendly web application that enables researchers to draw their interaction & determine the sample size needed to reach a power of .80 with & without using these three strategies.

The beta version of the app:
๐Ÿ‘‰intxpower.com Image
THANKS for reading this long thread

The preprint (osf.io/xhe3u/) is not submitted yet, so comments, suggestions, & criticisms are welcome and will be considered (feel free to email me).

I mean, let's be honest, there's probably at least ONE mistake in this appendix ๐Ÿ™ƒ Image

โ€ข โ€ข โ€ข

Missing some Tweet in this thread? You can try to force a refresh
ใ€€

Keep Current with Nicolas Sommet ๐Ÿ‡บ๐Ÿ‡ฆ

Nicolas Sommet ๐Ÿ‡บ๐Ÿ‡ฆ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @nicolas_sommet

Dec 15, 2022
At the top U.S. universities in psychology, there are 17 Democrats for every 1 Republican.

Some argue that this kind of political imbalance is not a cause for concern.

Others believe that it poses an existential threat to the field.

๐Ÿงต A dialectical thread ๐Ÿงต
๐—ง๐—›๐—˜๐—ฆ๐—œ๐—ฆ

Two studies suggest that the political orientation of researchers or their studies has little effect on research outcomes.
๐—ง๐—›๐—˜๐—ฆ๐—œ๐—ฆ ยท Study 1/2

๐Ÿ“ˆ @BreznauNate et al. gave the same dataset to 73 research teams & asked them to test the following politically charged research question: Does immigration reduce support for social welfare policies?
Read 12 tweets
Sep 12, 2022
@dom_muller @mjbsp @FGabarrot @cedricbatailler I get your point & that of @AntalHaans/@seriousstats. FYI, we intend to adjust the wording of the piece to be more precise & add a subsection on interaction ES. That being said, I don't think there are any mathematical flaws in the preprint or any confound in the sims. 1/5
@dom_muller @mjbsp @FGabarrot @cedricbatailler @AntalHaans @seriousstats First, I agree that:
1) power calculation is the same for main effects & interactions (eq. 1) and
2) ES calculation is essentially the same for main effects & interactions (eqs. 2 & 3, respectively).
(note that the overall interaction ES is displayed by the web app) 2/5 Image
@dom_muller @mjbsp @FGabarrot @cedricbatailler @AntalHaans @seriousstats Second, as an Editor, you surely agree thatโ€”despite these formulas being found in most stat textbooksโ€”people often overestimate the expected ES of their interactions by calculating, e.g., โ€œthe ๐˜• to detect a medium-sized [partially attenuated] interaction of ๐˜ฅ = 0.35 (sic).โ€ 3/5
Read 5 tweets
Sep 27, 2018
The Spirit Level has been cited โ‰ˆ10K times (โ‰ˆ700 times in 2018).

The book is straightforward: It uses cross-sectional data to show negative effects of #IncomeInequality on health.

The problem: It does NOT hold up to scrutiny.

๐Ÿ’ Thread ๐Ÿ’
#1 ๐Ÿ’-picking.

In the Spirit Level, some countries are excluded from the analysis without justification. When including these countries and using the latest estimates available, the core findings of the book disappear. [2/5]
#2 A second bite at the ๐Ÿ’

Papers using large survey data with much more countries fail to reproduce the findings. E.g., Jen et al. shows that income inequality actually reduces the chances to report a poor health (especially, in developing countries). [3/5]
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(