Power analysis for #interactions can be tough!
๐ข Our new preprint features:
๐ญ An intuitive taxonomy of 12 types of interaction
...with the ๐s to reach power = .80/.90
๐ฎ A ๐ญ meta-study
๐ฏ Simulations testing 3 ways to โ๏ธ power
๐ฐ A cool web app!
๐งต
osf.io/xhe3u/
๐ญ๐ฎ As we know from popular blogs/papers, power analyses differ b/w main effects & interactions because:
๐a main effect corresponds to a difference b/w means
๐a two-way interaction corresponds to a difference b/w mean subdifferences
(using simple b/w-Ss designs as examples)
๐ญ๐ฏ Thus, when running a power analysisโฆ
โ
It is OK to use a generic value to define the expected effect size of a main effect (e.g., a medium-sized difference of ๐ฅ = 0.35)
โ But it is NOT OK to use a generic value to define the expected effect size of an interaction
๐ญ๐ฐ To determine the type of interaction you expect, we argue that you must answer two Qs:
๐ค๐ญ What is the expected shape of my interaction?
โก๏ธReversed? Fully attenuated? Partially attenuated?
๐ค๐ฎ What are the expected sizes of my simple slopes?
โก๏ธSmall? Medium? Large?
๐ญ๐ฑ This results in 12 basic types of interactions.
๐ see Table ๐
E.g., a โ0.35 | 0.00 fully attenuated interactionโ (in red) involves a medium-sized simple slope & a null simple slope. If such an interaction is true, ๐ = 1,024 will give you an 80% probability to detect it.
๐ฎ๐ฎ From there, we wanted to know how researchers handle power analysis when having an interaction hypothesis.
We ran a prereg meta-study & built a sample of 159 studies testing interactions published 10 influential psychology journals.
Three (kinda depressing) conclusions.
๐ฎ๐ฏ Conclusions #1 ๐
The majority of the studies in the lit test partially attenuated interactions (the most difficult to detect)
๐ฎ๐ฐ Conclusions #2 โน๏ธ
Less than 5% of the studies report an adequate power analysis (many use an inadequate generic value to define the expected effect size of the interaction)
๐ฎ๐ฑ Conclusions #3๐ข
The overall median power to detect a medium-sized interaction of a given shape is .18.
๐ฏ๐ฎ From there, we wanted to find solutions to the problem of power when testing interactions.
We ran zillions of simulations to generate power curves for our 12 types of interaction & tested ways to increase power without increasing ๐.
Three (rather comforting) strategies.
๐ฏ๐ฏ Strategy #1 ๐
๐ฆIf preregistering a one-tailed test (rather than using a two-tailed test), 21% fewer participants are needed to reach a power of .80 (blue curves)
๐ฏ๐ฐ Strategy #2 ๐
๐ฉIf using a mixed design* (rather than a between-participant design), 75% fewer participants are needed to reach a power of .80 (green curves)
*assuming a conservative between-measurements correlation of ฯ = .50
๐ฏ๐ฑ Strategy #3 ๐
๐จ If using a planned contrast analysis* (rather than the orthodox factorial approach), 60% fewer participants are needed to reach a power of .80 (yellow curves)
*only applies to fully attenuated interactions
๐ฐ Finally, we developed INTรPower, a user-friendly web application that enables researchers to draw their interaction & determine the sample size needed to reach a power of .80 with & without using these three strategies.
The beta version of the app:
๐intxpower.com
THANKS for reading this long thread
The preprint (osf.io/xhe3u/) is not submitted yet, so comments, suggestions, & criticisms are welcome and will be considered (feel free to email me).
I mean, let's be honest, there's probably at least ONE mistake in this appendix ๐
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.