๐ข Our new preprint features:
๐ญ An intuitive taxonomy of 12 types of interaction
...with the ๐s to reach power = .80/.90
๐ฎ A ๐ญ meta-study
๐ฏ Simulations testing 3 ways to โ๏ธ power
๐ฐ A cool web app!
๐ญ๐ฎ As we know from popular blogs/papers, power analyses differ b/w main effects & interactions because:
๐a main effect corresponds to a difference b/w means
๐a two-way interaction corresponds to a difference b/w mean subdifferences
(using simple b/w-Ss designs as examples)
๐ญ๐ฏ Thus, when running a power analysisโฆ
โ It is OK to use a generic value to define the expected effect size of a main effect (e.g., a medium-sized difference of ๐ฅ = 0.35)
โ But it is NOT OK to use a generic value to define the expected effect size of an interaction
๐ญ๐ฐ To determine the type of interaction you expect, we argue that you must answer two Qs:
๐ค๐ญ What is the expected shape of my interaction?
โก๏ธReversed? Fully attenuated? Partially attenuated?
๐ค๐ฎ What are the expected sizes of my simple slopes?
โก๏ธSmall? Medium? Large?
๐ญ๐ฑ This results in 12 basic types of interactions.
๐ see Table ๐
E.g., a โ0.35 | 0.00 fully attenuated interactionโ (in red) involves a medium-sized simple slope & a null simple slope. If such an interaction is true, ๐ = 1,024 will give you an 80% probability to detect it.
๐ฎ๐ฎ From there, we wanted to know how researchers handle power analysis when having an interaction hypothesis.
We ran a prereg meta-study & built a sample of 159 studies testing interactions published 10 influential psychology journals.
Three (kinda depressing) conclusions.
๐ฎ๐ฏ Conclusions #1 ๐
The majority of the studies in the lit test partially attenuated interactions (the most difficult to detect)
๐ฎ๐ฐ Conclusions #2 โน๏ธ
Less than 5% of the studies report an adequate power analysis (many use an inadequate generic value to define the expected effect size of the interaction)
๐ฎ๐ฑ Conclusions #3๐ข
The overall median power to detect a medium-sized interaction of a given shape is .18.
๐ฏ๐ฎ From there, we wanted to find solutions to the problem of power when testing interactions.
We ran zillions of simulations to generate power curves for our 12 types of interaction & tested ways to increase power without increasing ๐.
Three (rather comforting) strategies.
๐ฏ๐ฏ Strategy #1 ๐
๐ฆIf preregistering a one-tailed test (rather than using a two-tailed test), 21% fewer participants are needed to reach a power of .80 (blue curves)
๐ฏ๐ฐ Strategy #2 ๐
๐ฉIf using a mixed design* (rather than a between-participant design), 75% fewer participants are needed to reach a power of .80 (green curves)
*assuming a conservative between-measurements correlation of ฯ = .50
๐ฏ๐ฑ Strategy #3 ๐
๐จ If using a planned contrast analysis* (rather than the orthodox factorial approach), 60% fewer participants are needed to reach a power of .80 (yellow curves)
*only applies to fully attenuated interactions
๐ฐ Finally, we developed INTรPower, a user-friendly web application that enables researchers to draw their interaction & determine the sample size needed to reach a power of .80 with & without using these three strategies.
At the top U.S. universities in psychology, there are 17 Democrats for every 1 Republican.
Some argue that this kind of political imbalance is not a cause for concern.
Others believe that it poses an existential threat to the field.
๐งต A dialectical thread ๐งต
๐ง๐๐๐ฆ๐๐ฆ
Two studies suggest that the political orientation of researchers or their studies has little effect on research outcomes.
๐ง๐๐๐ฆ๐๐ฆ ยท Study 1/2
๐ @BreznauNate et al. gave the same dataset to 73 research teams & asked them to test the following politically charged research question: Does immigration reduce support for social welfare policies?
@dom_muller@mjbsp@FGabarrot@cedricbatailler I get your point & that of @AntalHaans/@seriousstats. FYI, we intend to adjust the wording of the piece to be more precise & add a subsection on interaction ES. That being said, I don't think there are any mathematical flaws in the preprint or any confound in the sims. 1/5
@dom_muller@mjbsp@FGabarrot@cedricbatailler@AntalHaans@seriousstats First, I agree that: 1) power calculation is the same for main effects & interactions (eq. 1) and 2) ES calculation is essentially the same for main effects & interactions (eqs. 2 & 3, respectively).
(note that the overall interaction ES is displayed by the web app) 2/5
@dom_muller@mjbsp@FGabarrot@cedricbatailler@AntalHaans@seriousstats Second, as an Editor, you surely agree thatโdespite these formulas being found in most stat textbooksโpeople often overestimate the expected ES of their interactions by calculating, e.g., โthe ๐ to detect a medium-sized [partially attenuated] interaction of ๐ฅ = 0.35 (sic).โ 3/5
The Spirit Level has been cited โ10K times (โ700 times in 2018).
The book is straightforward: It uses cross-sectional data to show negative effects of #IncomeInequality on health.
The problem: It does NOT hold up to scrutiny.
๐ Thread ๐
#1 ๐-picking.
In the Spirit Level, some countries are excluded from the analysis without justification. When including these countries and using the latest estimates available, the core findings of the book disappear. [2/5]
#2 A second bite at the ๐
Papers using large survey data with much more countries fail to reproduce the findings. E.g., Jen et al. shows that income inequality actually reduces the chances to report a poor health (especially, in developing countries). [3/5]