Profile picture
Andrew Althouse @ADAlthousePhD
, 31 tweets, 5 min read Read on Twitter
(THREAD) As requested/discussed yesterday, here are a few thoughts on post-hoc power
It is not uncommon for reviewers to ask for a “post hoc power” calculation. The most common reasons people ask about this are:
i) the main findings aren’t significant, and they want to know either a) what was the “observed power” (which we’ll discuss in a moment) or b) “given the observed effect size, how large would your study have needed to be for significance”
ii) the main findings are significant, which is an even more baffling time to ask about post-hoc power (yet, I’ve heard from a few folks that say reviewers have asked something like “your effect size seems really large…can you include a post-hoc power calculation?”)
This is all basically rooted in a misunderstanding of “power” and its relationship to statistical hypothesis testing (for the moment, we must work in the NHST paradigm, please put aside any problems with that for the moment)
1) Power is a pre-study design characteristic. It is the probability of rejecting the null hypothesis under a specified set of conditions (sample size, effect size, and estimates of variability).
2) The basic idea of power calculations – we should design studies with large enough sample sizes to have a fairly high probability (usually 80-90%) if an important relationship truly exists.
3) If the study power is sufficiently high, we can be reasonably confident that a “negative” result is in fact a “true negative” and not merely “negative” because the sample wasn’t large enough
4) However, once you have collected your data, the “power” is irrelevant. There is no “probability” of rejecting the null hypothesis any more. Either your data are strong enough evidence to reject the null hypothesis, or they aren’t.
5) Calculating “power” after the fact is generally unhelpful.
Now, let’s address the specific cases above in which people (well-intentioned, just misguided) may ask about post hoc power
Case 1: the main findings aren’t significant, and the reviewers want to know what the “observed power” is. This is usually a well-meaning attempt to determine if your study was underpowered by asking you to calculate the study power with your sample size and observed effect size
But here’s the problem: if you are reporting non-significant effects, you will always have low “observed” power. As @lakens explains in his post on the subject, this is nothing more than reporting the p-value a different way.
The observed effect size was not large enough to conclude that the main effect was “significant” at the recruited sample size. So…it’s a guarantee that “observed” power is going to be low (some people will make a snarky-but-accurate joke that the observed power is zero).
Case 2: the main findings aren’t significant, and the reviewers want to know “given the observed effect size, how many subjects would you have needed for the results to be significant”
Again: this certainly shouldn’t be done using the observed effect size; the answer is basically just going to be “more than we had”
Maybe you can argue that if no power calculation was presented, the authors should be asked to include one that shows what effect size their study would have been powered to detect, or what the power calc would have looked like for some clinically relevant effect
Case 3: the main findings are significant, and for some reason the reviewer thinks they need to ask about power (sometimes because they just expect to see a power calculation in every paper, sometimes misguided logic “that’s a big effect, can you do post hoc power calculation?”)
As described above, the power is the probability of rejecting the null hypothesis under a specified set of conditions (sample size, effect size, and estimates of variability).
If your data provide evidence to reject the null hypothesis at your specified alpha level, the power is irrelevant. You already have a large enough sample and effect size that the null hypothesis is rejected.
The power is a moot point. One can argue that the precision of your estimate is worth discussing (i.e. if someone believes you have an unreasonably large effect in a fairly small sample, and wants to argue whether that’s a reliable estimate) but...
....that is not meaningfully informed by a “power” calculation. The confidence intervals and other study characteristics are more interesting in that regard,
One other brief comment: power and sample size calculations are nearly always done in prospective trial planning.
They are less common in observational studies (especially retrospective studies) because very often the sample is just “as much data as we could get our hands on” and there was/is no realistic possibility of getting more.
So in some cases when you’re asked for a post-hoc power calculation, it’s just because the reviewer may think all studies need a power calculation and wants to show off their methods rigor by asking something about “power”
But as described above, in most cases, it’s unclear what you can actually do with that information. Most people who ask for it (probably) think they’ll see a low observed power and therefore insert comments that the study was underpowered.
In some cases that will have some merit. People shouldn’t see a p>0.05 and assume that means “no effect” out of hand. It’s just that adding a post-hoc power calculation probably isn’t much additional help
Hope this was useful. For some pretty graphs and technical details, see this very nice post from @lakens
And this post from @StatModeling
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Andrew Althouse
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!