My Authors
Read all threads
1/34 Okay #medtwitter #epitwitter , read on for an #EBM #Tweetorial on p-values, with specific attention to the implications of the recent remdesivir trial with p=0.059 for mortality (full report still not published, which is not ideal …).
2/ This is a follow-up to my prior #EBM #Tweetorial on diagnostic test performance study design
3/ Again, who am I to do this? My PhD is in #biostatistics, I direct the @MayoClinicSOM #EBM curriculum, and I teach Bayesian Diagnostic Testing Strategies @MayoGradSchool @MayoClinic @MayoMedEd @MayoFacDev.
4/ Also again, this #Tweetorial is a little wonkish but really not so complicated. And anyway, when I hear “wonkish”, remember that I hear:
5/ To begin, let’s go back to the 2x2 table for diagnostic test performance.
6/ The “truth” is represented by the columns, and the test results are represented by the rows. From here, we have standard definitions for sensitivity, specificity, etc.
7/ If we consider a scenario with prevalence (pre-test probability) 10%, sensitivity 80%, and specificity 95%, the table looks like this:
8/ All good so far? Everybody with me? What the heck does this have to do with p-values, you may wonder.
9/ Well, since you asked, here is the 2x2 table for statistical hypothesis testing:
10/ The type I error rate is Prob(test rejects the null hypothesis given that it is in fact true). But this should look familiar from the 2x2 table!
11/ The type 1 error rate, alpha, is analogous to 1-specificity of a diagnostic test.
12/ Similarly, the type II error rate, beta, is Prob(test does not reject the null hypothesis given that it is in fact false). This is analogous to 1-sensitivity for a diagnostic test.
13/ But just as for diagnostic tests, what we really want to know is along the rows, not the columns. For example, given a statistical test result, what is the probability the null hypothesis is in fact false (e.g., remdesivir improves mortality)?
14/ To illustrate this, we can use common (though arbitrary) values for the type I error, alpha, (5%) and type II error, beta, (20%). Here, I’ll assume an effect we are moderately skeptical about, null hypothesis 90% likely:
15/ So what is the probability the null hypothesis is false (i.e., there is an effect) if the test rejects it (p<0.05)? Uh oh … only 80/125 = 64%. There’s still a 36% chance the null hypothesis is true if p<0.05!
16/ The extension to a particular p-value from the type I error rate example can be generally understood by imagining we set the type I error rate at the observed p-value. So let’s get back to remdesivir.
17/ The p-value for the observed mortality reduction from 11.6% to 8.0% in the as-yet-unpublished study was 0.059. NIH Clinical Trial Shows Remdesivir Accelerates Recovery from Advanced COVID-19 | NIH: National Institute of Allergy and Infectious Diseases niaid.nih.gov/news-events/ni…
18/ Consider the definition and proper usage of p-values from this REALLY IMPORTANT PAPER. Full article: The ASA Statement on p-Values: Context, Process, and Purpose amstat.tandfonline.com/doi/full/10.10…
19/ “a p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.”
20/ This means Prob(observing data as or more extreme GIVEN THAT in truth there is no effect of remdesivir on mortality) is 0.059, or 5.9%.
21/ This is conditioned on the truth being no effect, like specificity and not having the disease. But this isn’t what we care about. We want to know Prob(in truth there is an effect GIVEN the data we observed), conditioned on the data … cue Bayes’ Theorem.
22/ To get to this, we have to set a prior probability that remdesivir will reduce mortality. I am not a virologist, but many experts have indicated a paucity of such evidence for treatment of other coronaviruses.
23/ So let’s say 10%, keep the type II error rate at 20%, and apply a type I error rate of 5.9% to match the p-value. Then the 2x2 table looks like this:
24/ With p=0.059, the probability remdesivir actually reduces mortality is …
25/ 60.1%! So it’s not even close to the same implied meaning as the “almost statistically significant” claims many have made, but much closer to the flip of a coin.
26/ Don’t like the 10% pre-study probability assumption?
1% pre-study prob ➡️ post-study prob 12%
25% pre-study ➡️ post-study 82%
50% pre-study ➡️ post-study 93%
Someone can develop an app/graph for different p-values, pre-study probs, and type II error settings, but for me …
27/ And, of course, in Bayesian fashion we should be adjusting these probabilities as each new piece of evidence comes in.
28/ For example, there is this study from China in @TheLancet, which should then decrease our estimate of remdesivir’s likelihood of reducing mortality in severe #COVID19. thelancet.com/journals/lance…
29/ To be clear, I would love for remdesivir to work. However, based on prior probability estimates and current data, I think there is most likely a substantially less than 50% chance that it actually reduces mortality in severely ill #COVID19 patients.
30/ On the other hand, the fact that p=0.059 also doesn’t mean remdesivir definitively failed and should now be ignored – statistical test interpretation is not a simple binary process, as this discussion has hopefully illustrated.
31/ Want to know more? Check out this provocative classic paper from Dr. John Ioannidis @StanfordMed @PLOSMedicine : Why Most Published Research Findings Are False dx.plos.org/10.1371/journa…
32/ And with that, I’ll bring this to a close.
33/ Did you enjoy this #EBM #Tweetorial? RT if you liked it, and please add to the conversation!
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Colin West

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!