Profile picture
Andrew Althouse @ADAlthousePhD
, 29 tweets, 7 min read Read on Twitter
This article calls for a TWEETORIAL (h/t to the clever @boback and @ihtanboga for catching this and posting about it)
Each of them independently noticed a curiosity: this trial was stopped early for futility despite early results suggesting a (possible) strong benefit of the treatment.
Abstract reports the HR=0.76, 95% CI 0.55-1.04, p=0.09 for the primary endpoint based on first 240 patients
Results section begins with “After the inclusion of 240 patients, the fourth planned sequential interim analysis showed that the lower boundary of the stopping-rule triangle had been crossed...."
"... Because no significant between-group difference in mortality at 60 days had been found, trial recruitment was stopped, in accordance with the prespecified rules.”
At a first glance, that seems VERY odd. A trial that was looking pretty good was terminated for futility?? So let’s dig into the methods & why this happened.
The trial design was a group-sequential analysis with a two-sided triangular design that allows stopping for either efficacy or futility
Full details of the design are reported in the Supplementary Appendix, here:
Pages 25-29 have the explanation of trial design and interim analysis strategy
The bottom of page 28 & top of page 29 describe the decision to stop the trial after the 4th interim analysis (240 patients)
In summary, simulations suggested that based on mortality difference observed to that point in the trial, the probability of the trial stopping at the next interim analysis (300 patients) was 100% - but with the lion’s share (89.9%) being that it would stop for futility
Important note: in this case futility does not mean that we have evidence that the treatment is futile; rather, futility means that running the trial to reach the full planned sample size was unlikely to (holds nose) “achieve statistical significance”
THIS IS A CRUCIAL POINT!!!!
The trial was stopped for "futility" because we had sufficient evidence to conclude that the treatment didn't work. It was stopped for "futility" because simulations suggested that enrolling 60 more patients was not likely to push the p-value below 0.05
Which has more to do with original design being overly optimistic & powering based on a GIANT effect size (so despite early results suggesting a non-trivial BENEFIT of treatment, enrolling to the original planned sample size was not going to be enough for a magic p<0.05)
I’ll assume that the authors did the calculations correctly. I have no reason to doubt their statistical prowess. And they are to be lauded for attempting an adaptive design with flexible sample size. But there are key lessons to learn here.
There is something very disturbing about a trial with HR=0.76, 95% CI 0.55-1.04 for the primary endpoint being stopped early for futility. IMO this falls partly on the frequentist statistical paradigm and trials designed to a certain “power” level with guessed effect size.
In an alternative world, this could have been designed to run until there was (numbers flexible, for sake of argument) sufficient evidence for >90% BPP of a positive treatment effect or <10% predictive probability of a negative treatment effect
There was an accompanying editorial on this titled “Learning from a Trial Stopped by a Data and Safety Monitoring Board”
I do not blame the DSMB for following the pre-designed statistical analysis plan and stopping the trial (YMMV on this…)
But I think the real lesson to take away from this is that adaptive trials need to be VERY VERY carefully planned to avoid possibilities of stopping for "futility" in a situation where evidence from early results is leaning towards treatment BENEFIT
I'll hand off the mic (for now) and invite some others to comment: @boback, @ihtanboga , @MasriAhmadMD , @petersasieni, @f2harrell, @coachabebe, @statsepi, @jd_wilko, @MedgibbonsCathy, @bogdienache
One more thought, I guess: another editorial on this trial includes an infuriating interpretation:
"Nevertheless, at least one important conclusion can be drawn — the routine use of ECMO in patients with severe ARDS is not superior to the use of ECMO as a rescue maneuver in patients whose condition has deteriorated further"
That is maddeningly incorrect. This trial absolutely positively DOES NOT support that statement.
Would also love for @SCTorg to shine some light on this to get input from a few real-deal trialists (@JasonConnorPhD , @statberry , @RogerJLewis and you types)
CAN WE GET TO 100 RETWEETS PEOPLE?! MOBILIZE YOUR FRIENDS!
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Andrew Althouse
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!