My Authors
Read all threads
The NEJM compassionate-use remdesivir data was always sort of a "passing curiosity while we await trial data" but it's disappointing that the analysis was done wrong.

But, it offers an opportunity for a teachable moment about "time-to-event" analysis of non-mortality outcomes.
So, for those of you who may *read* a lot of papers that use survival analysis but not perform them too frequently yourself, let's talk a bit about the nuts and bolts of how the data are formatted and why this can be confusing.
I will assume that you, dear reader, generally know what a Kaplan-Meier curve is. Now let's talk a bit about how it is generated and what the data "look like" to make that plot.
Let's first suppose that the outcome is "mortality" and you plan to make a curve that starts at 100% at time zero (all patients "alive" at time zero) and drops each time a patient dies.
To make this plot, you generally need two data fields: a yes/no indicator of whether the patient "died" and their observed time. This will be the time of death for patients who died; it will be the total f/u time for patients who did not die during the study period.
So a patient that died at 14 days will have a "1" in the "Died" column and a "14" in the "Survival Time" column, letting me know that this patient died on day 14.
A patient that was followed for 28 days who did not die will have a "0" in the "Died" column" and a "28" in the "Survival Time" column, indicating that this patient was known to be alive for at least 28 days.
(Note: the length of follow-up time varies depending on whether one is doing a study of ICU patients, where usually the outcome of interest is basically whether they survived the acute event - so 28 / 60 days are common - versus a population like CVD or cancer - years)
Anyways, back to my example. So every patient has their "0" or "1" to indicate whether they died or not and their "time" (time of death for deceased; last-known-alive time of interest for those who did not die during the study period)
From this data structure, one can easily generate a Kaplan-Meier curve (or run a Cox model, or your other choice of "this is how I like to analyze my time to event data" method here)
But, when the outcome is "time to clinical improvement" rather than "time to mortality" - we have a new wrinkle that many people may not consider...
So now I'm counting an "event" as a good thing - unlike the above example, where dying early (or at all) is bad, we want patients to reach the endpoint of clinical improvement faster.
So a patient that reaches the endpoint of "clinical improvement" at 14 days will be a "1" in my "clinical improvement" column and a "14" in my "time" column
But what about a patient that dies on day 14? THIS IS WHERE IT GETS WEIRD.
The way most of us think about survival analysis in most settings, a patient that is "censored" without experiencing the event is assigned a time equal to their last observed time.
So the initial temptation, and what it seems the authors did if I have understood correctly, is to code this type of patient as "0" (meaning no clinical improvement) with a time of 14 days.
But we have to think a little harder about the meaning of time to event analyses here and the implications to realize why that doesn't make sense.
First glance: a patient that "died" was only "at risk" for the event of "clinical improvement" for the time they were alive; they cannot experience clinical improvement after they've died - so they should be censored at their time of death?
But that's kind of silly. Because we actually know that they will *never* reach the endpoint of clinical improvement. It could be argued that they should actually be given a "0" for improvement and a time of "infinity" - they will *never* have the event.
So what one should do in this setting is use an analysis that properly "penalizes" death as a bad outcome to give us an accurate estimate of the % of the original sample that have reached clinical improvement.
The simplest way to do this is assign all the deaths "0" for improvement with the max time in the sample (say, 28 days, if we are only reporting clinical improvement out to 28 days).
This way, you don't upwardly bias the KM estimate of the "clinical improvement" proportion over time by removing the deaths as standard "censored" observations.
This would get a bit stat-nerdy, but basically, the estimate at the later time points is computed based on the number of "events" occurring relative to the number of patients who are still "at risk" - so if the deceased patients are "censored" at their time of death...
It will upwardly bias the estimate of the proportion who have reached the endpoint of "clinical improvement" (in fact, if you were to follow this curve done-wrong to the end, it would reach a point where the KM estimate showed that 100% of patients "clinically improved")
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Andrew Althouse

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!