Tweet

Andrew Althouse

30 Oct, 18 tweets, 3 min read

https://twitter.com/rafaelmourao29/status/1319968783101026307

Here’s a brief follow-up thread answering a sidebar question to the last 2 weeks’ threads on interim analyses in RCT’s and stopping when an efficacy threshold is crossed

https://twitter.com/rafaelmourao29/status/1319968783101026307

The “TL;DR” summary of the previous lesson(s): yes, an RCT that stops early based on an efficacy threshold will tend to overestimate the treatment effect a bit, but that doesn’t actually mean the “trial is more likely to be a false positive result”

(Also, it seems that this is generally true for both frequentist and Bayesian analyses, though the prior may mitigate the degree to which this occurs in a Bayesian analysis)

Anyway, this follow-up question intrigued me: does it matter if the interim analyses are scheduled based on a # of events rather than at fixed # of patients enrolled?

Same basic setup: binary outcome, 40% mortality in control, 30% mortality with intervention, meaning “true” effect in this example is OR=0.65

This time: frequentist trial will take one look at “200 deaths” (rather than 500 patients) and stop if p<0.0054 at interim, otherwise proceeding to recruit until a total N=1000 recruited

In prior threads I plotted the OR estimated every 100 patients; this time, I’ll plot the OR estimated every 50 deaths (though remember, this is just for illustration, we wouldn’t actually be performing interim analyses with efficacy stopping at those points)

45 of the 100 trials with the frequentist approach would cross the efficacy threshold at the “200 events” interim analysis:

The median OR of the 45 trials that stopped early is 0.56 with a range from 0.42-0.61 (pretty similar results to what we saw with the interim analysis scheduled “500 patients” rather than “200 events”)

I don’t think we would have expected this approach to yield much different results than an interim scheduled for “500 patients” but anyways, here it is. Pretty similar.

If you’re curious about the fully Bayesian equivalent, now let’s do it fully Bayesian where we look every 50 events and stop if Pr(OR<1) >= 0.975 at each respective interim

(Remember, this isn’t set up to perform similarly to the frequentist approach, it’s just a quick “what happens with this design” question)

In this case, 92 trials would conclude efficacy at one of the “every 50 events…” interim analyses before reaching the max total N=1000 patients

The median of the posterior mean OR’s for those 92 trials is 0.61 (range from 0.31 to 0.77) so again, the Bayesian approach (thanks to prior) slightly mitigates the tendency to overestimate but doesn’t eliminate entirely

In closing: this is one fairly simple simulation of one fairly simple scenario, and you should keep in mind that the specifics of performance characteristics discussed here will vary depending on all of the dials one can twist…

…number of patients, number of interims, timing of interims, stopping thresholds, etc will all influence the degrees to which these things are true.

So please, keep in mind the all-important caveat: It’s Complicated.

Happy to take requests to answer questions like this that can be illustrated fairly easily via simulation.

@KertViele

Also please note if your question is "What does this look like in the vaccine trials" @KertViele has already covered that very nicely here:

https://twitter.com/KertViele/status/1315011886002339840?s=20

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @ADAlthousePhD

Andrew Althouse

@ADAlthousePhD

16 Oct

As promised last week, here is a thread to explore and explain some beliefs about interim analyses and efficacy stopping in randomized controlled trials.

Brief explanation of motivation for this thread: many people learn (correctly) that randomized trials which stop early *for efficacy reasons* will tend to overestimate the magnitude of a treatment effect.

This sometimes gets mistakenly extended to believing that trials which stopped early for efficacy are more likely to be “false-positive” results, e.g. treatments that don’t actually work but just got lucky at an early interim analysis.

Read 62 tweets

Andrew Althouse

@ADAlthousePhD

14 Oct

Having one of those mornings where you realize that it's sometimes a lot more work to be a good scientist/analyst than a bad one.

(Explanation coming...)

Processing some source data that could just be tabulated and summarized with no one the wiser, thereby including some obviously impossible data points, e.g. dates that occurred before study began, double-entries, things of that nature.

Not exactly an original observation here, but when we talk about issues with stats/data analysis done by non-experts, this is often just as big of an issue (or a bigger issue) than whether they used one of those dumb flow diagrams to pick which analysis to do.

Read 5 tweets

Andrew Althouse

@ADAlthousePhD

21 Aug

OK. The culmination of a year-plus, um, argument-like thing is finally here, and it's clearly going to get discussed on Twitter, so I'll post a thread on the affair for posterity & future links about my stance on this entire thing.

A long time ago, in a galaxy far away, before any of us had heard of COVID19, some surgeons (and, it must be noted for accuracy, a PhD quantitative person...) wrote some papers about the concept of post-hoc power.

I was perturbed, as were others. This went back and forth over multiple papers they wrote in two different journals, drawing quite a bit of Twitter discussion *and* a number of formal replies to both journals.

Read 27 tweets

Andrew Althouse

@ADAlthousePhD

31 Jul

https://twitter.com/rlbarter/status/1288998033321934848

Inspired by this piece which resonated with me and many others, I'm going to run in a little different direction: the challenge of "continuing education" for early- and mid-career faculty in or adjacent to statistics (or basically any field that uses quantitative methods).

https://twitter.com/rlbarter/status/1288998033321934848

I got a Master's degree in Applied Statistics and then a PhD in Epidemiology. The truth is, there wasn't much strategy in the decision - just the opportunities that were there at the time - but Epi seemed like a cool *specific* application of statistics, so on I went

But then, as an early-career faculty member working more as a "statistician" than "epidemiologist" - I've often given myself a hard time for not being a better statistician. I'm not good on theory. I have to think really hard sometimes about what should be pretty basic stuff.

Read 22 tweets

Andrew Althouse

@ADAlthousePhD

4 Jun

@NEJM

As more stuff continues to break on the @NEJM and @TheLancet papers using the Surgisphere 'data' there's another possibility which has occurred to me that I want to play out.

I've been poring over these numbers for a few days and have not yet found a purely "statistical" smoking gun: a mean that cannot exist, a confidence interval that can't exist, etc.

Thus far most of the prevailing sentiment that this data isn't real seems to come from anecdotal beliefs: not very much evidence that the company exists, insider knowledge of how hard it is to connect EHR data, etc.

Read 31 tweets

Andrew Althouse

@ADAlthousePhD

2 Jun

@Surgisphere

Some excellent work here as more people pry into the @Surgisphere papers. I'm going to try to build on this a bit further...

https://twitter.com/mikejohansenmd/status/1267675115908669441

Before we get started: many have pointed out some very legitimate reasons to be skeptical of how such a database could exist with so little record of the company's existence or infrastructure to support what would be an absolutely massive integration of EHR's around the world

Those are good points and people should continue to pursue them. I'm coming at this from another angle: I want definitive proof, or something like it, that these data cannot exist.

Read 53 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Andrew Althouse

Try unrolling a thread yourself!

More from @ADAlthousePhD

Andrew Althouse

Andrew Althouse

Andrew Althouse

Andrew Althouse

Andrew Althouse

Andrew Althouse

Did Thread Reader help you today?

Like this author's thread?