Noninferiority trials: a musing thread that I may regret.
NI trials are fickle & unsatisfying. Sometimes there's a legitimately good reason to do them (discussed below); the stats are maddening (also discussed below).
Suppose we have a particular group of patients that need to undergo a certain procedure. The procedure has a theoretical, biologically plausible risk of causing a particular complication; we generally give patients some prophylactic therapy against that complication.
Since no intervention is totally benign, we know that even this prophylactic therapy has its own risks, so we wonder if perhaps we could give patients a lower dose of it than typically used without losing the protective effect against the complication it's intended to prevent.
Perfectly sensible setting for a noninferiority trial: if the lower dose offers the same protection (or close!) as the higher dose, might be preferable, at least for some patients/clinical situations.
Of course, from here the statistical considerations get maddening. We have to decide on an acceptable NI margin, then we do this weird backwards test where we say we have significant evidence of noninferiority if the CI lies within that margin...makes everyone's head hurt.
(yes, you can argue that this would all feel a bit more coherent if we adopted a Bayesian approach & instead sought to accumulate a high probability that the true difference between groups was less than the NI margin; yay, no more "significance test" but still tricky)
Anyway, I thought of this recently because I saw an example of such a trial where some might argue that the noninferiority margin appears overly permissive, but I still think we're better off "having" this data to inform decisions than not having it.
You might be aware that appropriately "powering" an NI trial with a very small NI margin requires a huge sample size, often far larger than we can realistically accumulate (because it effectively means you need to get a very tight CI on "no effect" to prove NI with small margin)
So maybe we end up trials where the NI margin looks pretty big, and people are pretty unimpressed by their conclusions of noninferiority, arguing that the trial was basically a waste. I'm not sure that's true.
Suppose we end up seeing something **like** this.
Pre-spec NI margin of 4%. Trial recruits 1000 patients (500 per arm). 8 of 500 in high dose arm (1.6%) have bad outcome. 13 of 500 in low dose arm (2.6%) have bad outcome.
Absolute risk diff=1.0%, 95%CI from -0.8% to 2.8%; since it's within the pre-spec margin of 4% this trial concludes noninferiority for low-dose arm vs. high-dose arm, although the outcomes are slightly worse in the low-dose arm.
For an outcome that's so rare (1-2%) you might argue that a 4% noninferiority margin is unreasonable permissive. But powering the trial for a much smaller NI margin (1% or less) at this event rate jacks up the sample size you'd need.
If this hypothetical situation were indeed an open clinical question, e.g. if it were truly debated whether a lower dose of this prophylaxis was close enough to the higher dose to merit consideration, I'd argue that we're better off "having" this data than not having it at all.
Would the trialists love to just be able to magic up 10K patients instead of 1K for the trial? I'm sure they would, but if the choice is between "no data" and "1000 patient trial" aren't we better off having the small-ish trial to give folks some sense of just how big...
...the difference might actually be between a high dose vs. a low dose of prophylaxis in this setting?
Thanks for your attention. Have a nice day and please fly safely.
(mutes notifications, walks away)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
@sim0ngates@chrisdc77@hafetzj@eturnermd1 Yeah, I thought about (and should have said something about) the distinction between industry funded vs academic sponsored trials. The exact process is a bit different but the challenges would be similar-ish. Agree that industry/regulatory bodies would have to be on board.
@sim0ngates@chrisdc77@hafetzj@eturnermd1 Of course the easiest way to make this happen would be for the major regulators to make it happen. But as Chris (I think?) said a little while ago, this was evidently part of the original discussion for clinical trials dot gov but they didn’t go all the way to RRs.
@sim0ngates@chrisdc77@hafetzj@eturnermd1 I think some academic trialists might be persuaded or at least attracted by the idea that they could have a much-expedited peer review process on the back end. If can be frustrating to do a trial, write up your results & then spend another year submitting to 3 different journals
Thread on relationships between researchers and statistical consultants. Prompted by a few recent tweets, but not only those as this is a recurring and always-relevant conversation.
On the "researcher seeking stats help" side, there is an often-justified feeling that statistical consultants are difficult to work with (even those in good faith) and sometimes downright unhelpful or unpleasant.
So - let's address those right up front as part of this thread about making these relationships productive & relatively happy.
Has anyone in *medicine* (or otherwise, but particularly interested in US academic medicine) actually proposed a study where they said they'd use an alpha threshold above 0.05? How was it received? (cont)
(Also, please do me a favor, spare me the arguments about NHST being a flawed paradigm on this particular thread)
Clearly not all studies have the same tradeoffs of a false-positive vs a false-negative finding, and in some cases a higher alpha threshold seems like it should be warranted...
@Jabaluck@_MiguelHernan@aecoppock I think (perhaps unsurprisingly) that this shows “different people from different fields see things differently because they work in different contexts” - the scenario you painted here is not really possible with how most *medical* RCTs enroll patients & collect baseline data
@Jabaluck@_MiguelHernan@aecoppock The workflow for most medical RCTs (excepting a few trial designs…which I’ll try to address at the end if I have time) is basically this:
@Jabaluck@_MiguelHernan@aecoppock 1. Clinics/practices/hospitals know that they are enrolling patients in such-and-such trial with such-and-such criteria.
Amusing Friday thoughts: I've been reading Stuart Pocock's 1983 book Clinical Trials: A Practical Approach (do not concern yourself with the reason).
There is a passage on "Statistical Computing" in Chapter 11 of the book which one might have expected would age poorly, but is in fact remarkable for how well several of the statements have held up.
"I would like to refer briefly to the frequent misuse of statistical packages. Since they make each analysis task so easy to perform, there is a real danger that the user requests a whole range of analyses without any clear conception of what he is looking for."
Fun thread using some simulations modeled on the ARREST trial design (presented @CritCareReviews a few months ago) to talk through some potential features you might see when we talk about “adaptive” trials
DISCLAIMER: this is not just a “frequentist” versus “Bayesian” thread. Yes, this trial used a Bayesian statistical approach, but there are frequentist options for interim analyses & adaptive features, and that’s a longer debate for another day.
DISCLAIMER 2: this is just a taste using one motivational example for discussion; please don’t draw total sweeping generalizations about “what adaptive trials do” from this thread, as the utility of each “feature” must always be carefully considered in that specific context