Post

Noah Haber

@NoahHaber

Feb 14, 2021 • 46 tweets • 10 min read • Read on X

Folks: There are serious statistical, design, language, and ethics concerns with that vitamin-D RCT.

AT BEST, it's completely meaningless due to negligent statistical treatment and design, but there's more "questions"

Help us out: avoid sharing until the critics have had time.

Seriously, we (people who are frankly pretty good at this) are having a very hard time figuring out what the hell is happening in that trial.

Please give us time to do our work.

https://twitter.com/fperrywilson/status/1360944814271979523

This thread is a good place to start if you want a taste.

But "super sus" is right; there is just so much here that doesn't make any sense at all, and this thread only scratches the surface.

It's gonna be a while before we figure this out.

https://twitter.com/fperrywilson/status/1360944814271979523

<grumbly soapbox rant>

this kind of crap is what happens when RCTs are automatically the "gold standard" in medical evidence.

</grumbly soapbox rant>

Even if we make the (extremely generous) assumption that failure to account for clustering is the only "major" error, I can't emphasize enough just how damning that error is.

It's both incredibly basic and incredibly important to deal with correctly. How did this happen?

Things are developing in a bad direction, so time to talk a little more about it.

Let's start with the most generous version of this, and say they just made a couple of honest mistakes. It happens! Stats and study design is hard!

Abstract: "Participants (n=551) were randomly assigned to calcifediol treatment"

That is an unambiguous statement that individuals were randomized to arms.

But that's false; taking things at face value, the 8 WARDS were randomized, NOT individuals.

That's super important, since when you are assigning treatment at a group level, the grouping is the important bit, and needs to be dealt with from day 1.

Think along the lines of this being more like n=8 than n=930 (not quite right, but you get the idea).

That's a HUGE deal, as it's effectively impossible to get usable results from an 8 cluster cRCT due to complete loss of statistical power.

In effect, we can't really tell if the results had to do with other stuff inherent with the different wards, or the vit-d "treatment"

But then there's another weird thing: If you have only 8 clusters, you would want to split them in half (4v4) not what they did: 5v3.

The reason is, again, statistical power. It's typically (not always) MUCH more efficient to split your groups evenly.

Sounds like a small thing, but it's....weird. Even if you knew nothing about clustered RCTs, you would probably know that you want an even split. So why the 5v3?

If that was all that was wrong, dayenu. This isn't subtlety, this is REALLY basic stuff for trial design/stats.

From here, things get .... weirder.

If this was a properly run trial (assuming the design/stats were legit, which they aren't), according to the paper the trial ended in May, 2020.

What was happening from then to now? It often takes a long time to get and clean the data, sure.

But given 1k person RCT (which generally requires a HUGE amount of planning and infrastructure) in a pandemic, you'd want to accelerate that to light speed, and get those results out ASAP. Especially if they showed these miraculous results (narrator: they don't)!

Then there is some weirdness about the study population: i.e. the embedding in the cohort.

That's not a problem by itself; it can be a huge time and resource saver. I've developed 2 RCTs to date, both of which were embedded in cohort studies for this very reason.

But the way it's described is ... weird.

There is some weird language here around "hospitalized randomly" implying that they assigned patients to wards randomly. Maybe a language or oversimplification issue, so maybe don't read too much into that.

If everything was done properly as described, we have a new issue: in the consent process, it seems that patients were given options. If they're given options, they might choose (or be encouraged) to go to different wards for various reasons.

That breaks a LOT of things.

If which ward was which changes what patient assignment to wards (e.g. patient 1 might want to be sent to the vit-d ward, or doctor might send them there), the randomization with respect to patient assignment is completely broken due to selection issues.

That's...not great.

Then there's ethics and protocol. The manuscript states that it received ethical approval for the study. Great! In theory, that means there is a protocol for this (this are usually not public) and a trial registration.

We should be able to verify that this was planned this way.

I am personally not familiar with the typical required, and standard processes for ethics, registration, and protocols in Spain.

If it was approved with a protocol that roughly matches what the manuscript says was done, then this is merely* study design and reporting negligence

* "merely" just means nothing more going on; there would still remain a jaw-dropping series of design and reporting errors.

Also worth noting that there are HUGE differences in the baseline levels of vit-d in the trial arms.

Don't fall into the trap of believing that randomization means that the arms are "balanced." That's not true. Differences are both expected and totally ok when things are done right.

But that's a HUGE difference, suggesting that there are fundamental differences between arms.

At best, this is some combination of ward-specific protocols, procedures that happened to line up with the arms (small n's do that), etc + the patient selection issue.

Again that would be enough to be super sus.

Altogether though, that is a LOT of issues that kinda happen to fall into place for these results.

So, at best, this is study design and reporting negligence on a lot of dimensions plus a little push from random chance.

Best bet is assuming this is the case.

HOWEVER, given the weirdness about these errors (and a clear willingness to play fast and loose with the word "randomized") we should do some due diligence here and verify that's the case, which is what is happening now.

This should all be pretty cut and dry if the ethics approval and protocol turns up and describes this RCT. If it doesn't...

In the meantime, we are playing catchup, since this study is blowing up all over the place with unscrupulous sharers and media reports.

This has the makings of yet another HCQ-type debacle (albeit probably not as big) with a hint of DANMASK.

Let's do our best to not let that happen.

@sTeamTraen

If you want some more "live" look at this, @sTeamTraen has been at work here (and has a sterling reputation for discovering all kinds of research mistakes and misconduct, as well as well-known research trouble-maker @GidMK.

I'll probably update when I'm more sure of things.

Well, we have an an answer from the authors...of sorts. Copied here, because this sure is something.

pubpeer.com/publications/D…

What on earth does this mean, and how can you possibly square it with what's in the abstract and manuscript itself?

"We never say in the article that it is a randomized control trial (RCT) but we consider an open randomized trial, and an observational study."

The study describes randomization (implied at the individual level, but actually at the ward level). It describes a control (not receiving the "treatment") and it directly describes it as a trial.

There is ZERO question that, if we take them at their word, this is an RCT.

"Formal ethical approval was obtained shortly after the study started although verbal approval from the ethics committee was given at the time it was started while we completed all the bureaucratic process."

Oh. Oh no. That is very, very not ok.

In case there is any doubt at all, this is a direct quote from the manuscript, page 5:

"The effect of calcifediol administration was studied in a prospective open randomized controlled trial."

https://twitter.com/maureviv/status/1361380910956953601

This.

https://twitter.com/maureviv/status/1361380910956953601

This was clearly suspicious from the start, but I am quite honestly pretty shocked, and did not expect this result.

There are still some open questions, but at minimum this is a major violation of public trust and ethics, not to mention scientific and statistical rigor.

I hope the authors do the right thing and pull it. There is no version of this situation which can save it.

Best we can do is be honest with our errors and move forward.

Also folks; please don't take it upon yourselves to try to "fix this" by leaving inappropriate comments or feedback.

The authors already have all the information they need, from well-qualified folks who do this kind of thing.

Let this run its course, no need for more attention.

@sTeamTraen

An update (not doing play by plays): @sTeamTraen has been doing some excellent work following up and figuring out the ethics approval situation for this study, and things are looking .... not great.

https://twitter.com/sTeamTraen/status/1362145352804159494

Quick update on what we do and don't know so far.

The number referenced in the pre-print was for the local registration number (NOT ethical approval), who were not informed about the study until 60ish days into the study.

The PI specified that the study was approved by an external ethics board (totally normal).

@sTeamTraen

@sTeamTraen is checking in with the referenced external ethics board, but noting that there doesn't appear to be any study registered that seems to match this one (ward-randomized vitamin-D trial etc.).

Could be an administrative issue or a miscommunication (it happens!).

Three possibilities here:

1) IRB approval exists, but was misreported/admin errors.

2) There was never randomization in the first place (i.e. no trial).

3) There was neither approval nor consent for this trial, contrary to authors' claims.

I sincerely, truly hope for #1.

https://twitter.com/sTeamTraen/status/1362699216481644548

.....and it's gone! The Lancet SSRN removed it from their server.

This is a pretty unusual move for a pre-print server to do, usually only happens in high-profile and extreme situations.

https://twitter.com/sTeamTraen/status/1362699216481644548

Hopefully this is the end of the story. Lots of mysteries remain like what actually happened in this study, ethics, etc.

In an ideal world, the authors and all the people unscrupulously promoting this study would work hard to undo the damage done and prevent the next round.

Not gonna happen though.

We desperately need to invest in our research infrastructure and community to do better designed and more ethical research, and prevent this kind of thing from happening in the first place.

To rewind a bit: this started as a story about "just" study/stats design (clustering standard errors), and should have been the end.

"Just" a stats issue was game over before it started.

But it's "just" stats and study design, so that's not enough.

What unfolded was frankly bonkers. It was "sus" but I would never have expected just how bad it was (and still more we don't know).

But it makes me deeply uncomfortable that the original fatal, basic, and unrecoverable flaw isn't enough to have prevented all of this.

@sTeamTraen

Huge amounts of credit to data thug extraordinaire, @sTeamTraen, for pursuing this and all the folks in the back channels discussing and looking into things.

That kind of service to the public and science goes almost entirely unrewarded, and usually at cost to the ones doing it.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Enter URL or ID to Unroll

Noah Haber

Try unrolling a thread yourself!

More from @NoahHaber

Noah Haber

Noah Haber

Noah Haber

Noah Haber

Noah Haber

Noah Haber

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!