My Authors
Read all threads
I have been corresponding with the authors of the well-known Santa Clara County COVID-19 preprint, and I am alarmed at their sloppy behavior. The confidence interval calculation in their preprint made demonstrable math errors - *not* just questionable methodological choices.
Everyone makes mistakes, but the record must be corrected ASAP. I emailed them on Saturday morning asking them to do so. In the last three days they haven't corrected anything yet, but a subset of them have released a new study without saying how they did the analysis this time.
Given the critically important and time-sensitive policy decisions being made now, if the authors are still pressing their case in the media using possibly incorrect calculations, then I feel I should make my criticism public too.
The errors are not debatable and can be seen in these two screenshots of the supplement: 0.0034, the standard error meant to measure uncertainty about prevalence pi, is not the square root of 0.039, and the variance of a binomial estimate of proportion depends on the sample size.
I can't redo the whole calculation myself because parts were not described anywhere, but I have low confidence that those parts were done correctly; if not, the corrected confidence interval for prevalence in Santa Clara County might well stretch all the way to include zero.
The authors said by email that they used a built-in Stata function and aren't sure themselves how the software used the input weights. I suspect they misapplied that function (too complicated to tweet why) but I don't know Stata well enough to be sure; it seems neither do they.
Trevor Hastie, Steve Goodman, @robtibshirani and I have been asking for more information about their analysis and their data. The authors have been gracious and reasonably responsive; they say they are redoing the stats in a second draft and will share data if possible.
I was satisfied to wait for the second draft, which is supposedly imminent and which the authors assured us will include a detailed description and code for a different confidence interval method based on bootstrapping.
However, I'm flabbergasted that yesterday afternoon, a group including several of the same authors circulated a press release describing new results for a similar study in LA County, without any accompanying technical report and before correcting the Santa Clara County preprint.
I can only surmise that the numbers in the LAC press release are either still using the same muddled calculations, or using an unspecified new method that hasn't been described publicly and is very different from the one they described in the SCC preprint.
Whichever is the case, before journalists publicize any more results from this group, they should know that the confidence intervals reported in both studies have no known statistical provenance as of now. The calculations are not questionable; they are either wrong or unknown.
What should we believe while we wait for a defensible analysis from the authors? In my opinion, the analysis suggested by @graduatedescent using Fisher's exact test should be treated as authoritative until the authors are ready to give a competing account. bit.ly/2XR0pdL
@graduatedescent shows the SCC data are too noisy even to rule out the possibility that all the positives are false positives. Simply put, the difference between 50 heads in 3330 flips (SCC residents) and 2 heads in 371 flips (negative controls*) isn't statistically significant.
The authors have demographic information they have not yet shared, so it's conceivable a more refined analysis will pin down the prevalence more precisely. My point is that right now, as far as I know, no such analysis exists.
Note that beyond the formal statistical analysis there are other good reasons to be skeptical of the study, which have been pointed out publicly by @graduatedescent, @nataliexdean, @StatModeling, and many others.
Thanks also to @jjcherian for his perceptive tweets about the paper that piqued my interest in the first place.

*The "negative controls" are blood samples from people who were known not to have been infected with COVID-19.

The supplement I refer to can be found at medrxiv.org/content/medrxi…
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Will Fithian

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!