Duncan MacCannell @drm@mstdn.science Profile picture
Molecular epi and public health. Pathogen genomics, clin micro, AMR, (bio)tech and open data. Personal account. Opinions/dad jokes are my own. 🔬🦠🧬😷💉🌻🦣

Jan 6, 2023, 13 tweets

The variant surveillance dashboard on the CDC COVID Data Tracker was just updated to include projections up to 1/7/2022; this is a weekly update that posts like clockwork every Friday.

covid.cdc.gov/covid-data-tra…

This week, the national point estimate for XBB.1.5 is 27.6% (95%CI:14.0-46.5%). This is a substantial decrease from last week’s point estimate at ~41%. You might be wondering why these numbers changed. Let’s take a look:

Genomic surveillance provides ongoing monitoring of circulating variants from across the US. Over 100 labs – PH, academic & commercial - gather positive specimens, sequence the viruses they contain and share data with PH and researchers all over the world.

If you are interested, you can look at these sequences yourself, as the laboratories that generate them share them through public or open sequence repositories like @GISAID and @NCBI.

>> @GISAID: gisaid.org
>> @NCBI: ncbi.nlm.nih.gov/labs/virus/vss…

As these surveillance networks scale up, there is an important amount of lag in the generation of these data: it takes time to test and confirm a patient result. It takes time to gather positive samples and information about them. It takes time to sequence/analyze/QC/submit them.

So for many ID surveillance programs, SARS-CoV-2 included, it can be a bit like driving while looking in the rear view mirror. The point where you have a representative and meaningful set of samples is almost always a few days or weeks behind. Sometimes less, sometimes far more.

@CDCgov uses a modeling approach that we call “Nowcast” to extrapolate what likely happened in those intervening days and give a better sense of where things are today. This takes into account differences in population, testing and a few other factors: github.com/CDCgov/SARS-Co…

Let’s look at LAST WEEK’s 41% (95%CI:23-61%) estimate:
This is the dashboard with the NOWCAST turned ON. Note that the last three time points are modelled based on historical data, and that the last week with reliable actual weighted proportions is the week of 12/10/2022.

If we turn the NOWCAST OFF for last week's data, we get a better sense of what the underlying sequence data looked like at the time, without the overlay of modeled projections. A very different picture.

It’s challenging for any model to accurately predict the future – and this can be particularly true for emerging pathogens where a myriad of complex epidemiologic and ecologic factors can impact their local or overall trajectory.

As more data become available, the confidence intervals tighten, and estimates get more and more precise until the data eventually catch up with the model.

Another factor was simply the holidays: many testing and sequencing programs across the country were running at some level of decreased capacity, and there’s a noticeable bump in backfill of sequence data to public repositories for December.

Thank you to the hundreds of labs and thousands of scientists that are working to generate and share these data. It’s a hard and often thankless task, but so incredibly vital. Together, we've made remarkable progress, and the future for genomic pathogen surveillance is bright.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling