Over the past few weeks, we’ve noticed that newsrooms of all sizes—and even some government agencies—have fallen into some of the data potholes that we’ve become familiar with in our year of wrangling public COVID-19 data.
So today, we’re offering a brief cheat sheet on avoiding some of the most common errors we’ve seen. covidtracking.com/analysis-updat…
Tip 1: If you see dramatic movement in the data, look for contextual clues before interpreting it as a change in the pandemic. Day-of-week effects in data arranged by date of report produce predictable reporting swings over the course of each week.
Tip 2: Data backlogs—and the “data dumps” that occur when those backlogs are resolved—can mimic major declines and then jumps, especially in cases, tests, and deaths. Look for explanations on state dashboards and call public health officials.
Tip 3: Holiday and weather-related reporting issues happen when national or natural events occur across many states at once, and can mimic shifts in the pandemic. Look for holidays or major disruptions that might have artificially depressed—and then inflated—the data.
Tip 4: Watch out for definitional mismatches and alternate dating schemes. Be aware that different jurisdictions chose different ways of defining and reporting their metrics.
Tip 5: Get familiar with caveats. The most recent dates in epidemiological datasets are always incomplete—because, for example, the data points for people who died today won’t finish being reported for many days, weeks, or even months in the future.
Tip 6: Be cautious about what the data can say. If you’re trying to extract insights from the data itself, it can be very easy—especially within a headline—to make causal claims when only correlative evidence is available.
Read all our advice from @kissane and @jessicamalaty here: covidtracking.com/analysis-updat…

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with The COVID Tracking Project

The COVID Tracking Project Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @COVID19Tracking

31 Mar
Over the past 10 months, we've tried to determine how COVID-19 affected some of the people most vulnerable to the virus: residents of long-term-care facilities.
Based on official state figures compiled by our team, as of March 4, 2021, at least 174,474 people had died of COVID-19 in long-term-care facilities. This represents 34% of the total deaths due to COVID-19 in the US. covidtracking.com/ltc-topline-es…
We estimate that as of March 2021, about 8 percent of people who live in US long-term-care facilities have died of COVID-19: Nearly one in 12. covidtracking.com/analysis-updat…
Read 6 tweets
22 Mar
While we’ve stopped our data collection, we’ve put together a group of guides to help you navigate and understand federal COVID data.

On Friday we published our latest guide, this one on federal race and ethnicity data. We explain where you can find it, and what you need to know about its limitations.

You can find guides to federal COVID testing, case, death, hospitalization and nursing home data here.

Read 5 tweets
19 Mar
Our Federal Data 101 about race and ethnicity data is published.

Publicly available federal race and ethnicity COVID-19 data is currently usable and improving, although it shares many of the problems we’ve found in state-reported data.

Federal race and ethnicity COVID-19 data is not comprehensive enough to represent people’s experience of the pandemic in the United States. Most data is only available nationally, not by state. Two bar charts from the CDC site, one showing cases by race/
The federal data can be better, by collecting and publishing race and ethnicity data more consistently and comprehensively, presenting the data in clear, accessible ways, and being transparent about data sources and contexts.
Read 5 tweets
19 Mar
For many weeks now, the number of cases and hospitalizations has been going down across the country. Unfortunately, that trend has now reversed in the state of Michigan. Cases * and * hospitalizations are both on the rise there. 4 bar charts with 7-day averages showing cases, currently ho
There had been some hopes that if we did see cases rise somewhere, hospitalizations would not follow because many vulnerable people have been vaccinated. But Michigan hospitalizations have increased 45% from their February low.

Two important pieces of context: Statewide, just 28% of Black residents 65+ are known to have received a first dose of vaccine. Though that data is incomplete, CDC numbers show that 66% of the U.S. population aged 65+ has received at least one dose of vaccine.
Read 5 tweets
15 Mar
Here's the latest in our ongoing effort to help data users find, understand, and use federal COVID-19 data.

We've created a bit of code that combines federal testing, case, death, and hospitalization data in a single spreadsheet.

One major caveat—we are not committed to maintaining this script should the federal data pages undergo material changes. This is simply a set of instructions for interested data users (and an example of what's possible with federal data).
For inexperienced data users, this process is no more than 2 clicks. For users familiar with Python and pandas, feel free to take this code as a starting point for further exploration.
Read 7 tweets
11 Mar
We’ve concluded our data collection, but fear not: we’ve put together a bunch of resources to help you find COVID data.
First, here’s all the data and metadata we collected over the past year. covidtracking.com/about-data/dat…
If you’d like to see charts and data visualizations, here’s where to find simple topline data. covidtracking.com/analysis-updat…
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!