Disclaimer: with that said, this is still a simple heuristic and hence is not perfect. There are more advanced methods (e.g. see covidestim.org).
The basic idea is this: for each day, we try to estimate the ratio of true infections to reported cases that day.
We call this the prevalence ratio, and we model this ratio as a function of the day and positivity rate:
What does this mean? Two things:
1) As test positivity increases, the prevalence ratio increases due to not enough testing.
In the below plot, a 20% test positivity corresponds to a prevalence ratio of 4, meaning that for every reported case, there are 4 true infections.
2) As time goes on, the prevalence ratio decreases due to increased testing availability.
The reasoning is intuitive: If testing is widespread, we are catching more cases.
The increase in testing was much more rapid in the beginning, with a more taper-off in recent months.
When you combine the above two factors, you end up with different curves of the prevalence ratio depending on the day and positivity rate.
As you can see in the plot below, test positivity has a more muted effect on the prevalence ratio as time goes on.
Using the above equations, we can plot the test positivity and prevalence ratio for the US over time.
You can see that we believe the current ratio is around 3.4, meaning that we are catching 1 out of every 3-4 infections (~30% detection rate).
We can then multiply the prevalence ratios on each day with the reported cases to estimate the true infections (from 14 days earlier, after accounting for lag).
We can do this for each individual state and take the sum, or for the US as a whole. The results are quite similar.
Here's what the estimated true infections for each state looks like for a few select states (normalized by population).
Out of the large states, Illinois currently has the largest outbreak, but that may not be the case for long.
We can apply the same heuristic to estimate what true infections look like based on age group. This is based on the CDC COVIDView data:
One thing I have not yet mentioned is the calculation of the positivity rate. Each state has its own standard in reporting test results, and we must first standardized them before computing test positivity.
While I've seen many people cite South Dakota's 50%+ test positivity rate, the adjusted test positivity is closer to 20-25%. This is true for several other states as well.
If you want more details, see the @COVID19Tracking post above or the write-up (warning: it's complicated).
We can also use the true infections and reported deaths to estimate an implied infection fatality rate (IIFR).
The IIFR has fallen from >1% in the spring to ~0.5% now.
After adjusting for age, possible undercounting and excess deaths, we believe the true IFR is ~0.7%.
That about sums it up. As always, let me know if there are any errors / suggestions / feedback.
These estimates are tuned based on a limited sample of randomized serology surveys. So if you come across any new ones in the US, please send them my way.
On a related note, I'm glad that the open-access journal @eLife is transitioning to a "publish then review" model.
While there are always tradeoffs, this will hopefully lead to faster turnaround times for time-sensitive issues (such as a pandemic) .
I deployed some new features to covid19-projections.com over the past week. Here's a brief summary:
1) Maps over time - you can now view how the pandemic progresses over time for the US, on both a state and county level: covid19-projections.com/maps-infection…
2) Plots of confirmed cases and deaths for every state and county in the US (in addition to estimates of true infections).
Last week, Illinois reported 15,415 cases in a single day, more than Florida ever did in a single day. This is despite Illinois' population being 40% lower.
Many of you probably did not know the dire situation in Illinois. That's because no mainstream media chose to report it.
Here is how the media chose to report Illinois now (left) vs Florida in July (right).
Unfortunately, no national news outlet is covering the situation in Illinois.
No other state has ever averaged 12,000 cases a day for a whole week. Not even Florida (1.7x pop), California (3x pop), and Texas (2.3x pop).
For deaths per capita, Illinois also exceeded the peak deaths in Florida twice, once in May and once again now. So why is this not news?
Using this map, you can see that many counties in the Northeast and Northwest still have very low rates of prevalence, and thus are susceptible to a future wave.
A few more observations:
Zooming in to the Midwest, it seems like counties in Minnesota have a lower prevalence than its neighboring states.
Western side of Kansas also shows very high prevalence, but prevalence is many times lower just across the border in eastern Colorado.
If you're under 50, your odds of dying if you contract COVID-19 is ~0.013% or 1 in 8000. This is similar to the odds of dying in a car accident in a year.
BUT many infected people will go on to infect others. In this thread I'll explain why we cannot treat the two risks equally.
The current Rt for the SARS-CoV-2 virus in the US is ~1. This means that an infected person will infect, on average, 1 other person. That person will infect another, and so on.
After 3 months, ~20 people will have been infected that can be indirectly attributed to the 1st case.
If Rt increases to 1.2, then 130 people will have been infected after 3 months. All stemming from 1 infection.
The chance that at least 1 person among the 130 will die is non-trivial (~50%).
That's why we need to view COVID-19 as a *community risk*, not an *individual risk*.
We increased our forecasts over the past week after incorporating several new factors:
- A potential loss of immunity after >6 months
- Further relaxation of policies
- Increased interactions (school reopenings, return to work, etc)
- Plateau in cases/hospitalizations
There is currently a lack of consensus among the top models about the short-term deaths forecasts.
Our model and the COVIDhub ensemble model both suggest a possible plateau in reported deaths over the next few weeks.
This week's forecast is a slight uptick from past weeks' forecasts.
We believe new infections may be flattening at 2x the level it was back in May. That is a cause for concern.
Which direction new infections will go is still uncertain, at least from the data.
We may see cases plateau at around the 40k/day mark. Cases may increase in the near future, but it's unclear if that'll be due to backlog/increased testing or due to a true rise.
Hence, test positivity and hospitalizations are better metrics to monitor.