Thread by @adamjkucharski on Thread Reader App

Unfortunately, this oft-quoted Spectator tracker of COVID 'scenarios vs outcomes' seems at best muddled and at worst actively misleading. A thread on some weird comparisons - and how to do better critiques...
data.spectator.co.uk/category/sage-… 1/

First plot is comparison of scenarios for increased R values with later data. Crucially, these weren't predictions about what R would be (R estimate that week was 0.9-1.1, so pretty flat epidemic). Rather, report showed what could happen if R increased beyond current range... 2/

Not sure why they cut out the R=2 scenario from original plot (which would've made it obvious these were illustrative - assets.publishing.service.gov.uk/government/upl…). TBF SPI-M plot could have included horizontal line to illustrate what R=1 looks like, but don't need a model to draw a flat line... 3/

In any case, the upper scenarios were not considered likely by members, as reported at the time: 4/

Next plot is July scenarios for July 19 reopening vs actual outcomes. Not sure why tracker cut out the crucial uncertainty intervals? Remember, uncertainty intervals aren't some superfluous addition to scenarios - they *are* the scenarios. 5/

The plot also focuses on bed occupancy, despite admissions being the metric shown across all scenarios. But picking this metric for comparison (rather than admissions, deaths etc.), it means several scenarios are omitted - including all of those from Imperial... 6/

And if bed occupancy was the preferred metric, why not show all the plots that show scenarios for this metric (e.g. fig below from assets.publishing.service.gov.uk/government/upl…)? Why cut these ones out? 7/

Or even better, why not try and match up the parameters in the different scenarios (e.g. mobility, vaccine effectiveness, transmission) with subsequent known values from empirical data (or even re-run the model itself) to see how well things match up? 8/

Trying to extract the most relevant scenario would make it possible to flag any systematic biases in model estimates when using 'ground truth' parameters, allowing for better discussions about what aspects of scenarios are more/less reliable in future. 9/

It's also strange to cut off plot at the end of September, just as some scenarios start to dip below actual hospitalisations - which remained flattish for prolonged period in reality as caution persisted. Why pick this arbitrary cropping, rather than showing rest of autumn? 10/

Next plot shows modelling scenarios for 'nothing changes' in autumn 2020 vs what actually happened (i.e. two lockdowns). So fundamentally a redundant comparison, because models weren't trying to estimate impact of a lockdown scenario... 11/

I mean, the report was pretty clear that these weren't predictions (assets.publishing.service.gov.uk/government/upl…) - that would have meant trying to predict policy decisions, which doesn't make sense for analysis designed to inform decisions... 12/

But if we did compare totals in the plot, we'd find central model estimates between Oct and April ranged between 100-230k deaths, compared to 75k in reality (after lockdown, Alpha & vaccination). Think about that for a moment... 13/

If I asked you to estimate how many deaths control measures & vaccines averted last winter, what would you say? Above totals suggest control measures & vaccines prevented at least 25-155k deaths last winter (because models weren't accounting for higher impact of Alpha)... 14/

Next plot compares reopening scenarios with hospitalisations in 9 June report. Again, uncertainty intervals have been cut out for some reason. 15/

And looks like something odd has happened with the data extraction, e.g. compare middle estimate (left) with model output (blue line, right). Also, again strange to cut off plot at end of July, immediately after reopening - why not show later dynamics? 16/

TBF on data side, would be helpful to have more routine release of underlying values behind plots (as in UKHSA technical reports). Also, see earlier point about extracting scenario that matches now-known parameters to get a better idea of underlying model performance... 17/

Next plot is simple case scenario vs outcome from Sep 2020. In reality growth was indeed not sustained at fast level, likely influenced by incoming measures and underlying behaviour change (as we've seen repeatedly during pandemic)... 18/

However, it's odd to only quote the scenario for cases, rather than the accompanying warning of 200 daily deaths by mid-November - which actually turned out to be optimistic... 19/

I get that any visualisations have to make some design choices, but it is strange that pretty much every single choice made above ends up making models look artificially overconfident and pessimistic... 20/

Models need proper scrutiny and challenge, especially as COVID dynamics are influenced by feedbacks between population behaviour & policy - unlike weather, where analysis doesn't change the outcomes. But muddled comparisons don't help anyone, they just sow confusion & anger. /End

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll