My Authors
Read all threads
Let me explain why using models to simulate epidemics basically doesn't work when it comes to predicting the future (and why it doesn't mean models are useless). And also why the “reproduction number” is a very problematic quantity. •1/36
When the epidemic starts, we see a nice predictable exponential growth. What we would like our models to tell us is when and how this exponential growth will slow down and stop, how many people will be infected and how many will die. Sorry, but that won't work. •2/36
Of course we have a model which can describe a simple epidemic with constant contagiousness in a homogeneous population: this model is called SIR. I talked about it here: threadreaderapp.com/thread/1236324… … •3/36
… or on my blog (in French, but Google Translate should do a decent job): madore.org/~david/weblog/… •4/36
… or in this more technical note (in which I compare SIR to a variant of it in which people recover in constant time rather than through an exponential process): hal.archives-ouvertes.fr/hal-02537265 •5/36
And of course SIR comes in many more sophisticated variants, like SEIR which takes into account incubation period, or variants with stratification by age, geography and so on. But whatever the variant, we still have some fundamental problems. •6/36
First, there are the unknowns on the medical side: we don't know how many cases of Covid-19 (and asymptomatics) there really are. Tests are being done but the results contradict each other in crazy ways, suggesting that there are more unknowns than we thought. •7/36
And of course, with the number of cases being unknown (and the number of deaths also unknown, though not to the same extent), the fatality rate is as well. Estimates vary from 0.1% to over 5%, which tells you how little we know! It probably depends on many factors. •8/36
Now what like SIR tells you is something like “the epidemic has exponential growth until a significant amount of immunity slows it down and then it peaks”. If you don't know how many cases you have, you don't know the immunity, so the model tells you little. •9/36
Now you think “maybe I can read this from the curve of past cases”, but no: the curve is far too noisy. And its exponential growth can slow down for a zillion different reasons, not least of which lockdowns or people changing their habits (I'll return to this). •10/36
But this is still not the end of the story. Another important aspect is that even if people don't change their habits and even if we know exactly the number of infections, there are still too many unknowns because of social effects. Let me explain this. •11/36
I already wrote a long thread with various simulations on this (and why it can lower the final attack rate), threadreaderapp.com/thread/1241745…‌, but let me try to explain from a different angle. •12/36
Consider the same epidemic in two different, idealized countries: in country A the population is fairly homogeneous, the epidemic follows the SIR model accurately; in country B, the population is divided into two different sub-populations: … •13/36
… population B₁ behaves just like country A, but in population B₂ the epidemic has a much lower contagiousness because these people have fewer contacts; B₁ and B₂ could be geographical (e.g., urban and rural) but they could also be social categories. •14/36
Now when you start observing the epidemic, countries A and B behave the same: but in country A the epidemic is infecting the whole country, in B it's only infecting B₁ and getting nowhere with B₂. Only you might not notice that because the stats don't tell you. •15/36
If it's urban/rural you'll probably catch on on the B₁/B₂ distinction, but if it's more subtle socioeconomic factors you might not at all be able to pick them in the data. So you think A and B are doing the same, you fit your constants from the models from that. •16/36
But then suddenly in B you reach the point where B₁ starts having significant immunity and the epidemic slows down, whereas in A you don't because you need the whole country to have immunity. And you end up with very different attack rates: … •17/36
… because in A the epidemic will infect x% of the population whereas in B it will infect x% of B₁ and almost nothing of B₂. Despite the fact that they seemed to behave identically and have the same parameters! But you were just observing B₁ and not knowing about B₂. •18/36
This, of course, is highly simplified, but shows the sort of things that can happen. When we talk about the “reproduction number” of an epidemic, this is a synthetic figure which is so synthetic that it becomes almost meaningless: … •19/36
… the epidemic is progressing in different subpopulations of the population with different dynamics, basically exponentials. But in a sum of exponentials you only see the exponential with the fastest growth! The rest are drowned in the noise. •20/36
So when we say we observe a reproduction number of 3, what this really means is that in the fastest-growing subpopulation the epidemic has this reproduction number. It tells us nothing of the size of this population or of the reproduction number in other subpopulations. •21/36
So unless we can gather extremely fine geographical and socioeconomic data on the people who are infected (which we do NOT have), we have simply no way of knowing how the epidemic is behaving in different subpopulations or how it will behave in the future. •22/36
And this whole problem is made even far worse by the fact that people adapt their behavior based on the news they hear and how afraid they are of the epidemic, not just on measures the authorities decree (lockdowns and such). •23/36
So if you see the exponential growth start to slow down, it's extremely hard to tell whether that's because some subpopulation is becoming significantly immune, or because people have changed how they behave. •24/36
Needless to say, there is no satisfactory modelling of how people react to news about the epidemic and change their behavior, or what effect these changes in behavior have on the contagiousness of the epidemic. •25/36
Oh yeah, did I mention that we don't even know how, where and by whom most infections happen? We don't. So even if we could accurately measure or predict how fewer contacts people have at work, or with friends, or in public transport, we still couldn't tell the effect. •26/36
So, in effect, trying to predict the epidemic's behavior using models is like trying to predict the weather two weeks from now using just a thermometer in a few major regional capitals. Oh, and the thermometers aren't even calibrated the same way! •27/36
(Because, yeah, various countries or states have entirely different ways of measuring the number of cases and even the number of deaths. And then they report them in various ways. And they don't even accurately document what they report. It's a mess.) •28/36
Of course you can try to build a highly sophisticated model with a zillion different parameters accounting for geographical and socioeconomic subpopulations, different reactions to the epidemic, medical complexities, subpar reporting, imported cases, the full monty. •29/36
But then you're basically building a model of all of society, and there's a reason why psychohistory doesn't exist: with too many free parameters, a model becomes simply useless predictionwise. You can fit anything by twiddling the parameters. •30/36
THIS DOESN'T MEAN MODELS ARE USELESS. They're just useless for making predictions. But they're useful for understanding the SORT of phenomena which CAN happen. Even if you have just a bunch of thermometers you can still theorize some interesting things about weather: … •31/36
… like what an anticyclone is and how it tends to behave. I've explained in texts linked above how “social effects can lower attack rates” and “constant recovery time can make the infection peak sharper”: I can't predict by how much, but these are effects worth noting. •32/36
But everyone running simulations on computer models, no matter how sophisticated (perhaps even MORE so if the models are sophisticated and have many parameters) should be VERY MODEST when putting forward what their models predict as an indication of the future. •33/36
(EVEN if they got it right so far: remember, a trivial model predicting an endless exponential growth will be correct up to a certain point, where it suddenly ceases to be correct. The trick is to predict that point and maybe you can't do it better than a hunch.) •34/36
Similarly, anyone using the term “reproduction number” should be aware that it's about as informative as the mean Earth temperature when talking about the weather: you don't know what subpopulation you're measuring it for, nor what influences it. •35/36
What worries me is that politicians will listen to epidemiologists who brag most loudly about their model's quality just like they tend to listen to physicians who brag most loudly about their miracle cure. And they might not be the most competent! •36/36
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Gro-Tsen

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!