^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

OK, some more explanations about these curves modelling the attack rate as a function of the variance of infectious contacts, for a given reproduction number (here R₀=2.5), how to read and not to read them: •1/48 ^{}

I wrote a different thread, ^{}

, on the mathematics of how they were computed, but let me try to get across some more informal explanations and dispel some misunderstandings. •2/48
This is a very simple, even simplistic, kind of model! It assumes, inter alia, that the dynamics of the epidemic does not change with time (so the reproduction number is a constant), in fact, it doesn't even know about time. … •3/48
^{}

… But it allows some complexity in the form of heterogeneity / variance. The point of such “toy” models isn't to describe the real world accurately but to explore the effect of various phenomena one complexity at a time. I wrote about this here: ^{}

•4/48
So, what are these “infectious contacts”? By an “infectious contact” x→y, here, we mean a contact between x and y which ①takes place during a typical infectious period for x, ②regardless of whether y is infected, immune or not, and ③would make y infected if possible. •5/48
^{}

Clause ① is a bit tricky. If x is indeed infected at some point, we mean a contact while x is infected. But since we're only counting such contacts and assuming that the dynamic stays the same at all time (and ignoring time), the precise period doesn't matter. •6/48
^{}

(I mean, if we were to choose a different time period, it would give a different set of infected people, but since we only care about the number of people who get infected, this doesn't matter.) Even if x is never infected, pick an infectious period for them. •7/48
^{}

Regarding y, we ignore the fact that they may already have been infected and become immune. We just care that they would be infected by the contact if they were susceptible: this defines the concept of an infectious contact x→y. •8/48
^{}

The key point is that if x gets infected (at some point) and there is an infectious contact x→y, then y gets infected. So the set of people who get infected is the set of all people reachable from some infectious seed (patient zero) x₀ in this “infection graph”. •9/48
^{}

Then we try to use mathematical results on random graphs to count the proportion of people who get infected, i.e., the attack rate of the epidemic. The key figure, of course, is the number of infectious contacts per person. •10/48
^{}

The number of infectious contacts x→y for a given x, i.e., the number of people whom x infects if x is indeed infectious at some point, is called the out-degree of x. The number for a given y is called the in-degree of y (more tricky to interpret!). •11/48
^{}

Here's an obvious but important mathematical fact: the AVERAGE out-degree equals the AVERAGE in-degree. This number is called the “reproduction number” of the epidemic and denoted R₀ by epidemiologists (who really suck at notation btw). •12/48
^{}

Since we're working in a timeless model, this R₀ is a constant: we're not studying variations in time. But we can study variations across individuals: not everyone has the same out-degree or in-degree. The measure of deviations from the mean is called the “variance”. •13/48
^{}

The variance σ² is the mean of the square of deviations from the mean; and the standard deviation σ is the square root of that, i.e., the quadratic mean of deviations from the mean. Higher variance means values tend to be spread farther away from the mean. •14/48
^{}

So even if R₀=2.5 meaning that the average person makes or receives 2.5 infectious contacts in average, some will make fewer (or none) and some will make more. A variance of 0 would mean that everyone makes exactly R₀, which is obviously impossible. •15/48
^{}

People with high out-degree (and who indeed get infected) have been called “superspreaders”: their existence increases the variance of out-degrees (of course, they increase the mean! but for a given mean, greater variance means more superspreaders and more subspreaders). •16/48
^{}

Now two fairly surprising facts are that ⓐeven though out-degrees and in-degrees have the same mean, their variances are unrelated, and ⓑeven though the epidemic spreads OUTward, the variance of IN-degrees matters much more to computing attack rates (final spread). •17/48
^{}

(The variance in the OUT-degrees matters for something else, namely the probability that the epidemic doesn't die out quickly. But here we care about an epidemic that, evidently, didn't die out quickly, so out-degrees don't matter.) •18/48
^{}

I pointed this out in a past thread, threadreaderapp.com/thread/1252581… — by using the word “superspreadees / superreceivers”, but it was a bit misunderstood (as if superreceivers were a bad thing or something to worry about). •19/48
^{}

So remember that we're studying the effect of variance for a GIVEN average (vary one parameter at a time!). If you increase the number of superspreaders or superreceivers you also increase the number of subspreaders or subreceivers. •20/48
^{}

The concept of the out-degree of x is fairly easy: look at the infectious period, count how many people x has close contact with, weigh it with some probability of infection, it's quite natural. The concept of in-degree of y is a bit trickier: … •21/48
^{}

… mostly because it depends on x (not y) being in their infectious period. It's not the total number of contacts that y had, but of course, it will tend to increase with that number: people with more contacts in general have more infectious contacts bout in and out. •22/48
^{}

I can think of several causes for variance of in-degrees: one is simply the variance in the total number of contacts, as I just said. But there could be variance in behavior (some people will be very strict about hygiene, others less so). •23/48
^{}

And maybe there could be medical differences: I'm not a physician so I don't know if this is relevant in this case, but in some diseases some people are more susceptible than others, meaning more likely to BE infected: this will make their in-degrees higher. •24/48
^{}

But there's a point I have to emphasize because several people asked me about the point σ=0 on my curves: even if everyone acts the same, there's a sort of “baseline” variance, the Poisson variance, and going below that is very strange and unnatural. … •25/48
^{}

… Imagine you have a million people and a million boxes, and you ask everyone to take as many balls as they want from a huge bag, and place them in boxes. They place 2.5 balls on average (out-degree), so each box gets 2.5 balls on average (in-degree), … •26/48
^{}

… but even if the boxes are absolutely identical and chosen at random by the participants, there will be random variations in the number of balls (of course!). In fact, the number of balls will follow a Poisson distribution with average 2.5 and that has variance σ²=2.5. •27/48
^{}

Now if some boxes are more attractive than others you can expect to have a higher variance, but it would be extremely unusual to have a LOWER variance than Poisson. Not impossible, but bizarre. People have to look inside the boxes or something of the sort. •28/48
^{}

So back to my epidemic graph, we expect σ²≥R₀. Anything less would be very unusual (but one example would be “infection by mortality” where (nearly?) everybody gets infected by two people, their parents, so R₀=2 [in a steady population], σ=0 and everyone is infected). •29/48
^{}

The point where σ²=R₀ (namely (1.581, 0.893) on my curves: this is where the blue and red curves meet) is what I call the “Poisson point”: the point where the in-degrees are Poisson-distributed, as in “everyone behaves the same and contacts are at random”. •30/48
^{}

The attack rate we get for that (so, 89% for R₀=2.5) coincides with the attack rate predicted by the classical SIR model, which I discussed in threadreaderapp.com/thread/1236324… (formula: 1 + W(−R₀·exp(−R₀))/R₀) ≈ 1 − exp(−R₀). The two models coincide there. •31/48
^{}

But now we can see what happens when the degrees have a different variance σ², while keeping the same average value R₀=2.5. This is what my plots are about. Of course, there is some fine print! •32/48
^{}

It's not just the variance that matters but the full distribution of degrees. I chose a family of distributions that allows me to keep the same mean while varying the variance: the binomial and negative-binomial (aka Pólya) distributions, with Poisson as limit case. •33/48
^{}

I won't get into the details of why I think this makes sense, and it might be worth investigating how much dependency there is on that. Then there is the whole matter of the random graph model and how we compute attack rates in function of all that: … •34/48
^{}

… again, this is discussed in the other (long!) thread I wrote about this, ^{}

— and even there I had to gloss around many details. This present thread is meant to be less technical, so let's not get into this. •35/48
The difference between the blue curve and the red curve is that the blue curve uses a directed graph model (which is mostly what I described so far: x→y means x potentially infects y) whereas the red one uses an undirected graph model (all arrows go both ways). •36/48
^{}

The main effect of the undirected graph model (red) is essentially that when x infects y, since it's completely symmetric, there is a higher probability that y has high degree, so they will in turn infect more other nodes. … •37/48
^{}

… Whereas in the directed graph model (blue), if x infects y, y has a higher probability of having a high in-degree, but in-degrees and out-degrees don't correlate and there's no particular reason for y to infect more nodes. This explains the difference. •38/48
^{}

Both models are simplistic (see tweets 3–4 again!) but I would tend to say that the directed graph model is more realistic to describe the effect of variance in intrinsic susceptibility, while the undirected model would describe variance in social contacts. •39/48
^{}

Of course, another thorny question is how much variance we should expect. As I pointed out above, we certainly expect σ²≥R₀, i.e., σ≥√R₀. But how much more? I don't know! •40/48
^{}

Besides the Poisson case σ²=R₀, there is another point of special significance, σ²=R₀(R₀+1). This corresponds to a “geometric” distribution of contacts. In the case of the OUT-degrees, a geometric distribution is fairly natural: … •41/48
^{}

… the classical SIR model assumes recovery to follow an exponential distribution and we then get a geometric distribution on out-degrees. For IN-degrees, I can't think of a really natural model that would give a geometric distribution, … •42/48
^{}

… but I noticed that if we assume such a distribution, the attack rate computed by this (directed graph) model, i.e., the point with σ=√(R₀(R₀+1)) on my curve, coincides with the herd immunity threshold 1 − 1/R₀, for the classical SIR model. •43/48
^{}

This is probably not a coincidence, and even though I don't have an explanation for this, I think it suggests that σ = √(R₀(R₀+1)) ≈ R₀ is somehow a natural value as well. This might hint at an order of magnitude. •44/48
^{}

Note that there is nothing intrinsically wrong or contradictory about having σ > R₀, even though it may be surprising to write that each person has, say “2.5 ± 5.0 contacts” and the number of contacts is always nonnegative (duh): … •45/48
^{}

… it just means that we have large values and enough very small values (including 0: people who receive 0 infectious contacts obviously don't get infected) to make up for the same average. The distribution will be skewed, but it can happen. •46/48
^{}

However, the fact that this is logically possible doesn't make it realistic. I don't know if it is. I'm not a sociologist, I'm not a physician, and even those who are lack accurate data (measuring R₀ is hard enough, out-degree variance is harder, and in-degree, ugh!). •47/48
^{}

So I think the curves should be taken as an indication of the fact that variance in number of contacts matters, and give an order of magnitude of how much we might expect from this effect, but not much beyond that. This is clearly not intended as a forecast of any kind! •48/48
^{}

Missing some Tweet in this thread? You can try to force a refresh.

** Keep Current with Gro-Tsen
**

** This Thread may be Removed Anytime!**

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll"
`@threadreaderapp unroll`

You can practice here first or read more on our help page!