My Authors
Read all threads
Let me try to explain a point which I think isn't widely known: in computing attack rates, the distribution of infectious contacts a person MAKES is less important than the distribution of such contacts they RECEIVE. Maybe we should care about “superspreadees”! •1/44
This is a subtle point, and somewhat counterintuitive, coming from random graph theory (which I'm not a specialist in, and I had a hard time understanding this). I'm afraid it might not be well-known to epidemiologists, who seem to focus on superspreaders. •2/44
(I came across this fact while trying to analyse the variant of SIR where recovery takes place in constant time instead of taking place according to an exponential process: hal.archives-ouvertes.fr/hal-02537265 — the point I wish to make is ¶4.2 in this note, … •3/44
… although it really doesn't have much to do with the rest of the note. My surprise was that changing the recovery distribution changes the distribution of the number of infections each person causes but doesn't change the attack rate at all! Why is that? •4/44
Also, what I found seemed to contradict the results of the paper arxiv.org/abs/2002.04004 so I was completely confused until I understood the point which I now wish to try to explain.) Let's see if I can explain it without being TOO technical (and not 100% rigorous). •5/44
We wish to model an epidemic through a random oriented graph and apply results from random graphs to say things about the epidemic. The graph will have people as vertices and “potentially infectious contacts” as vertices: … •6/44
… this means that we wait for the epidemic to fully run its course, then we draw an arrow from person x to person y when y was in contact with x while x was infectious in such a way that y would be infected if susceptible (but maybe they weren't). •7/44
So the set of people infected by the epidemic are the graph vertices which can be reached (following oriented edges) from patient zero whom I will call x₀. Let's call this the “down-set” of x₀. The proportion of vertices reached is the “attack rate” of the epidemic. •8/44
The “down-set” of a vertex x is the set of vertices reachable FROM x, or which are infected if x is (by a sequence of contaminations). But I'll also need to talk about the “up-set” of x, namely vertices from which x can be reached, or can infect x. •9/44
Now let's consider the following way to construct a random oriented graph: start with a huge number of people, and assume given a probability distribution on the natural numbers, which will be the distribution of OUT-degrees (or OUT-distribution): … •10/44
… for each vertex x independently, pick a random number using the specified (out-)distribution, which will define the number of edges leading FROM x, i.e., the number of people which x will infect. Then pick that many vertices independently at random, … •11/44
… and add edges from x to the nodes (:=vertices) in question. Do this for every node x. This defines a random oriented graph, whose out-degrees (:=the number of edges going OUT of a given vertex) follows the distribution prescribed at the start. •12/44
In particular, the average out-degree of a vertex will be (by the law of large numbers, very close to) the expected value of the distribution which was prescribed at the start. This average out-degree is the reproduction number of our epidemic, … •13/44
… or at least the average reproduction number, namely, the number of people each given person infects (if possible). Now of course the average OUT-degree of a vertex is the same as the average IN-degree (number of vertices pointing TO that node), … •14/44
… because both are simply given by the number of edges dividing by the number of vertices (this is what mathematicians call a “double counting” argument). But the distribution and variance of in-degrees need not match the distribution of out-degrees! •15/44
And indeed, with the random graph construction explained in tweets 10–12, the in-degree distribution will be a Poisson distribution. (A Poisson distribution is what you get by counting a large number of independent events with a fixed expected number.) •16/44
Indeed, for each given vertex y, each other vertex x has a certain probability of creating an edge x→y, the expected number is fixed, so we get a Poisson distribution when counting the number of x which connect to y. So Poisson is, in a sense, universally unbiased. •17/44
But we could construct random oriented graphs where we prescribe the IN-distribution (distribution of IN-degrees) instead of the OUT-distribution, or we could even prescribe both. I'll return to this. •18/44
Anyway, consider a (large, randomly constructed) oriented graph which we wish to use to model an epidemic. There are two important things we can wish to study: non-extinction probability, and attack rate. And they are very symmetric. Let me explain. •19/44
The non-extinction probability is this: choose a random vertex x₀ and ask yourself “if this is patient 0, do they infect a large number of people?”, i.e., is its down-set (set of vertices reachable from x₀) a large proportion of the graph? •20/44
This is basically the probability that, if the epidemic starts at a given point in the graph, it will indeed reach large proportions. (Here “large” means a number comparable to the total number N of vertices in the graph, not something like √N or so. … •21/44
… I'll discuss in a minute the actual proportion that “large” entails, but it's not too important: the epidemic either ends quickly or reaches many nodes (people).) Now this non-extinction probability depends crucially on the distribution of OUT-degrees. •22/44
Basically we have here what is known as a Galton-Watson process constructed on the distribution of out-degrees. Technically, if p_i is the probability that a node x connects to i other nodes, and G(t) := ∑ p_i t^i (expected value of t^i) is the generating function, … •23/44
… (so G(1)=1 and the average out-degree is G′(1)) then the extinction probability is the smallest t≥0 such that G(t)=t. Sorry, that was technical, but the point is, non-extinction probability (probability that the epidemic will really start) depends on OUT-degrees. •24/44
Now what about the attack rate? Well, there's a beautiful symmetry here: the attack rate is the probability that, if you consider a random vertex y, and look at it's “up-set” (set of vertices FROM which y is reachable), it will be large. •25/44
Indeed, in deciding whether a given vertex y gets infected, it doesn't really matter who patient zero is. What matters is (whether the epidemic really started, and) whether that vertex y is reachable from a large set of possible patients zero. •26/44
OK, I've been doing a lot of mathematical handwaving here (and will continue so), but I'm trying to communicate essential idea without getting into TOO many technicalities. Lots of things need to be made precise with e.g., limits when number or vertices tends to infinity. •27/44
But for those who want technical details, essentially, what I'm alluding to is this: arxiv.org/abs/1409.4371 (M.D.Penrose, “The strong giant in a random digraph”) theorem 4. His random graph model is the one I tried to describe in tweets 10–12. •28/44
Anyway, back to what I was saying. The symmetry is that if you REVERSE the graph (reverse the orientation of every edge), the non-extinction probability becomes the attack rate and vice versa, because down-sets become up-sets and vice versa. •29/44
So as I was explaining that the non-extinction probability depends essentially just on the distribution of OUT-degrees (and can be computed as a Galton-Watson process), the attack rate depends on that of the IN-degrees. •30/44
Again, the out-degrees and the in-degrees have the same average (expected) value but their distributions are otherwise unrelated. The graph construction in tweets 10–12 gives you a specified out-degree construction but Poisson in-degrees. •31/44
For a Poisson distribution with expected value κ as out-degrees we get a non-extinction probability, or, hence, for that distribution as in-degrees we get an attack rate, of 1−Γ = 1+W(−κ·exp(−κ))/κ where Γ = exp(−κ·(1−Γ)). •32/44
This (Galton-Watson non-extinction probability for a Poisson distribution) coincides with the attack rate predicted by the basic SIR model. I discussed this formula in this thread: threadreaderapp.com/thread/1236324… (and how to approximate it). •33/44
But for something other than Poisson we get completely different things. For an exponential (geometric) distribution with same expected value κ, we get 1 − 1/κ, which is much lower. For a Dirac distribution with constant integer value κ (no variance), we get exactly 1. •34/44
So: ✱the distribution of out-degrees (contacts MADE) is crucial in determining the non-extinction probability, and symmetrically, the distribution of in-degrees (contacts RECEIVED) is crucial in determining the attack rate✱, the reproduction number does not suffice. •35/44
The attack rate predicted by the simplistic SIR model is valid only if the distribution of in-degrees is Poisson. Qualitatively, the LARGER the variance in the in-degrees (for a GIVEN expected value = reproduction number) the SMALLER the attack rate will be. •36/44
We can construct random graphs with a given distribution of IN-degrees by taking the construction explained in tweets 10–12 above, and reversing it (construct edges TO the nodes rather than FROM them). •37/44
This can provide examples of epidemic graphs where the attack rate is very modest despite the average in/out-degree (reproduction number) being high. So it's essential to look beyond the reproduction number! •38/44
And also, insofar as we look beyond the reproduction number, if we care about attack rates (and I think we do…), it's more important to look at the variance in the distribution of IN-degrees (contacts RECEIVED) than OUT-degrees. •39/44
This suggests that we look at “superspreadees” (“superreceivers”? IDK: people at high risk of being infected; e.g., celebrities, politicians), not “superspreaders” (people at high risk of infecting others), as elements of the variance of the in-degrees. •40/44
I know it sounds paradoxical that “superspreadees” would LOWER the attack rate, but remember that this is for a GIVEN (measured) reproduction number: once such a person has been made immune, they are effectively removed from the epidemic. •41/44
This is related to the general phenomena I was discussing in this pas thread, although the exact connection isn't clear to me at the present: threadreaderapp.com/thread/1241745… •42/44
But it seems that epidemiologists have concentrated their attention to the variance in the distribution of OUT-degrees (superspreaders): nature.com/articles/natur… — which should have much more modest effects on the attack rate by the results I discussed above. •43/44
Now of course I don't know how much effect this has in real life! I don't know how much variance there really is and I can't make any predictions. All I'm saying is: this deserves much more attention, and this graph-theoretic phenomenon should be more widely known. •44/44
OK, since this thread has gathered a lot of attention, I need to add a few clarifications to clear up a few misunderstandings. First, I agree that, of course, there is certainly a high correlation between infectious contacts made and received! •45/(44+9)
So yes, many superspreaders are probably superreceivers (superspreadees, whatever you want to call it) and conversely. But insofar as they are both, it is the latter which matters for the attack rate, and insofar as they are not, it is the superreceivers who matter. •46/(44+9)
Second, the point I was making was that, for a GIVEN reproduction number (average number of infectious contacts either made or received), the attack rate will be lower if the variance of the number of contacts received is higher: … •47/(44+9)
… so yes, this is a bit complex. I'm not saying that superreceivers (superspreadees, whatever) decrease the attack rate, obviously they tend to increase the reproduction number R₀ so, the attack rate. But they do it LESS than their increase of R₀ suggests. •48/(44+9)
So it's not easy to translate what I was saying into policy decisions! I'm not saying something like “confine the superreceivers” or whatever. I'm mostly saying… modelling is HARD. Maybe this other thread says it better in general: threadreaderapp.com/thread/1249738… •49/(44+9)
Basically, I think models shouldn't be used so much to predict what will happen (this is just beyond our abilities) as to predict the SORT of effects that various phenomena CAN have. Like variance in the number of contacts a person makes or receives. •50/(44+9)
My concern is that people are giving simple models like SIR, which assume a homogeneous population with perfect mixing of contacts and Poisson distribution of contaminations received, too much credence while forgetting their limits and hypotheses. •51/(44+9)
And specifically, epidemiologists have focused on superspreaders and the effect of the variance of individual infectiousness, but I worry that they didn't take into account correlation with, or variance of, infectious contacts received. •52/(44+9)
More generally, there are many complex social effects in the prediction of epidemics which go far beyond the purview of SIR-like models. I discussed a few more in this past thread: threadreaderapp.com/thread/1241745… •53/(44+9)
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Gro-Tsen

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!