There's something really fishy going on with “effective reproduction numbers” in various analyses of this pandemic. And I'm not talking about predicting the future, I'm talking about analysing the past. Here we see a claim of R~3 for Belgium. •1/21
On the other hand, researchers at the Centre for Mathematical Modelling of Infectious Diseases have a web site computing R(t) for many countries, and their estimates for Belgium never went above 1.5 in the computed time frame: epiforecasts.io/covid/posts/na… •2/21
Now briefly speaking, computing R(t) depends on two things: the exponential rate of growth r (=logarithmic slope, =logarithmic derivative) of the number of new cases, and temporal data on infections, notably the “serial interval” and its distribution. •3/21
If the serial interval d is constant (each infected person infects R new always exactly d days after infection), we can compute R very simply as the number of new cases at date D divided by the same number at date D−d. This is because R=exp(r·d). Simple. BUT! •4/21
First, the data aren't perfect (e.g., there's a strong weekly bias in the number of cases, and reporting isn't instantaneous); second, the timing is variable (btw infection and spreading, btw infection and test, and btw test and reporting): serial intervals vary. •5/21
AFAIU, many people, including France's own health agency, simply compute a “reproduction number” by the simplistic formula above with d=7, i.e., their “R” is: number of cases at date D ÷ number of cases at date D−7 (both numbers averaged over 7 days, I guess?). •6/21
This kind of “reproduction number” is just a rate of exponential growth converted by assuming a constant serial interval. It's useful for simply viewing whether cases are on an upward or downward trend, and how fast, comparing places; but it has many problems. •7/21
The first is that the value will be very noisy. Maybe the real R is also very noisy, but its short-term variations probably bear absolutely no relation to the rapid variations of the value thusly computed. See
… I'm pretty sure all these rapid variations are just pure noise, due to bad data to start with, and difficulty in removing underlying noise and random timing variations. •9/21
The second problem is that, if the serial interval isn't constant (people don't take a constant time d to infect others), the relation between R and the exponential slope r of infection becomes more delicate than simply R = exp(r·d) … •10/21
… which underlies the computation of R for constant d as a simple ratio of new cases d days apart. Instead, we must average over the distribution of the serial interval. For example, imagine ½ of infections occur after d₁ days and ½ after d₂ (with d₂>d₁ say): … •11/21
… then R will be 1/[½(exp(−r·d₁)+exp(−r·d₂))] (compare Euler-Lotka equation), so even in a constant and steady exponential growth, the d we should use to compute R is −log(½(exp(−r·d₁)+exp(−r·d₂)))/r, which depends on r: … •12/21
… essentially, when r (or R) is large, fast infections (d₁) dominate slow infections (d₂) whereas if r (or R) is small, it's the other way around. So we can't simply use some “average” d value and divide values d days apart, … •13/21
… because doing so will overestimate R when it is large, or underestimate when R is small (or both, depending on what kind of average d has been chosen to be). So in effect, it artificially amplifies the variations of R, even with perfect data. •14/21
Now the “sophisticated” way to compute R(t) is more delicate: we use an a priori distribution on it, hypotheses on the distribution of the serial interval (not assumed constant, that's the point), predict values and condition by the observations made, … •15/21
… and then we get an a posteriori distribution on R, of which we generally retain the maximal likelihood value (plus uncertainty bounds). This is well explained here (in French):
Of course the real nitty-gritty details are, well, nitty and gritty. There are subtly different ways to do the computation, implemented in various packages for the (abominable) “R” stats program, such as EpiEstim cran.r-project.org/web/packages/E…; and they are difficult to use. •17/21
(I thank @jeuasommenulle for giving me a start on how to actually use these tools, which I still need to investigate more fully, both in theory and in actual practice.) And of course we need hypotheses on the distribution of the serial interval, which isn't well known! •18/21
All this is to say, computing R(t) is a more delicate matter than dividing two numbers, even though many people are content with this. How it's done doesn't matter to extrapolate a simple exponential growth, of course, but it matters if we try to look further than that. •19/21
So, bottom line, if you see an estimate of an “effective reproduction number” you should question how it was computed: is it a simple quotient or a maximum likelihood? And crucially: what hypotheses have been made on the serial interval? •20/21
When in doubt, I would tend to consider as more accurate the values computed by the CMMID at epiforecasts.io/covid/ — their inputs won't be magically better, uncertainties are huge, and their graphs aren't updated as often as one might wish, but I trust their expertise. •21/21
PS1: Correction: it seems that the French health agency doesn't use a simple quotient but a more sophisticated method, running the computation under Excel (😭). Details, however, are still very unclear on what exactly they do. •22/(21+2)
PS2: I also didn't discuss the issue that R(t) can be computed from various input sources (lab positives, symptomatic cases, hospital admissions, serious cases, or even deaths), all of which will have their own specific issues. •23/(21+2)
Il y a 160 ans, les soldats français et britanniques, pour punir la Chine de refuser qu'on lui deale de l'opium, ont pillé puis détruit ce qui était certainement une des plus grandes merveilles de l'art palatial et horticole, le Yuánmíng yuán ou Palais d'été des Qīng.
Victor Hugo écrit assez bien son indignation: «Les artistes, les poètes, les philosophes, connaissaient le Palais d’été. […] Cette merveille a disparu. Un jour, deux bandits sont entrés dans le Palais d’été. L’un a pillé, l’autre a incendié.» (monde-diplomatique.fr/2004/10/HUGO/1…)
Cette date d'octobre 1860 est évidemment bien connue des Chinois, mais je me demande à quel point elle l'est en France et au Royaume-Uni. Elle est pourtant, et continue d'être, géopolitiquement importante.
Mais bon, soyons compréhensifs, peut-être qu'il s'agit d'un simple malentendu: les préfets ont pris le Journal Officiel pour un recueil de blagues et cherchent à détendre l'ambiance en publiant celle qui fera enfin rire les Français dans cette période difficile.
Non mais faut vraiment lancer un petit concours «imaginez la prochaine restriction ridicule dans votre département». Attention, pour être admissible à concourir la mesure proposée doit être totalement inefficace et doit inclure une amende de 135€!
On peut bien sûr déduire de cette observation de R ~ 1.2 quasiment constant depuis des mois une estimation (au moins en ordre de grandeur) de l'ampleur de la seconde vague de covid qui attend la France: ⤵️ •1/11
Tout le monde est resté sur l'idée que le R₀ de covid est autour de 3 (donnant un seuil d'immunité grégaire naïf de 1 − 1/R₀ ~ 65%), mais manifestement, pour la France, quelles que soient les raisons (je ne me l'explique pas), en ce moment, c'est à ~1.2 que ça se passe. •2/11
(Et comme je le signale dans le fil lié ci-dessus, les mesures prises par le gouvernement n'ont pas l'air d'avoir eu un impact sur cette reproduction. Elle est peut-être passée de 1.22 à 1.15, l'incertitude est grande, mais en tout cas c'est toujours autour de 1.2.) •3/11
Voici l'estimation de la variation du nombre de reproduction de la covid en France tirée de epiforecasts.io/covid/posts/na… à ce jour. Une mesure publique efficace devrait se manifester comme une baisse soudaine et/ou nette de cette courbe. Hmmm… 🤔
Faudrait étiqueter ce graphique par les mesures qui ont été prises pendant ce temps: masque obligatoire à l'extérieur, fermeture des bars à 22h, fermeture des salles de sport, couvre-feu (bon, pour ça il faut attendre un peu).
Noter que ce calcul de R effectif est fait par les gens du CMMID qui savent utiliser le package EpiEstim (cran.r-project.org/web/packages/E…), niveau 0 de l'épidémiologie, par opposition à SPF qui doit juste faire: nombre de cas à la date D ÷ nombre de cas à la date D−7 (niveau −1).
Here's a plot of the most populated European countries, showing
⁃ abscissa = peak covid deaths/day/1Mhab (7-day avg) during 1st covid wave,
⁃ ordinate = ratio of current covid deaths/day/1Mhab (last 7 days avg) to this 1st peak value, in log scale.
In other words, the abscissa is how bad the first wave was (in terms of deaths per inhabitant during a single day), the ordinate is how the current state of the second wave compares to the peak of the first.
I just watched this video on why England has £1M (“giant”) and £100M (“titan”) banknotes, and I'm left with nothing but questions. •1/14
Their explanation is basically that Scottish and Northern banks are still allowed to print their own notes, but they are required to deposit the same amount at the Bank of England as security, to ensure that these private banknotes keep their value in case of collapse. •2/14
Superficially this makes sense. But the more I think about it, the more absurd and confusing it becomes. First, why store it in paper form and deposit it in high security vaults… at the Bank of England itself? Why not store it electronically? •3/14