My Authors
Read all threads
Earlier today, we put out a preprint that asked: how do we design and analyze SARS-CoV-2 seroprevalence surveys? @yhgrad wrote a lovely explainer thread, linked here. ] I want to highlight some #stats and #networks results. 1/
First, the basics. This paper's 1st result is like a statistical inference midterm problem: If you observe n+ positive tests, n- negative tests, and you know the sensitivity/specificity of your test, what is the posterior prob. of actual positives? Solution: Bayes' rule. ✅
Now a practical problem: The posterior looks like a binomial posterior, but due to sensitivity/specificity, we end up with incomplete beta functions & small things raised to n+ and n- powers & can't invert the CDF. Solution: take logs and use accept-reject algm to sample.✅ 3/
The result is something that uses some prob/stats classics, and is therefore fast enough to do Monte Carlo in your browser, as Sam demonstrated when he made this interactable. 🔥larremorelab.github.io/covid-serology 4/
Next, the fraction of people who are seropositive can be an input to a dynamical model! But because of the previous step, what would *typically* be a fixed model parameter is now a random variable. When your parameter becomes a distribution, your output does too... 5/
This is often called a "compound probability distribution," and here's how it works: take data X and infer the posterior over seroprevalence θ. Then integrate that against the distribution/function that maps inputs+model to outputs. ✅ Here are two examples. 6/
That means that high/low uncertainty in your sampling procedure can be "pushed through" the model to find high/low uncertainty in model outputs. Here's the figure from the paper where the final output was epidemic peak height/timing. 7/
Finally, something for the #networks and #modeling people out there, which was a discovery (to me, but if you know refs, please share). Problem: you have n serological tests, but you have to decide which subpops to allocate them to, and your goal is to get better model estimates.
The thing is: some subpopulations have greater influence on epidemic trajectory than others. Intuition says we should spend more of our budget sampling on them; but how much? We answered this by thinking of part of the epi model as a network... 9/
The "next generation matrix" of a model is like a network of amplifying flows between subpops of infected people. [Its Perron-Frobenius e-val is R0 in fact.] Applying that matrix to a vector of infections in subpops will make next-gen infections more parallel to the PF e-vec. 10/
[This is why the power method works.] But it means that subpops with larger PF eigenvector entries are more influential on dynamics. A related Lagrange multipliers result can then be used to show: allocate samples to subpopulations proportionally to the PF eigenvector entries.11/
We called this "Model & Demographic Informed" sampling (MDI), but from a networks view, it allocates samples by *eigenvector centrality* (plus a term reflecting existing estimates of seroprevalence). "See supplement" as they say. But there's one more implication worth noting.12/
The next-gen-matrix contains the age contact structure of the population, so *if you want to model interventions* [closing schools, soc. distancing], you change the matrix, change the e-vec, and therefore should allocate samples differently! Variation by country too. [Figure] 13/
This MDI allocation should help reduce the uncertainty that we get from scarce and unreliable serological test kits by strategically allocating them, particularly when there are interventions that policymakers are considering. We hope these results will be useful in study design.
This is a preprint, so if you spot issues with the math or otherwise, please chime in. It's important that we get this stuff right. But the techniques here involved some classics and some new connections for me, so I wanted to share. Thx: @baileyfosdick @jugander Arjun Seshadri
Thanks of course to tireless colleagues too: @baileyfosdick @yhgrad @Caroline_OF_B @CJEMetcalf @bubar_kate Sam Zhang @StephenKissler. Plus the @BioFrontiers IT/HPC staff, @overleaf for their pandemic account upgrades, & @_nickdavies for sanity checks and his #openscience help.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Dan Larremore

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!