As a companion to our new paper in @NatureMicrobiol we've opened up the Host-Virus Model Database, the @viralemergence team library of studies that try to predict the host-virus network. How does it work? 🧵
In our new paper, we define a taxonomy of six types of models that try to predict the host-virus network; in practice, they don't always look and feel like network questions (e.g., do some mammals have a higher richness of zoonotic viruses?) nature.com/articles/s4156…
We outline six big model "shapes": predicting host-virus associations; host / reservoir / vector identification; predicting zoonotic potential; predicting viral sharing; analyzing viral host range and host viral richness. Plus, some odd ones out (e.g., viral transmissibility)
You can think of HVMD as an annotated library of studies that apply statistics or machine learning to each of these different problems, including some information on the modelling - what data did they use? Which methods? What predictors did they try?
Let's see it in practice. Take this question from @b_longdon: what predicts (1) viral zoonotic potential and (2) viral host range? Both of those are network questions, and both have a pretty extensive evidence base in HVMD.
So drop into the AirTable, select "Host range" and "Zoonotic risk" studies, select studies that include viruses, and...
....there's 23 studies that use everything from logistic regression to reverse complement neural networks! An easy starting place when you're kicking off a study and looking for ideas, avoiding pseudoreplication, or just learning about how viruses work.
Since we started writing our study, this field has exploded. We're entering new studies all the time (and you'll see there's a bit of a backlog too) so keep checking back in, and let us know if you want to help / your paper should be in here!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
One last thing before I drop off for the holidays… a bit of behind the scenes on our new paper
Before there was a thing called Verena, there was me and @Gfalbery, some long chats over beers, and 12% of an idea for something called VirusNet - a team that would stop duplicating efforts and pseudoreplicating the same analyses of the host-virus network, and go far together
It’s absolutely bewildering to think this is the first thing @viralemergence worked on and that it’s finally, finally out there.
And now, a mega-thread: If you've ever wondered what connects all our work at @viralemergence, our new paper in @NatureMicrobiol ties it all together. No, really. It's all one thing. Want to step through the Verena Cinematic Universe together?
Our team uses big data, statistics, and machine learning to understand "the science of the host-virus network", a broad methodological problem that includes a number of smaller, more applied problems. nature.com/articles/s4156…
If you want to study the host-virus network, you need data. But as @roryjgibb & co. showed, existing datasets are full of taxonomic inconsistencies and conflicts. We needed a synthesis. academic.oup.com/bioscience/art…
With machine learning and network science, we can start to recover the source code of the global virome. We wrote an instruction manual - out now in @NatureMicrobiol - and along the way, we've tried to solve what it would mean to really "predict and prevent the next pandemic" 🧵
Practically every big question about viral ecology, evolution, and emergence - from "why do bats have so many deadly viruses?" to "can we spot a pandemic flu before the first human case?" - is a variation on a fundamental scientific challenge: predicting the host-virus network.
Over three years of research, we've compiled these kinds of studies into a unified framework, allowing us to put our finger on a new convergence science - "the science of the host-virus network" - that uses computational inference to understand viral biology across scales.
We're learning today that alphacoronavirus 1, previously not known to be zoonotic, jumped from dogs to humans > a year ago. Key lessons for where COVID-19 has been pointing us the wrong direction 🧵
1⃣ Singular focus on wildlife trade / wildlife farming as a human-animal interface is a mistake, given other natural pathways of emergence like, here, pet dogs or cats (probably).
(This doesn't mean we have to start getting rid of pets / livestock to prevent pandemics! The point of building strong healthcare systems - including One Health monitoring systems that include vets - is to stay safe by catching these kinds of events early and often.)
Something missing in a lot of viral ecology / "stop pandemics at the source" work right now:
If you're not including flu in your schema for pandemic risk, you're not actually talking about pandemics. You're talking about general disease emergence, not pandemic preparedness.
Before COVID-19, the answer to the question "what's the next pandemic most likely to be" was influenza. After COVID-19? It's actually still influenza believe it or not
So much of how we respond when a new virus emerges in a new pathway is to try to hyperfocus on sealing that entryway. But it's a bit like only locking the specific window a burglar came into your house through, and not checking the front door.