Bloom Lab Profile picture
Feb 9, 2022 32 tweets 13 min read Read on X
I'd like to add my preliminary thoughts on a new pre-print by Istvan Csabai & Norbert Solymosi that is receiving a lot of attention because of speculation it might contain new data relevant to origins or early spread of #SARSCoV2 in China: researchsquare.com/article/rs-133… (1/n)
This is actually second pre-print by these authors on topic. I initially heard about first pre-print on Dec-23-2021 from @carlzimmer, who had noticed it: researchsquare.com/article/rs-117… (2/n)
That first pre-print describes analysis of metagenomic samples collected in Antarctica in 2018-2019 that were subsequently sequenced and found to contain #SARSCoV2 reads. The pre-print suggests these might by early #SARSCoV2 reads based on the mutations they contain. (3/n)
After hearing about the pre-print from @carlzimmer, I downloaded the samples and performed the analyses myself. A GitHub repo with my computer code, results, and notes about the analysis / timeline is available here: github.com/jbloom/PRJNA69… (4/n)
My analysis confirmed main findings of first pre-print: some samples did contain #SARSCoV2 reads, with most reads in 3 of 11 samples. In addition, some reads contained three key mutations: C8782T, C18060T, and T28144C, although there is clearly a mixed viral population. (5/n)
Those three mutations are intriguing because they are all "ancestral" mutations that move the sequence *closer* to the bat CoV relatives RaTG13 and BANAL-20-52 relative to first reported Wuhan-Hu-1 sequence from the Huanan Seafood Market. (6/n)
A virus with those three mutations relative to Wuhan-Hu-1 is one of the two plausible progenitors for all currently known human #SARSCoV2 (the other plausible progenitor has C29095T rather than C18060T). See academic.oup.com/mbe/article/38… and academic.oup.com/mbe/article/38… (7/n)
This fact suggests that some sequencing reads come from a virus genetically ancestral to the known sequences from the Huanan Seafood Market, although the stochastic nature of viral mutations means that a more genetically ancestral sequence is not always temporally earlier. (8/n)
In early January, I was contacted by lead author of pre-print, Istvan Csabai. He reached out because the three samples with most #SARSCoV2 reads had just been deleted from @NIH's Sequence Read Archive, reminding him of my paper on deleted sequences: academic.oup.com/mbe/article/38… (9/n)
I confirmed the sequences had been deleted, and archived weblinks showing the original, deleted, and then subsequently restored (see below) pages for the samples are linked in the README in the GitHub repo I created: github.com/jbloom/PRJNA69… (10/n)
Istvan showed me info he had received from Chinese scientists who deposited the sequences. The samples were submitted for sequencing by Sangon Biotech in Dec 2019, & they received results in early 2020. Suggests #SARSCoV2 reads from contamination at Sangon Biotech. (11/n)
I agree this is almost certainly the explanation. Contamination at large-scale sequencing facilities happens, and can be due to index hopping / mis-assignment or physical cross-contamination. In this case, former more likely due to bias towards #SARSCoV2 in read 2. (12/n)
Timeline matters a lot here. According to Chinese scientists, samples submitted in Dec 2019, results received in early 2020. If they were sequenced in Dec 2019 then exceptionally important, because Chinese govt holds #SARSCoV2 not discovered until Dec 30-31... (13/n)
... On other hand, if sequenced in early 2020 then they could be contaminated with some early patient samples and still concord with Chinese govt timeline. Right now it doesn't seem there is enough info to narrow down timeline to distinguish between these. (14/n)
Istvan also explained to me the unfortunate fact that their first pre-print (which is very rigorous and matter-of-fact) was rejected by @biorxivpreprint, which is why they had to post it to Research Square where it got less notice. (15/n)
Shortly thereafter, Istvan & Norbert performed some ingenious further analyses that are basis for their second even more intriguing pre-print researchsquare.com/article/rs-133…, which was unfortunately also rejected by @biorxivpreprint. (16/n)
In their second pre-print, they analyzed *host* reads alongside viral reads and found they came from human, African Green Monkey & hamster. (17/n)
I have *not* yet independently validated these host analyses and so cannot vouch for them, although they appear solid from textual description. Assuming they are correct, the presence of these host reads in the samples is intriguing. (18/n)
Obviously, none of these hosts from Antarctica metagenomes & abundance of host reads parallels abundance of #SARSCoV2 reads. The hosts are interesting: Vero cells are African Green Monkey; CHO cells from hamster, & also hamsters themselves used to study #SARSCoV2 (18/n)
This fact suggests some #SARSCoV2 reads from samples in Vero & hamster cells (or hamsters). Again significance depends on timeline. #SARSCoV2 in Vero cells in Dec 2019 inconsistent with current account of viral origins, but WIV had virus in Vero cells by early to mid Jan (19/n)
Without knowing the sequencing timeline more precisely than the current "December 2019 to early 2020," all we can say is that these samples were contaminated at Sangon Biotech with some early #SARSCoV2 viruses, some of which appear to have been from lab-grown samples. (20/n)
This is obviously super interesting, and I hope further analyses or additional data can shed more light. (21/n)
One postscript: After being deleted from @NIH Sequence Read Archive in early Jan 2022, data were restored later that month. I asked Chinese authors & they did ask to have #SARSCoV2 contaminated samples deleted, but did *not* to have them restored. So unclear what happened (22/n)
To clarify, mutations C8782T, C18060T & T28144C towards RaTG13 / BANAL-20-52 *not* unique to these samples. They are also in some other non-market lineage A viruses (see Tweet 7/n). So suggest a sequence about as ancestral as oldest known ones, but not more ancestral (23/n)
Also to more strongly clarify another point above, these samples are almost certainly contaminated with a *mix* of different #SARSCoV2 samples as indicated by presence of multiple non-fixed viral mutations and reads from multiple host species. (24/n)
Another postscript: this comment @acritschristoph posted on the Antarctica-#SARSCoV2 pre-print makes good points that should also be considered in continued analysis of the data: researchsquare.com/article/rs-133… (25/n)
To follow up on another question, this point by @K_G_Andersen is also relevant (). There is a mix of mutations in these samples, because they are almost certainly contaminated with several different #SARSCoV2-containing samples. (26/n)
Above I called some mutations "ancestral" which I am using to loosely mean mutations likely present in earliest #SARSCoV2 viruses. Eg, @sergeilkp et al propose earliest virus had mutations at 8782, 18060, & 28144 (academic.oup.com/mbe/article/38…), and those are in these samples. (27/n)
Kristian correctly points out some other mutations (eg at 23525) are "derived," which he is using loosely to mean mutations unlikely to be present in earliest #SARSCoV2 viruses. These two observations consistent w idea that we are seeing mutations from a mix of viruses. (28/n)
Overall point is we can't precisely date samples just from mutations. Several reasons: (a) we see mix of mutations, not full sequences, (b) we don't know exact true most "ancestral" sequence, although we can guess it was closer to RaTG13/BANAL-20-52 than later viruses, ... (29/n)
... (c) sequences cannot be dated at high resolution just from mutations due to stochasticity of evolution (). (30/n)
To sum up, given mutations, I think we can be confident these are contaminated w "early viruses" in sense of late 2019 to early 2020, which is also consistent w what authors reported as sequencing timeline. But I doubt more precision than that possible from mutations alone (31/n)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Bloom Lab

Bloom Lab Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jbloom_lab

Aug 20
In new study led by Bernadeta Dadonaite, we measure how spike mutations affect function & antigenicity of spike of KP.3.1.1 strain of SARS-CoV-2.

Sheds light on how key neutralizing epitopes are changing & importance of RBD up/down motion.

biorxiv.org/content/10.110…
We examined spike of KP.3.1.1, a strain from late 2024 / early 2025 similar to current variants

KP.3.1.1 & other recent variants have >60 spike amino-acid mutations relative to early pandemic strains, as spike has evolved at extraordinary rate of >10 mutations/year on avg Image
We previously developed pseudovirus deep mutational scanning (), which uses non-replicative viral particles to safely study spike mutations.

Here we used approach to measure how mutations to KP.3.1.1 spike affect five phenotypes, as shown below. pubmed.ncbi.nlm.nih.gov/36868218/Image
Read 12 tweets
May 27
In new study led by @timcyuu, we measure how mutations to H3 flu HA affect cell entry, stability & antibody escape

We find pleiotropic effects of mutations on these phenotypes shape evolution: epistasis alleviates cell-entry but not stability constraints

biorxiv.org/content/10.110…
We used pseudovirus deep mutational scanning to characterize all mutations to a recent H3N2 HA. This approach uses virions that can only undergo one round of cell entry & so are not pathogens capable of causing disease.

All measurements available here: dms-vep.org/Flu_H3_Massach…
As can be seen below, constraint due to mutational impacts on cell entry are widely distributed across HA including receptor-binding pocket and fusion peptide. But mutational constraint due to HA stability is concentrated at trimer and HA1-HA2 interface. Image
Read 8 tweets
Mar 12
In study led by Cassie Simonich & T McMahon, we quantify antigenic evolution of RSV F. Important because:

1⃣ RSV top cause of infant hospitalization in USA

2⃣ New antibodies & vax can prevent hospitalizations

3⃣ But will virus evolution erode efficacy?

biorxiv.org/content/10.110…
RSV has high burden in infants: top cause of infant hospitalization in USA, 2nd-leading cause of infant mortality globally

A monoclonal antibody (nirsevimab) recently recommended for infants born in USA in RSV season. It prevents hospitalizations

pubmed.ncbi.nlm.nih.gov/38457312/
RSV vaccines also now approved to protect infants (via maternal vaccination) & elderly

But some viruses evolve to erode antibodies and vaccines

Will RSV do same? Worryingly, a Regeneron antibody failed phase 3 trials due to resistance in some RSV strains
pubmed.ncbi.nlm.nih.gov/32897368/
Read 11 tweets
Jan 21
In new study, we find dramatic differences in specificities of serum neutralizing antibodies in infants w single infection by a recent SARS-CoV-2 strain versus adults/children imprinted by an early viral strain.

biorxiv.org/content/10.110…
As background, immune response to a virus is “imprinted” by first exposure, since later exposures to new viral strains often activate pre-existing B-cells.

For SARS-CoV-2, most people globally imprinted by an early viral strain from either vaccination or infection in 2020-2021.
However, small but growing fraction of population has instead been imprinted by more recent viral strain.

Specifically, we compared adults/children imprinted by original vaccine then infected w XBB* strain in 2023 vs infants only infected w XBB* in 2023. Image
Read 9 tweets
Nov 21, 2024
I’ve updated SARSCoV2 antibody-escape calculator w new deep mutational scanning data of @yunlong_cao @jianfcpku

My interpretation: antigenic evolution currently constrained by pleiotropic effects of mutations on RBD-ACE2 affinity, RBD up-down position & antibody neutralization
First, the updated escape calculator is at

As shown below, it is remarkable how much antigenicity of RBD has changed over last 4 yrs. jbloomlab.github.io/SARS2-RBD-esca…Image
Updated data for calculator from this paper by @yunlong_cao’s group (nature.com/articles/s4158…), described in this thread by first author @jianfcpku:
x.com/jianfcpku/stat…

Calculator show how much mutations at each RBD site escape binding by set of neutralizing antibodies
Read 13 tweets
Nov 16, 2024
@Nucleocapsoid @HNimanFC @mrmickme2 @0bFuSc8 @PeacockFlu @CVRHutchinson Good observations. See also this thread posted by @SCOTTeHENSLEY:

I have added a few notes to the bottom of that thread.

To recap here:bsky.app/profile/scotte…
@Nucleocapsoid @HNimanFC @mrmickme2 @0bFuSc8 @PeacockFlu @CVRHutchinson @SCOTTeHENSLEY To add to thread linked above, human British Columbia H5 case has a HA sequence (GISAID EPI_ISL_19548836) that is ambiguous at *both* site Q226 and site E190 (H3 numbering)

Both these sites play an important role in sialic acid binding specificity
@Nucleocapsoid @HNimanFC @mrmickme2 @0bFuSc8 @PeacockFlu @CVRHutchinson @SCOTTeHENSLEY If you are searching literature, these sites are E190 and Q226 in H3 numbering, E186 and Q222 in mature H5 numbering, and E202 and Q238 in sequential H5 numbering (see: )dms-vep.org/Flu_H5_America…
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(