Steve Massey Profile picture
May 10 24 tweets 7 min read Twitter logo Read on Twitter
A new animal coronavirus, possibly from bamboo rat, is present in abundance in sample Q61 from the Huanan Seafood Market

This find, however, stands in stark contrast to the single SARS2 read present in the same sample, which has a high proportion of raccoon dog sequences 🧵 Image
2/ Sample Q61 was generated by Liu et al in their survey of the Huanan Seafood Market

It has generated discussion for having a high level of raccoon dog reads, which has been postulated by some workers as an intermediate host for SARS2
nature.com/articles/s4158…
3/ Previously, I had identified reads related to human OC43 /HKU1 coronaviruses using the coronascan procedure
4/ Subsequently, several other investigators and myself determined that these reads were more closely related to rodent coronaviruses using Blast (this allowed matching to partial CoV sequences, and additional whole genome sequences)
5/ The next step is to add the complete coronavirus genomes that were hit in the Blast search to the coronascan procedure

Coronascan systematically maps NGS reads against a dataset of complete coronavirus genomes, to find the best match
github.com/semassey/Scann…
6/ The coronascan results show that reads from Q61 map best to the Bamboo rat coronavirus GX/GX19-89/2019 genome (OQ297694)

The genome coverage is 15.3 %

(note I tweaked bowtie2 mapping by adding -a and --non-deteministic options, which allows more multi-mapping of reads) Image
7/ A level of 15.3 % is comparable to other wildlife associated RNA viruses in sample Q61
8/ The coronavirus genomes with the next best % coverage were:

rodent CoV RCoV/GD/2020 genome (MW855473) 8.9 %

Bamboo rat CoV GX-F1-1 genome (OM480511) 8.4 %

Bamboo rat CoV GX/GX18-49/2018 genome (OQ316388) 6.4 %
9/ Given the close relationships of these four CoVs with OC43/HKU1, they are likwise embecoviruses, which are beta-coronaviruses

SARS2 is a sarbecovirus, which is another subgenus of beta-coronavirus
en.wikipedia.org/wiki/Embecovir…
10/ While it is difficult to be certain as to the host animal of the new CoV detected in the Q61 sample, the type of correlational analysis by @jbloom_lab across all NGS samples might reveal if the novel CoV is correlated with bamboo rat reads
biorxiv.org/content/10.110…
11/ Liu et al also did a qualitative spatial correlation analysis which could likewise be informative:
12/ Previously, I had noted that the coronascan procedure could be improved by adding more complete and partial coronavirus genomes

However, partial genomes would confuse the ranking using # reads and % coverage. Some form of normalization would be required therefore
13/ Ranking whole coronavirus genomes on the basis of % coverage is useful as % coverage is related to phylogenetic distance (as well as viral titer in the sample)

In addition, masking of low-complexity regions using bbmask.sh should improve accuracy h/t henjin
14/ When I mapped the Q61 reads to the GX/GX19-89/2019 genome only, there was a substantially higher level of mapping: 32.9 % coverage

This is due to absence of other genomes, to which reads might cross-map

Cross-mapping dynamics need to be better understood and controlled for Image
15/ I took the 331 reads that mapped to GX/GX19-89/2019 only, and tried to assemble them using Megahit

Unfortunately, Megahit did not generate any assembled contigs

(contigs can give precise information regarding SNVs, when compared to the GX/GX19-89/2019 genome)
16/ I then displayed the mapped reads on the GX/GX19-89/2019 genome using Integrative Genome Viewer (IGV) - they are fairly evenly mapped across the genome, although somewhat denser in the second half of the genome Image
17/ henjin (from Discord metagenomics chat) did a more comprehensive analysis than me, with a larger # of complete CoV genomes and did a distance based clustering tree using ggtree

henjin found the 3 bamboo rat CoVs group between OC43 and HKU1 CoVs Image
18/ "In the plot ... the x-axis shows the distance to the bamboo rat virus which had the most aligned reads from Env_0576, and the y-axis shows the distance to the HKU1 reference genome"
19/ "Each point is connected with a line to its two neighbors. "Mouse coronavirus PREDICT/PDF-2560" is connected to two of the bamboo rat coronaviruses even though it has over 7,000 nucleotide changes from "Bamboo rat coronavirus isolate GX/GX19-89/2019""

h/t henjin
20/ Given that some of the novel CoV reads are divergent from the GX/GX19-89/2019 genome, possessing SNVs, this means that there are likely additional reads present that do not map to GX/GX19-89/2019 (due to the sequence divergence)
21/ Consequently, the number of reads derived from the new coronavirus is likely higher than 331, strongly outnumbering the single SARS2 read
22/ The value of this novel CoV is that it shows that the # of reads that can be expected from an animal-associated CoV in the market NGS datasets is in the order of 100s

Also, that a market animal CoV has not degraded prior to sampling, being present in significant levels
23/ In the light of these considerations, the presence of only a single SARS2 read, in a sample with a large proportion of raccoon dog reads, appears increasingly inconsistent with the presence of infected raccoon dogs

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Steve Massey

Steve Massey Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @stevenemassey

Apr 22
New coronaviruses in raccoon dog sample Q61 !!

On Blast searching individual reads from Q61 that matched the human OC43 and HKU1 genomes, the closest matches are to bamboo rat and rodent CoV sequences 🧵

h/t @babarlelephant @humblesci Image
2/ Previously I had found reads that mapped to human OC43/HKU1 coronaviruses, interpreting this to mean that human coronaviruses were present in the raccoon dog sample Q61
3/ However, Blasting shows a majority better match animal CoVs than human OC43 / HKU1

Typically in the coronascan procedure, cross-matches may occur between closely related genomes

(eg reads from the original RaTG13 NGS dataset only gives ~ 3 % coverage of the SARS2 genome)
Read 18 tweets
Apr 15
What is the true trace level of SARS2 in the raccoon dog sample Q61 from the Huanan Seafood Market ?

Here, in a refined analysis, I show that human common cold coronaviruses are present in quantities greater than SARS2 in Q61 🧵 Image
2/ Q61 is a key sample described by Liu et al in their survey of the Huanan Seafood Market, as it contained large quantities of raccoon dog sequences, leading some to claim that it was evidence that raccoon dogs were the source of the SARS2 pandemic
nature.com/articles/s4158…
3/ Previously, using coronascan, I found 137 reads in Q61 that matched SARS2, but @humblesci and @emmecola only found 1

I also found human coronaviruses OC43, HKU1, 229E and an alphacoronavirus

Read 19 tweets
Apr 6
In their survey of the Huanan Seafood Market, Liu et al conclude that there is no evidence for a zoonotic transmission of SARS2 at the market

Out of curiosity, I did a deep dive into sample Q61 (Env_0576), which they report as having a high proportion of raccoon dog reads 🧵 Image
2/ I ran a mitoscan analysis, a procedure developed with @humblesci , that was first described by Csabai and Solymosi in their Antarctic soil preprint that identified early SARS2 sequences associated with a variety of potential mammalian cell lines 👇
researchsquare.com/article/rs-133…
3/ Mitoscan systematically maps all reads in an NGS dataset against all mito(chondrial) genomes in the Genbank db

It uses bowtie2 default settings, which allows for a limited number of mito polymorphisms/sequencing errors when mapping
github.com/semassey/Scann…
Read 20 tweets
Mar 23
Our new paper on intermediate lineage A-B SARS2 genomes is out !

Within, we criticize the exclusion of 20 potential A-B intermediate genomes from Pekar et al (2022), finding that the majority were improperly excluded 🧵
mdpi.com/2036-7481/14/1…
2/ Lin(eage) A and lin B are two earliest established lineages from Wuhan. A likely arose first, with B arising from A via two SNVs at positions lineage defining positions 8782 and 28144 Image
3/ The presence of SARS2 genomes intermediate between lin A and lin B from humans would invalidate Pekar et al’s hypothesis that SARS2 spilled over twice from an unknown animal into humans, at the Huanan seafood market
Read 20 tweets
Mar 21
9/ This saga is a case study in the perils of making grandiose claims without having completed the analysis, and of the hubris to embark into a new subject area without specialists (metagenomic) to scrutinize and suggest robust analyses
10/ The report can be found here:
zenodo.org/record/7754299…
Image
Read 4 tweets
Mar 21
So, Flo Debarre et al’s raccoon dog analysis that has caused such a media frenzy has been released and what does it show ?

Not much 🧵
2/ Essentially, it confirms Gao et al’s preprint analysis that there was nuc acid from animals in addition to humans in the samples 👇 (no surprise there)

It adds some species specific info
3/ The analysis is crude, and taxonomic attribution method naïve

They rely on seq assembly, which miss a lot of info

They use numbers of assembled contigs as a metric for quantity of species specific material

This is semi-quantitative at best (due to the vagaries of assembly)
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(