I’ve refrained from commenting on metagenomics of environmental samples from Huanan Seafood Market, since media coverage preceded description of data or analysis.
Data still not available, but there is now enough written down to offer partially informed assessment.
These data don’t tell us how pandemic began, but every bit of information helps.
Here is what I glean from the available analyses by the two groups w access.
First analysis is pre-print Chinese CDC posted over year ago (Feb-25-2022) that describes sampling animals & environment at Huanan Market: researchsquare.com/article/rs-137…
It is moving thru peer review at glacial pace: @researchsquare status says still under review at a Nature journal.
Below is my summary of Chinese CDC preprint when it posted last year:
Human infections started in Wuhan no later than Nov 2019, which limits how much can be concluded from either positive or negative samples collected on Jan 1.
Chinese CDC collected samples throughout market, but focused on southwest corner where animals sold. Because sampling focused on southwest corner, both negative (green) and positive (orange) environmental samples were mostly from that part of market.
Chinese CDC reports testing samples for SARS2 by PCR & deep sequencing them. They report no tendency for the rate of PCR positive samples to be associated with any type of market vendor: similar positivity rates for stalls selling wildlife, vegetables, livestock, seafood, etc.
Table 1 summarizes environmental sample testing.
I’ve added highlighting & raccoon dog image next to sample Q61.
This sample was collected on Jan-12 & was negative for SARS2 by PCR but positive by NGS, which presumably means it had detectable SARS2 reads in deep sequencing.
Does fact Q61 was negative for SARS2 by PCR testing mean it had less viral RNA than samples w measurable Ct values? Unclear, because neither Chinese CDC pre-print nor subsequent analysis discussed below quantify fraction of deep sequencing reads that map to SARS2 for all samples.
This brings us to the second analysis by Crits-Christoph et al which recently posted on Zenodo. This pre-print analyzes deep sequencing data for some samples the authors obtained via GISAID, although those data remain unavailable to others. zenodo.org/record/7754299…
The way Crits-Christoph et al obtained these data has itself become a controversy. I am not going to address that. Read the first 3 pages of Crits-Christoph report linked above & GISAID statement (gisaid.org/statements-cla…) for opposing viewpoints.
Instead, I’ll focus on the substance of the Crits-Christoph analysis. Their analysis focused on the non-SARS2 metagenomic content of the deep sequenced environmental samples.
Key results are in Fig 1 of their analysis, shown below.
Contour heatmap shows fraction of positive samples from different regions of market.
Positive samples mostly from southwest corner, since that is where Chinese CDC mostly collected samples.
So contour plot just reflects where Chinese CDC collected samples (as shown in Fig 1A of their pre-print, mentioned earlier in this thread).
In other words, contour does not show percent positivity of collected samples, but total positive samples not normalized by total samples.
The novel part of Fig 1 of Crits-Christoph et al is fraction of mammalian mtDNA reads in each analyzed sample that mapped to variety of mammalian species. These are pie chart circles.
There are samples w mtDNA from humans, pigs, sheep, cows, Siberian weasels, raccoon dogs, etc.
The sample that has attracted the most attention is the mostly light green pie circle, for which the dominant mtDNA is from a raccoon dog. This has attracted attention because raccoon dogs are one of several species known to be susceptible to SARS2.
That green pie circle is sample Q61 from Chinese CDC pre-print, which was SARS2 negative by PCR but positive by deep sequencing. When data become available, I would like to quantify SARS2 reads in each sample, as it would be useful to know how much SARS2 was in each sample.
For instance, it is unclear whether there is enough SARS2 in the raccoon dog mtDNA dominated Q61 sample to actually infer a viral sequence, or if the undetectable Ct value for that sample means there are just trace levels of reads that are too low to obtain a viral sequence.
There are also lots of pie circles where the dominant mtDNA is from humans, & also from animals that were probably only sold as meat (eg, cow, sheep, pig). And although not shown in figure (since it’s not a mammal), the text reports some samples had fish mtDNA.
So as Crits-Christoph et al note in text, fact that mtDNA from a species is found in a sample doesn’t mean that species was infected w SARS2. It just means material from that animal ended up in the same place as viral RNA, which was widespread in the Huanan Market by Jan 2020.
They also analyze a few samples in more detail, including assembling near-complete mtDNA for some animals, and some genomic / cDNA contigs for sample Q61. This could be useful for learning more details about animals from which genetic material in the market was derived.
Overall, main thing we learn is details of which animals or products (eg, meats) were in market before it closed on Jan-1-2020.
This doesn’t tell us if any infected w SARS2. But knowing more about animals could help trace supply chain, which is valuable line of investigation.
However, human SARS2 infections started in Wuhan no later than Nov 2019. So we have to be circumspect about environmental samples from Jan 2020.
Eg, raccoon dog sample Q61 was collected on Jan-12-2020, which is at least 6 and probably >8 weeks after first human infections.
This brings me back to main caveat I noted about Chinese CDC pre-print when it posted last year: we are unlikely to get conclusive answers about origin of an outbreak that started in Nov 2019 (or earlier) by looking at samples collected in Jan 2020:
Analyses of Jan 2020 samples is definitely worthwhile, because as @DrTedros rightly noted each piece of data is important for better understanding initial outbreak in Wuhan.
But to identify actual origin of outbreak, we need details from Nov or early Dec 2019.
Recall that there are descriptions of confirmed cases, both unlinked and linked to the market, dating back to Nov and very beginning of Dec 2019:
I’ve updated SARSCoV2 antibody-escape calculator w new deep mutational scanning data of @yunlong_cao @jianfcpku
My interpretation: antigenic evolution currently constrained by pleiotropic effects of mutations on RBD-ACE2 affinity, RBD up-down position & antibody neutralization
@Nucleocapsoid @HNimanFC @mrmickme2 @0bFuSc8 @PeacockFlu @CVRHutchinson @SCOTTeHENSLEY To add to thread linked above, human British Columbia H5 case has a HA sequence (GISAID EPI_ISL_19548836) that is ambiguous at *both* site Q226 and site E190 (H3 numbering)
Both these sites play an important role in sialic acid binding specificity
@Nucleocapsoid @HNimanFC @mrmickme2 @0bFuSc8 @PeacockFlu @CVRHutchinson @SCOTTeHENSLEY If you are searching literature, these sites are E190 and Q226 in H3 numbering, E186 and Q222 in mature H5 numbering, and E202 and Q238 in sequential H5 numbering (see: )dms-vep.org/Flu_H5_America…
Here is analysis of HA mutations in H5 influenza case in Missouri resident without known contact w animals or raw milk.
TLDR: there is one HA mutation that strongly affects antigenicity, and another that merits some further study.
As background, CDC recently released partial sequence of A/Missouri/121/2024, which is virus from person in Missouri who was infected with H5 influenza.
Here I am analyzing HA protein from this release, GISAID accession EPI_ISL_19413343cdc.gov/bird-flu/spotl…
Sequence covers all of HA except signal peptide, and residues 325-351 (sequential numbering) / 312-335 (H3 numbering). The missing residues encompass HA1-HA2 boundary, and any missed mutations there unlikely to affect antigenicity or receptor binding, but could affect stability.
In new study led by @bblarsen1 in collab w @veeslerlab @VUMC_Vaccines we map functional & antigenic landscape of Nipah virus receptor binding protein (RBP)
Results elucidate constraints on RBP function & provide insight re protein’s evolutionary potentialbiorxiv.org/content/10.110…
Nipah is bat virus that sporadically infects humans w high (~70%) fatality rate. Has been limited human transmission
Like other paramyxoviruses, Nipah uses two proteins to enter cells: RBP binds receptor & then triggers fusion (F) protein by process that is not fully understood
RBP forms tetramer in which 4 constituent monomers (which are all identical in sequence) adopt 3 distinct conformations
RBP binds to two receptors, EFNB2 & EFNB3
RBP’s affinity for EFNB2 is very high (~0.1 nM, over an order of magnitude higher than SARSCoV2’s affinity for ACE2)