2/ The existence of such intermediate genomes in humans is incompatible with the two spillover model for the origin of C19
To recap: lineage A has T8782/C28144 (T/C) while lineage B has C8782/T28144 (C/T)
An A/B intermediate will have C/C or T/T
3/ Firstly, as previously pointed out, Pekar et al's exclusion criterion of 'low read depth' is inconsistent with data from GISAID showing high read depth for the majority of the datasets
Only 1 genome falls below this criterion (Table 1, in yellow)
5/ Curiously, 'contamination' is used as an exclusion criterion. However, nowhere is any evidence presented of contamination. One way to do this is via haplogroup analysis of human mito genomes, to show more than one haplogroup present, which they fail to do
6/ 'Personal communications' are used to exclude 11 C/C and 3 T/T genomes. An 'L.Chen' is credited for the information that 11 intermediate C/C genomes from Sichuan and Wuhan are sequencing errors
7/ However, the identity of L.Chen is unclear. In addition, the C/C genomes (actually 12 not 11) from Sichuan and Wuhan were sequenced in different sequencing centers, so likely sequenced by two different people
8/ Given L.Chen is explicitly linked by the authors to Sichuan 👇, then the Wuhan genome was sequenced by an unidentified person we term 'person X'. It is concerning that a C/C intermediate genome was excluded by a personal communication with an unidentified person
9/ One of the Sichuan C/C genomes (EPI_ISL_451320) excluded by Pekar et al. is actually used by NextStrain to root their phylogenetic tree (as an A/B root). This genome has a 1335X sequencing depth. Clearly @nextstrain do not concur that it is erroneous
10/ Pekar et al. exclude genomes from Singapore (EPI_ISL_462306) and South Korea (EPI_ISL_413017) that had raw data available, on the basis of low sequencing depth at positions 8782/28144 and 28144, respectively
However, we show that the Singapore genome is clearly a T/T genome
11/ We do this by mapping the raw reads to the Hu-1 reference genome (which is C/T, lineage B)
Position 8782 has 12/12 T, while 28144 has 6/6 T
h/t @humblesci
This is clearly a T/T genome therefore
12/ We also identify an additional intermediate T/T genome from Guangzhou (GZMU0025.capture, SRR13616010), that has the following SNVs when compared to Hu-1
13/ Remarkably, two C/C intermediate genomes from Beijing (2500X and 1850X sequencing depth) were excluded because 'no underlying data was available'.
This is hard to understand, and was selectively applied (it was not applied to the 787 remaining genomes used in their analysis)
14/ Puzzlingly, one T/T intermediate genome EPI_ISL_493182 was discarded even though it conformed to their (contentious) 10X read depth cutoff. Position 8782 is a consensus T nucleotide, with 19/29 reads T
15/ Pekar et al fail to explain how repeated sequencing errors can occur in the same positions 8782 and 28144 in multiple genomes
If sequencing depth were a problem in causing miscalls, there should be a significant number of unique SNVs in these genomes, which is not observed
16/ 'Convergence' was used to exclude 7 intermediates that possessed A, B or A/B specific SNVs. However, 5 of these only possessed 1 A or B specific SNV. These could be true intermediates that picked up a A or B specific SNV by convergence
This caveat was not discussed
17/ To conclude, the exclusion of most of the 20 intermediate genomes from the analysis of Pekar et al. is untenable, and represents an unsurmountable problem for the conclusion of two zoonoses leading to the establishment of lineages A and B
A link between MERS-CoV pathogenicity and epithelial sodium channel (ENaC) was described in a 2016 proposal by Luis Enjuanes, collaborator of Baric h/t @USRightToKnow
Coincidentally, the human α-ENaC furin cleavage site (FCS) is identical to that of SARS2 🧵
2/ As far back as 2009 it was known that SARS1 spike and E proteins interacted with human ENaC, modulating its activity leading to fluid buildup in the lungs
Fluid buildup (pulmonary edema) is a key symptom of both SARS1 and SARS2
The GOF Executive Order includes no clear ban on US-based GOF research, but kicks the can down the road
Weak gruel unfortunately at this stage, vested interests seem to have inserted themselves 🧵 whitehouse.gov/presidential-a…
@thackerpd @emilyakopp @HansMahncke @ban_epp_gofroc @lewiskamb 2/ While executive focus on GOF is to be welcomed, unfortunately this Executive Order (EO) does not deliver a ban on US GOF research. While it bans funding of dangerous GOF in countries of concern, why does it not do the same in the US ? Why not a blanket ban ?
3/ There is also a ban on GOF in countries with inadequate oversight, but presumably that allows GOF in countries that do pass the oversight test (and what that test is remains to be described)
2/ @quay_dr uploaded our original preprint on March 26
We identified 7 'frozen' SARS2 sequences from UNC, that were basal to other sequences generated at the same time. These could represent early SARS2 strains that had escaped from a research facility
3/ Before uploading the preprint, we sent an email to Dirk Dittmer and Melissa Miller of the Clinical Molecular Microbiology Laboratory, UNC Hospital, regarding the 7 anomalous sequences we had detected from their sequencing facility
Steve Quay @quay_dr and myself have published evidence for a potential lab leak of SARS2 at the University of North Carolina (UNC)
7 genome sequences collected by UNC in 2020-21 are basal (ie 'frozen') indicating a potential lab origin 🧵 zenodo.org/records/150833…
2/ The 7 sequences were collected from Jun 2020 to Jan 2021 and submitted by Dr Dirk Dittmer of the Clinical Molecular Microbiology Laboratory, UNC Hospital
Puzzlingly, they show strong similarities to the SARS2 reference sequence Hu-1, sampled in Dec 2019
3/ On the dates the 7 sequences were collected, they should have accumulated more SNVs than they actually possess, according to the SARS2 molecular clock
I've published a paper on the emerging phenomenon of 'frozen' virus genome sequences, and how they can indicate lab leaks
"Frozen' virus genome sequences are generated from recent outbreaks, but show surprising similarity to historic strains 🧵 mdpi.com/2076-2607/12/1…
2/ The classic example is 1977 'Russian' H1N1 Flu, which re-emerged after having gone extinct in the 50s. Genomic sequences were identical to strains from the 50s, indicating it had been stored for > 20 years before release nature.com/articles/27433…
3/ An alternative scenario of viral host persistence / latency seems unlikely
A clue to its non-natural origin was that only younger people got sick, while older people were immune. This is because the latter had been exposed to H1N1 in the 50s and before pmc.ncbi.nlm.nih.gov/articles/PMC23…
It has been reported German Intel has assessed that COVID19 originated via a lab leak
Swiss paper NZZ reports that our discovery of a MERS-related infectious clone with a chimeric spike from Wuhan sequencing datasets was considered by German Intel 🧵
2/ The study by @humblesci @Daoyu15 @BiophysicsFL @ydeigin @quay_dr and myself was published last year and demonstrated unreported GOF experimentation on novel MERS-related coronaviruses in Wuhan prior to the pandemic fortunejournals.com/articles/disco…
3/ Concerningly, we found evidence that a MERS spike (with FCS) was inserted into a HKU4r-CoV backbone, likely enhancing transmissibility in human cells, as the RBD is expected to have a high binding affinity for the hDPP4 receptor
Here is a thread on the preprint version of the paper: