A leak of polio from a research facility is indicated by a poliovirus genome sequence generated from a sample collected in 2014 in China by the Wuhan Institute of Virology
2/ Chesnais et al sequenced 3 poliovirus genomes recovered from 60 year old historic samples generated by Albert Sabin at the Pasteur Institute, Paris
Sabin is the father of the oral polio vaccine, created from attenuated polio strains
3/ Poliovirus is a single stranded positive sense RNA virus, that only naturally infects humans (no other natural hosts are known)
There are 3 poliovirus serotypes: PV-1, PV-2 and PV-3
4/ The 3 genome sequences were from strains Glenn (PV-3), cold variant P2149 (PV-1) and cold variant P712 (PV-2)
Glenn was a cold adapted strain isolated before 1956 from a child in Cincinnati
5/ The P712 cold variant was a serially passaged derivative of strain P712. P712 is a constituent of the oral polio vaccine
The P2149 cold variant is likewise a serially passaged derivative of strain P2149
Both cold variants are able to grow at 25C
6/ Surprisingly, the Glenn genome showed 95 % identity to WIV14, generated from a sample collected in Dec 2014 by the Wuhan Institute of Virology (WIV)
7/ As Glenn is related to Saukett A (PV-3), which is used to produce inactivated polio vaccine (IPV), the group also sequenced the Saukett A genome
They obtained a sample from the British National Institute for Biological Standards and Control (NIBSC)
8/ Saukett A was first obtained from a child, James Sarkett (the name of the strain was misspelled), in California prior to 1954 by medical pioneer Jonas Salk, creator of the IPV
Salk never received a Nobel for his work, but perhaps should have done
9/ The Saukett A genome showed a 99% match to the WIV14 genome, providing a closer match than Glenn
This can be seen on the tree below (genetic distance is indicated by branch lengths)
10/ The high sequence identity, with only 70 nucleotide differences, indicates a recent common ancestor between Saukett A and WIV14
However, Saukett A was isolated < 1954. Given the high mutation rate of polioviruses, the high sequence identity is anomalous
11/ WIV14 should be more divergent at the sequence level if it had been in circulation for 60 years after splitting from Saukett A
The most plausible explanation is that its progenitor was in storage for a substantial amount of time, and then released into the population
12/ This would be reflected in WIV14's retarded sequence evolution in comparison to Saukett A
The most likely reason for storing its likely progenitor, Saukett A, would be for vaccine development, given Saukett A's role in the IPV
13/ Could the WIV14 sequence be the result of contamination from a WIV source during the sequencing process ?
I think this is unlikely as contaminating genome sequences often have low sequencing depth, and the sequence was described as coming from a poliovirus live isolate
14/ In addition, the sequence was generated using Sanger sequencing, which is not as prone to contamination as next generation sequencing
However, PCR amplification was used to generate the sequence fragments, and this can amplify contaminating sequences
15/ Given the widespread use of the IPV, and consequently Saukett A, then it is more difficult to identify a potential originating facility
There should be a paper trail, however, regarding those facilities doing that type of work
16/ Given that WIV14 was isolated from a child in Anhui, China it implies the originating facility was in China
WIV14 had apparently been in circulation somewhat before isolation given the sequence divergence from Saukett A, molecular clock analyses can give an approximate date
17/ Of interest are the 10 nonsynoymous mutations in the capsid protein which suggest immune pressure, consistent with circulation in the human population, or serial passage in a lab animal (rather than a cell line)
18/ Given that there are mouse adapted polio PV-3 strains then passaging in a lab mouse cannot be ruled out
Vincent Racaniello @profvrr for example has worked on a mouse adapted PV-3 strain, it would be interesting to hear his opinion
19/ In principle, examination of the mutations in WIV14 might provide evidence for mouse adaptation (as would infectivity studies in mice)
20/ Chesnais et al note the possibility that WIV14 had undergone serial passage in a lab before release into the population, and that this would result in accelerated evolution (from a Saukett A progenitor)
21/ While WIV14 was isolated by the WIV14 in Vero cells, only 6 passages were reported, which is insufficient to generate the nucleotide differences observed with Saukett A
22/ To conclude, it appears highly likely that WIV14 was the result of a leak from a scientific facility
Its 'frozen' sequence is reminiscent of the Russian H1N1 flu genome, which was also likely released from a scientific facility
23/ Another example of 'frozen' sequences is that of hand, foot and mouth disease (HMFD) strains sampled in 2007-2009 from China, which bear high identity to the BrCr prototype strain sampled in 1970 USA, indicating a likely lab leak origin
25/ Finally, another intriguing example of a 'frozen' sequence is that of the 2021 Ebola outbreak in Guinea. The genome sequence was little different from 2014 sequences from the same region
26/ Is there a connection with research facilities in the region ? Collaborators of the West African Emerging Infectious Disease Research Center, Drs Kristian Andersen and Robert Garry, no doubt would have an enlightening take
Why are virologists freely allowed to anonymize lethal synthetic viruses, but developers are put in jail for writing code that anonymize bitcoin transactions ?
Genetically enhanced infectious clones present a much higher risk than anonymous monetary transactions 🧵
@R_H_Ebright @SenGaryPeters @COVIDSelect @BiosafetyNow @CharlesRixey @HSGAC_GOP @RepBradWenstrup @RepRaulRuizMD Virologists often synthesize infectious clones (ICs), which are used to produce live infectious viruses
To make an IC of a coronavirus, due to its large size constituent fragments need to be synthesized, and then ligated together to form a complete genome
Unique restriction (cut) sites at the ends of the fragments allow them to be assembled in the correct order
These are often left in the genome as a signature of the ligation, as in this SARS1-related IC by ZLS and Daszak
Re-watching this video, it becomes clear why virologists have lobbied so hard to detach the names of novel pathogens / strains from the location where they arise
Detaching regional / place names from pathogens is a form of anonymization eg delta, omega etc
Why would virologists want to use names which are arbitrary and non-informative ? geographic information is an important part of epidemiology and helps understand origin and spread
Claiming that the use of place / regional names is 'racist' is a red flag - geography is neutral
The implication that the uneducated masses who use such terminology are reflexively bigoted is surely a form of prejudice and elitism in itself
Some further insights into the SARS2 spike sequences found in Pseudomonas aeruginosa datasets, recorded as being sampled in 2019 🧵
2/ Complete SARS2 spike gene sequences were found in contigs generated from Pseudomonas aeruginosa cultures sampled in 2019, by @iximeno
The spike sequence displayed codon optimization and lacked the furin cleavage site
3/ The spike sequences are found in four contigs 👇, inserted into the pcDNA3.1 plasmid h/t @raqueltobes , with a t-PA (tissue plasminogen activator) leader h/t @Daoyu15
Differential gene expression analysis of the controversial RaTG13 dataset reveals strong similarity to the RaTG15 dataset, also described as generated from a Rhinolophus affinis 'rectal swab' from the Mojiang Mine
This indicates they have a common, undefined source 🧵
2/ The source of the RaTG13 dataset has been a key puzzle of the C19 Origin debate
RaTG13 was sequenced by the Wuhan Institute of Virology prepandemic in 2017/2018 and remains the closest related CoV backbone to SARS2
3/ While the sample is described as being generated from a Rhinolophus affinis 'fecal swab', numerous investigators have noted this is inconsistent with the low % of bacteria present in the NGS dataset
The Zhang group of Fudan University have identified and validated two A-B intermediate SARS2 genomes from the early pandemic
This provides a key to understanding the origin of COVID19 🧵
2/ In their new paper, the Zhang group sequence 343 new SARS2 genomes from the early pandemic (sampled up to Oct 2020). The genomes were obtained from COVID19 patients in the Shanghai Public Health Center academic.oup.com/ve/advance-art…
3/ Importantly, they identify two SARS2 genomes intermediate between lineage A and lineage B
These were validated using two methods, RT-PCR (Sanger sequencing), and Next Generation Sequencing (NGS). @jbloom_lab verified the sequencing depth on one (high)