Steve Massey Profile picture
May 9 38 tweets 12 min read Read on X
Some further insights into the SARS2 spike sequences found in Pseudomonas aeruginosa datasets, recorded as being sampled in 2019 🧵 Image
2/ Complete SARS2 spike gene sequences were found in contigs generated from Pseudomonas aeruginosa cultures sampled in 2019, by @iximeno

The spike sequence displayed codon optimization and lacked the furin cleavage site
3/ The spike sequences are found in four contigs 👇, inserted into the pcDNA3.1 plasmid h/t @raqueltobes , with a t-PA (tissue plasminogen activator) leader h/t @Daoyu15 Image
4/ I re-examined the following 3 features of the sequences:

i) the C-terminus extension of the spike sequence
ii) the N-terminus t-PA (tissue plasminogen activator) extension
iii) Reasons for variation between the contigs that contain the spike and plasmid sequences
5/ i) The C-terminus of the spike sequences has a peptide extension not found in SARS2

@VBruttel proposed that this is a trimerization domain, but it does not match anything in the database

6/ The C-terminus extension can be traced to a publication from Hong Ling's group, Harbin Medical University

The sequence is termed an 'MTQ' domain and is designed to promote trimerization of overexpressed virus glycoproteins
link.springer.com/article/10.100…
7/ The construct described by Ling's paper also uses the pcDNA3.1 expression vector, with a t-PA leader, as well as the MTQ domain, a close match therefore to the contig sequences Image
8/ The MTQ domain has had limited published use for the overexpression and trimerization of HIV, RSV and SARS2 surface glycoproteins (spike protein is classed as a glycoprotein)

These are all immunogenic as they are exposed to the host immune system, so of medical interest
9/ The MTQ extension starts with a flexible linker GGGSGGS, then followed by a trimerization domain designed de novo using PSIPRED and MARCOIL1.0, incorporating elements that produce coiled coil alpha helices (including heptad repeats) Image
10/ Trimerization of the spike protein means that it is in its native state and reflects how it is orientated in the SARS2 membrane surface (as a trimer).

This native state conformation can be important for boosting the immunogenicity of subunit vaccines
11/ The MTQ domain has been patented by Harbin Medical University, which means that if used commercially there should be royalty payments

patents.google.com/patent/CN10172…
12/ The MTQ domain is mentioned in a patent for a RSV subunit vaccine by Jiangsu Ruike Biotechnology Co Ltd (however the plasmid used is not pcDNA3.1)

patents.google.com/patent/CN11758…
13/ The identification of the MTQ domain verifies that the spike proteins in the 4 contigs are likely intended for subunit vaccines either for humans, or for testing in mouse/bat (as commonly conducted by the WIV, and described in DEFUSE h/t @VBruttel ) Image
14/ Using 'MTQ' as a search term on Addgene () yields no results (C-terminal tags are sometimes provided in plasmid descriptions), so this indicates it is only rarely used (hence no matches on Genbank) and constitutes a useful identifieraddgene.org
15/ There are 14 Chinese spike protein subunit vaccines (2 approved, 12 undergoing trials) listed at COVID19 vaccine tracker (however note that site is not updated after Dec 2022)

covid19.trackvaccines.org/country/china/
16/ Not all of these use the entire spike protein: several use the NTD/RBD instead

Of those that use the spike protein none of those that report which trimerization domain was used report using MTQ
17/ They report using alternative trimerization domains such as:
collagen (SCB-2019)
foldon (SCTV01C)
fibritin (202-CoV9)
18/ Consequently, the spike constructs in the 4 contigs do not appear to have been used in official vaccine trials

This may be because the project was abandoned, never openly subjected to trials, or in a trial after Dec 2022
19/ b) The N-terminus of the spike sequences in the contigs possesses a t-PA leader sequence

The t-PA leader is immunogenic, but is also a secretion signal in mammalian protein expression systems (this was Ling's purpose for adding the t-PA leader)
20/ It is notable that P22 of t-PA has been replaced by asparagine (N) 👇

In the publication by Hong Ling's group, they test two modifications of the t-PA leader:

P22A and P22G Image
21/ The insertion of small, neutral amino acids was used to enhance cleavage of the t-PA signal sequence from the overexpressed protein after secretion

SignalP was used to predict enhanced cleavage of the tPA signal peptide due to the P22A and P22G substitutions
22/ Oddly, however, using SignalP-6.0, the P22N mutation reduces cleavage efficiency of t-PA, compared to P22A and P22G (probability 0.30 vs 0.92 and 0.81, respectively)

Image
Image
Image
23/ In addition, if they had added the tPA signal sequence 1 amino acid upstream of the current join site, resulting in inclusion of an additional valine, this would result in much better cleavage (probability 0.97) Image
24/ This indicates that the P22N was suboptimal for cleavage, whether deliberate or not is unclear

Given suboptimal cleavage, these exact constructs are likely to have been unsuccessful as subunit vaccines or in experiments
25/ This emphasizes that designers are perfectly capable of designing suboptimal cleavage sites, with relevance to the supposedly suboptimal FCS of the SARS2 spike protein
26/ Of note, the P22A substitution has been used before Ling's 2011 paper, in 2004 👇 This indicates that the effect of mutating P22 on cleavage was known before Ling's paper, and so it is not a unique marker

ncbi.nlm.nih.gov/pmc/articles/P…
Image
27/ iii) Lastly, I looked for differences between the contig sequences

When aligned I found that they are identical, but differ in size. I attribute this to the assembly of few reads may produce contigs of variable sizes (their sequence depth is low h/t @Kevin_McKernan )
28/ This substack by Kevin contains a lot of interesting ideas and background information on the plasmids

anandamide.substack.com/p/chocolate-in…
29/ A 2023 paper by Dongsheng Zhou and others describes the P.aeruginosa genome sequences, but unfortunately does not provide the sequencing date or location, which could help trace the provenance of the plasmid sequences

ann-clinmicrob.biomedcentral.com/articles/10.11…
30/ Can the observations overall tell us more about the date the spike sequences were generated ?

Not directly, but the identity of the MTQ domain verifies the purpose of the sequences (subunit vaccine), and indicates that there should be documentation if a post-2019 project
31/ (documentation is likely deleted if it is pre-2020)
32/ The observations are consistent with contamination of the P.aeruginosa sequence datasets either during sample / library prep or sequencing

This is because the construct is almost identical to Ling's, which was used for mammalian expression
33/ Lastly, in a striking coincidence the P.aeruginosa contigs match experiments described in DEFUSE

34/ Of interest also, pcDNA3.1 was used extensively by Zhengli Shi for a variety of purposes, including cloning codon optimized SARS1 spike h/t @VBruttel





nature.com/articles/natur…
journals.plos.org/plosone/articl…
nature.com/articles/s4142…
journals.plos.org/plospathogens/…
link.springer.com/article/10.100…



Image
Image
Image
Image
35/ ZLS has also used the t-PA signal sequence with pcDNA3.1




Unfortunately, plasmid sequences are not provided in her papers, so use of the MTQ domain is unclear

The use of MTQ would be a red flag sciencedirect.com/science/articl…
journals.asm.org/doi/10.1128/jv…

Image
Image
36/ The pcDNA3.1 expression system is quite widely used; entering the search terms "pcDNA3.1" "spike" "SARS-CoV-2" "S" into Addgene brings back 55 entries from over 7 groups

However, the MTQ domain is rare and key for tracing provenance
addgene.org/search/catalog…
37/ Strictly speaking, these plasmids cannot be ruled out as coming from the WIV, or that they were generated pre-pandemic

This is due to the lack of an audit of the WIV Image
@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Steve Massey

Steve Massey Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @stevenemassey

Apr 18
Differential gene expression analysis of the controversial RaTG13 dataset reveals strong similarity to the RaTG15 dataset, also described as generated from a Rhinolophus affinis 'rectal swab' from the Mojiang Mine

This indicates they have a common, undefined source 🧵 Image
2/ The source of the RaTG13 dataset has been a key puzzle of the C19 Origin debate

RaTG13 was sequenced by the Wuhan Institute of Virology prepandemic in 2017/2018 and remains the closest related CoV backbone to SARS2
3/ While the sample is described as being generated from a Rhinolophus affinis 'fecal swab', numerous investigators have noted this is inconsistent with the low % of bacteria present in the NGS dataset
Read 26 tweets
Mar 8
The Zhang group of Fudan University have identified and validated two A-B intermediate SARS2 genomes from the early pandemic

This provides a key to understanding the origin of COVID19 🧵 Image
2/ In their new paper, the Zhang group sequence 343 new SARS2 genomes from the early pandemic (sampled up to Oct 2020). The genomes were obtained from COVID19 patients in the Shanghai Public Health Center
academic.oup.com/ve/advance-art…
3/ Importantly, they identify two SARS2 genomes intermediate between lineage A and lineage B

These were validated using two methods, RT-PCR (Sanger sequencing), and Next Generation Sequencing (NGS). @jbloom_lab verified the sequencing depth on one (high)
Read 22 tweets
Feb 13
Was Baric aware of the work on the human α-ENaC furin cleavage site (FCS) at the University of North Carolina (UNC) ? 🧵

In a striking coincidence, the human α-ENaC FCS is exactly the same as that of SARS2, as first noted by Anand et al in 2020
elifesciences.org/articles/58603
2/ Furin cleavage of human α-ENaC has been studied by M.Jackson Stutts, who is based at the UNC School of Medicine

Ralph Baric is also based at UNC School of Medicine, and also has an interest in lung disease. Was he aware of this work ?

rupress.org/jgp/article/15…
3/ Stutts has studied the furin cleavage of human α-ENaC, but does not explicitly describe the FCS sequence

FCSs were originally identified in mouse α-ENaC by Hughey et al 2004
sciencedirect.com/science/articl…
Read 10 tweets
Jan 25
What was the restriction enzyme BsmBI, itemized in the DEFUSE budget, to be used for? 🧵

3 possibilities:

i) To make an spike ectodomain expression vector
ii) To insert recombinant S genes into VRP vectors
iii) To make infectious clones (ICs) Image
2/ I will consider each of these possibilities as follows
3/ i) spike protein ectodomains were to be overexpressed and purified, following the protocol of Pallesen et al

pnas.org/doi/abs/10.107…
Image
Read 23 tweets
Jan 20
A potential explanation for the alanine (A684) in the SARS2 FCS ?

Alanine mutational scanning is used to systematically mutate residues in functional sites, to determine if they ablate function, so indicating their importance 🧵

en.wikipedia.org/wiki/Alanine_s…
2/ In the recent FOIA-ed DEFUSE documents released by @emilyakopp and @USRightToKnow there is a section that proposes to 'ablate' 'human-specific cleavage sites' introduced into SARSr-CoV spike proteins Image
3/ This would present a method of tweaking the efficiency of the FCS, and identifying key residues. There is a possibility that overly-efficient cleavage might result in excessive shedding of the S1 domain (reducing ACE2 binding rates)

frontiersin.org/articles/10.33…
Image
Read 10 tweets
Dec 18, 2023
Bombshell !

From an early draft of DEFUSE:

"The BSL-2 nature of work on SARSr-CoVs makes our system highly cost-effective relative to other bat-virus systems"
Baric was aware of the lax safety standards at the WIV, in the following draft note:

" In china, might be growin these virus under bsl2. US reseachers will likely freak out"

But put his name to the proposal anyway
Daszak wrote:

"Ralph, Zhengli. If we win this contract, I do not propose that all of this work will necessarily be conducted by Ralph, but I do want to stress the US side of this proposal so that DARPA are comfortable with our team"
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(