Alina Chan Profile picture
Scientist against lab-based pandemics 🧬 Co-author of VIRAL: the search for the origin of Covid-19 📖 A dangerous young investigator 🕵🏻‍♀

Sep 18, 2020, 30 tweets

Was RaTG13, the closest virus genome to SARS-CoV-2 (96.2% match), fabricated?

Whistleblower Limeng Yan says she will share a report showing evidence.

But a solid analysis by Eldholm & Brynildsrud shows that the genome is supported by raw sequence data.

virological.org/t/on-the-verac…

@shingheizhan and I, alongside several other teams of scientists, started independently looking into the RaTG13 raw data because of amplicons that were quietly deposited by the WIV onto NCBI on May 19, 2020, months after its genome was published in Nature. ncbi.nlm.nih.gov/sra/SRX8357956

These amplicons revealed that RaTG13 had been sequenced in 2017 & 2018, which threw everyone for a loop because we all thought that RaTG13 had only been full genome sequenced AFTER COVID-19 broke out. Acknowledgements: @babarlelephant @franciscodeasis

Later in July, a @ScienceMagazine Q&A revealed that the WIV had actually fully sequenced the genome of RaTG13 in 2018 - contrary to both their 2020 Zhou et al. Nature article and to multiple interviews of Peter Daszak, collaborator and funder of the work.
sciencemag.org/news/2020/07/t…

Talking to @nytimes and @WIRED, Daszak went so far as to say that if they had more money, they "could have sequenced the whole genome... maybe then when we were designing vaccines for SARS, those could have targeted this one too"
nytimes.com/2020/04/21/mag…
wired.co.uk/article/corona…

This has raised concerns about the transparency surrounding research performed on SARS2-like viruses prior to the COVID-19 outbreak.

Are there other SARS2-like CoVs we don't know about? When do we (and even EcoHealth/NIH) get to find out about these?

minervanett.no/alina-chan-cor…

It is based on these scientific discrepancies, not to mention RaTG13's connection to unresolved SARS-like cases in Yunnan (2012), that a need arose to verify the raw sequencing data and genome of RaTG13 -- which has been used in dozens of studies to understand how SARS2 evolved.

As Eldholm & Brynildsrud tactfully noted in their @virological_org post, the methods used for RaTG13 genome assembly were only "cursorily" described by the Zhou et al. Nature paper. Not having these methods makes it challenging to reproduce the assembly... virological.org/t/on-the-verac…

One head scratcher is the discovery of 2 sites in the raw data that don't match the published RaTG13 genome. @shingheizhan and @notoriousFIL have found this mismatch as well and noted that these 2 bases are identical to their counterparts in SARS-CoV-2.

It's unclear how this error happened. From the rest of the analysis, it appears that the amplicon data (uploaded in May, 2020) were likely deposited to address the gaps in the RaTG13 genome based on the metagenomic data alone.

Nonetheless, because of the lack of methods provided as to how RaTG13 was processed prior to sequencing, there are still outstanding questions raised for example by @MonaRahalkar about why RaTG13 has so much fewer bacterial reads compared to other bat CoV metagenomic data.

Getting back to the question: Was RaTG13 fabricated?

The short answer: Not that we can tell.

However, there are still questions pertaining to the integrity of this disintegrated sample: the source, how it was processed, and if it is the only known CoV with the 4991 sequence.

I've been getting a particular question for months now: why would the WIV offer up this RaTG13 96.2% match to SARS2? Wouldn't it raise all sorts of suspicions about lab origins? Isn't this a sign of their honesty?

To explain this, we have to go back to January, 2020...

This story was put together by twitter vigilantes and journalists, too many to name. I append links to verify each statement:

The first SARS2 genome sequence was published in early Jan, triggering a race to find similar known viruses and SARS2's origins: virological.org/t/novel-2019-c…

In early Jan, 2020, the most closely related virus genomes were from two SARS-like coronaviruses sampled from bats in Zhoushan city, Zhejiang province: the ZXC21 virus was obtained in July, 2015, and ZC45 in February, 2017.
ncbi.nlm.nih.gov/pmc/articles/P…

The study showed that a novel SARS virus could cause clinical symptoms manifesting in the rat lung, intestines, and brain, with the highest viral loads detected in the lung despite intracerebral introduction of the virus...

The study spoke to the genetic diversity of and potential for cross-species transmission of SARS-like coronaviruses found in bats --- even though this team was not able to culture the viruses in Vero (monkey) cells.

Shortly after the first SARS2 genome went public, two papers were published to shed more light: Chen et al. pointed out that SARS2 exhibited 98.7% nucleotide identity to the partial RNA-dependent RNA polymerase (RdRp) gene of a bat coronavirus BtCoV/4991. tandfonline.com/doi/full/10.10…

BtCoV/4991 was sampled from an abandoned mineshaft in the town of Tong Guan in Mojiang county, Yunnan province in 2013, and published by Shi Zhengli's lab in Virologica Sinica in 2016.

Chen et al. expressed regret that the rest of the genome of BtCoV/4991 was not available for comparison, but noted that SARS2 shared 87.9% nucleotide identity with the two aforementioned SARS-like coronaviruses ZXC21 and ZC45.

In parallel, in Nature, Zhou et al. (Shi lab, WIV) found that SARS-CoV-2 shared high sequence identity in a short region of the RdRp with a new virus named BatCoV RaTG13, also sampled from a R. affinis bat in Yunnan province. nature.com/articles/s4158…

RaTG13 shared 100% identity with BtCoV/4991 in that 370 bp of the RdRp. With no prior citation provided by Zhou et al. Nature for RaTG13, this led to speculation that BtCoV/4991 and RaTG13 could be the same virus, or, at the very least, closely related viruses from Yunnan.

Finally, on July 24, 2020, it was revealed by the Wuhan Institute of Virology that RaTG13 was indeed BtCoV/4991, and that its full genome had been sequenced in 2018 and not after the COVID-19 outbreak as some readers had initially thought. sciencemag.org/news/2020/07/t…

However, by April 2020, this had happened: "Studies on the origin of the virus will receive extra scrutiny and must be approved by central government officials" cnn.com/2020/04/12/asi…

How do we make sense of this?
A novel SARS virus, connected to unresolved SARS-like cases in 2012, was buried in a 2016 Virologica Sinica paper with only a 370bp RdRp fragment published.
We find out months post-COVID that its genome was sequenced in 2018. ncbi.nlm.nih.gov/pmc/articles/P…

Dr. Yan interprets this to mean that RaTG13 is a fabricated genome to distract from the Zhoushan viruses, which were the earliest known matches to SARS2 (see above).

However, RaTG13 looks like a real genome, despite the (unintentional?) misdirection/obfuscation by WIV+EcoHealth.

What we know:
1. The 1st SARS2 genome was published without gov approval; lab was shutdown for rectification
2. Scientists would easily match it to the btCoV/4991 sequence from the WIV
3. Even other Chinese scientists didn't have access to the RaTG13 full genome (seq'ed in 2018)

4. RaTG13 amplicons were uploaded in May 2020, exposing the fact that these were sequenced in 2017/2018
5. July 2020, WIV reveals that RaTG13 is 4991! and that it was full genome sequenced in 2018 - even their co-funder EcoHealth was in the dark

It's not that there was a 6-year-long cover-up. No one in 2012 could've predicted that SARS2 would infect 30+ million people in 2020.

It's that there's a lot of research going on that even funders have no idea about. And right now, we're not able to know what viruses were found.

That is why it astounds me that some scientists can say that the raw sequences seem to add up so we can all relax now.

We can't relax now. There are so many questions that are unanswered relating to RaTG13 and perhaps other unpublished SARS2-like viruses.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling