Over the course of several threads, I will present evidence that the genome of RaTG13, the closest relative of SARS2, was not generated from a fecal swab as claimed but a ‘live’ isolate.
2/ Here, I will show that most reads in the RaTG13 dataset belong to a bat transcriptome. This is inconsistent with a bat fecal swab, where only a minority of reads are expected to belong to the animal being sampled, the rest belonging to bacteria
3/ 87.5 % of RaTG13 paired reads mapped to the Rhinolophus ferrumequinum genome (the closest genome to R.affinis available) using BBMap. This mapping rate is higher than to the genomes of other species known to be in cell culture at the WIV.
4/ Reads from a Rhinolophus sp. anal swab sample, also generated by the WIV (NCBI MN611522), were used as a comparison. Only 2.6 % of these reads mapped to the R.ferrumequinum genome. This is consistent with a true anal swab.
5/ The majority of reads in the RaTG13 dataset therefore appear to belong to a bat species. If a R.affinis genome were available, then a higher mapping efficiency would presumably be observed than to the R.ferrumequinum genome.
6/ This assumes that the RaTG13 sample was derived from R.affinis. I will identify the exact phylogenetic affiliation of the RaTG13 dataset in a future thread.
7/ The next question is whether the RaTG13 sample was DNA or RNA. If a significant proportion of reads map to annotated genes in the R.ferrumequinum genome, then the dataset was derived from an RNA sample and represents a transcriptome.
8/ 62.2 % of the reads that map to the R.ferrumequinum genome map to protein coding genes (using a gff annotation file and bedtools). This indicates that RaTG13 genome was generated from an RNA sample and represents a transcriptome.
9/ Consequently, it can be inferred that the RaTG13 genome was generated from a sample extracted from a bat cell line, or less likely bat tissue. Details of the technical analysis will be provided in a preprint. More to come.
In this thread I dissect the microbial taxa present in the RaTG13 dataset and show they are inconsistent with a fecal swab sample
2/ Using Metaxa2, only 1.8 % of the reads in the RaTG13 dataset (GSA CRR122287) correspond to small subunit rRNA sequences. This contrasts with 20.7 % present in a Rhinolophus sp. anal swab sample from the WIV (NCBI MN611522)
3/ This implies the RaTG13 sample underwent rRNA depletion, in contrast to the anal swab sample. This is an optional step when using the TruSeq library preparation kit, indicated as being used on the RaTG13 GSA webpage