[Thread] FOIA from @USRightToKnow regarding Latinne et al. (2020) and clade 7896
TLDR: No sequence was deleted/modified since Aug-2019, but it seems they wanted to buy time for not publishing the viruses very early in the pandemic. usrtk.org/biohazards/foi…
13-Aug-2019: Ben Hu wrote an email to NCBI with 11 sqn files with 630 virus created with Sequin 15.10 (MS WINDOWS VISTA). All aa seqs are identical to current known seqs. Only two minor exceptions (two ends of two viruses trimmed by NCBI) usrtk.org/wp-content/upl…
14-Aug-2019: NCBI replied with the 630 Accessions. Note: viruses were "accessioned" but not "deposited" in Genbank (this is something new that we were not sure if it was possible, and it is the case when doing "manually" by email). usrtk.org/wp-content/upl…
07-Nov-2019: Automatic NCBI reply to a submission made probably around two days before. Scheduled release date is fixed as "Dec 31, 2020". Fasta file with 630 viruses is attached, exactly the same as today, including nt seqs (except minor things such as the journal and date)
13-Apr-2020: Ben Hu told them to hold because "both the authors and the editor of the journal decided that the manuscript involving those sequences need to be further revised. The acceptance and publication date of the paper will thus be delayed, which will be determined later".
Conclussions:
- All sequences are original (unless we do not trust NCBI). Synonymous mutations might have been introduced between Aug-2019 and Nov-2019, although it seems unlikely.
- Clade 7896 RdRps seems original, or at least they were not faked as a result of the outbreak. But they suffered some changes in the last article of WIV
- It seems clade 7896 was really downplayed from the beginning, on 13-Aug-2019, because they are the only SARS viruses without SARS label. A similar case to Ra4991 in 2016?
- It seems they just wanted to buy time for not publishing the viruses very early in the pandemic. I guess to avoid being questioned on clade 7896 (first 9 hits for the mine when BLASTing RdRp would have been very hard...).
- Accession numbers were assigned by NCBI maintaining the order given by Ben Hu (first the 11 files, and then the order inside each file).
Up to MN312664 the viruses were sorted by sample ID number as integers, but after MN312665 the viruses were sorted as text.
So, surprisingly 7952 sorted as text without having non-digits characters. I am now more convinced of this.
And now we can remember this from Hu's thesis and from Ge et al. (2013). Double SARSr in the same sample... (2nd photo):
- We still miss the original draft sent to Nature Comm.
- There could have been a previous submission to another journal in between Aug-2019 and Nov-2019
To end:
What does it means "both the authors and the editor of the journal decided that the manuscript [...] need to be further revised"? Why Hu changed his mind? They heard of imminent release of RmYN02?
RmYN02 article was "Received 2020 Apr 16; Revised May 1; Accepted May 6". Without it, the first 9 virus closest to SARS-CoV-2 RdRp would have been RaTG13+Clade 7896. So, being published RmYN02 they can obviate clade 7896 in June and nobody would notice.
"CNN published one piece of their works in June, which shows that the group ramped up efforts to expand its voices via these influencing media outlets." -->
"currently available data" is highly biased (at best). If you conclude something from biased data, your conclusions will be biased.
A good scientist assesses quality and biases of data first. That is how Science work
If you do not study something enough, you do not make an absolute judgment, much less give extreme probabilities. There you exceeded your mandate and did something unprofessional.
[Thread] New unforced error. GIABR, the lab of the pangolins, has just uploaded sequences (MW600658:MW600715) that shows a trip to the Mojiang mineshaft or nearby on 22-Aug-2017, well after the last known trip of WIV in 2015.
Libiao Zhang explicitly credited as collector.
Some context: Very few CoVs published with collection date after 2016 by Chinese institutions
WIV used two main series of sample IDs (NNNN, e.g. 4991; and YYNNNN, e.g. 162387) plus some ad-hoc series.
We discovered yesterday that one of them was from GIABR.
[Thread] Who is the first known patient?
There is a lot of confusion, so let's review all possible patients according published onset dates [of symptoms] up to 15-Dec-19.
Notation
Patients are anonymized, so they are identified as <AgeSex> (e.g. 49F is a 49-year-old female). In case there are many patients with same age & sex, suffixes are used (e.g. 65M1, 65M2, ...).
U = Unknown.
Problem: people can have birthday during illness
XX Su (61F), XX Wang (62M) & XX X (UU) onset 14, 21 & c. 30-Nov-19.
Info unnoticingly leaked in Health Times and uncovered by DRASTIC and @ianbirrell
[Thread] Necessary corrections to the China-WHO report.
What they will probably fix and what they will not.
TLDR: circular swap of 3 IPCAMS genomes + tampered onset of Wuchang accountant; First patients and first cluster; Some falsehoods in articles washingtonpost.com/world/asia_pac…
As we said, the problems were with S01, S05 & S11. Absolutely chaotic, but with a little table it is all more clear. Part of the problem is inherited from Ren et al. (2020) who followed different orders in the text and in the genomes for the patients.
"It shows how an authoritarian government can successfully shape the narrative of a disease outbreak and how it can take years — and, perhaps, regime change — to get to the truth"
"“You can concoct a completely crazy story and make it plausible by the way you design it,” Dr. Meselson said, explaining why the Soviets had succeeded in dispelling suspicions about a lab leak"
“Those who don’t want to accept the truth will always find ways not to accept it.”