@scotub@MJnanostretch@PatrickSSte@Biorealism OK, I'll give it an honest effort. But if MJ continues his childish belittling, I'll block him and suggest you do the same in the interest of real dialogue.
The early A and B lineages are distinguished by mutations at two sites. A is T/C at those sites, and B is C/T.
@scotub@MJnanostretch@PatrickSSte@Biorealism The haplotype in B is seen in previously sequenced related bat coronaviruses, and also in early cases at the market. There are also a small number of early cases that show C/C or T/T haplotypes, but those are very infrequent, and it was unclear how to interpret them.
@scotub@MJnanostretch@PatrickSSte@Biorealism The base of each lineage has very high polytony, and yes, that can happen in humans. High polytony was somewhat characteristic of the initial introduction to new geographic areas. And it would *also* be expected in a zoonotic spillover for the same reasons
@scotub@MJnanostretch@PatrickSSte@Biorealism But it's not the usual case. For example, *monophyletic* relations between viral sequences have been used in criminal prosecutions of people accused of knowingly transmitting HIV, but you can't use polytony because an individual carries a viral population
@scotub@MJnanostretch@PatrickSSte@Biorealism Since lineage B had been observed in the earliest known cases at the market, one might think, OK lineage B crossed over and gave rise to lineage A. Or maybe lineage A crossed over somewhere else and gave rise to lineage B at the market. Or maybe there was more than one jump.
@scotub@MJnanostretch@PatrickSSte@Biorealism There's no particular reason Pekar would want to conclude >1 jump. But the early epidemiology from phylogenetics was (and is) an interesting topic. And they start with a bunch of questions, like what about the 20 "intermediate" (C/C or T/T) sequences?
@scotub@MJnanostretch@PatrickSSte@Biorealism With enough genomes later, intermediates would be expected to appear, but with the earliest cases, one would like to be dealing with non-convergent evolution. And they were so few that it's important to see if they might be artefactual. That's what they found.
@scotub@MJnanostretch@PatrickSSte@Biorealism As I said elsewhere, I haven't examined the sequences myself, and don't feel inclined to because I've read enough of the analysis of others to be satisfied that the exclusions were done well. I've also read the "unwarranted exclusion" paper, and am, well, less impressed.
@scotub@MJnanostretch@PatrickSSte@Biorealism So the paper works both backwards and forwards in time. From the curated early sequences, they first work backwards to infer some information about the MRCAm, including both ancestral site reconstruction and an ancestral genome.
@scotub@MJnanostretch@PatrickSSte@Biorealism It's in the ancestral genome haplotype and its rooting that they develop some new phylogenetic rooting methods, partly because of the amount of recombination and partly because of the different substitution rates expected for generation of different haplotypes.
@scotub@MJnanostretch@PatrickSSte@Biorealism They actually work through two *different* approaches to the question, and they get conflicting answers. A head-to-head battle favours B but A is favoured by the unconstrained rooting and the related bat CoVs (I had it backwards in this tweet earlier)
@scotub@MJnanostretch@PatrickSSte@Biorealism As far as I can tell they didn't really start to ask about the possibility of two introductions until they had to grapple with three possible MRCA haplotypes. That's just a guess, but that's how I read it.
@scotub@MJnanostretch@PatrickSSte@Biorealism So this is where they switch to start working forward in time, simulating different epidemic possibilities. So they simulated all the realistic possibilities for single introduction, (C/C, A, or B) and asked how well it could reproduce the observed structure (including polytony)
@scotub@MJnanostretch@PatrickSSte@Biorealism And the single introduction doesn't really cut it. It often generates hundreds of but not the separated clades observed. That's true even with C/C as the haplotype at the introduction.
Dammit, I keep misspelling polytomy as polytony 😡
@scotub@MJnanostretch@PatrickSSte@Biorealism And that's at least how I see the "why not human" question from Scott and Patrick. It wouldn't be expected to give rise to two clades separated by two substitutions.
@scotub@MJnanostretch@PatrickSSte@Biorealism Then the "why two crossovers" question. That's pretty straightforward at this point. Just allow separate introductions of two different haplotypes and see how likely the observed phylogenetic structure is. From that, one can then move forward and try to infer timing.
@scotub@MJnanostretch@PatrickSSte@Biorealism To wrap up: I really don't think they were looking for two cross-overs. They were looking to understand the early epidemiology using phylogenetics, and only *after* the MRCA haplotype inference work did they start to ask about whether >1 crossover could be examined.
@scotub@MJnanostretch@PatrickSSte@Biorealism In addition, at the time, inferring two spillovers was a potential setback for the market origin theory! This is because only B lineage samples had been observed among the earliest samples at the market.
@scotub@MJnanostretch@PatrickSSte@Biorealism Their prediction was actually quite bold: their analysis predicts that lineage A virus was circulating AT the crossover epicentre in 2019. That hadn't been observed at the time. But then ....
@scotub@MJnanostretch@PatrickSSte@Biorealism Gao's virtually simultaneous preprint, which they were unaware of, showed exactly that. An A lineage sample taken on Jan 1 2020, but left from earlier.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
In the last day *alone*, you've hyped false allegations against @acritschristoph et al, repeated and amplified falsehoods about a table in their paper, hyped absolute *nonsense* about genomic DNA depletion, and made (and deleted) false accusations against @angie_rasmussen
@acritschristoph@angie_rasmussen Receipts: part 1. Hyping false allegations against @acritschristoph et al. (Note: GISAID withdrew its statement, issued a revised statement without accusations, and restored access to the authors after the authors documented their communications)
Receipts part 2: The table. My favorite part of this is that the highlighted part of the table shows the exact opposite of the falsehood you were making. And please don't bullshit your way out of this by trying to pretend you were making an argument about read counts.
"...and human nucleic acid was removed using an enrichment kit to improve the sensitivity of viral RNA detection."
This is almost surely generic gDNA depletion. Nobody's doing hsDNA subtractive hybridisation on environmental swab samples for NGS.
It's really astonishing how many scientists adopted a literal interpretation of that line from Gao et al because they wanted to use it against @acritschristoph et al's new report. I mean, COME ON
@acritschristoph A far better response to that line would be "WHAT SORCERY IS THIS?"
Those who find that Gao et al make a compelling case for human introduction of the virus to the market rely heavily on arguments summarised in Figure 4 (4B below). You can't find those arguments compelling and also discard the new analysis of susceptible wildlife as "nothing new"
In fact, as @PeaseRoland points out for Figure 4B and I mentioned for FIgure 4A, there are reasons to be cautious in accepting Gao et al's interpretations of their own figure.
But at any rate, the new results really are interesting, so let's turn to those.
The number of stalls with specific SARS-CoV-2 susceptible animal material is striking. This provides more information than, and sometimes conflicts with, what these specific stalls were previously reported to contain.
I want to expand on this point about the environmental sampling because people seem to have missed how central this (now putatively falsified) assertion is to the lab leak argument. 🧵
I'll start with a counterpoint by @WashburneAlex: finding evidence of SARS-CoV-2 susceptible animals proves nothing about viral spillover. Good point, Alex; alone, it does not.
You might want to stress that point to @Ayjchan, who described the positive environmental samples with human genetic material as pointing to a human introduction of the virus to the market.
About the new findings (shown at SAGO meeting on Tuesday) from analysis of environmental sequencing data at Wuhan HSM. What can we say from news reports of a presentation at a meeting we didn't attend? ("We" here being most of us). Not as much as we'd like, but not nothing 🧵
First, I'll start with an agreement with lab-leakers: the news reports are FAR too breathless for something that's not available for public and scientific scrutiny. There *is* something that's essential to report and important for our understanding of what happened, though.
New evidence is an opportunity to update our beliefs. So rather than trying to pointlessly argue about the science from the articles, I'll describe how how the results might lead us to update our beliefs. This assumes the results are accurately reported and will hold up.
The Lancet Commission Report's treatment of Covid origins is the opposite of what it pretends to be: it calls for inquiry, but actively contributes to disinformation.
The @Commissioncovid report ironically COVERS UP inquiry into Covid origins 1/10
The most important and substantive investigations into Covid origins in the last year are the @ScienceMagazine papers by Pekar et al (2022) and Worobey et al (2022), as well as the Gao et al preprint. According to @Commissioncovid, these never happened. 2/