This could have the record for private spike mutations, surpassing even BA.1 & BA.2.86. ~35 spike AA mutations—& likely many others hiding behind dropout at S:447-471. Singlet in a country w/good surveillance, so unlikely to transmit. I'm super busy, so a few short remarks. 1/15
First, of the nucleotide mutations in amino acid (AA) coding regions, 61/71, or 85.9%, are non-synonymous, i.e. they cause a change in the AA. This is extraordinarily high and indicates strong positive selection—i.e. selection for advantageous mutations. 2/15
This sort of positive selection is a hallmark of chronic infections.
• Glycans—These are a type of sugar molecule that can hide exposed parts of the virus. S:F32S is somewhat convergent in chronics, & this is likely because it creates the N-X-[S/T] that can add a glycan. 3/15
• Glycans (cont): S:Y248N also creates an N-X-[T/S] motif (where X is any AA but P) & so likely adds a glycan. We've seen Y248N before, notably in BA.2.76. BA.2.86 (JN.1 now) has S:H245N, which adds a glycan at a nearby N residue, as well as K356T—which also adds a glycan. 4/15
• XBB mutations—This seq shares an unusual number of mutations with XBB. Most are highly convergent in chronics & may have been added independently (S:D339H, S:L368I, E:T11A, ORF8:G8*, ORF9b:I5T). But others aren't so common (S:V83A, T27889C reversion). I'm not sure if... 5/15
...these resulted from a complex intrahost recombination process or were all acquired independently.
Normally, shared mutations w/another variant w/no clear breakpoint suggests a sequencing artifact, but there are too many chronic markers for this to be an artifact. 6/15
• ORF1a:A2710T—Also in BA.1 and BA.2.86. Outside those 2 lineages it's been extremely rare. Curiously, SARS-CoV-1 & most Bat-CoVs have ORF1a:2710S, a similar AA. Both form the glycan N-X-[S/T] pattern. However, most structures show this region of NSP3 region (NSP3_1892)... 7/15
...to be in the cytosol, in which glycans do not exist (as I understand it). Previously there had been predictions of >2 transmembrane regions of NSP3, but I do not know if those have been decisively refuted or not. 8/15
• C26881T—This is in BA.2.86 & is convergent in chronics. It was also in Epsilon (B.1.427/9) a variant that becomes more intriguing the more I learn about it.
It's synonymous, so it doesn't cause an AA change. What is it doing? I don't know. 9/15
Credit @shay_fleishon for calling attention to C26881T. C->T are by far the most common mut & are caused by a cell protein called APOBEC. But APOBEC favors certain nuc contexts. C26681T has an awful context & hence is less likely to occur by chance. 10/15
C26881T could have something to do with the genome's secondary RNA structure, something I speculated about previously, but these things are poorly understood, so it's hard to come to any real conclusions on this front. 11/15
• Unknown nucleotide at A5648 almost certainly forms ORF1a:K1795Q, perhaps the single most convergent mutation in chronic-infection sequences, as @SolidEvidence first pointed out.
And for once, we actually know what this mutation does. 12/15
• ORF1a:I3255T (NSP4_I492T) reversion—Rare mutation but has shown up numerous highly mutated chronic-infection seqs. Any convergent reversion is undoubtedly purposeful. E.g., we know S:R493Q hugely increased ACE2 affinity. But most non-spike reversions remain mysterious. 13/15
One study found ORF1a:T3255I increases infectivity & immune evasion & reduces severity by improving the efficiency of NSP5 cleavage from the ORF1ab polyprotein. (All NSPs come in 1 big chunk & must be cut off to function properly.)
What purpose might its reversion serve? 14/15
• S:V486H—Very rare 2-nuc mut. Looks un favorable on the @jbloom_lab/@bdadonaite spike-mutation tool in XBB background, but this spike is its own beast & it could act differently in this context.
I could go on forever, but I have too much else to do, so I'll end here. 15/15
One final note—S:F759V has only ever appeared in two previous sequences, making it (along with the insertion at S:248) the most distinctive mutation of all in this sequence.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
This one's a doozy. Has to be the wildest RBD yet.
It's hard to convey just how crazy this one is, but in this 🧵 I'll share my initial thoughts one some of these RBD mutations. (RBD = receptor-binding domain of the spike protein)
1/22
I've analyzed 1000's of chronic-infection sequences, yet this Cryptic has 9 mutations I've never seen in any of those: G413R, T415E, Y421F, N440T, V445N, G446K, Y449T, K462Q, and G482T—*in the RBD alone.* Two others I've only seen once. 2/22
Another striking feature of this Cryptic spike is the number of multi-nucleotide mutations it has. Almost all amino acid (AA) mutations (>>99.9%) involve 1 nucleotide mutation. Yet this partial RBD sequence has *seven* 2-nuc mutations & two *3-nuc* mutations. 3/22
Longer 🧵 on BA.2.87. So far, it has not grown quickly or spread far geographically, so its future is murky. It could be a flash-in-the-pan that soon disappears or it could mutate to become more fit & challenge JN.1. My summary of its private mutations is below. 1/22
This is *not* to be confused w/the equally amazing BA.2 singlet discovered by @BorisUitham. While these 2 astoundingly divergent BA.2s are completely different, they do share one remarkable similarity: total loss of the same spike NTD disulfide bond. 2/22
Like BA.2.86, BA.2.87 comes from the root of BA.2, meaning the person it evolved in was infected for >1.5 years. The tree gives an idea of how divergent it is but doesn’t include the most striking aspect mentioned above: its large, disulfide-destroying NTD deletions. 3/22
Phenomenal 🧵 by @Tuliodna describing a wild new BA.2 lineage circulating at low levels in South Africa.
Incredible find by one of the best & most valuable teams in the world. We're all in their debt. Thank you, @Dikeled61970012, @nicd_sa, @DarrenM98230782, @houzhou, & team.
Will this new BA.2 sputter and disappear without making much impact? We've seen many chronic-infection BA.2 saltation variants do exactly that: BA.2.83, DD.1, BP.1, BA.2.10.4, + multiple undesignated ones (exceptionally divergent BA.2s in Chile & Ukraine come to mind). 2/
Or will it be more like BA.2.86 or BA.2.75, with modest growth at first, followed by a breakthrough RBD mutation(s) & rapid growth?
One important question: What is the ACE2 binding strength of this new BA.2? BA.2.86 & BA.2.75 had very high ACE2 binding. 3/
We see a lot of BA.2* these days: ~100% of cases have been due to BA.2-derived variants in the past year. But what ever happened to BA.1?
It’s still around, just not transmitting… much. The BA.1 below, w/32 additional spike mutations, was transmitted at least once. 1/15
Another recent variant also had 32 spike mutations relative to its ancestor: BA.2.86 (which later became JN.1). This is not the first extremely divergent, chronic infection-derived BA.1* sequence we’ve seen. But it may be the first we’ve seen transmitted. 2/15
I'll confine myself to commenting on ~6 mutations. 1) S:∆149-157—This deletion is never in BA.2 but often seen in chronic BA.1. I have no idea why this is. Any guesses, @PriscillaFalzi1, @jbloom_lab, @GuptaR_lab, @EnyaQing, @veeslerlab?
Note: SARS-1 had ∆149-152. 3/15
I'm not 100% sure due to uncertainties in the sequencing, but I think something remarkable has happened in JC.5.1 (an XBB.1.41 branch that has S:Q173K, L335S, R403K, K478R, S486P, & N:H300Y).
There's a large ORF7a deletion that leads to ORF7a-7b fusion—and more. 1/8
This deletion causes a change in the reading frame, i.e. a frameshift. This not only leads to fusion of ORF7a-7b but also completely changes the amino acid content of the last 20 AA of ORF7a. 2/8
Instead of hitting a stop codon, the new frame leads seamlessly into ORF7b. The red region below is the 173-nt deletion, and the purple is the new ORF7a reading frame.
Image made with @theosanderson's invaluable Gensplore genome browser. 3/8 gensplore.genomium.org
So the deer-derived sequence I described in the thread below has a companion—from a different US state. They share 35 mutations, so they are clearly related. But the shorter branches leading directly to each seq have 19 and 24 mutations, so they differ by 43 nuc mutations. 1/13
These are BF.11's, a lineage that hasn't circulated in humans for ~1 year. There are 2 possibilities: 1) This deer variant is circulating at low levels in humans in the US Midwest
Seems extremely unlikely, esp w/great recent surveillance in Minnesota. @CIDRAP @mtosterholm 2/13
2) There were two independent deer-to-human transmissions of a BF.11 variant widespread among deer in the upper Midwest.
This also seems unlikely. Still, I think this is a far more likely scenario, for a couple other reasons. 3/13
@mnhealth