Fantastic review on chronic SARS-CoV-2 infections by virological superstars Richard Neher & Alex Sigal in Nature Microbiology. I’ll do a short overview, outline a couple minor quibbles, & defend the honor of ORF9b w/some stats & 3 striking sequences from the past week.
1/64
First, let me say that this is well-written, extremely readable, and accessible to non-experts, so you should go read the full paper yourself, if you can find a way to access it. (Just realized it’s paywalled, ugh.) 2/64nature.com/articles/s4157…
Neher & Sigal focus on the 2 most important aspects of SARS-CoV-2 persistence: its relationship to Long Covid (including increased risk of adverse health events) & its vital importance to the evolution of SARS-CoV-2 variants. I’ll focus on the evolutionary aspects.
3/64
They divide persistent infections into 2 categories, based on immune status of the host: immunocompromised (IC) or immunocompetent. There are of course intermediate immune states—immunosenescent the most important example to me—but the binary model is the most useful model. 4/
Chronic SARS-2 infections in the IC are exceedingly well documented. In one sense, this isn't abnormal: chronic infections in IC have been documented for many “short infection” viruses—influenza, parainfluenza, rhinoviruses, adenoviruses, “common-cold” CoVs, norovirus, etc. 5/
What makes chronic SARS-CoV-2 infections different? One difference is the way such long-term infections have dominated the evolution of major variants. Nothing like this is known to happen w/other human viruses (though can't be definitively ruled out due to spotty sequencing.
6/
Prolonged influenza infections in the IC, for example, have been documented, but there is no sign they’ve ever led to major flu variants. This prompts 2 questions: 1) Why do major SARS-2 variants arise from chronic infections? 2) Why not in other viruses?
One major reason chronic infections have created variants is the lack of a transmission bottleneck, which I’ve discussed previously (e.g. posts 6-11 in thread below). Neher & Sigal also write about this. 8/
Another possible explanation for the success of chronic infection-derived variants is that the adaptive immune response takes time to kick in—after the period of transmission in many/most cases—it is in long-term infections that antibody-escape mutations evolve efficiently.
9/64
The long branches leading to new, chronic infection-derived variants exhibit more rapid evolution than circulating lineages, and, as outlined in the paper, many studies of infections in the immunocompromised have directly observed the same. 10/
It’s important to recognize that the vast, vast majority of chronic infections do not lead to serious variants; in fact, it seems they rarely transmit at all, for unknown reasons, even when they possess potent immune evasion (see 1 hypothesis below). 11/
There are thousands of examples of such chronic-infection sequences, and each sequence undoubtedly represents hundreds of similar, unsequenced potential variants. Neher/Sigal outline numerous studies that hint at the enormous number of hidden, long-term infections.
12/64
It’s not at all clear why a tiny fraction of these highly mutated variants transmit widely and others do not. But the success of such variants was not, as far as I know, predicted by anyone, and this was not just because there were no clear previous examples.
13/
There are good theoretical & empirical reasons these variants were unexpected. I haven’t read a similar discussion of this elsewhere, so this is one of the highlights of the paper to me. One example: ancestral HIV variants transmit more successfully host-adapted ones. 14/
Many mutations in chronic seqs seem to fit this description, like E:T30I & M:H125Y. Many dozens of others that almost never transmit are reversions to Bat-CoV residues & likely adaptations to the GI tract that impede transmission. See @SolidEvidence for more on this. 15/
There are innumerable way to be immunocompromised—inherited genetic disorders, immunosuppressive therapies, cancer, etc—but advanced HIV may be an especially potent incubator of potential variants & may explain why so many have originated in Southern Africa. 16/64
Many people suffering chronic infections are very sick: the metadata from sequences very often indicates hospitalization. Others, aware of their vulnerability to infection, have few social contacts and/or mask in social situations. But HIV is often very different.
17/
Many people with advanced HIV who are chronically infected with SARS-CoV-2 have few or no symptoms & therefore have a normal number of social contacts. 18/
Furthermore, intermittent use of antiretrovirals (ARVs) could periodically subject the virus to enough antibodies to foster escape mutations but not enough to clear the virus entirely.
19/
Speculatively, perhaps 99% is cleared but a reservoir in the GI tract or other tissue persists, that later, when ARV treatment lapses, expands to the respiratory tract & potentially transmits. There is indirect evidence for this that requires a look at Bat-CoVs & Cryptics.
20/
I don’t think transmission in bats is well characterized, but it seems CoVs preferentially infect GI tissue in bats. Tissue tropism of one Merbecovirus was analyzed & found to favor GI tissue in a paper by @thijskuiken, @MarionKoopmans, & @VeraMols
21/64 journals.asm.org/doi/10.1128/jv…
There are a striking number of mutations in major variants that are “reversions” to AA residues in Bat-CoVs. Gamma’s ORF1a:K1795Q, as @SolidEvidence first pointed out, is one example. BA.2.86 possesses a panoply of such mutations.
22/64
This at least suggests the possibility that some major variants spent extensive time evolving in the GI tract of the chronically infected individual they evolved in.
Two remarkable chronic-infection sequences also hint at a possible GI-to-lung transition.
23/64
The Cryptics documented by @SolidEvidence bear some distinct mutational signatures that almost never appear in conventional nasal-swab sequences. One is S:L828F.
24/64
One of the most remarkable seqs ever was from the lab of @GBazykin: a B.1.1 collected in Oct 2022 whose closest relatives were collected around July-Sept 2020. It had many distinctive Cryptic mutations—but not S:L828F.
25/64
BUT, it very clearly *had* S:L828F at some previous time, and I think this likely represents the virus’s transition from the lungs, to the GI tract, and back again. See posts #13-19 below for my previous explanation of this. 26/
Another mutation that’s even rarer than S:L828F—& which appears almost exclusively in long-term, highly mutated chronic infections—is S:L1186F. Though it’s uncommon in Cryptics, it sometimes appears with very rare Cryptic-type mutations like… 27/
…the ORF1b:L314P reversion & S:Q498H, (both of which are in the B.1.1 sequence described above & in many Bat-CoVs). Two remarkable chronic BA.5.2 seqs from Canada did not have S:L1186F. 28/
BUT, like L828F in the B.1.1 sequence discussed above, they clearly had S:L1186F previously. C25118T is in every S:L1186F sequence ever. C25120A causes the F1186L AA reversion, but leaves a tell-tale trace in the genomic record. 29/
There are only a few dozen S:L1186F sequences ever, with close to half clear chronics. It’s may be meaningful that 3 are labeled as having come from bronchoalveolar lavage. Perhaps an inability to thrive in the URT prevents L1186F from transmitting.
30/
Interesting tidbit: in chronics there’s a very tight connection between mutations at E:5-42 (esp ∆V14 & T30I) & a region of NSP4 (ORF1a:3049-3089 plus—oddly—ORF1a:2972).
*All* S:L1186F chronics have ≥1 mut at E:5-36 and 8/11 have ≥1 in the NSP4 region.
31/64
Back to the paper. I have a few tiny quibbles, one involving non-spike muts in chronics, particularly ORF9b mutations. Sigal/Neher state that “mutations are not often observed outside spike” in chronics and note that “Mutations in ORF9b, ORF6, and N, key viral proteins…
32/64
…which mediate escape from innate immunity, do not appear in the list of mutations from persistent infections.” The list is based on 3 great studies by
@Mahan_Ghafari, @LucaFerrettiEvo, @jcbarret;
@sigallab, @farinakarim, @tuliodna;
@PeacockFlu, @pathogenomenick 33/
…& it’s a good list, but, I’d argue, incomplete. Using several thousand chronic seqs I’ve documented I’ve tried to calculate the rate of private mutations per AA residue for various proteins. It’s an imperfect measure but I think broadly accurate. 34/
In raw counts, ORF9b ranks #1. An adjusted version that excludes mutations that are clearly not overrepresented in chronic sequences puts spike first, followed by E, then ORF9b. I need to improve this measure further by adjusting for mutation rates in circulating sequences. 35/
As I noted in a previous thread, ORF9b is coded from an internal, overlapping reading frame within the N gene. Eddie Holmes & @SimonLoriereLab have noted that mutation rates tend to be lower in overlapping genes—especially internal ones like ORF9b.
36/64
But ORF9b is not just an overlapping gene; it’s also a fold-switching protein. Remarkably, it takes on two completely different protein structures: a monomer whose ordered regions are all alpha helices & a dimer consisting entirely of beta strands.
37/64
I’ll eventually finish a thread/paper elaborating a bit on this, but briefly, the ORF9b monomer powerfully antagonizes the IFN response while the dimer holds a lipid molecule & likely plays an important role in assembly. See great work by @KroganLab & @DoctorBou on this.
38/64
Considering this, the very high mutation rate in ORF9b is very surprising: Not only are ORF9b’s mutational options severely limited because it shares a frame with N, it also has to preserve two structural forms, which carry out entirely different functions.
39/64
Another neglected ORF9b-related feature of chronic infections (& CI-derived variants) is the repeated destruction of the N Kozak sequence, which dramatically increases ORF9b expression. See 46-53 in thread below for a partial outline of this. 40/
I did not mention in that thread that these N-Kozak-killing, ORF9b-enhancing mutations can be found in *virtually every major variant* of the VOC era. There are only two major exceptions: Beta and Gamma.
41/64
But even Beta and Gamma managed to increase ORF9b protein expression about 3-4-fold compared to WT through unknown mechanisms.
(graph via @KroganLab @DoctorBou @lucygthorne @akreuschl)
42/64
And Beta may have simply arrived too early: 5 of 7 chronic Betas I’ve found acquired one of the three maximal N-Kozak killers (A28271T, A28271C, or ∆A28271).
43/64
There’s also a plausible explanation for Gamma’s lack of an N-Kozak killer. ORF9b seems to be the most vulnerable SARS-CoV-2 protein to K48 ubiquitination & degradation by the proteasome, as documented by two excellent papers in mBio & @JVirology.
44/64
Gamma is the only major variant to have the classic chronic mutation ORF1a:K1795Q—also found in virtually all related Bat-CoVs. This mutation dramatically increases the NSP3 PLpro domain’s ability to deubiquitinate K48-tagged proteins. 45/
So it seems plausible that, like other sarbecoviruses, Gamma has no need to increase ORF9b because its PLpro efficiently prevents ORF9b degradation. Perhaps there’s even more active ORF9b protein in Gamma-infected cells than in any other variant.
46/64
Three sequences uploaded in the past week (two collected in Dec 2024) exemplify the high ORF9b mutation rate in chronic infections—and two other striking ORF9b trends in chronics.
47/64
First a BA.1.1 from Texas with 4.5 ORF9b mutations & a BA.4.6 from Arizona with five.
And a BA.5.2 with a stunning eight private ORF9b mutations:
R13H, G16D (R), R47H, P51Q, L64P, A75V, E86D, V93L
The two major trends in ORF9b mutations in chronics can be seen here:
#1) C-terminal muts, often in clusters
#2) Reversions to Bat-CoV residues
49/64
Here’s an alignment of the SARS-CoV-2 ancestral ORF9b aligned with closely related sarbecovirus ORF9b seqs. The second sequence is the consensus sarbeco ORF9b. The 6 reversions to Bat-CoV residues in these 3 chronics are labeled.
50/64
Many other 9b “reversions” are also common, both in chronics & in circulation—I5T, in particular was seen in chronic infections years before it became ~universal in XBB.1.9’s descendants. Others include R32H, D33G, G38D, N55S (XBB.1.16), & N62D.
51/64
What are the C-terminal ORF9b mutation clusters about? The ORF9b monomer CTD is disordered & projects outward. In the dimer, the CTD is also on the exposed outer edge of the protein, perhaps making it vulnerable to antibodies or T-cell attack.
52/64
Anyway, my mutation density measure masks huge variations in mutation density within proteins. For a subset of proteins, I separated out the various known domains. As expected, spike RBD is the runaway winner here. I need to add domains for E, M, ORF9b, ORF3a, & others.
53/64
NSP12 palm region spans 2 separate regions of the linear AA seq separated by a linker region. The linker has a much higher mutation rate, & this is generally true for NSP3 (& likely others) as well. But exact NSP3 linker residues aren’t well defined, so I didn’t include them. 54/
I have one other nano-quibble that wouldn’t merit mentioning except that it leads to a more important point. They state that BA.2.86, before acquiring L455S & becoming JN.1, had limited spread. In fact, it spread quite widely & was among the fastest lineages globally.
55/64
This was despite BA.2.86 having weaker antibody-evasion than other circulating lineages at the time—and exhibiting weaker infectivity in vitro. So why did BA.2.86 spread at all?
56/64
Partly, this may have been due to an increased ability to utilize cell-surface heparan sulfate as a co-receptor, as @yunlong_cao & @GuptaR_lab showed in a fascinating paper.
57/64
There’s more to the story than just that, but I think
@StuartTurville may be the only person in the world (along with Stefan Pöhlmann) who can properly explain it, so I’ll leave the story to him.
58/64
The excellent discussion section of Sigal & Neher’s paper is a treat. In describing the extreme rarity of chronic-infection variants, they make a point that should be obvious but which many fail to grasp: a single infection is all it takes.
59/64
Furthermore, the random nature and timing of these divergent variants means traditional seasonal vaccination schedules are inadequate, and future changes in disease severity & symptoms are entirely possible, something @georgimarinov has discussed.
60/64
Finally, the fundamental questions: Why do we see this in SARS-CoV-2 & not influenza or other viruses? A mostly uniform immune history & antigenic imprinting (OAS) may be part of it. Could the rampant recombination typical of CoVs factor in as well?
61/64
And what to make of the many, many studies showing signs of persistent viral RNA or proteins—most recently in the “skull-meninges-brain axis”—even in the immunocompetent? I feel more baffled than ever on this front, & have nothing useful to add.
62/64 cell.com/cell-host-micr…
I’ve always been skeptical that viral RNA or proteins could persist for many months or years with no viral replication occurring, except maybe in germinal centers. Are the skull meninges immune privileged enough to be an exception? Or is my skepticism is wholly unjustified?
63/64
I haven’t read much comment or discussion among experts on that study, which seems extremely well done & potentially really important. Would love to hear more expert thoughts on this.
64/64
In SARS-2 evolution, amino acid (AA) mutations get the lion’s share of attention—& rightfully so, as noncoding & synonymous nucleotide muts—which cause no AA change‚ are mostly inconsequential. But there are many exceptions, including a possible new one I find intriguing. 1/30
I’ll discuss four categories of such “silent” mutations, two of which might be involved in the recent growth of one synonymous mutation.
Maybe the single most remarkable example of convergent evolution in SARS-CoV-2 involves noncoding mutations: the multitude of muts in major variants that have pulverized the nucleocapsid (N) Kozak sequence.
I wrote about this below & a few other 🧵s 3/
@SolidEvidence There was yet another paper this week describing someone chronically infected, with serious symptoms, but who repeatedly tested negative for everything with nasopharyngeal swabs. On bronchoalveolar lavage (BAL), they were Covid-positive. 1/ ijidonline.com/article/S1201-…
@SolidEvidence BAL is very rarely performed, yet there must be dozens of documented cases now where NP-swab PRC-negative patients who were very ill tested positive by BAL. This has to be way more common than we realize.
If we had a similar GI test, I imagine we'd find something similar. 2/
@SolidEvidence Importantly, the patient was treated and improved, likely clearing the virus for good. Many, maybe most, chronic infections could be treated and cleared. But they have to know they're infected for that to happen. 3/
Read full 🧵for explanation, but the short story is that the best apparent escape mutations all interact w/something else—like a nearby spike protomer or other important AA—making mutations there prohibitively costly.
In short, the virus has mutated itself into a corner. 2/6
It's very hard to effectively mutate out such a local fitness peak via stepwise mutation in circulation since multiple simultaneous muts might be required to reach a higher fitness peak. 3/6
It's an interesting thought. I think the evidence is strong that all new, divergent variants have derived from chronic infections. The first wave of such variants—Alpha, Beta, Gamma—IMO involved chronic infections lasting probably ~5-7 months. It's controversial to say.... 1/15
…that Delta originated in a chronic infection, but I think the evidence that it did is strong. One characteristic of chronic-infection branches is a high rate of non-synonymous nucleotide (nuc) substitutions (subs)—i.e. ones that result in an amino acid (AA) change. 2/15
For example, if 80% of nuc subs in coding regions cause an AA change, that’s a very high nonsynonymous rate. The branch leading to Delta has 17 AA changes—from just *15* nuc subs! That’s over 100%. How is this possible? 3/15
I'd add that XEC's had no noticeable impact on cases & isn't likely to going forward barring a serious change, which we've not seen since S:Q493E & the glycan-adding S:S31-/S:T22N appeared months ago. Next major change seems likely to take the form of an entirely new variant. 1/4
I've been in lockstep with @SolidEvidence and @JPWeiland on this front. Despite the sensational early growth advantages XEC appeared to have, it never seemed likely to me ever to have a noticeable real-world impact. 2/4
In fact, XEC resembles BA.5.2 + ORF1b:T1050N, which had a similar growth advantage in summer 2022. That one, however, never had a sexy name like "XEC" that was distinct from other major contemporary variants so it passed unnoticed. Names matter. 3/4
Molnupiravir-created mutants still show up intermittently, mostly in Australia and Japan. A remarkable one popped up today: A KP.3.1.1 with 94 private mutations. 1/6
The closest related sequences are from the same region and from about 1 month earlier, suggesting these 94 consensus mutations were acquired in about one month, and possibly a shorter period of time. 2/6
It has the classic MOV signature of an extremely high percentage of transversions, primarily C->T and (especially) G->A.
93/94 mutations are transitions
27/94 are C->T
38/94 are G->A
More detailed discussion of this in 2022 thread below.