Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Marc Johnson

@SolidEvidence

Dec 23, 2023 • 26 tweets • 5 min read • Read on X

Scrolly

We are now officially into year 5 of the SARS-CoV-2 pandemic/endemic.

I gave a lecture to my virology class this Fall about the history of the pandemic through the lens of viral genotypes.

I thought I would share that lecture as a thread.

This is a long one.
1/

In the beginning there were 2 genotypes of SARS-CoV-2, A and B.

The two differed by only 2 nt, but both lineages would go on to circle the globe.

2/

The fact that both lineages were present in the Wuhan Seafood market from very early on is one strong piece of evidence that the market was the likely origin.

If the market were just a single superspreader location, you wouldn’t have expected it to have both lineages.

3/

The virus was found to be closely related (96% identical) to a bat sarbecovirus, RaTG13.

A striking difference was a 4 AA insertion that would create what is called a furin-cleavage site (FCS), a protein sequence that could be cut by the cellular protein furin.

4/

Lab leak proponents will claim that this is evidence that the virus was engineered because many other coronaviruses have an FCS at the same site, and investigators had talked about testing these kind of changes.

5/

Zoonosis proponents will counter that coronaviruses make random insertions all the time, and no idiot would generate an FCS that was preceded by a proline (P) since that would make a very poor cleavage site.

6/

The virus agreed with the zoonosis proponents on this account and proceeded to eliminate the Proline numerous times. 681P went extinct in circulating lineages years ago.

7/

But the A and B lineages started having offspring and eventually a B descendant called B.1 took over. Bette Korber was the first to point out the dramatic increase in lineages containing the mutation D614G, a key mutation in the B.1 lineage.

8/

I now need to explain how PANGO designations work if you aren’t familiar already.

Every lineage has a numerical designation starting with A or B.

Any time they get a descendant, they get the same designation as the parent with a new number added at the end.

9/

The first descendant of B was B.1, the second was B.2, etc.
When B.1 had descendants, they were B.1.1, B.1.2, etc
And so on.

10/

However, they couldn’t have strings of numbers going on forever, so they put a cap at 3 numbers.

Once a lineage gets a 4th number, that is converted to the next available letter in the alphabet.

11/

The first time this happened was with the lineage B.1.1.1.1, which became C.1.

Eventually they ran out of single letters and had to switch to a two-letter code. We are about halfway towards needing a 3-letter code.

12/

Last thing, when a viral recombination occurs, the lineage starts with X (like XBB), but then all the same rules apply.

BTW, I would like to thank the dedicated scientists (mostly volunteers) that keep track of these lineages through tireless analysis. The list is long.

13/

The B.1 and B.1.1 lineages dominated by mid-2020 and everyone thought the pandemic would soon be over.

Then something surprising (to me) happened with a virus that is supposed to make virus very few replication errors.

14/

We started getting various lineages that seemed to be spreading at a much faster rate. The lineages (at the time) were called the UK variant (B.1.1.7), the South Africa variant (B.1.351), and the Brazil variant (B.1.1.28.1/P.1). These are now called Alpha, Beta, and Gamma.

15/

All three of these lineages, which were from very different viral backgrounds and parts of the world, had the same Spike mutation N501Y.

Why did this only start appearing a year into the pandemic?

16/

Some said it was about immune evasion, and the virus didn’t need it before because no one had immunity before.

I never bought this. The most successful of the three N501 lineages was Alpha, which is not particularly immune evasive.

17/

Others said that N501Y gave a general growth advantage because it enhanced receptor binding.

If true, why did it take so long to be select for it?

18/

The answer of the timing of N501Y lineages probably has to do with how they emerged.

There is a lot of evidence that all 3 of these lineages were derived from persistent infections. This helps explain the timing.

19/

We know now that people can be infected for months or even years with SARS-CoV-2 in some instances, and this is basically like sending the virus to college.

20/

In a persistent infection the virus has lots of time to try out different combinations of changes that it doesn’t get to try when it is ‘working’ (in circulation).

I know I’m anthropomorphizing, but it fits.

21/

Generally, the novel lineages from persistent infections take months to years before they ‘escape’ and start circulating again. (In the vast majority of cases, this never happens, it just stays in the one patient)
22/

Although the Alpha/Beta/Gamma lineages all had N501Y, they also had other changes (such as changes at the FCS), and they were all B.1/B.1.1 derivatives (containing D614G).

23/

It was probably just a matter of timing.

In the first few months B.1/B.1.1 took over, started a bunch of persistent infections, and then three of these started circulating again a few months later, and they all had certain ‘obvious’ changes like N501Y.

24/

Opps, I guess twitter has a string limit now. I'll continue in another thread.

https://twitter.com/SolidEvidence/status/1738635438301262271

https://twitter.com/SolidEvidence/status/1738635438301262271

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @SolidEvidence

Marc Johnson

@SolidEvidence

Jan 24

We found a new (I think) cryptic lineage this week.
I know I say this all the time, but this is really weird.
Warning, this thread is for nerds only.
1/

Here’s what we do. Every week we download all of the new sequences from SRA and run a bunch of screens to look for anachronistic or cryptic lineages.

This new one popped up in 3 different screens.
2/

A good way to spot anachronistic lineages is to look for sequences that have been deleted in contemporary lineages. The virus can only undo a deletion through recombination. If we find seqs that lack the deletions, they have to be old (or contaminated with something old).
3/

Read 16 tweets

Marc Johnson

@SolidEvidence

Nov 23, 2025

What should we expect this flu season?

Here’s a forecast from a wastewater perspective (because sh*t don’t lie)

1/

Background. The 4 main kinds of influenza circulating among humans (in order of severity) are:
FluA H3N2
FluA H1N1
FluB
FluC (many don’t know this one)

2/

Last season, there was a pretty even split between H1N1 and H3N2, with a little bit of FluB late in the season. At least according to CDC patient data.
3/

Read 13 tweets

Marc Johnson

@SolidEvidence

Nov 21, 2025

https://x.com/SolidEvidence/status/1758255313440907668

This is wild.

Remember the NJ crytic lineage?

I posted 18 months ago that the Spike was too divergent to predict ACE2 binding, and asked if someone else could figure it out.

Some colleagues took me up on it.

Guess what they found?
1/

https://x.com/SolidEvidence/status/1758255313440907668

This preprint just came out. @wchnicholas and team reconstructed and tested the NJ Spike and found that it has the tightest ACE2 binding of any SC2 Spike ever measured.
2/
medrxiv.org/content/10.110…

https://x.com/SolidEvidence/status/1590072665561497601

We first found the NJ variant in 2023 because this sewershed from NJ with 1.5 million people because it regularly had a sequence that was a reversion to the bat sarbeco sequence, which is common in cryptics.
3/

https://x.com/SolidEvidence/status/1590072665561497601

Read 9 tweets

Marc Johnson

@SolidEvidence

Oct 31, 2025

Can you take a quarter cup of composite sewage, simply ask ‘what’s in there?’, and find out all of the pathogens circulating in that community?

That is the question we asked in our latest pre-print.

Turns out you can.
1/
medrxiv.org/content/10.110…

We are not the first group to do unbiased sequencing of wastewater to monitor circulating viruses, but I think we are the first to ever do it at this scale.

Weekly wastewater samples for 18 months, totaling over 85 Billion sequence reads.

2/

Among the ‘known’ viruses, there was a fairly even split between bacteria viruses (phages) and eukaryotic viruses.
This was just raw reads though, if you look at diversity there was considerably more species of phages.
3/

Read 23 tweets

Marc Johnson

@SolidEvidence

Oct 24, 2025

Help me out, I’ve got another wastewater virus mystery.

This one really blows my mind.
1/

Starting in the late 2023, + @securebio have been doing ultra-deep metagenomic sequencing of the virome from Columbia, MO wastewater.

We’ve collected and sequenced sample for over 90 consecutive weeks.
2/Lung.fish

We sequence about a billion reads per sample. That’s generated about 16TB of data from this site so far.

To put this in perspective for people my age, it would take a stack of 3.5 in floppy disks 200 miles high to store this data.
3/

Read 12 tweets

Marc Johnson

@SolidEvidence

Oct 17, 2025

It looks like Coeur d’Alene, ID cryptic is gone for now, but it has still managed to answer a lot of lingering questions for me about SARS-CoV-2 evolution, and what to expect next.

Here's a whole genome summary and interpretation.
1/

For a long time cryptic lineages were all from pre-Omicron lineages.

I started wondering:

Will there be Omicron cryptics?

If so, will they have the same evolutionary trajectories as the pre-Omicron cryptics?

ID shows that the answer to both questions is yes.
2/

We don’t do a lot of whole genome sequencing, so I sent 3 samples to @dho lab, who got fantastic sequences for all 3.
These samples were virtually 100% cryptic, so we have nearly complete coverage of the genome for a change.
3/

Read 12 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Marc Johnson

Try unrolling a thread yourself!

More from @SolidEvidence

Marc Johnson

Marc Johnson

Marc Johnson

Marc Johnson

Marc Johnson

Marc Johnson

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!