With 3 more days of data, here's a look at how far various collections of immune escape variants have progressed so far.
I now count S:493 as a key mutation, so what used to be >=5 (Pentagon) is now >=6.
The more RBD mutations, the faster the growth. cov-spectrum.org/explore/World/… 1/
Looking at the data like this, we reduce the complexity behind numerous emerging variants into 4 buckets of increasing escape. This is not perfect as what's in a bucket is not totally the same, but it solves the problem of having to remember BQ.1.1, BA.2.75.2, XBB etc. 2/
If you want to connect this way of looking at things with various lineage names you may have heard of, this is a summary
Level 3 (not shown as "boring") = BA.4/5, BA.2.75 = BA.2 + 3 RBD muts
Level 4 = BA.4/5 + 1 RBD mut (e.g. BA.4.6, BF.7) or BA.2.75 + 1 RBD mut (BA.2.75.5) 3/
Each key RBD mutation that's added increases the growth advantage. But it also takes longer for that combination to evolve, so higher levels start out at lower share - to catch up later
While Level 3 (vanilla BA.4/5) still dominate in North America/Europe, that's not for long 5/
S:346T, which is in most of level 4, is on track to become dominant by the end of October as I predicted 7 weeks ago here: github.com/neherlab/SARS-…
(note any higher level variant is automatically a member of lower levels, hence level 5 (BA.2.75.2) is also level 4) 5/
But higher levels are catching up very fast. Given the rapid growth in particular of level 6 variants (BQ.1.1, BN.1, BM.1.1.1, XBB) a variant driven infection wave is inevitable by the end of November, possibly as early as early November. 6/
As a case in point for using level buckets rather than individual lineages when evaluating the variant situation: the level 6 variants mentioned above (BQ.1.1, BN.1, BM.1.1.1, XBB) each only make up between 10-35%. If you look at only one of them, you'll miss most! 7/
If you prefer to have each lineage broken out individually, collection 24 maintained by @siamosolocani is the one to go for.
Note that it's important to not take the growth advantages at face value. Use the low confidence interval instead: cov-spectrum.org/collections/24 9/
Beware that the proposed simplification into RBD mutation count levels ignores a lot of details that are important for individual variant success.
It only looks at mutations at 12 sites, when in fact very many positions matter, like the N-terminal domain of Spike. 10/
But for now, this seems to be a useful additional way of looking at the data and simplifying it to not be so confusing and still useful conceptually.
For the gold standard of growth advantage estimates, please have a look at @TWenseleers's analyses. 11/
@GenSpectrum is quick and dirty, and definitely useful, but it's important to not take these values at face value when there is not enough data yet.
In contrast, Tom's estimates are much more robust and less biased.
Prompted by a tweet from @BarakRaveh I added "pure" level X queries to collection 54:
A pure level 4 query shows everything with exactly 4 RBD mutations on top of BA.2 instead of 4 or more.
This is suitable for addition: level 4 + level 5 etc
Here are counts of BA.2.86 and overall sequence submissions to GISAID
Note that the English sample included in week 2023-08-07 likely got expedited, so it may be best to exclude from this analysis. 1/
In my very rough reading, there's not enough data yet to pin down growth advantage. It could be small or non-existent, or it could be sizeable, e.g. doubling every week.
Bear in mind that due to constant antigenic drift, there are always lineages with decent growth advantage. 2/
To really have an impact, BA.2.86 would have to become dominant, outgrowing even the fittest lineages around, e.g. HK.3 (EG.5.1 with S:L455F) or FL.1.5.1 (456L, 478R) which themselves are doubling in share about every two weeks. 3/
EG.5 (and EG.5.1) has recently got attention due to being highlighted by @UKHSA and @WHO.
EG.5, which is an alias for XBB.1.9.2.5, is a sublineage of XBB characterized in particular by Spike RBD mutation F456L. 1/
EG.5 is one of the fastest growing XBB sublineages, particularly common in China where it appears to be dominant.
As EG.5 has only one RBD difference compared to the upcoming vaccine strain XBB.1.5 vaccine protection is expected to be good. 2/
While the name EG.5 may sound very different from XBB, it is important to know that this is just due to naming - EG.5 is the short form of XBB.1.9.2.5. 3/
It appears that China has stopped uploading SARS-CoV-2 sequences to GISAID and now shares via its own version of Genbank: Genbase github.com/yatisht/usher/…
I don't yet fully understand the terms under which China shares the sequences - I assume (and hope_ they are just as open as Genbank.
In that case this is a great development towards having as much SARS-CoV-2 data being free of usage restrictions as possible.
GISAID should still be able to integrate the data in their platform (unless the license prohibits it, if it's Genbank-like then GISAID can pull the data as they've done with Genbank-only published sequences).
I'm seeing quite a bit of discussion about the potential impact of a big wave in mainland China on variant evolution.
I do not think a big wave in mainland China would have major consequences outside of China. 1/
While China is a big country, it has less than 20% of the global population. The rate at which new variants evolve would only increase slightly as a result of fractionally more infections worldwide. 2/
We have by now seen multiple second generation BA.2 variants evolve independently: BA.4/5 in Southern Africa, BA.2.75 in South Asia, BA.2.3.20 in the Philippines, BS.1 in South East Asia. 3/
For straightforward problems, @github Copilot really rocks.
Yes, as a Python dev, I could code this up myself in less than a minute
But why spend mental energy on this if it can be auto-generated?
As a beginner, this could have easily taken me a few minutes. Now it's just seconds
And the only unnatural about this example is that I deleted the output generated by copilot before recording.
Everything else is exactly as it was when I did it.
Nothing contrived, very natural usage pattern.
It's hard to think of a better way of solving this problem.
Copilot chose a very pythonic solution.
There's a good chance handwritten solutions by many developers would be less idiomatic and more confusing.
BQ.1* and XBB have different geographic foci
BQ.1* is mostly in Africa, Europe and North America
XBB in South (East) Asia
3 countries with similar levels worth watching for comparison and potential co-circulation are:
- Japan
- Australia
- South Korea cov-spectrum.org/explore/World/… 1/
BQ.1.1 and XBB have quite a lot of spike differences and seem to have similar growth advantages - this makes them candidates for co-circulation.
We may only be able to know once we see how the variants do in countries that had a wave with the other one. 2/
Here are the mutations that only occur in one of the two variants:
XBB only: S:V83A, S:Y144-, S:H146Q, S:Q183E, S:V213E, S:G339H, S:R346T, S:L368I, S:V445P, S:G446S, S:F486S, S:F490
BQ.1.1 only:
S:H69-, S:V70-, S:V213G, S:G339D, S:K444T, S:L452R, S:F486V 3/