As we head deeper into 2021, a gentle reminder to double-check your #SARSCoV2 sequencing scripts for hardcoded '2020's! 😬

We've seen a real uptick recently in sequences with sampling dates in Jan 2020, with divergence that clearly indicates they meant 2021! 📅🤦🏻‍♀️
"How can we tell?" you might wonder. Actually, these cases are pretty easy to spot!
Let's take a look at two sequences reported as being from Jan 2020 - 1 in green, 2 in blue. (Spoiler, green is the imposter.)
This view is 'date view' where sequences are plotted on the X-axis by the date they were sampled. Then, we try and infer how they're connected using their genetic information - specifically, what mutations they share & don't share.
But we can Also view the tree in 'divergence view' - *just* looking at the mutation differences and not considering time.
Colouring by sample date, you can see that generally, recent sequences (red) have higher divergence (further to right).
In this view, the blue #2 sequence is wayyy to the left - as we'd expect from a sequence from Jan 2020.

But our friend green #1 - supposedly from the same time - is sitting in the middle, 20 muts away from #2!

(This fits a lot closer to the ~24 muts we'd expect for Jan 2021!)
We can also look at this in 'clock view' - plotting sample date (x-axis) against # of mutations (y-axis). We expect - on average - for sequences to lie roughly around the black line.
But you can see imposter green sitting wayy too high, far from the line.
Given it's at 20 mutations, if we trace that y-axis line to the right, it crosses the black line at around the end of 2020. (However, mutation rates are an average - there's noise, so Jan 2021 also fits.)
These views can also illustrate what you may have heard about 501Y.V1 (B.1.1.7) & 501Y.V3 having 'long branches' or 'many mutations.

Look at them in divergence & clock view. Even though the samples are mostly from end-2020/Jan-2021, they have 30+ mutations.
How these long branches happen and what it means for #SARSCoV2 is greatly discussed by scientists, but I hope now if you hear it again, you can picture how scientists 'measure' this and why it 'stands out' so much!
And, I hope this mini-thread as helped you learn a little about how we can sometimes spot bad dates (and bad sequences!) in @Nextstrain runs - and a few things about how our analyses work, and some interesting ways you can view things on Nextstrain!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Dr Emma Hodcroft

Dr Emma Hodcroft Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @firefoxx66

22 Jan
I got a lot of messages last week about a 'new variant' in Bavaria, Germany.

Notably, scientists at @ChariteBerlin have announced this is *not* a new variant. Link may stop working so screenshot attached.

Read on to see the sequences...

1/4

spiegel.de/wissenschaft/c…
3 seqs indicated in red are apparently from the hospital. They're coloured by deletion at position 69, which likely caused them to be flagged (501Y.V1 has it too). However, they're related to other sequences in Germany & Turkey, also w the deletion.

2/4

nextstrain.org/groups/neherla…
We do see the 69/70 deletion has appeared independently previously. While this is interesting, I don't think it increases concern significantly about this group of sequences, which are part of a larger cluster that's been circulating in Germany since spring 2020.

3/4
Read 4 tweets
21 Jan
The S:N501 and S:E484 builds are now updated with data from 20 Jan! 🎉

There are almost 400 new non-UK and non-South Africa sequences in 501Y.V1 (B.1.1.7 #b117) & 501Y.V2.

And 8 new sequences in 501Y.V3 & 20B/S.484K (primarily circulating in Brazil)

#SARSCoV2

1/19
There are 394 new non-UK sequences in 501Y.V1 from the Netherlands, Denmark, Singapore, Australia, Italy, Spain, France, Ireland, Finland, Sweden, the USA, Switzerland, India, Brazil, Slovakia, & Belgium

2/19

nextstrain.org/groups/neherla…
Denmark has 125 new sequences (orange) - many linking to older sequences & clusters, indicating ongoing local transmission (pic 1).

Netherlands has 14 new sequences (yellow) - many linking to older sequences and clusters, likely indicating local transmission (pic 2).

3/19
Read 19 tweets
18 Jan
A #SARSCoV2 variant with a combination of mutations in Spike (S13I, W152C, L452R) has been making headlines & twitter rounds today, so - a fresh-from-the-oven 🥖 S:L452R focal build is now up - let's take a look:

1/10

This combination of mutations falls in 20C, and is found predominantly in the US, especially in California. Here's a zoomed view and link.

2/10

nextstrain.org/groups/neherla…
Importantly, we *don't yet know* if this variant is being pulled along with a general rise in cases, or some other environmental or behavioural change - or whether the mutations change the behaviour of the virus, like transmissibility.

Be wary of headlines claiming this!

3/10
Read 10 tweets
17 Jan
The focal S:N501 build has been updated with data from 15 Jan, & includes 164 non-UK, non-South African sequences in 501Y.V1 (B.1.1.7 #b117) & 501Y.V2.

The S:E484 build has also been updated. Includes 501Y.V3 & 20B/S.484K (predominantly in Brazil).

1/16

nextstrain.org/groups/neherla…
There are 161 new, non-UK sequences in 501Y.V1 (B.1.1.7) from the Netherlands, Denmark, Australia, Spain, Norway, Germany, the USA, & Switzerland, and from Turkey and Ecuador for the first time.

2/16
Turkey has 19 sequences in 501Y.V1 for the first time. These indicate many independent introductions, though a few clusters may indicate local transmission, and pairs with common exposure.

3/16
Read 16 tweets
15 Jan
Lots of tweets about this today!
Let's see what we can see in the focal S:E484 build!

Phylogenetics (what I do - making 'family trees' from virus genetics) can be very informative to see how different variants are spreading, and how cases link🙂

1/10

independent.co.uk/news/health/br…
There are two variants circulating predominantly in Brazil:
- 20B/S.484K seems to be older & more widespread. It has (among others) a mutation at position 484
- 20J/501Y.V3 is smaller & detected recently. It has mutations at 501 *and* 484.

2/10

nextstrain.org/groups/neherla…
The variant predominantly in the UK (501Y.V1 / B.1.1.7) and the variant predominantly in South Africa (501Y.V2) also both have 501. 501Y.V2 *also* has the 484K mutation.

Why are there concerns about these mutations? You can read more at CoVariants.org!

3/10
Read 11 tweets
14 Jan
There's been a lot of news today about 'new variants in Brazil' & 'S:E484' and 'Ohio variants'. A short thread discussing what we know about each.

1/11

#SARSCoV2 #Ohio #BrazilVariant #Brazil #COVID19
There are two variants circulating in Brazil: 20J/501Y.V3, recently detected in Japan & prevalent in Manaus. And temporarily-labelled '20B/S.484K', a larger clade circulating more widely in Brazil.

They both carry the spike mutation 484K, though this arise separately.

2/11
S:E484K is also found in 501Y.V2 (primarily circulating in South Africa).

Why are we particularly concerned about S:E484K? You can read more at CoVariants.org, as discussed in a tweet from earlier this week:

3/11

Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!