All of our B.1.621 + B.1.621.1 (important emerging SARS-CoV-2 variant) submissions are being rejected by GISIAD. This has some important implications that data producers + public health agencies should be aware of.
Technical π§΅(1/12)
We usually have a few sequences/week that get rejected for QC reasons (eg indel in a string of As or Ts). It takes a few days for these to be fixed and reposted. Sometimes longer depending on our bandwidth. These are usually a random distribution so not a problem if delayed(2/12)
This past week we had 40+ sequences get rejected, and almost all of them were B.1.621/B.1.621.1. @JosephFauver found that all of these sequences have a 4 nt deletion in ORF3A that results in a premature stop codon about ~50 nt upstream. GISAID sees these and kicks them back(3/12)
Why is this important?
Some groups may not have the bandwidth to investigate and approve rejected sequences and thus they don't get uploaded to GISAID. Other times it means that these data are delayed by several days. (4/12)
In both cases they bias the data.
- Inflates the proportions of all other lineages, including Delta, by changing the denominator
- Deflates the proportions of B.1.621 + B.1.621.1
Depending on the location, the overall impact could be quite large. (5/12)
We do the majority of the sequencing for Connecticut, Puerto Rico, US Virgin Islands, and the Dominican Republic. We are finding that B.1.621 + B.1.621.1 recently emerged in these regions and at times can make up ~10% of the sequenced cases. (6/12)
But yet if you go to outbreak[.]info - which pulls its data from GISAID - you wont find any B.1.621 sequences from Connecticut because we are still working to get GISAID to accept them. And if you look at the entire US, it shows that B.1.621 is <1%. (7/12)
I suspect that B.1.621 + B.1.621.1 are actually much higher than 1% in the US and are significantly under-reported across the world.
Outbreak[.]info classifies B.1.621 as a 'Variant of Interest' as it has E484K, N501Y, and P681H found in other important variants. (8/12)
Beyond missing the emergence of a potentially important variant, the rejection of B.1.621 submissions can also inflate the frequency of other variants.
For example, last week we reported that Delta in CT was <70% while the @CTDPH report said ~80%. (9/12)
The difference between our frequencies is that we use data directly produced by our lab and reported to us by our partners and the @CTDPH only uses data they find on GISAID (which includes our data). Since B.1.621 is being rejected by GISAID, their denominator is off. (10/12)
The @CTDPH is aware of this issue and we are working with GISAID to correct it. I'm telling y'all because this is likely an issue in other places too.
**Please be aware of potential biases of using only data pulls from GISAID for surveillance** (11/12)
Recap:
- GISAID is rejecting B.1.621 sequences due to a real 4 nt deletion in ORF3A
- Please try to confirm these as accurate ASAP so limit the data bias that these rejections may cause.
Hopefully GISAID will fix this rejection issue soon. (12/12)
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
One of the π parts of our study is that we used virus isolates (not pseudovirsues) that represent much of the genetic diversity in our region. This allowed us to examine local effects and to dive into the genetic components of πneutralization (2/22)
These are the results that I want to spend some time with as there is a lot to unpack here. I know that I am a bit biased, but this is such a π figure! (3/22)
2/10 Our data combined with the CDC indicates that Delta was ~64% by 6/28 in Connecticut and may have been as high as 80% by 7/6 (remember that sequencing data always has a bit of a lag). Also, the rise in Delta is replacing almost all other variants.
3/10 In addition to B.1.617.2, we are also seeing the sub-lineages AY.1, AY.2, and AY.3, which are all classified as Delta. Some AY.1's have K417N and some AY.2's have V70F. AY.3 is defined by mutations outside of spike. The functional differences between these are unknown.
2/9 Last week when we reported that Delta was only 2.3% I said: "This is probably more of a reflection of noisy data when trying estimate frequencies from a small number of cases", and followed that up with an expectation that we will see Delta π.
3/9 This week we are seeing the expected π in Delta (B.1.617.2), but the caveat still remains that our estimates are noisy because of the low numbers of sequenced cases (a product of the low numbers of cases, which is a good problem to have π)
2/8 In Connecticut, the % of sequenced cases that are the Delta variant (B.1.617.2) decreased in recent weeks. This is probably more of a reflection of noisy data when trying estimate frequencies from a small number of cases vs an actual decline in delta.
3/8 Looking at our neighbors in Massachusetts and New York, delta is 10-20%, so we in Connecticut are probably pretty close to that. My guess is that we'll see a π in delta in the coming weeks to reflect the trends of our neighbors.
2/7 Gamma (P.1) and Delta (B.1.617.2) continue rising in Connecticut, while Alpha (B.1.1.7) and others decline, following national trends (see next tweet).
3/7 Data from outbreak.info shows that in the US πΊπΈ, Delta (B.1.617.2) is π exponentially, while Alpha (B.1.1.7) is on the π. Despite this, COVID-19 cases are still dropping (for now).
2/6 We now have 8 cases of B.1.617.2 (5 shown in the π²) and 2 of B.1.617.1. To my knowledge, none of these are associated with βοΈ. Our phylo π² shows that there are at least 4 independent transmission chains of these viruses, spanning at least 3 counties. Definately one to π
3/6 The π in B.1.1.7 shown in tweet 1 is probably a combination of noisy data with few cases and the emergence of other lineages. The figure π 1 is from TaqPath SGTF data, which is a week ahead of the sequencing shown π, where we don't yet see the sudden π.