The Antartica metagenomic samples that @jbloom_lab nicely covered has some quirks.
NCBI claims its Illumina Data
Fastq files have headers that look like MGISeq data.

So I decided to take a look at their Adaptor sequences as each sequencer has their own unique flow cell primers
TrimGalore indeed confirms these are MGISEQ reads.
This paper has the MGISEQ adaptor sequences-
frontiersin.org/articles/10.33…
Why does this matter? The authors posit that this could be a result of the high index hopping problem seen on Illumina platforms.
MGISEQs documented index hopping rate is orders of magnitude lower than Illumina.
bmcgenomics.biomedcentral.com/articles/10.11…
This implies the contamination would have to occur prior to index ligation and the SARs construct would mostly likely have to be DNA not RNA for a DNA based metagenomic library to capture it. Has anyone looked for vector sequences in the data?
Why do we care about vector sequence? That implies human manipulation in dec 2019.
If index hopping is ruled out,
Then the contamination must be earlier and RNA molecules eventually must be turned into DNA for metagenomic libraries to capture them.
Someone had it as cDNA/Vector
I have send the authors this thread. Their finding of cells line DNA in the metagenomic DNA is also suggestive of human manipulation prior to December 2019.
After using the MGISEQ adaptors for trimming we get more Forward reads to map but still 2X more reverses. The MGISEQ has more signal on the reverse reads as there is an additional polymerase replication event.
However the reads look very noisy on their 3' ends. Sign of dim DNBs
Dim DNBs usually have lower quality and are more prone to index hopping from neighboring bright DNBs.
This is quality of the Reverse reads that mapped and all Reverse reads. There is a 10Q difference. Q20 reads have 1 error every 100bp. Q30 reads have 1 error every 1,000 bases. That 10 fold drop in quality means Index hopping may still be on the table. We 6K/55M reads mapping.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Kevin McKernan 🙂

Kevin McKernan 🙂 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Kevin_McKernan

Feb 7
Psilocybe cubensis assembly in NCBI.
This couldn't be done without @PhaseGenomics or @PacBio

ncbi.nlm.nih.gov/Traces/wgs/JAF…
These HiC maps were performed on spores which express no controversial compounds.
Illumina libraries are also generated with spore preps.
Read 7 tweets
Jan 26
Carl Fuller et al are at it again.

This is wild. 16,384 CMOS chip capable of single molecule movies at each pixel.

Each pixel has a protein nanowire bridging it. These are assembled with electrophoresis.

Once wired, click chemistry binds one molecule.

pnas.org/content/pnas/1…
20nM gap for the nanowire peptide bridge.

1000Hz read outs. 0-400pAmp sensors.
Chip features
Read 5 tweets
Jan 22
Many people ask me about this Moderna patent sequence.

Some calc the odds of a 19mer by chance as 4^19.
A big number if life were truly random.
But evolution is a preservation of those random words that improve fitness so we have to ask, are there similarities to common words?
Take the 19mer sequence and plug it into NCBI BLASTn against the Nr database.

Check ‘exclude’ and enter coronaviridae.

You’ll get microbial hits like this.

Check the E-Value.
What’s the E-Value?
Q: What is the Expect (E) value?
The Expect value (E) is a parameter that describes the number of hits one can “expect” to see by chance when searching a database of a particular size. It decreases exponentially as the Score (S) of the match increases.
Read 8 tweets
Jan 17
If you read the Corman Drosten retraction demand addendum, you will see we voice concerns over both FN and FPs.

A common newb retort is that the test can’t have both.
On an individual test this is true but on a population level this is false.

pubmed.ncbi.nlm.nih.gov/34741305/
So why do we only hear debate about the FP rate?

I believe this is because the implications of FPs are less severe than the implications of FNs.

FPs you can confirm with another test or just suck it up and quarantine for the ‘greater good’.
FNs, on the other hand, expose the entire track trace system as the scam that it is.

Particularly when they are this high.
Once negative, very few people want to stand in a line with other potentially sick people and pay $50-300 a second shot at quarantine.
Read 4 tweets
Jan 15
We have Cannabis Whole Genome Sequencing honed to the point where people are using it to untangle the history of the famous Skunk #1 line.

I wasnt around then so I cant speak with any authority on the oral history but I can help people better understand Kannapedia.net
Phylo-Trees can be complicated so lets just take a look at the genetics of THCAS.
There are a few interesting mutations found in early cannabis lines that we will go over.
Ala250Asp
Pro333Arg
Pro542Leu
Ser355Asn
A recently sequenced Skunk line
kannapedia.net/strains/rsp124…
A250D and Pro333Arg are some of the most common mutations.
A250D is found in 12.7% of the NGS data. The C90 data finds this more frequently but less samples have been run through that pipeline.
P333R is found in 18.2% of the NGS data.

Click on the blue %Number
Read 16 tweets
Jan 8
Respiratory viruses have been around longer than any Judge. To assume parasitism and not mutualism is myopic.
45% of our genome consist of viral elements (LINE, Alu etc) and 8% of the genome consists of infectious retroviruses.

ncbi.nlm.nih.gov/labs/pmc/artic…

annualreviews.org/doi/abs/10.114…
Zero virus plans assume viruses that may be pathogenic for some age groups can be mutualistic for others.

ncbi.nlm.nih.gov/labs/pmc/artic…
The advent of NextGen sequencing allowed us to peer into mutualistic viruses, as in the past, expensive sequencing was reserved for exploring pathogens.

pnas.org/content/118/10…
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

:(