Nick Loman Profile picture
Aug 13, 2019 18 tweets 25 min read Read on X
Today's #longreadclub episode will stream live with @koadman at 11am to ask him about @longastech: who are promising a method to generate long reads using short read platforms, resulting in accurate and single contig de novo assemblies!
First watch:
(1/n)
@koadman @longastech Aaron is the CSO and co-founder of Longas Technologies, and an academic bioinformatician at UTS Sydney, who has been responsible for some key bioinformatics software and algorithms, notably Mauve/progressiveMauve and Phylosift. #longreadclub
@koadman @longastech Aaron starts by introducing the two variables that most influence assembly: read length and coverage, and reiterates @torstenseemann's two laws of assembly:

#longreadclub
@koadman @longastech @torstenseemann Long read platforms (ONT, PacBio, linked reads like 10X) should solve these problems but Aaron notes that the databases are not yet filling up with complete microbial genomes relative to draft genomes - why?
#longreadclub
@koadman @longastech @torstenseemann Introducing Morphoseq: a way of getting long "virtual" reads from short read platforms like Illumina. The basic principle is to mutagenise sample to actually remove repeats: each read gets a unique signature of mutations !
#longreadclub
@koadman @longastech @torstenseemann Process: tagment sample to make long fragments (10kb), perform mutagenesis by incorporating nucleotide analogues like pPTP by PCR, add sample barcodes, then replace pPTP with (random) natural nucleotides, size select and then perform enrichment PCR.
@koadman @longastech @torstenseemann Bioinformatics process requires unmutated data also: make short-read assembly graph, then map the mutated reads onto the assembly graph. Follow the breadcrumbs of mutated reads to give you a unique path through repeats! Nifty.
@koadman @longastech @torstenseemann Evaluated using 60 microbial genomes available from BEI with fairly low yield and quality, as well as 3 genomes with varying GC contents that they also generated nanopore data for:
#longreadclub
@koadman @longastech @torstenseemann What does the data look like?
Lengths look good, and the induced mutation rate is around 6-8% despite GC content:

#longreadclub
@koadman @longastech @torstenseemann When you inspect the raw reads you can see the mutated sequences (plus sequencing error). Then these reads can be reconstructed into long mutated reads:
#longreadclub
@koadman @longastech @torstenseemann So can these long reads help with assembly? Using the long reads and short reads in @rrwick's Unicycler pipeline: able to reconstruct a low GC Arcobacter organism into a single contig! #longreadclub
@koadman @longastech @torstenseemann @rrwick When applied to the BEI data many of the genomes are coming out as circular contigs or at least improved: but room for gains in the software. #longreadclub
@koadman @longastech @torstenseemann @rrwick Accuracy matters: 90% accuracy read != 99% accuracy read for de novo assembly. Shows chart from Jain et al. 2018 showing effect of accuracy on NG50 (nature.com/articles/nbt.4…), recently reused in PacBio HiFi read paper out today; nature.com/articles/s4158… #longreadclub
@koadman @longastech @torstenseemann @rrwick Unlike linked reads, MorphoSeq can resolve through complex local repeats (VNTRs, microsatellites) etc. Also is better at resolving gene calls (using @BioMickWatson's broken gene predictor) than nanopore-only assemblies:
#longreadclub
@koadman @longastech @torstenseemann @rrwick @BioMickWatson Looking at cost: if you try and and generate 135x short read data and 15x long you can theoretically assemble a complete E. coli genome for the price of a extra value meal at McDonalds on NovaSeq S4 (at least with respect to sequencing cost)

#longreadclub
@koadman @longastech @torstenseemann @rrwick @BioMickWatson Summing up here:

Data available, looking forward to the bioRxiv preprint!
#longreadclub
@koadman @longastech @torstenseemann @rrwick @BioMickWatson Thanks @koadman! We'll be sitting down to chat with him in just under 2 hours (11am UK time). If you have questions you'd like to ask him, pop them on the end of this thread, or on the YouTube video, or in the comments of the live Q&A stream when we post the link later.
@koadman @longastech @torstenseemann @rrwick @BioMickWatson Go check out the archived Q&A with @koadman at:

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Nick Loman

Nick Loman Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @pathogenomenick

May 24, 2019
.@Scalene introducing the difficulties of long-read metagenomic sample preparation from complex samples: harder to lyse than bags of cytoplasm like E. coli or human cells. #nanoporeconf
We've mainly favoured ultra-long libraries with rapid kit, but recently @DrT1973 has developed a bead-free ligation method that can get 100kb+ reads. Started to move to ligation method routinely as good balance of length and yield.
We've been practicing on mock communities from @ZymoResearch as they act as very useful positive controls for lab and bioinformatics. Zymo community consists of 3 Gram-, 3 Gram+ and 2 Fungi. #nanoporeconf
Read 20 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(