Excited to announce a new preprint! We did a study comparing two different @nanopore library prep approaches (ligation and rapid) for bacterial genomes with small plasmids: biorxiv.org/content/10.110…
(1/11)
I really like this paper because it has a clear conclusion simple enough to fit in a tweet: rapid preps are better than ligation preps at recovering small plasmids.
(2/11)
Figure 1 gives a simplified illustration of why we think this is the case: due to their size, small circular plasmids can avoid fragmentation during DNA extraction, leaving no ends for adapter ligation. Rapid preps, in contrast, don't depend on DNA ends.
(3/11)
Figure 2 shows our main results: using Illumina reads to approximate true plasmid abundance, the smaller a plasmid is, the more underrepresented it will likely be in a ligation read set. But not in a rapid read set!
(4/11)
I like the plots in Figure 2 because the effect is obvious - I didn't even think it was necessary to do statistics (xkcd.com/2400). But I reluctantly did do linear regressions 😄
(5/11)
We also found another interesting difference between ligation and rapid: there were fewer chimeric reads in the rapid sets. This makes sense because rapid preps don't involve ligase, so there is less chance of combining two DNA fragments together.
(6/11)
So the advantages to using ligation preps are: better yield, more versatility and greater multiplexing of samples. For these reasons, it's what the @DrKatHolt lab mostly uses for Nanopore sequencing of bacterial isolates.
(7/11)
The advantages to using rapid preps are: simpler/faster procedure, potential for longer reads (if you're careful with DNA extraction), better representation of small plasmids and fewer chimeras.
(8/11)
The main takeaway of the paper is this: if you're doing Nanopore-only sequencing of a bacterial isolate and small plasmids matter to you, we recommend rapid preps! If you use a ligation prep, you might miss the small plasmids.
(9/11)
If you're doing hybrid (Nanopore+Illumina) sequencing, then ligation should be fine because the small plasmids will be captured by the Illumina reads. But expect the small plasmids to be underrepresented in your Nanopore reads.
(10/11)
Many thanks to the co-authors (@JuddLmj, @KelWyres and @DrKatHolt) and all the other members of the Holt Lab. And more generally, thanks to all the researchers out there who help make @nanopore bacterial genomics so good!
(11/11)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
We've once again updated our paper benchmarking long-read assemblers for bacterial genomes! Take a look at the fresh results here: f1000research.com/articles/8-2138
Updates since the last version include...
(1/9)
New versions of some assemblers: Canu v2.0, Flye v2.8, Raven v1.1.10 and Shasta v0.5.1. My favourite change here is that Flye no longer requires a genome size parameter.
(2/9)
I've also added a new assembler to the comparison: NextPolish/NextDenovo. It performed well on chromosomes but not on plasmids, and it was more cumbersome to run than the other tools.
(3/9)
It is for generating a consensus long-read assembly of a bacterial genome.
(1/9)
I.e. you give Trycycler multiple different long-read assemblies of the same genome, and it produces a single consensus assembly that is better than any of the inputs.
(2/9)
In doing so, Trycycler can repair most of the problems that hide in long-read assemblies. These include: 1) missing/spurious contigs 2) bad circularisation 3) glitchy sequence regions
New paper for the new year! It compares different long-read assemblers for microbial genome assembly: f1000research.com/articles/8-213…
Two Twitter threads follow - one about the paper itself and one about my experience with @F1000Research.
(1/n)
In this paper, we did a ton of long-read microbial genome assemblies (using both real and simulated long-read sets) to see how the current assemblers perform.
(2/5)
I won't get into detailed results here, but very briefly: Flye, Raven and Miniasm/Minipolish were our favourites, each excelling in particular ways 🏆
(3/5)
It does Racon-polishing on a miniasm long-read assembly. Why not just use Racon directly? For a few reasons...
(1/6)
1. Minipolish keeps the assembly in graph form (GFA format) whereas Racon produces FASTA sequences.
2. Racon has a nasty habit of sometimes truncating sequences a little bit when it polishes them - Minipolish will repair this.
(2/6)
3. Minipolish 'rotates' circular contigs (like in bacterial genomes) between polishing rounds. This ensures that final polished contigs circularise cleanly (no missing or overlapping bases).
Nothing too fancy, but if you work with large numbers of bacterial genomes, read on... (1/9)
I occasionally encounter a situation where I have lots of genome assemblies, but some are quite redundant. Often because the set contains outbreaks of near-identical genomes. (2/9)
For example, if you download all Klebsiella pneumoniae genomes, you'll get many thousands, but common disease-causing lineages (like CG258) are heavily represented. (3/9)