We've once again updated our paper benchmarking long-read assemblers for bacterial genomes! Take a look at the fresh results here:
f1000research.com/articles/8-2138
Updates since the last version include...
(1/9)
New versions of some assemblers: Canu v2.0, Flye v2.8, Raven v1.1.10 and Shasta v0.5.1. My favourite change here is that Flye no longer requires a genome size parameter.
(2/9)
I've also added a new assembler to the comparison: NextPolish/NextDenovo. It performed well on chromosomes but not on plasmids, and it was more cumbersome to run than the other tools.
(3/9)
I've moved some supp figures into the main text. Most interesting to me is panel E which shows the maximum indel error size in assemblies.
(4/9)
This shows that all assemblers can sometimes make very large errors: hundreds or even thousands of bases in size! Flye performed best in this regard, often keeping its errors under 10 bp, but it wasn't totally immune to the problem.
(5/9)
This issue (big indel errors in assemblies) was one of the main reasons I created Trycycler. It makes a consensus assembly from multiple input assemblies and can therefore avoid large-scale errors such as these.
github.com/rrwick/Trycycl…
(6/9)
My main recommendations have not really changed. Favourites are still Flye (for overall quality), Miniasm/Minipolish (for clean circularisation) and Raven (for speed and reliability).
(7/9)
If you twisted my arm for a single recommendation, I'd have to pick Flye. It does well in most metrics and I really like that it makes fewer big indel errors. Nice work, @fenderglass!
(8/9)
That's all for now! Thanks again to @F1000Research for facilitating these updates so the benchmark can stay up-to-date. Canu v2.1 and NECAT v20200803 are out, so I'll get started on the next version 😀
(9/9)
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
