Robyn Wright Profile picture
Apr 28 11 tweets 3 min read
Q: Which is better for taxonomic classification of #metagenomics samples - Kraken2 or MetaPhlAn 3?

A: It really depends!

Read the very short story in this thread, or the full story in my preprint w/ @BetaScience and André Comeau: bit.ly/3EWkYJf

1/
Now, you may be thinking "but aren't there loads of studies that compare different metagenomic taxonomic classifiers already?" (& you would be right), but what they don't do is compare the impact of different parameters and reference databases on the classifications.

2/
What started about two years ago, as a quick test of which tool/parameters we should use to classify some samples, took on a whole life of its own and I'm really pleased it's finally out there. If you use Kraken or MetaPhlAn then I think some of this will be useful for you.

3/
We noticed that running MetaPhlAn on metagenome [MG] samples might only identify 10 species, or even none at all, while running Kraken on the same samples might identify up to tens of thousands of species.

4/
To investigate this systematically, we collected ~400 previously made simulated/mock MG samples with known compositions and classified them with both Kraken2 and MetaPhlAn 3. We made several different Kraken2 databases (all available on Dropbox)...

5/
And we tested out almost all of the tool-parameter-database [DB] combinations that we could (and took a really deep dive into Kraken2's confidence threshold), giving >60,000 taxonomic profiles for us to analyse.

6/
We show that if you run Kraken2 with the default parameters & DB then it really performs quite badly. But, by making changes to the parameters (the confidence threshold) & using a complete DB, it outperforms MetaPhlAn.

7/
The best tool-parameter-DB combination also depends on sample characteristics (we provide some guidance on how to choose this for your samples), but in general, we were able to achieve higher performance with Kraken than MetaPhlAn.

8/
Although we do want to highlight that MetaPhlAn is *really* quick (& easy) to install & run (& computational requirements are very low). It also performs pretty well out-of-the-box on these samples (i.e. no optimisation required).

9/
So there isn't really a one-size-fits-all "best" option (particularly when also considering available computational resources) but if you've just been running Kraken with the default DB & parameters then you can likely achieve more accurate classification than you have been.

10/
And finally, this is just a preprint at the moment so we'd welcome any feedback you may have on it 🙂

11/11

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Robyn Wright

Robyn Wright Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(