Flanker is a reference free tool to faciliate analysis of flanking regions around genes - it extracts sequences around genes of interest then clusters these based on mash distance thresholds (lots of tunable options to customise this)
There are lots of great tools for studying mobile genetic elements like TETyper/MEFinder/Galileo etc but these all rely on reference databases and specialist knowledge.
Flanker allows users to identify clusters of variation (which we term flank patterns in the paper) with no prior knowledge. Several people have done similar things with hacky custom scripts but there's no easy off the shelf tool
Using blaKPC and blaOXA containing plasmids as examples, we show that flank patterns epidemiological relevance (e.g. greater carbapenem resistance and geographical restriction).
When you plot these clusters of variation in what we call a "Flankergram" (figures 2/3 - also see binder on github) it becomes easy to spot edges of transposons for example. Flanker should help to boil large datasets down into a few groups to focus on with existing tools.
Flanker knits together several great existing tools - Abricate, Mash, Biopython and NetworkX and the manuscript reuses data from Dutch CPE and EUSCAPE studies - thanks to all for #OpenScience
As a bonus we have provided an R script demonstrating how we parse Prokka genbank files into gggenes plots (like in the figures) - not rocket science but might be useful for some people - see binder repo on GitHub.
Also huge thanks to @beconstant for lots of help with the script including packaging for #bioconda
• • •
Missing some Tweet in this thread? You can try to
force a refresh