Benchmarking the MinION: Evaluating long reads for microbial profiling

Nanopore based DNA-sequencing delivers long reads, thereby simplifying the decipherment of bacterial communities. Since its commercial appearance, this technology has been assigned several attributes, such as its error proneness, comparatively low cost, ease-of-use, and, most notably, aforementioned long reads. The technology as a whole is under continued development. As such, benchmarks are required to conceive, test and improve analysis protocols, including those related to the understanding of the composition of microbial communities.

Here we present a dataset composed of twelve different prokaryotic species split into four samples differing by nucleic acid quantification technique to assess the specificity and sensitivity of the MinION nanopore sequencer in a blind study design.

Taxonomic classification was performed by standard taxonomic sequence classification tools, namely Kraken, Kraken2 and Centrifuge directly on reads.

This allowed taxonomic assignments of up to 99.27% on genus level and 92.78% on species level, enabling true-positive classification of strains down to 25,000 genomes per sample. Full genomic coverage is achieved for strains abundant as low as 250,000 genomes per sample under our experimental settings.

In summary, we present an evaluation of nanopore sequence processing analysis with respect to microbial community composition. It provides an open protocol and the data may serve as basis for the development and benchmarking of future data processing pipelines.

Authors: Robert Maximilian Leidenfrost, Dierk-Christoph Pöther, Udo Jäckel, Röbbe Wünschiers