NanoOK: Multi-reference alignment analysis of nanopore sequencing data, quality and error profiles

Motivation: The Oxford Nanopore MinION sequencer, currently in pre-release testing through the MinION Access Programme (MAP), promises long reads in real-time from a cheap, compact, USB device. Tools have been released to extract FASTA/Q from the MinION base calling output and to provide basic yield statistics. However, no single tool yet exists to provide comprehensive alignment-based quality control and error profile analysis – something that is extremely important given the speed with which the platform is evolving.

Results: NanoOK generates detailed tabular and graphical output plus an in-depth multi-page PDF report including error profile, quality and yield data. NanoOK is multi-reference, enabling detailed analysis of metagenomic or multiplexed samples. Four popular Nanopore aligners are supported and it is easily extensible to include others.

Availability and implementation: NanoOK is open-source software, implemented in Java with supporting R scripts. It has been tested on Linux and Mac OS X and can be downloaded from: https://github.com/TGAC/NanoOK. A VirtualBox VM containing all dependencies and the DH10B read set used in the paper is available from http://opendata.tgac.ac.uk/nanook/. A Docker image is also available from Docker Hub – see program documentation.

Supplementary Information: Program documentation is available at: https://documentation.tgac.ac.uk/display/NANOOK. The complete E. coli report referred to below is provided as supplementary data.

Authors: Richard M. Leggett, Darren Heavens, Mario Caccamo, Matthew D. Clark, Robert P. Davey