MINTyper: A method for generating phylogenetic distance matrices with long read sequencing data

In this paper we present a complete pipeline for generating a phylogenetic distance matrix from a set of sequencing reads. Importantly, the program is able to handle a mix of both short reads from the Illumina sequencing platforms and long reads from Oxford Nanopore Technologies’ (ONT) platforms as input.

By employing automated reference identification, KMA alignment, optional methylation masking, recombination SNP pruning and pairwise distance calculations, we were able to build a complete pipeline for rapidly and accurately calculating the phylogenetic distances between a set of sequenced isolates with a presumed epidemiolocigal relation.

Functions were built to allow for both high-accuracy base-called MinION reads (hac m Q10) and fast generated lower-quality reads (fast Q8) to be used. The phylogenetical output when using different qualities of ONT data with correct input parameters were nearly identical, however a higher number of base pairs were excluded from the calculated distance matrix when fast Q8 reads were used.

Authors: Malte B. Hallgren, Søren Overballe-Petersen, Henrik Hasman, Ole Lund, Philip T. L. C. Clausen