Main menu

Critical assessment of bioinformatics methods for the characterization of pathological repeat expansions with single-molecule sequencing data


A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats over the last 5 years. However, different custom bioinformatics pipelines were employed in each study, preventing meaningful comparisons and somewhat limiting the reproducibility of the results.

In this review, we provide a brief summary of state-of-the-art methods for the characterization of expanded repeats alleles, along with a detailed comparison of bioinformatics tools for the determination of repeat length and sequence, using both real and simulated data.

Our reanalysis of publicly available human genome sequencing data suggests a modest, but statistically significant, increase of the error rate of single-molecule sequencing technologies at genomic regions containing short tandem repeats. However, we observe that all the methods herein tested, irrespective of the strategy used for the analysis of the data (either based on the alignment or assembly of the reads), show high levels of sensitivity in both the detection of expanded tandem repeats and the estimation of the expansion size, suggesting that approaches based on single-molecule sequencing technologies are highly effective for the detection and quantification of tandem repeat expansions and contractions.

Authors: Matteo Chiara, Federico Zambelli, Ernesto Picardi, David S Horner, Graziano Pesole

Getting started

Buy a MinION starter pack Nanopore store Sequencing service providers Channel partners

Quick links

Intellectual property Cookie policy Corporate reporting Privacy policy Terms & conditions Accessibility

About Oxford Nanopore

Contact us News Media resources & contacts Investor centre Careers BSI 27001 accreditationBSI 90001 accreditationBSI mark of trust
English flag