Genome-wide survey of tandem repeats by nanopore sequencing shows that disease-associated repeats are more polymorphic in the general population

Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. However, it is difficult to predict which tandem repeats may cause a disease. We performed a genome-wide survey of the millions of human tandem repeats using long read genome sequencing data from 16 humans. We found that known Mendelian disease-causing or disease-associated repeats, especially coding CAG and 5’UTR GGC repeats, are relatively long and polymorphic in the general population. This method, especially if used in GWAS, may indicate possible new candidates of pathogenic or biologically important tandem repeats in human genomes.

Authors: Satomi Mitsuhashi, Martin C Frith, Naomichi Matsumoto