London Calling 2023: Pathological short tandem repeats analysis by long-read sequencing in affected individuals


Pathological short tandem repeats (STRs) are repetitive DNA sequences found throughout the genome that can expand and contract in length, resulting in various genetic diseases, including Huntington's disease, Fragile X syndrome, and myotonic dystrophy. Traditional short-read sequencing technologies have limitations in detecting and accurately characterizing STRs due to their inherent instability and repetitive nature. Long-read sequencing (LRS), on the other hand, can provide a potential solution for accurate and comprehensive STR analysis.

Nine affected individuals with known pathogenic expansions in nine different loci were sequenced on PromethION, to obtain long-read whole-genome sequencing (LR-WGS). Short tandem repeat genotyping was performed using the straglr software as bundled within Oxford Nanopore Technologies wf-human-variation workflow using default parameters and with the `--str` flag to specify STR genotyping.

Straglr software provided repeat unit counts from the phased reads that span the collection of pre-annotated disease-associated human STRs and detected all known repeat expansions. LR-WGS detected almost the same allele sizes that were known from repeat-prime PCR, and provided sizing information for large loci, in which the size was unknown.

As the cost of genome sequencing continues to drop and new technologies emerge that can detect all types of genetic variations in one test, we believe it is likely that clinical STR detection will eventually shift towards LR-WGS. This is further supported by the ability to analyze DNA methylation from Oxford Nanopore data, which is relevant and important for several pathological STRs.

Before implementing LR-WGS for clinical STR detection, further validation would be necessary, which includes testing multiple affected individuals for each genetic locus.

Authors: Hagar Mor-Shaked