Repeat expansions

Tandem repeats (TRs) are a type of structural variant; TR expansions are important genetic aberrations that are associated with a variety of neurological diseases. The study of such diseases requires an accurate size determination of the expanded repeat, which has proven to be challenging with traditional analysis methods, such as short-read sequencing. Long nanopore sequencing reads can span repeat expansions end to end in single reads, without the need for PCR, enabling unambiguous size determination.

  • Sequence repeat expansions end to end, accurately determining repeat length and nucleotide composition
  • Explore the epigenetic profile of repeat expansions and eliminate bias with direct sequencing
  • Scale to your requirements – from targeted Cas9 enrichment of known loci to a whole genome survey of repeat expansions

What are repeat expansions?

Tandem repeats (TRs) in DNA are contiguously repeated units of DNA, with the units being 1-6 bps (STRs), or > 6 bps (VNTRs) in length (see Figure 1 and Table 1). Approximately 3% of the human genome is occupied by TRs, with around 500,000 mapped to the human genome. Moreover, TRs are highly mutable with a notable propensity to expand. It is these expanded repeats that are implicated in neurological diseases. Yet, despite their well-documented contribution to genetic variation, TRs remain poorly understood, and their impact on phenotype and disease is likely underestimated. This is largely down to the repetitive nature, high GC content, and length of these expansions, which has rendered them refractory to PCR amplification — a typical step prior to short-read sequencing. On top of this, repeat expansions regularly exceed 10 kb in length, and many cannot be spanned by short reads — a major challenge for their accurate computational resolution. For these reasons, repeat expansions are precluded from base-level resolution by most technologies.

Figure 1: Expansion of CGG trinucleotide repeat in 5'UTR of the FMR1 gene underlies pathogenesis of Fragile X mental retardation.

Table 1: Tandem repeats can be subdivided into two categories - STRs and VNTRs, which differ in terms of their repeat unit length. (From: De Roeck et al. 2019.)

Figure 2: Methylation of intron 1 at the FXN repeat locus in carrier parents and their affected child.

View the poster

Comprehensive repeat expansion analysis, including direct detection of modified bases

With Nanopore technology, there is no limit to read length: single reads frequently reach hundreds of kilobases in length, with a current record of over 4 Mb. This means that even the largest of repeat expansions can be sequenced end to end in single reads, enabling unambiguous determination of the repeat length and voiding the need for assembly, simplifying downstream computational analysis. Amplification is not required, eliminating PCR bias, and enabling repeat expansion detection across the genome, irrespective of GC content/low complexity regions.

Repeat expansion loci have been shown to have an altered methylation status, which can change the disease phenotype. To this end, characterisation of the methylation status in expansion loci, and untangling its effects on different disease phenotypes will be an important question to further examine. Nanopore sequencing does not require amplification, allowing the direct detection of base modifications (see Figure 2) alongside the nucleotide sequence for comprehensive repeat expansion interrogation.

Case study

Investigating repeat expansions in dementia using PromethION

We show that long-read sequencing with a single Oxford Nanopore Technologies PromethION flow cell per individual achieves 30x human genome coverage and enables accurate assessment of tandem repeats including the 10,000-bp Alzheimer’s disease-associated ABCA7 VNTR.

De Roeck et al

De Roeck et al. demonstrated the utility of nanopore sequencing for accurately resolving repeat expansions. Using the PromethION for whole-genome sequencing, the group were able to accurately characterise many tandem repeats, including the 10,000 bp Alzheimer’s disease associated ABCA7 VNTR. In addition, they developed a novel squiggle-based algorithm, which uses nanopore raw squiggle data to robustly determine repeat sequence composition. The possibility to resolve nucleotide composition offers the prospect of exploring interruption motifs, which are known to act as disease modifiers in other repeat disorders.

Read more
Case study

Cas9 enrichment and nanopore sequencing for repeat expansion resolution

We demonstrate the precise quantification of repeat numbers in conjunction with the determination of CpG methylation states in the repeat expansion.

Giesselmann et al.

Cas9 enrichment and nanopore sequencing enables a significantly increased coverage of target sequences, without the need for PCR amplification. Therefore, it is possible to enrich for a target in genomic regions that are impervious to PCR, and has the added benefit of preserving epigenetic modifications thereby enabling simultaneous detection of methylation. Giesselman et al. demonstrated accurate determination of both repeat length and methylation status of the C9ORF72 locus.

Get started

Repeat expansion detection: in action

For high-throughput whole genome sequencing with repeat expansion detection and characterisation, we recommend the following:

Ligation Sequencing Kit


Analysis: Custom tools e.g. tandem-genotypes, Nanosatellite


Get in touch

Talk to us

If you have any questions about our products or services, chat directly with a member of our sales team.

Book a sales call

To book a call with one of our sales team, please click below.