Advancing long-read genome sequencing towards improved clinical and molecular diagnosis of rare genetic disease
- Home
- Advancing long-read genome sequencing towards improved clinical and molecular diagnosis of rare genetic disease
Katie (Canada’s Michael Smith Genome Sciences Centre) explained that a rare disease is defined as that occurring in fewer than 1 in 2,000 individuals (UK) or 200,000 individuals (US); worldwide, 260-450 million people have a rare disease, so ‘collectively they pose a significant burden on the health system’. Many rare diseases (~70-80%) are suspected to be genetic in origin. Katie shared that there is a saying in medicine from Theodore Woodward: ‘When you hear hoofbeats, think horses, not zebras’; however, in rare diseases, you have to think of zebras.
Many individuals with rare diseases lack a diagnosis, and because of this, rare disease patients often go through a long “diagnostic odyssey” involving multiple sequential clinical visits and molecular tests. This is a significant burden for the patients and their families. Genetic and phenotypic heterogeneity are also a challenge for rare disease molecular diagnosis – with multiple overlapping symptoms between diseases, and many candidate genes being involved. Another consideration is incomplete penetrance in many diseases, meaning not everyone with the same underlying genetic abnormalities has the same clinical phenotypes. This is particularly important in rare inherited cancers. This can make it hard to identify carriers of the genetic changes. Thirdly, variants of unknown biological or clinical significance are a challenge in rare disease diagnosis; the awareness of these variants has increased through the application of whole-genome or whole-exome sequencing. The last challenge Katie presented, which was central to her talk, was that of unresolved complex variation: the technologies most widely used for genetic diagnosis, such as short-read sequencing, have difficulty in resolving such variation. Katie highlighted her team’s published work on this area (Thibodeau, O’Neill, Dixon et al. 2020. Genetics in Medicine). In this work, they investigated the use of nanopore sequencing to identify such complex variants in clinical research samples; Katie presented an example of where long nanopore reads resolved a complex variant, which ultimately helped explain the lack of a particular phenotype observed in the patient from which the sample had been sourced.
Extending upon this previous work, Katie’s team have performed whole-genome nanopore sequencing for structural variation (SV) characterisation in 32 human genomes, to identify SVs that may not have been identified in these samples using standard orthogonal testing methods. From these sequencing runs, they achieved an average genome depth of coverage of 30X per sample. Per genome, and based on the output from 3 SV callers, the unified number of intrachromosomal SVs identified was ~40,000. Katie stated that almost 98% of reads were phased at 30X depth, ‘suggesting that nanopore genome sequencing at 30X could potentially be a replacement for other short-read based genome sequencing in the future’. They found a relationship between higher depth and longer reads with increased phase block lengths, showing potential for haplotype-phased assemblies.
Within the cohort, 16 individuals were found to have SVs in known breast cancer-predisposition genes. This included two individuals from unrelated families with recurrent deletion of the same exons in BRCA1. Explaining this further, Katie noted how almost 20 years ago a research group had reported a potentially recurrent recombination hotspot within an intron of BRCA1, and observed different deletions in the gene which they hypothesised was mediated by sequence homology at the locus. In the two unrelated individuals in Katie’s cohort, two different sized deletions were found, which is consistent with what had been reported in the literature; ‘what was striking was that nanopore sequencing was really cleanly able to identify both of these deletions and help us identify where this exact recombination hotspot was’, and therefore the precise sequence mediating these recurrent deletion events.
Katie’s team also found that nanopore sequencing could detect founder deletion variants as well as duplication events, ‘which are inherently difficult to identify using short-read technologies’. Katie showed how variant-containing reads could be phased with the long-read nanopore sequencing data. Due to these clean findings, Katie suggested that ‘targeted long-read sequencing may be a good alternative, and a cost-effective alternative, to approaches like targeted Sanger sequencing or MPLA for copy number analysis’.
In terms of paediatric disorders, Katie explained that paediatric cases tend to undergo a wider variety of tests than adult rare disease cases, particularly hereditary cancer syndromes, so choosing appropriate testing strategy is essential. Classic testing strategies such as karyotyping and microarrays have a larger resolution for SVs and large chromosomal abnormalities, but fail to identify sequence or base-level changes that can only be identified with exome or genome sequencing. However, short-read whole-genome sequencing has only ‘marginally improved’ the rates of diagnosis from exome sequencing, which may be due to limitations in its ability to identify certain types of genetic variation.
Katie introduced one example case of a child with anopthalmia, microcephaly, hypertonia, hepatosplenomegaly, and seizures. In this individual, a chromosomal rearrangement had been identified involving duplication, triplication, and deletion in chromosome 13q, using standard testing. No pathogenic or likely pathogenic small variants were identified using exome sequencing. Katie mentioned that this isn’t the first time that chromosome 13q abnormalities had been identified in anophthalmia patients. Nanopore sequencing was essential in this example to reveal the underlying complex SVs, revealing a specific signature of inverted sequences, and resolving the order of the complex rearrangement, involving duplications, inverted triplication, and a terminal deletion.
Going back to what Katie had mentioned previously about how genome sequencing had only marginally improved the rates of diagnosis compared to exome sequencing, Katie explained that this is partly because the functional significance remains unknown, especially because a lot of the genome is non-coding, and partly because of the challenge of accessing and resolving variants, both structural variants, as well as epigenetic variations, using traditional technologies. ‘These are all things [where] nanopore sequencing could potentially help fill some of the gap’.
Katie concluded that nanopore sequencing is sensitive to the detection of SVs at known disease loci, and that global haplotype inference will have the potential to inform molecular aetiology and the segregation of disease-associated variants. Various variant calling methods are needed for agnostic identification of germline SVs, in cases where the underlying genetic association is unknown. From this work, her team aim to uncover dark regions of the genome with nanopore sequencing, which are difficult to access using other sequencing technologies. In future, they aim to investigate how targeted sequencing could be used for rapid, sensitive, and cost-effective analysis of well-characterised candidate disease genes. They also plan to investigate the potential application of nanopore sequencing in the neonatal intensive care unit, where rapid turnaround times could have important implications for prognosis or treatment.