Full-length RNA isoforms deliver new insights into human health and disease
- Home
- Full-length RNA isoforms deliver new insights into human health and disease
The importance of accurate isoform characterisation
Although it has long been understood that a single gene can generate multiple RNA isoforms that can result in different proteins, our knowledge of these isoforms — where they are expressed and how their functions may vary — remains limited. Legacy technologies such as short-read sequencing have prevented researchers from not only discovering and characterising all isoforms of a gene, but also quantifying expression and fully understanding their functions1. Using short-read sequencing, RNA must be fragmented, which leads to the problem of multimapping, impacting both identification of isoforms and differential expression analysis. Researchers have ‘historically been forced to collapse all isoforms into a single gene expression measurement [which is] a major oversimplification of the underlying biology’2. Long nanopore sequencing reads can span full-length RNA transcripts, enabling accurate, unambiguous isoform identification and quantification.
Uncovering novel insights into neurodegenerative disorders
In their recent publication, Aguzzoli Heberle et al. emphasised that long reads at high read depth are necessary to truly bridge the ‘substantial gaps ... in our understanding of RNA isoform diversity’2. With the aim of mapping medically relevant RNA isoforms in the human brain, the team used the cDNA-PCR Sequencing Kit and a PromethION device to perform whole-transcriptome sequencing of 12 post-mortem, aged, frontal cortex brain research samples: six with Alzheimer’s disease (AD) and six cognitively unimpaired controls (CT), with a median of 35.5 million aligned reads per sample.
The team identified 7,042 genes expressing two or more RNA isoforms, 1,917 of which were determined to be medically relevant. Ninety-eight genes implicated in brain-related diseases were found to express multiple RNA isoforms, including AD genes such as APP (Aβ-precursor protein) with five isoforms, MAPT (tau protein) with four isoforms, and BIN1 with eight isoforms. Several other genes implicated in other neurodegenerative diseases and neuropsychiatric disorders also expressed multiple RNA isoforms in the prefrontal cortex, including: SOD1 (amyotrophic lateral sclerosis and frontotemporal dementia), SNCA (Parkinson’s disease), TARDBP (involved in several neurodegenerative diseases), and SHANK3 (autism spectrum disorder).
Using a strict threshold for high-confidence isoform identification, the team reported 428 new isoforms, 53 of which originated from medically relevant genes involved in brain-related diseases, including MTHFS (implicated in major depression, schizophrenia, and bipolar disorder), CPLX2 (implicated in schizophrenia, epilepsy, and synaptic vesicle pathways), and MAOB (currently targeted for Parkinson’s disease treatment).
Five new spliced mitochondrial RNA (mtRNA) isoforms with two exons each were also identified. Explaining how surprising this was, the team revealed that all previously annotated human mitochondrial transcripts have only one exon; this has never been reported in human tissue before2. Highlighting that mitochondria are involved in many age-related diseases, the team shared how they are very interested in determining the function of these spliced mtRNA isoforms. Building on their new discoveries, they also identified RNA isoforms from genomic regions where transcription was not expected: 1,267 isoforms from 245 new gene bodies were reported. The median length was 1,529 nucleotides with 96.6% of isoforms only having two exons, which they suggest may be a feature of ageing in mammalian tissues.
The team shared that ‘the most compelling value’ in using long nanopore sequencing reads is the ability to perform differential isoform expression analyses. Analysis of six AD and six CT samples revealed expression patterns associated with AD that were hidden when performing gene-level analysis. The team reported 176 differentially expressed genes and 105 differentially expressed RNA isoforms (Figure 1). Of the 105 isoforms, 99 came from genes that were not differentially expressed at the gene level. Using the gene TNFSF12 as an example, the team showed that the TNFSF12-219 isoform was significantly upregulated in AD research samples, whereas the TNFSF12-203 isoform was significantly upregulated in control samples (Figure 1).
Figure 1. a) Differential gene expression and b) differential RNA isoform expression between AD research samples and CT samples. c) The TNFSF12 gene was not significantly differentially expressed when collapsing all transcripts into a single gene measurement. d) TNFSF12-219 was upregulated in AD research samples. e) TNFSF12-203 was upregulated in CT samples. Figure taken from Aguzzoli Heberle et al.2 and made available under Creative Commons License (creativecommons.org/licenses/by/4.0).
Understanding disease mechanisms and developing treatments require methods that offer ‘substantial improvement over short-read sequencing’ approaches. Using long nanopore sequencing reads, Aguzzoli Heberle et al. demonstrated that a large proportion of medically relevant genes in the human frontal cortex expressed multiple RNA isoforms. Differential expression analysis of these isoforms can dive deeper, to reveal which isoforms are expressed in particular cell and tissue types, and potentially facilitate direct targeting of RNA isoforms for disease treatment.
This case study was taken from the RNA sequencing white paper.
Page, M.L. et al. Surveying the landscape of RNA isoform diversity and expression across 9 GTEx tissues using long-read sequencing data. bioRxiv 579945 (2024). DOI: https://doi.org/10.1101/2024.02.13.579945
Aguzzoli Heberle, B. et al. Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA seq. Nat. Biotechnol. 10.1038/s41587-024-02245-9 (2024). DOI: https://doi.org/10.1038/s41587-024-02245-9