Ultra-rich nanopore data offers unprecedented insights into the transcriptomes of single cells

According to some estimates, about 95% of multi-exon genes in the human genome are alternatively spliced1. Many of these have been linked to diseases, making them tremendously valuable in clinical and translational research. Biopharma scientists routinely use RNA sequencing to analyse isoforms and entire transcriptomes as part of their biomarker discovery and target identification workflows.

Unfortunately, however, most sequencing platforms struggle to represent the full depth of this isoform diversity. Since nearly all isoforms are longer than the typical length of reads produced by short-read sequencing — in the human genome, at least 95% of isoforms are longer than 300 base pairs — short reads cannot span most complete isoforms2. Instead, isoform sequences must be assembled during data analysis. However, even the most advanced assembly methods have been unsuccessful in stitching short reads together into full isoforms: studies show that just 20% to 40% of the human transcriptome can be accurately reconstructed this way3.

In contrast, long nanopore sequencing reads overcome this challenge by capturing entire isoforms in single reads, avoiding the need for downstream assembly. Now, scientists at Genentech have released a new method that generates accurate, reliable single-cell expression data for a targeted set of genes using only nanopore sequencing data, eliminating the need for short-read sequencing on an orthogonal platform. The method also addresses the throughput and artifact challenges seen with other long-read approaches.

Depth and breadth

In their study, Byrne et al. addressed the ongoing challenges of relying on short-read data for transcriptome analysis4. Their method was designed to increase throughput considerably, allowing for the interrogation of significantly more genes per run compared to conventional approaches, which are often used for groups of 50 genes or fewer.

The authors reported how they ‘developed Single-cell Targeted Isoform Long-Read Sequencing (scTaILoR-seq), a hybridization capture method which targets over a thousand genes of interest’. The technique improves the ‘median number of unique transcripts per cell by 29-fold’ and relies on broad gene panels to achieve its throughput. Biotinylated PCR primers are also incorporated to reduce the impact of cDNA artifacts.

The researchers evaluated this new method in three cell lines representing ovarian cancer as well as tumor research samples, assessing results either with or without guidance from data from a short-read sequencing technology to identify cellular barcodes and UMIs. cDNA libraries were generated with 10x Genomics technology, prepared for sequencing using the Oxford Nanopore Ligation Sequencing Kit, and sequenced on both MinION and PromethION Flow Cells. The authors reported that scTaILoR-seq, based on nanopore sequencing data, ‘is capable of accurately producing single-cell transcriptomes from a complex tumor tissue without the need for supplemental short-read information’. Using a single platform makes it possible to reduce costs and produce results faster.

‘Nanopore long-read sequencing is used to enable assignment, identification and quantification of transcript isoforms in thousands of single cells’

The new method generated nearly 11,000 single-cell transcriptomes. Its gene-enrichment protocol resulted in ‘a 16.4-fold increase of on-target reads compared to untargeted long-read sequencing, yielding a significant boost in read counts per gene’ — and identified almost 2,500 annotated transcripts that were not detected by a comparable untargeted technique. The method even enabled profiling of the T cell receptor repertoire, an area of great interest for drug discovery research, despite the small fraction of T cells present in the cell populations. The authors noted that improved sensitivity for transcript detection makes this method a good fit for variant discovery, analysis of differential isoform expression, and other common applications.

The team was also able to delve into differential isoform usage, identifying 43 events they found to be significant. These included differential usage of isoforms related to CD8+ T cells and certain fibroblasts, as well as low expression of an isoform that influences the epithelial-mesenchymal transition.

Nanopore-based isoform detection offered other advantages as well. With excellent read depth and broad coverage, nanopore data ‘enabled more comprehensive detection of expressed SNVs, which was fundamental for the characterization of transcript structure alterations’, the authors reported. Some of these structural alterations could be especially important in elucidating the biological function of cancer mutations, such as an SNV that changes the DNA binding domain of TP53. In addition, the scientists were able to use long nanopore reads with multiple SNVs to reconstruct haplotypes, phase variants, and identify imbalances in allelic expression. They highlighted that ‘of potential therapeutic relevance is the observed allele-specific expression of VEGFA, which is the target of bevacizumab for treatment of platinum-resistant recurrent epithelial ovarian cancer’, pointing to the potential impact of using allele-specific transcript information to inform diagnosis and treatment selection.

With this strong demonstration of scTaILoR-seq and its performance using only nanopore sequencing reads, researchers now have access to a technique capable of generating comprehensive transcriptome coverage at the single-cell level in one straightforward experimental process.

1. Pan, Q. et al. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008). DOI: https://doi.org/10.1038/ng.259

2. Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22: 1760–1774 (2012). DOI: https://doi.org/10.1101/gr.135350.111

3. Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184. (2013) DOI https://doi.org/10.1038/nmeth.2714

4. Byrne, A. et al. Single-cell long-read targeted sequencing reveals transcriptional variation in ovarian cancer. bioRxiv (2023). DOI: https://doi.org/10.1101/2023.07.17.549422