Products & Services
Applications

Applications

Nanopore sequencing offers advantages in all areas of research. Our offering includes DNA sequencing, as well as RNA and gene expression analysis and future technology for analysing proteins.

Learn about applications
View all Applications
Resources
News Explore
Contact

Nanopores allow direct sequencing of RNA strands, giving full-length reads with low bias

Poster

Date: 24th May 2018

Complete RNA strands can be sequenced on the MinION, GridION and PromethION using a simple library prep, without the need to convert to double-stranded DNA

Fig. 1 Direct RNA sequencing a) library-prep workflow, b) ‘squiggle’, c) alignment

Direct RNA offers native-strand sequencing, quick library prep and full-length transcripts

Nanopores are the only sequencing technology which can sequence an RNA strand directly, rather than analysing the products of reverse transcription and PCR reactions. In the workflow shown in Fig. 1a, an adapter is attached to the poly-A tail at the 3’ end of the RNA strand. This adapter is pre-loaded with a motor protein. This protein controls the speed of translocation of the RNA strand through the nanopore. Fig. 1b shows the raw data produced by translocation of a complete 1,500 nt transcript through the nanopore. The poly-A tail can be seen close to the start of the read since, unlike DNA, RNA is sequenced 3’ end first. Fig. 1c shows a section of a Direct RNA read obtained from a yeast transcriptome dataset, aligned to the reference.

Fig. 2 Direct RNA a) accuracy, b) alignment coverage, c) Circos plot, d) correlation with Illumina

Increases in throughput allow generation of transcriptome-wide, full-length datasets

We sequenced a whole-transcriptome RNA library prepared from Saccharomyces cerevisiae S228C. The modal read accuracy was > 90% (Fig. 2a). Mapping to the reference transcriptome and calculating the proportion of each transcript covered by each read shows that we obtain a high proportion of full-length reads (Fig. 2b). We obtained 100-base paired-end Illumina data of the same sample, aligned the Direct RNA and Illumina reads to the reference and calculated the log-fold difference in coverage (Fig. 2c). 2,045,748 (63.43%) Direct RNA reads mapped using GMAP 22, and 708,592,030 (98.22%) Illumina reads mapped using GSNAP. Direct RNA gene-coverage corresponds well with the Illumina results, with a correlation of 0.73 (Fig. 2d).

Fig. 3 Unambiguous detection of a) isozymes, b) and c) spike-ins, d) quantitative measurement

RNA data identifies isozymes and splice variants unambiguously and with low bias

Two of the yeast transcripts were GAPDH isozymes, forms of the same gene residing at different genomic positions. Although the homology of the genes is higher than the current accuracy, we can map without ambiguity (Fig. 3a). The long reads of Direct RNA sequencing should allow straightforward detection of splice variants. We investigated this using Lexogen’s SIRV panel, and detected the majority of variants in the panel (Figs. 3b and 3c). Having no amplification means that Direct RNA sequencing should show low quantitative bias. However, the high similarity of many SIRV RNAs results in some mismapping. With the more dissimilar RNAs in the ERCC panel, read counts match the expected values extremely closely (Fig. 3d, Spearman r = 0.97; p = 5.9e-56).

Fig. 4 Modifications in human rRNA a) Tombo prediction, b) raw reads, c) Tombo performance

Detecting alternative bases from highly modified human rRNA

Our Tombo software suite can detect different modifications in direct RNA data using a single algorithm. Figure 4a shows accurate prediction of many different annotated RNA modifications in close proximity at a region of the human small subunit (SSU) ribosomal RNA. Figure 4b shows the raw signal (one red line per read) over the ionic m7G modification on the SSU. The normalised current levels deviate from the expected levels (grey distributions) around this modification, indicating the presence of m7G. When Tombo is not used this causes consistent errors in the consensus sequence produced from these reads. Over the entire span of the long and short subunits, Tombo identifies a higher percentage of modified reads at and near annotated modified bases than at positions further from an annotated modified base (Fig. 4c).

Recommended for you

Open a chat to talk to our sales team
FAQs

FAQs

Search