New insights into Arabidopsis transcriptome complexity revealed by direct sequencing of native RNAs

Arabidopsis thaliana transcriptomes have been extensively studied and characterized under different conditions. However, most of the current ‘RNA-sequencing’ technologies produce a relatively short read length and demand a reverse-transcription step, preventing effective characterization of transcriptome complexity.

Here, we performed Direct RNA Sequencing (DRS) using the latest Oxford Nanopore Technology (ONT) with exceptional read length.

We demonstrate that the complexity of the A. thaliana transcriptomes has been substantially under-estimated. The ONT direct RNA sequencing identified novel transcript isoforms at both the vegetative (14-day old seedlings, stage 1.04) and reproductive stages (stage 6.00–6.10) of development.

Using in-house software called TrackCluster, we determined alternative transcription initiation (ATI), alternative polyadenylation (APA), alternative splicing (AS), and fusion transcripts. More than 38 500 novel transcript isoforms were identified, including six categories of fusion-transcripts that may result from differential RNA processing mechanisms.

Aided by the Tombo algorithm, we found an enrichment of m5C modifications in the mobile mRNAs, consistent with a recent finding that m5C modification in mRNAs is crucial for their long-distance movement.

In summary, ONT DRS offers an advantage in the identification and functional characterization of novel RNA isoforms and RNA base modifications, significantly improving annotation of the A. thaliana genome.

Authors: Shoudong Zhang, Runsheng Li, Li Zhang, Shengjie Chen, Min Xie, Liu Yang, Yiji Xia, Christine H Foyer, Zhongying Zhao, Hon-Ming Lam