Systematic assessment of long-read RNA-Seq datasets and its application in transcriptome analysis


Abstract

Recent advances in long-read sequencing enables the direct measurement of transcripts, which requires full-length libraries that differ from short-read RNA-Seq. A systematic understanding of the distinct characteristics of long-read RNA-Seq is much needed. Here, we assess nanopore sequencing data for various library properties. Our analysis shows that truncation in 5’ and 3’ UTR regions and shifts in splice sites are common, imposing a major challenge for accurate transcript analysis. By taking these biases into account, we improve transcript identification and quantification for nanopore sequencing and validate using synthetic controls. These insights will facilitate future transcriptome analysis using nanopore sequencing.

Authors: Chenchen Zhu