nanoDoc: RNA modification detection using Nanopore raw reads with Deep One-Class Classification

Advances in Nanopore single-molecule direct RNA sequencing (DRS) have presented the possibility of detecting comprehensive post-transcriptional modifications (PTMs) as an alternative to experimental approaches combined with high-throughput sequencing. It has been shown that the DRS method can detect the change in the raw electric current signal of a PTM; however, the accuracy and reliability still require improvement.

Here, we presented a new software, called nanoDoc, for detecting PTMs from DRS data using a deep neural network. Current signal deviations caused by PTMs are analyzed via Deep One-Class Classification with a convolutional neural network. Using a ribosomal RNA dataset, the software archive displayed an area under the curve (AUC) accuracy of 0.96 for the detection of 23 different kinds of modifications in Escherichia coli and Saccharomyces cerevisiae.

We also demonstrated a tentative classification of PTMs using unsupervised clustering. Finally, we applied this software to severe acute respiratory syndrome coronavirus 2 data and identified commonly modified sites among three groups.

Authors: Hiroki Ueda