Genome-wide detection of cytosine methylations in plant from Nanopore sequencing data using deep learning

Methylation states of DNA bases can be detected from native Nanopore reads directly. At present, there are many computational methods that can detect 5mCs in CpG contexts accurately by Nanopore sequencing. However, there is currently a lack of methods to detect 5mCs in non-CpG contexts. In this study, we propose a computational pipeline which can detect 5mC sites in both CpG and non-CpG contexts of plant genomes by using Nanopore sequencing. And we sequenced two model plants Arabidopsis thaliana (A. thaliana) and Oryza sativa (O. sativa) by using Nanopore sequencing and bisulfite sequencing.

The results of our proposed pipeline in the two plants achieved high correlations with bisulfite sequencing: above 0.98, 0.96, 0.85 for CpG, CHG, and CHH (H indicates A, C or T) motif, respectively. Our proposed pipeline also achieved high performance on Brassica nigra (B. nigra). Experiments also showed that our proposed pipeline can achieve high performance even with low coverage of reads. Moreover, by using Nanopore sequencing, our proposed pipeline is capable of profiling methylation of more cytosines than bisulfite sequencing.

Authors: Peng Ni, Neng Huang, Fan Nie, Jun Zhang, Zhi Zhang, Bo Wu, Lu Bai, Wende Liu, Chuan-Le Xiao, Feng Luo, Jianxin Wang