Discovering multiple types of DNA methylation from bacteria and microbiome using nanopore sequencing

Nanopore sequencing provides a great opportunity for direct detection of chemical DNA modification. However, existing computational methods were either trained for detecting a specific form of DNA modification from one, or a few, specific sequence contexts (e.g. 5-methylcytosine from CpG dinucleotides) or for allowing de novo detection without effectively differentiating between different forms of DNA modifications. As a result, none of these methods supports de novo, systematic study of unknown bacterial methylomes.

In this work, by examining three types of DNA methylation in a large diversity of sequence contexts, we observed that nanopore sequencing signal displays complex heterogeneity across methylation events of the same type. To capture this complexity and enable nanopore sequencing for broadly applicable methylation discovery, we generated a training dataset from an assortment of bacterial species and developed a novel method that couples the identification and fine mapping of the three forms of DNA methylation into a multi-label classification design. We evaluated the method and then applied it to individual bacteria and mouse gut microbiome for reliable methylation discovery. In addition, we demonstrated in the microbiome analysis the use of DNA methylation for binning metagenomic contigs, associating mobile genetic elements with their host genomes, and for the first time, identifying misassembled metagenomic contigs.

This novel method has broad utility for discovering different forms of DNA methylation from bacteria, assisting functional studies of epigenetic regulation in bacteria, and exploiting bacterial epigenomes for more effective metagenomic analyses.

Authors: Alan Tourancheau, Edward A. Mead, Xue-song Zhang, Gang Fang