15th December 2016 - BioRxiv
Advances in single molecule sequencing technology have enabled the investigation of the full catalogue of covalent DNA modifications. We present an assay, Modified DNA sequencing (MoD-seq), that leverages raw nanopore data processing, visualization and statistical testing to directly survey DNA modifications without the need for a large prior training dataset. We present case studies applying MoD-seq to identify three distinct marks, 4mC, 5mC, and 6mA, and demonstrate quantitative reproducibility across biological replicates processed in different labs. In a ground-truth dataset created via in vitro treatment of synthetic DNA with selected methylases, we show that modifications can be detected in a variety of distinct sequence contexts. We recapitulated known methylation patterns and frequencies in E. coli, and propose a pipeline for the comprehensive discovery of DNA modifications in a genome without a priori knowledge of their chemical identities.