Fig. 3 Partitioning bacterial reads using methylation patterns a) overview of experimental set-up b) bioinformatics workflow c) hexbin plot showing partitioned reads d) verification of results
Sequence similarity between strains in microbial communities can present challenges for analysis. Fortunately, even though methylation occurs at specific motifs in bacteria, there is high diversity of motifs even among members of the same species. These methylation patterns can be detected in native nanopore reads and used to bin reads by strain. We co-cultured a wild-type K12 E. coli strain and a mutant strain lacking Dcm and Dam methyltransferases (5'-CCWGG-3' and 5'-GATC-3' motifs respectively, Fig. 3a). After aligning all reads to a K12 reference, we used Tombo to characterise methylation status at each motif before calculating the median methylation score for these sites (Fig. 3b). The hexbin plot shows a division of reads based solely on the read- level methylation assessments at the two motifs. In addition, each strain had been transformed with a different plasmid and these reads segregated with the expected genome (Fig. 3c). De novo assembly of reads from the unmethylated cluster gave an assembly in which much of the Dam methyltransferase gene was deleted, hence the lack of Dam methylation (Fig. 3d). The Dcm methyl- transferase gene had also been inactivated by a point mutation (Fig. 3e). Next, we used the methylation binning tool Nanodisco to group genomic DNA and plasmids from a 4-species mock community, including three strains with high nucleotide similarity but different methylation motifs. We used a de novo approach to discover and cluster significant motifs (Fig. 3f). Plasmids and genomic contigs from correct hosts clustered together. Lastly, we took the difference in methylation between each plasmid and host genome across known motifs (Fig. 3g). Each plasmid had the highest similarity with the correct host genome.