Fig. 1 Pore-C a) laboratory workflow b) multi-contact reads c) overview of bioinformatics workflow d) good concordance between Hi-C pairwise and Pore-C virtual pairwise datasets
Genomic DNA must be folded to fit inside a nucleus, but must remain accessible for gene transcription, replication and repair. Control elements and their target genes are not always adjacent in the linear sequence, and so folding is not random. Pore-C explores the folded state of the genome, which can tell us about genome function and regulation. Genomic DNA is first cross-linked to histones, preserving the spatial proximity of interacting loci. Restriction digestion followed by proximity ligation is used to join cross-linked, interacting fragments, which are then sequenced (Fig. 1a). Nanopore reads span entire amplicons, which can contain fragments from multiple interacting loci (Fig. 1b). Each segment of the read is assigned to a restriction fragment, determined by in silico digestion of the reference sequence. The reference genome is then divided into equally sized bins and restriction fragments are assigned to their corresponding bin. Finally, the total number of bin-to-bin contacts is calculated from all reads and visualized in a contact map (Fig. 1c). When the Pore-C reads are simplified to a set of virtual pairwise contacts, the data is concordant with Hi-C contacts at the chromosome and territory level (Fig. 1d). We have released Pore-C sample preparation protocols for cells grown in culture, insects, nematode worms, mammalian blood and tissue and plants.