Resources Get started
Resource Centre

Pore-C: using nanopore reads to delineate long-range interactions between genomic loci in the human genome


Date: 24th May 2019

Imielinski lab (NYGC) collaboration: probing the three-dimensional spatial organisation of chromatin in human cells using a combination of long-read sequencing and chromatin-conformation capture (3C)

Fig. 1 Pore-C laboratory workflow

Sequencing multiple concatenated fragments of interacting loci with Pore-C

Chromatin conformation capture is a method used to investigate interactions between genomic loci that are not adjacent in the primary sequence. Genomic DNA is first cross-linked to histones using formaldehyde, which preserves the spatial proximity of interacting loci. Restriction digestion followed by proximity ligation is used to join cross-linked, interacting fragments. These fragments may be size-selected and amplified by PCR before sequencing (Fig. 1). Long nanopore reads can span entire amplicons, which can contain fragments from multiple interacting loci. This can reveal biological functions such as promoter-enhancer interactions.

Fig. 2 Overview of Pore-C bioinformatics workflow

Obtaining a genome-wide chromosomal contact map from Pore-C reads

The concatameric Pore-C Reads are first aligned to a reference sequence using BWA-SW to identify separate alignments. Each aligned read is filtered to retain only the minimal collection of alignments that traverse the majority of the read. Following optimisation of the alignment path, each segment of the read is assigned to a restriction fragment, determined through in silico digestion of the reference sequence. The reference genome is then divided into equally sized bins and restriction fragments are assigned to their corresponding bin. Finally, the total number of bin-to-bin contacts is calculated from all reads and visualized in a contact map (Fig. 2).

Fig. 3 Contact maps of breast cancer cell line HCC1954 and standard cell line NA12878

Genome-wide interaction maps of breast cancer and lymphoblastoid cell lines

Using the bioinformatics pipeline outlined in Fig. 2 we constructed contact maps at a 1 Mb resolution by mapping against human reference hg37, with each cell in the matrix representing the number of Pore-C reads that map to both parts of the genome. For both cell lines the chromosome boundaries are clearly visible, indicating that most of the interactions are intra-chromosomal (Fig. 3a). In the case of the cancer cell line (HCC1954, below the diagonal),amplifications of the underlying genome are visible as higher density horizontal and vertical bands, while regions of higher intensity off the diagonal indicate either genomic rearrangement or changes in the spatial organisation of the chromatin associated with cancer (Fig. 3b).

Using genome-wide interaction maps to improve assembly contiguity

Using genome-wide interaction maps to improve assembly contiguity

Pore-C reads from NA12878 were mapped against the contigs produced by de novo whole genome assembly of the same sample. The Salsa2 tool uses the resulting contact density map to split, re-orient and join contigs into scaffolds that are consistent with the contact data. Fig. 4a shows the contact densities for the contigs (left) and scaffolds (right) of chromosome 8. Salsa2 determined that the two largest contigs for this chromosome were adjacent and in the opposite orientation and in fact could be joined, creating a scaffold that spans ~90% of the whole chromosome. The assembled contigs and resulting scaffolds are shown below in the dot plots, which were created using mummer (Fig. 4b)

Recommended for you

Open a chat to talk to our sales team