Eoghan Harrington - Pore-C: a method for genome-wide, multi-contact chromosome conformation capture
London Calling 2019
The DNA within the nucleus of an interphase cell is organised into a complex hierarchy of folds and loops known as the 3D Genome. The development of various chromatin conformation capture methods has enabled the detection of the structures that define each level of this hierarchy e.g. chromosome territories, A/B compartments, topologically associated domains (TADs) and promoter-enhancer loops. This in turn has facilitated functional studies which have uncovered some of the mechanisms behind the formation and maintenance of these structures, as well as their effect on gene expression. However, most of these studies rely on methods that could only capture interactions between two points on the genome, and thus lacked the ability to resolve higher-order interactions. We will share our progress on Pore-C, a method to generate genome-wide, multi-contact chromatin conformation maps. We will also demonstrate how it can be used to improve whole genome assemblies and help resolve complex structural variants in cancer.
Eoghan introduced himself as a member of the Genomics Applications team, part of the Applications group at Oxford Nanopore; the group's main role is to find projects to showcase the various strengths of the nanopore platform.
Eoghan introduced the background of his talk, describing how chromatin has a tightly packed, 3D organisational structure within the cell. This spatial structure is important both architecturally and for the regulation of gene expression. The way that these contacts are generally represented is via a chromatin contact map, whereby contacts close in 3D space can be seen as close in 2D space. These maps either illustrate contacts within a single sample or compare contacts between two samples, allowing a "closeted comparison of the data". These maps are generally quite sparse at the base-pair resolution, so the genome is divided into bins, and the aggregate counts in a bin are taken as our measure.
There are a variety of methods used to investigate chromatin interactions, such as 3C, 4C, 5C, and Hi-C, which reveal pairwise interactions. The Pore-C method is one such technique used to investigate chromatin interactions, but with the ability to investigate multiway interactions. This technique involves firstly cross-linking the chromatin to maintain interactions within the cell. Chromatin is then enzymatically digested, re-ligated (whereby DNA fragments spatially close together in 3D space tend to re-ligate), and uncrosslinked; DNA is then purified and nanopore sequenced as a linear concatemer of interacting fragments. The bioinformatics pipeline, currently under development, involves read alignment to the reference genome (BWA-SW software), optimisation of alignments, assignment of the alignments to the corresponding restriction fragments, and data aggregation into bins to create a contact map.
Eoghan described some of the advantages of Pore-C compared to other commonly used methods for investigating chromatin interactions: simpler sample prep than, for example, Hi-C; simultaneous readout of DNA modifications and contact probabilities, by avoiding PCR with nanopore direct sequencing; obtain multi-way contacts and therefore visualise higher-order interactions, with greater resolution obtained with longer concatemers formed from shorter individual fragments; and sequence across repetitive regions with nanopore long reads. Combining long concatemer lengths with sequencing on the high-throughput PromethION platform, Eoghan stated that you can obtain high pairwise contact counts and greater resolution.
A comparison between restriction enzymes DpnII and HindIII has been performed: for DpnII there are 7 times more fragments than HindIII, but the majority of fragments are between 100 and 1000 bps in length; on the other hand, HindIII generates far fewer fragments but they are mostly between 1 and 10 kb, which is easier to map. Eoghan stated that HindIII has predominantly been used in Pore-C protocol development.
Pore-C has been tested on various cell types, with and without PCR or DNA fragment size selection, and cross-linking conditions. Eoghan stated that one of the most important factors considered during protocol development is the yield: the number of contacts per Gb of sequence data. With 1 Gb sequence data, around 3,000 contacts can be identified. Another QC metric measured is the ratio of inter:intra-chromosomal contacts; 1.5 is considered a good ratio to have. Eoghan stated that for native samples which tend to have a good number of multi-way contacts, they also tend to have a lot of inter-chromosomal contacts, which is something that needs to be addressed.
Finally, Eoghan presented a chromatin contact map comparing two samples, a breast cancer cell line (HCC1954) sequenced using Pore-C versus Hi-C; the map showed good correlation between the two methods, additionally revealing, for example, structural variations present. Quantitative comparisons have also been performed to compare the data to that from the reference sample.