A complex CTCF binding code defines TAD boundary structure and function

Topologically Associating Domains (TADs) compartmentalize vertebrate genomes into sub-Megabase functional neighbourhoods for gene regulation, DNA replication, recombination and repair. TADs are formed by Cohesin-mediated loop extrusion, which compacts the DNA within the domain, followed by blocking of loop extrusion by the CTCF insulator protein at their boundaries. CTCF blocks loop extrusion in an orientation dependent manner, with both experimental and in-silico studies assuming that a single site of static CTCF binding is sufficient to create a stable TAD boundary. Here, we report that most TAD boundaries in mouse cells are modular entities where CTCF binding clusters within extended genomic intervals. Optimized ChIP-seq analysis reveals that this clustering of CTCF binding does not only occur among peaks but also frequently within those peaks. Using a newly developed multi-contact Nano-C assay, we confirm that individual CTCF binding sites additively contribute to TAD separation. This clustering of CTCF binding may counter against the dynamic DNA-binding kinetics of CTCF, which urges a re-evaluation of current models for the blocking of loop extrusion. Our work thus reveals an unanticipatedly complex code of CTCF binding at TAD boundaries that expands the regulatory potential for TAD structure and function and can help to explain how distant non-coding structural variation influences gene regulation, DNA replication, recombination and repair.

Authors: Li-Hsin Chang, Sourav Ghosh, Andrea Papale, Mélanie Miranda, Vincent Piras, Jéril Degrouard, Mallory Poncelet, Nathan Lecouvreur, Sébastien Bloyer, Amélie Leforestier, David Holcman, Daan Noordermeer