Profiling the transposable element epigenome with nanopore sequencing

Seth (Mater Research Institute, Australia) explained that repetitive elements comprise a large proportion of the human genome, generally derived from retrotransposons. The largest class of these are the LINEs. LINE-1 is the only human retrotransposon that is still autonomously active, encoding those proteins that allow it to mobilise via a ‘copy and paste’ mechanism, leading to the generation of 6 kb insertions in the embryo, the brain, and in cancer – the focus of Seth’s talk.

L1 insertions can disrupt tumour suppressor genes; for example, loss of a second APC allele by L1 insertional mutagenesis can initiate colorectal cancer. Even though there are hundreds or thousands of copies of L1 in the genome, very few of them are active, and there is high heterogeneity in activity across tissues. Until recently, most research focused on L1 elements en masse, averaging analyses across all copies of L1, without looking into locus-specific effects. Seth’s team wanted to address the question of whether epigenetic heterogeneity explains why some L1 elements are hyperactive in cancer.

To do this, Seth’s team profiled L1 element methylation with nanopore sequencing. Nanopore sequencing has ‘distinct advantages for looking at L1 insertions’. Firstly, short reads tend to map ambiguously to the genome, as the internal L1 sequences are highly repetitive; in comparison, long nanopore reads tend to be able to resolve the insertion completely. Secondly, nanopore technology enables direct detection of methylation. For their project, they used tissue derived from brain, heart, and liver (representing the three different germ layers) from one individual; and from a second individual, they used hepatocellular carcinoma tissue and adjacent tissue samples. Whole-genome nanopore sequencing was performed on the extracted DNA with the PromethION platform, and transposons were called using the TLDR (Transposons from Long DNA Reads) tool. TLDR was designed by Dr Adam Ewing (University of Queensland), and both identifies transposable elements and profiles their methylation status.

From their analysis of normal liver vs. hepatocellular carcinoma tissue, they found that L1s were demethylated across the genome in cancer. Seth highlighted a locus on chromosome 22, which is known to be the most active L1 copy in normal tissue, seemed to change its methylation status the least in cancer. They wanted to know what was special about the particular L1 copy. They discovered that it was already demethylated in normal liver, but not in heart or hippocampus, suggesting this demethylation was restricted to endoderm tissue. This copy could therefore be a candidate for initiating carcinogenesis, by transposing in pre-cancerous tissue.

With the long reads we can do really cool things, such as look at allele specific methylation’ of L1 copies, to reveal how the different alleles behave. Reads can also be mapped back to the reference to identify non-reference insertions, and then their methylation status can be determined.

Seth next discussed how TLDR could identify functional tumour-specific insertions: providing an example of two such instances, in which insertions had occurred with genes, in the 3’UTR of one gene and the intron of the other. Both of those insertions were found to impact gene expression. Seth emphasised how this highlights the impact that these L1 insertions can have in cancer.

Seth discussed his team’s current work on using the technique NanoNOME (from the Timp lab): a method for investigating chromatin accessibility. In this technique, a methylase methylates GpC motifs in nucleosome-free accessible regions, and through detection of GpC methylation by sequencing, chromatin accessibility can be determined based on methylation status. This allows them to look at accessibility in the long nanopore reads, and thereby the activation of the L1 locus. Seth displayed example data from two cell lines, demonstrating how NanoNOME-seq enabled identification of individual L1 copies, their methylation status, and thereby their accessibility and activation. Looking genome wide, L1 methylation status and extent of chromatin accessibility are correlated. This suggests that DNA methylation is probably the key determinant of whether or not an L1 is active.

Seth stated that a potential future direction of this work is to identify if there is a selective pressure for L1 demethylation in cancer – the whole genome does not get demethylated in cancer, L1 itself is much more profoundly demethylated than the surrounding genome, and more so than other families of transposons. They have found that L1 demethylation can activate expression of proto-oncogenes, such as MET.

Concluding the talk, Seth stated that analysis of nanopore whole-genome sequencing data using TLDR enables comprehensive profiling of transposable element insertions and their epigenetic states. From this, they have found the L1s are globally demethylated in cancer. Using the technique NanoNOME-seq, they have been enable to identify active L1s. Nanopore sequencing therefore helps to elucidate the roles of transposable elements in human biology and cancer.

Authors: Seth Cheetham