Epigenetics, Cas9-mediated enrichment and novel insights in transcriptome variation: catch up on talks from ASHG 2020

This week we heard from Ariel Gershman, Shruti Iyer and Tuuli Lappalainen on their latest research addressing some of the key challenges in human genomics using nanopore technology. The talks are now available to watch on-demand:

Ariel Gershman, Johns Hopkins University – Into the unknown: Epigenetics of repetitive DNA

Ariel discussed her research into profiling DNA CpG methylation in the human genome, specifically in repetitive regions of the genome that are challenging to access. CpG methylation can be detected in nanopore sequencing with ‘extremely high accuracy’. Compared to other methods of methylation calling, such as bisulphite sequencing, the PCR-free nature of nanopore sequencing for modification detection removes GC bias, and longer reads increase mappability of the data and reveal long-range epigenetic patterns.

Ariel investigated methylation patterns in the haploid (CHM13) human genome, assembled to high contiguity by the Telomere-to-telomere (T2T) consortium. She discovered distinctive and consistent methylation patterns in centromeric higher order repeats (HORs), of every centromere analysed, and revealed the role of these distinctive patterns in kinetochore attachment. With long, native nanopore reads, the repetitive DXZ4 allele could be phased to the active and inactive X chromosomes, based on methylation state alone.

Concluding her talk, Ariel explained how it was really due to the long nanopore reads that she was able to probe the methylation states of these large repetitive arrays. In future, she plans to perform phased epigenetics on diploid human genome assemblies, and to analyse other epigenetic regulatory events using exogenous DNA labelling and long nanopore reads.

Shruti Iyer, Cold Spring Harbor Laboratory/ Stony Brook University – An affinity-based Cas9-mediated enrichment method using nanopore sequencing

Shruti explained how whole-genome sequencing has revealed substantial heterogeneity in the cancer genome, but detecting and analysing the complex genomes of tumour subpopulations is very difficult with this approach. Targeted short-read sequencing-based methods have been used to increase coverage over genomic features of interest, but this technology has an inherent ‘blind spot’ to structural variants (SVs), which play an important role in cancer. To analyse SVs, single reads covering the variant are needed to reduce mapping errors occurring due to the variants, ‘and that’s where long-read sequencing really comes in’.

Shruti is interested in applying targeted nanopore sequencing to cancer genomics, and she and her team opted for the Cas9-mediated PCR-free enrichment method. The goal with this method, she explained, was to cover the entire target region with a single read, and as it was amplification-free, they could also call methylation in parallel.

Having obtained a record 198 kb read targeting the BRCA1 region with this approach, she explained how her team explored methods to increase the depth of coverage at the target, developing the ACME method (Affinity-Based Cas9-Mediated Enrichment). Via background reduction, this method doubled the sequencing depth of their targets, with single-read target sizes up to 100 kb. Importantly, when applying this approach to SV detection in cancer genes, ACME called all SVs that were detected in parallel by long-read whole-genome sequencing, demonstrating its success. To achieve this method’s ‘full sequencing potential’, Shruti next plans to multiplex samples, and also scale down the input material; in the long term she will apply this approach to a panel of genes taken from existing cancer diagnostic panels, for testing in organoids and tumour samples.

Tuuli Lappalainen, New York Genome Center – Long-read sequencing of human tissues provides novel insights into transcriptome variation

Tuuli stated how much of our understanding of the transcriptome has come from short-read RNA sequencing, but this technology does not directly measure the real biological units of the transcriptome – transcripts. The Genotype Tissue Expression (GTEx) project, the largest project to date investigating tissue-specific gene expression, has primarily been based on short-read data, but with limited ability to resolve individual transcripts, allele-specific transcript structures could not be identified. Tuuli explained that with long reads, the technology is now mature enough for affordable production of large-scale datasets, to comprehensively investigate the structure and regulation of the transcriptome. To this end, her team analysed 80 GTEx transcriptomes across multiple tissues, using nanopore technology.

Tuuli’s team obtained >10 K high-quality novel transcripts, which showed higher-resolution tissue-specific clustering compared to already-annotated transcripts, confirming how true biological variation had been captured with the transcript-level data. Long reads were essential for the analysis of allele-specific transcript structures (ASTS) – an important analysis, Tuuli explained, as each of the two haplotypes can have different effects. Their novel software Long Read Allele Specific analysis (LoRALs) was used to assign transcript reads to haplotypes. They found 1,264 unique genes displaying significant allele-specific expression (ASE), and 152 displaying ASTS, and there was often overlap between the two for a given gene.

Presenting the largest human long-read cDNA sequencing dataset to date, Tuuli summarised how long reads have helped to resolve the relationship between genetic variants on expression, splicing, and transcript structure.

Find out more about human genomics and nanopore sequencing.

Read the Oxford Nanopore announcement from ASHG 2020 to find out about the latest releases.