Understanding the drivers of oncogenesis

Comparing the genomes of tumour and paired normal tissue enables identification of somatic variants which can act as driver mutations in cancer progression, and help identify germline variants to reveal hereditary cancer risk. In the Cancer research breakout session at London Calling 2023, we heard how scientists at the Michael Smith Genome Sciences Centre in Canada are using nanopore sequencing to gain comprehensive insights into how genomic and transcriptomic alterations affect oncogenesis.

In his talk, The potential of nanopore sequencing for personalised oncogenomics, Kieran O’Neill introduced the Personalized OncoGenomics (POG) Program research trial, which has been running for over 12 years, and has involved the sequencing of over 1,000 genomes and transcriptomes from clinical research samples from participants with advanced cancer. Motivated by the wide range of tumour-specific variation that can be captured within a single sequencing assay using nanopore sequencing, including large, complex structural variants (SVs) and epigenetic modifications, Kieran highlighted how they were ‘very interested in the potential for nanopore sequencing to improve upon what we’re getting’.

Genomic variants can be difficult to resolve using traditional short-read sequencing technology since the short reads cannot span large SVs, which can reach megabase scale. Furthermore, short fragments of DNA must undergo PCR, which erases epigenetic modifications and introduces bias, limiting detection only to regions amenable to PCR. Long, PCR-free nanopore reads can span SVs end-to-end and directly detect methylation in the same sequencing run, with no additional library prep. With this in mind, Kieran explained how they used nanopore sequencing to resolve complex SVs, detect methylation, and perform phasing analyses in tumour-normal (T-N) research samples. Using the PromethION 48 sequencing device, the group generated 70 Gb of data per PromethION Flow Cell, totalling 17 Tb sequenced, with ‘very good’ read length N50s at ~30 kb.

‘we have recalled all known clinically relevant fusions and detected structural variants not identified from short read data’

Describing a complex SV affecting the SMG1 gene (Figure 1), which is suggested to have roles in responding to DNA and RNA damage and cellular stress, Kieran highlighted how ‘it was completely invisible to [short-read sequencing]’ using read depth coverages of 184x for tumour and 44x for matched normal research samples. Using nanopore sequencing, the team were able to resolve the SV, including its multiple breakpoints, at 38x depth for tumour and 28x for matched normal research samples.

Figure 1. A complex structural variant in the SMG1 gene on chromosome 16 was resolved using long nanopore sequencing reads but was not detected using short-read sequencing.

Since nanopore sequencing technology enables the direct sequencing of native DNA, it is possible to directly detect epigenetic variants in the same sequencing run as genomic variants. Aberrant DNA methylation can cause inappropriate silencing of tumour suppressor genes (TSGs) or expression of oncogenes in tumours. Using nanopore sequencing, Kieran and his group confirmed previous reports that tumours display global hypomethylation relative to normal tissue, and went on to distinguish tissue-of-origin of the tumour based on methylation patterns. Many POG research samples are metastatic, meaning it is difficult to determine tissue of origin. Studying enhancer regions and CTCF-binding sites, Kieran and his group showed that methylation clustered by tumour type and not tissue type for the cancer research samples; Kieran explained that they were ‘excited about the possibility of using this as an orthogonal approach for detecting tumour tissue of origin’

Using such long reads, the team were also able to assign methylation according to haplotype. Assigning a variant to the maternal or paternal chromosome can reveal haplotype regions with uneven distribution of cancer-associated mutations. Highlighting that ‘phasing is the other really quite exciting potential of using nanopore’, Kieran explained that a read length N50 of >20 kb is needed to achieve megabase-sized phasing blocks. Kieran compared nanopore technology to ‘the other long-read sequencing technology which does tend to cap out at about 20 kb’ and shared how excited he was that nanopore sequencing could capture entire gene bodies in one phase block.

‘Even some of these enormous genes, which are megabases in size, we can capture them 25–30% of the time — the entire gene body — which is quite impressive’

With phase block lengths well into the megabases (median 1.27 Mb), the group were able to dig deeper into the two-hit hypothesis which states that in order to drive oncogenesis, a TSG must be inactivated on both alleles. Explaining that they usually assume a TSG has two inactivating mutations acting in trans, Kieran highlighted that they ‘can find out for sure’ using nanopore sequencing. Phasing analysis of somatic variants suggested that the ‘majority of multi-hit TSGs are biallelic’. Kieran went on to describe a case where two somatic variants in the PTEN gene were acting in cis to each other (Figure 2), before showing how phasing enabled them to determine that allelic expression was largely explained by imbalanced copy number and allele-specific methylation.

Figure 2. Phasing somatic variants in the PTEN gene using long nanopore sequencing reads.

Concluding his presentation, Kieran noted that the group will be releasing their data via the European Genomic Archive as a resource to help other researchers developing data analysis software for tumour characterisation.

1. O'Neill, K. The potential of nanopore sequencing for personalised oncogenomics. Presentation. Available at: https://nanoporetech.com/resource-centre/london-calling-2023-potential-nanopore-sequencing-personalised-oncogenomics [Accessed: 11 September 2023]