NCM 2021: Resolution of complex human papillomavirus and human sequences

Nicole Rossi (National Cancer Institute, USA) began the presentation by discussing the huge burden caused by human papillomavirus (HPV), which causes 300,000 deaths worldwide per year. In order to study HPV and cervical cancer in low/middle-income countries, Nicole and her colleagues established a cohort in Guatemala and recruited over 700 women, collecting blood, tumour tissue, and clinical data. HPV has a 7.9 kb genome and replicates in the nucleus as an episome by hijacking host cell machinery. The HPV genome encodes two oncogenes E6 and E7, the products of which inhibit the human p53 and RB1 proteins respectively and contribute to chromosomal instability. Integration into the genome almost always leads to E1 and E2 deletion.

Nicole explained that HPV integration isn’t always as simple as the HPV genome being flanked by human sequences, it can actually be quite complex. For instance, there are often multiple copies of the HPV genome and flanking DNA generated during the integration process, which are not resolved by short-read sequencing technologies. A proposed mechanism for such complex integration events involves amplification of the HPV genome and flanking human DNA, resulting in a looping structure. To investigate these events, Nicole performed whole-genome sequencing using long nanopore reads. Moreover, Nicole mentioned that adaptive sampling was used to target specific genes within the human and HPV genomes.

The CaSki Cervical Cancer Cell Line is known to have 600–800 HPV genomes in complex arrays of both the full-length HPV 16 genome, and multiple copies of the truncated genome. These structures are integrated at 30–50 locations in the tumour cell line. Since the same HPV genomes are integrated in multiple chromosomes, they must have originated before integration occurred — Nicole and her team have coined this phenomenon ‘superspreading’. Using long nanopore reads, Nicole is able to identify reads with human DNA on one end and HPV genomes on the other. Some of the HPV-only reads were up to 160 kb in length.

Next, Nicole discussed the potential model for HPV superspreading. The phenotype is thought to have arisen from a normal 7.9 kb episome undergoing abnormal replication to generate a larger episome containing many complete and truncated HPV genomes. Subsequently, these HPV concatemers get inserted at multiple loci in the human genome. Similar observations were discovered in the head- and -neck cancer cell line, SCC152; complete and truncated HPV genomes randomly arranged as integrated multimers.

One of the unique properties of HPV16 is that not every tumour has an integrated virus, with about one third of cervical tumours exhibiting integrated HPV16 genomes. This raises the question of how HPV16 causes cancer without integration. To that end, Nicole and her team characterised the unique cell line, SNU-1000, with both episomal and integrated HPV16. Whole-genome sequencing (WGS) of SNU-1000 revealed a 150 bp HPV fragment inserted into an intron of the CEP126 gene on chromosome 11. The fragment is composed of a small portion of the E7 gene, although it is incapable of coding for a functioning protein. Interestingly, the integration did result in the amplification of the YAP1 oncogene; YAP1 can cause cervical cancer by interfering with the immune response. HPV-16, in the absence of integration, was shown to cause cervical cancer in mice by increasing expression of YAP1.

Nicole showed SNU-1000 HPV reads from nanopore WGS, and the alignments revealed some large insertions within the episomal reads. Some of these insertions are multiples of the 7.9 kb episome, meaning that there is more than one full length HPV genome present in these episomes. Nicole and Michael’s group coined this phenomenon multimer episomes. Some episomes also have a 634 bp deletion which removes the E1/E2 genes. Using a rapid tagmentation protocol, the team captured an array of episomes, representing monomers and long multimers as well as scrambled HPV genomes.

The sequencing of SNU-1000 demonstrates that abnormal extrachromosomal episomal replication can occur and is able to cause cancer without integrating. Nicole also observed E1 and E2 deletion in episomal HPV16 in a similar fashion to what occurs during integration, and stated that if they were to integrate, superspreading may ensue. To uncover whether this was occurring in actual tumours, Nicole and her team carried out nanopore WGS of 62 Guatemalan cervical tumours, which were previously characterised as episomal only or episomal and integrated. They found a subset of tumours carrying monomer episomes only, suggesting HPV can cause cancer without integration. The other episomal-only tumours contained multimers and rearranged episomes, the latter of which frequently occurred within the E1 and E2 genes. Some of the rearranged multimers/episomes have episomes that exist as a dimer, with one example possessing a deletion to a key regulatory site, which the group postulate was selected for to drive the increased expression of E6 and E7 in the absence of integration. The tumours containing both episomal and integrated HPV genomes also showed complex rearrangements.

In the future, Nicole plans to sequence these tumour types more extensively. They are starting now to prepare ultra-high molecular weight DNA, for long-read sequencing. This is required to capture the multimer episomes, and the concatenated integrations occurring in many of the cell lines. Nicole added that they used adaptive sampling to enrich for HPV-containing ultra-long reads — Nicole smiled beamingly as she revealed this is unchartered waters. For the CaSki cells, Nicole obtained a four-fold enrichment in HPV-only reads, and in HPV human reads using the ‘Ultra-Adapt’ protocol. Impressively, Nicole also obtained HPV containing reads as long as 350 kb and human-only reads up to 1.5 Mb. These ultra-long reads helped identify blocks of HPV DNA flanked by human DNA.

Nicole passed over to Michael Dean, to talk about the effects of HPV integration on viral and human gene expression. Michael gave a brief reminder that the expression of E1 and E2 represses expression of E6 and E7, and upon integration E1 and E2 loss enables high E6 and E7 expression. Michael conducted nanopore direct cDNA sequencing to uncover the full-length HPV transcripts in CaSki cell lines, which showed knockout of the E1 and E2 genes. In the SNU-1000 cell line, which has only episomal DNA, the expression is very similar to the CaSki cells with HPV integration. They next extracted RNA from cervical tumours with only episomal DNA and performed cDNA-PCR sequencing with barcoding. In all the cell lines, representing monomer, rearranged and multimer episomes, E6 and E7 expression predominates.

How HPV expression is regulated, especially in absence of integration, is poorly understood. In an effort to unravel just how expression is controlled, Michael sought to examine the epigenetic profile of the cell lines and tumours. 5-methylcytosine (5mC) was called from direct nanopore sequencing using the Megalodon program. The nanopore data was in complete concordance with the bisulfite data. Michael justified his choice to use nanopore sequencing; ‘the attraction of the nanopore data is that, for the reads across the full HPV genome, we can see the phase of all the methylated bases, and no other technology can show this’. Furthermore, there is a distinct difference between cell lines and a considerable amount of heterogeneity in the methylation patterns of individual HPV genomes within a cell line.

No study has looked at the role of 5-hydroxymethylation (5hmC) in HPV genomes and ‘5hmC can be called from the same sequencing data’. The team’s preliminary analysis indicates that this modification is present on HPV DNA. Michael has plans to pursue this project further to gain a better understanding of how 5hmC regulates viral gene expression.

Another observation made from patient cell lines was that integration activates or disrupts important local cellular genes and oncogenes. For example, HPV integration disrupted RAD51B, which is involved in DNA repair. Michael stated that further work needs to be done to elucidate how integration impacts the local genome and epigenome. To that end, Pore-C, a chromatin interaction method adapted to nanopore sequencing, was used. In doing so, Michael and his team have 8 million reads, with over 3 million contacts, of which over 1 million occur at long range. Interestingly, there are over 12,000 HPV-containing reads in the dataset and 8,000 of these connect to human sequences. Michael hopes that this will shed light on how HPV integration influences the human genome.

Michael moved on to discuss the PI3K signalling pathway, which almost half of cervical tumours have aberrations in. The drug Piqray acts as a PI3KCA inhibitor and is currently an FDA-approved drug to treat breast cancer, but it’s therapeutic potential for treating cervical cancer is unknown. Michael proceeded to test this drug on several cervical cancer cell lines with mutations in the PI3K pathway, revealing a marked reduction in E6 and E7 expression. The drug also dramatically reduced proliferation, as well as upregulating the T-cell checkpoint marker PD-1. These are promising findings and suggest the drug reduces cancer growth as well as galvanizing T-cells into action. Using nanopore cDNA-PCR analysis following Piqray treatment, Michael showed HPV expression was reduced. In the future, Michael plans to investigate how methylation and chromatin structure respond to treatment with the drug. He concluded the talk by highlighting the goal of this research, which is to ‘reduce the cancer burden of human papillomaviruses’.

Authors: Nicole Rossi and Michael Dean