Amplicons to whole genomes: clinical sequencing using nanopore technology


Ultra-high-resolution HLA typing

Andrew introduced his first topic – ultra-high-resolution HLA typing using Oxford Nanopore sequencing - by asking, why type HLA at all? He explained how organ and stem cell transplantation are critically important, life-saving treatments.

Andrew explained how kidney dialysis is very important but incredibly expensive; you can get around these issues with transplantation, but this also isn't cheap - costing $17,000 for the first year, then ~$5,000 in subsequent years.

HLA mismatching increases the risk of transplant rejection, so it is "critical to get a good match". However, HLA typing is difficult for a number of reasons - the locus is highly polymorphic, it is co-dominantly inherited, and HLA gene expression is also important to measure.

Current methods for HLA typing, such as serology and sequence-specific PCR amplification ("like the first version of HD"), give low resolution; short-read NGS gives higher resolution (4-field), but it cannot easily phase haplotypes and struggles with homozygosity. Moreover, all of these methods are expensive - surely there is a better way?

Andrew then introduced his solution - nanopore-based HLA typing. This also provides 4-field (8-digit) resolution and, exploiting the inherent advantages of long reads, involves long-range PCR. Compared to previous PCR assays which used 96 well plates, this is a "single-tube assay with a 150 minute turn-around".

The approach costs a total of $109, with 60 ng DNA input required for the PCR reaction; post-PCR, ligation library preparation is performed (Nanopore library prep kit SQK-LSK109), 12-sample multiplexed sequencing is then carried out on a single MinION Flow Cell. The run is basecalled in real time with Guppy, and finally HLA assembly and calling is carried out with HLA-LA*, which uses reference graph assembly. The total time of the workflow is 5.5 hours.

Testing their workflow on 33 reference samples, it outperformed current technologies, with 100% concordance for class I calls, and only one sample with a second field mismatch; however, it turned out that the nanopore sequencing result was correct, not the short-read sequencing result. In response to this, Andrew said that "accuracy at the moment outperforms current state of the art".

For class II calls, concordance was also 100% for the first field, and one sample had an error in the second field of DPA1*. However, when indels were polished out of the data, this was corrected.

Haplotyping was performed using the WhatsHap tool, and runs of homozygosity could also be called as part of the HLA algorithm they were using.

Andrew stated that they have been testing the R10 pores, and these show substantially lower numbers of mismatches during alignment. They have also run a single sample on a Flongle Flow Cell in 2 hours, and he suggested that HLA could be called in only 50 minutes from 8 samples multiplexed on the MinION.

To conclude this section, Andrew said that, with their approach, we can type HLA within a day and to a much higher resolution than technologies which are as fast, but also faster than technologies of equivalent resolution. He shared future work that he will be doing in this area – such as Cas9 enrichment, combined DNA/RNA expression, and incorporating SNP typing into the assay. He pointed out that there is great potential here to democratise sequencing - at the moment, reference laboratories perform these sorts of assays, but there is no reason that they cannot be taken into the field thanks to the portability and speed of nanopore technology.

CNV resolution of clinical samples on the Flongle

"Wouldn't it be really cool if we could try doing CNV calling on a Flongle?"

The second part of Andrew’s talk discussed “quick and dirty CNV calling” on the Flongle. Many human diseases are caused by germline CNVs, and CNVs are associated with cancers, such as EGFR amplification in lung cancer. Taking 1 µg input DNA from blood or tumour samples, library prep (SQK-LSK109) was performed, followed by an 8-hour Flongle run. This produced ~0.05x depth of coverage of the whole human genome.

Early results from this work on a few colorectal cancer samples, using Sniffles for SV calling and QDNASeq/Bioconductor for CNV calling, found concordance between the nanopore and short-read WGS data. Known translocations and deletions were also detected, as was loss of heterozygosity, although not with high confidence, and Andrew suggested more reads using the MinION could be more optimal, or performing an enrichment-type method.

Clinical whole-genome sequencing using the PromethION

To introduce this section, Andrew discussed the UK 100K genome project (GP) which has sequenced >20,000 human tumour genomes.

`clinical whole-genome sequencing (WGS) is likely to transform patient care; many patients with advanced/metastatic disease have had treatment changes due to WGS findings. However, the workflow of the UK 100K GP was relatively slow – with an average turn-around time of 4-6 weeks, which is "too slow for patient care". Short-read WGS, which was used for the UK 100K GP, also struggles to provide high-quality SV calls due to read length.

Andrew said that we need to make it quicker, and taking some samples from the UK 100K GP they "had a go" at doing just that.

Andrew discussed their approach to clinical WGS and variant calling, using the PromethION sequencing platform. With 3 µg DNA from GeL samples, library prep was performed with the Ligation Sequencing Kit, followed by 72-hour sequencing runs and a custom bioinformatics pipeline which included alignment (Minimap2), variant calling with various tools (Clair, Longshot and Sniffles), and methylation calling (Nanopolish).

One of the challenges that they faced was that got so much data from all the human genomes sequenced on the PromethION that it became a problem in terms of data transfer requirements clashing with what was available at the genome centre. Thanks to the BEAR/Castles team, they managed to reduce their computational burden.

So far, 12 samples have been processed via this pipeline (48 will be processed in total), with a median flow cell output of 100 Gbases and a longest read of 1.14 Mbp ("we should be part of the long read club really").

They have observed a reduced output with very long read lengths but shearing before library prep increased the yield. In terms of variant calling, SNV accuracy was comparable to short-read sequencing, and many SVs were identified in cancer that were not seen in the short-read WGS data, "which was definitely fascinating"; typically, they observed mostly intronic variants. CNVs were “relatively straightforward” to call on the PromethION data, including complex CNVs and loss of heterozygosity, with binning reduced down to 15 kb, and using only the tools QDNAseq and Bioconductor for calling.

Andrew described how you "can detect fusions much more easily at the DNA level" compared to short-read sequencing. Fusions were detected with the Sniffles tool, although Andrew suggested that it may be preferable to detect fusions from RNA sequencing data. Nanopolish and MethplotLib tools were used to call methylation; hypomethylation of MLH1 near its promoter, as is commonly observed in colorectal cancer, was detected - and Andrew said that this is a drug target. Andrew stated that methylation detection from PromethION data had a much higher resolution "compared to anything else we do" in terms of methylation calling with other technologies. It was a "piece of cake" and "in fact we are going to move all our methylation assays onto PromethION and nanopolish".

In conclusion, Andrew said that clinical WGS on the PromethION “has the potential to be game changing” although we are still in the “beta” stage of its application – we need better variant calling tools, and a clinical pipeline and ISO accreditation (which they are going to work with Genomics England to achieve). In terms of accuracy, nanopore sequencing data is comparable to short-read data, and “in some ways better”. Moreover, with multiplexing, clinical WGS will be possible within < 24 hours, and this will be "a transformation for clinical genetics".

What could be the future of clinical nanopore sequencing?
Andrew suggested that amplicon targeted sequencing on Flongle is ideal for the clinic, and clinical WGS is “potentially a game changer” for nanopore sequencing, as it is the same price as short-read sequencing yet much more information is obtained from a sequencing run, such as methylation and SV calling.

It is "a very exciting time to be involved in nanopore sequencing", which has the potential to radically change clinical genetics in the next few years.

Authors: Andrew Beggs