Nanopore Tech Tour: a round up from Beijing

This week, we're hosting the first Nanopore Technology Tour in China. On September 23rd we kicked off in Beijing with an excellent line-up of speakers from the Nanopore Community and a packed agenda of tutorials and live technology demonstrations. Read on for a summary from the event in Beijing.

Oxford Nanopore technology and application upgrades: from basic principles to application

Dan Turner, Vice President of Applications at Oxford Nanopore Technologies, and Sissel Juul, Director of Genomic Applications, provided nanopore sequencing technology updates and information about new product releases, their talks featured:

  1. VolTRAX V2 for automated library preparation.
  2. Update on read length and accuracy rate: current maximum sequencing read length up to 2.3 Mb, using R9.4.1 nanopore, DNA single long read base recognition accuracy up to 95%, R9.4.1 DNA long single read base recognition accuracy as high as 94%, and consistent Q value for S. Aureus of 44.
  3. Options for library preparation for nanopore sequencing, including a 10 minute rapid sequencing kit, PCR and PCR-free kits, and the latest improvements to throughput.
  4. Protocol Builder for generation of complete application-specific protocols, from extraction to data analysis.
  5. Recent developments in Pore-C, a chromatin conformation capture technique with the ability to correct misassemblies, detect genomic rearrangements & copy number changes, and resolve complex structural variations. The benefits of Pore-C were demonstrated on an assembly of the NA12878 cell line, producing a contig N50 of over 30 Mb, with the longest scaffold at 129 Mb length and corresponding to 89% of chromosome 8, including the centromere.
  6. PCR-free CRISPR/Cas9-mediated enrichment, which makes use of the long read lengths of nanopore sequencing to phase variants and methylation. The benefits of Cas9 enrichment were demonstrated on homozygous and heterozygous carriers of Friedreich’s ataxia to quantify the triplet repeat expansion and examine the associated hypermethylation.

Popular research results obtained by global scientists using nanopore sequencing

Professor Yutaka Suzuki , University of Tokyo, Japan

“Long-read sequencing can help us understand the molecular etiological of cancerous mutations that are still unknown in patients for which there is no effective treatment.”

Professor Yutaka Suzuki shared his experience using nanopore sequencing for whole-genome sequencing of clinical cancer samples. Beginning with lung cancer, Yutaka's team performed targeted amplicon sequencing on PromethION to accurately identify single-base mutations and alleles of cancer-associated genes (such as EGFR , KRAS , NRAS, and NF1 genes). In a single PromethION run they generate >50Gb of sequencing data, which reduces the cost of long-read cancer genome sequencing. Whole genome sequencing of 6 lung cancer cell lines generated 30x depth of coverage per sample, with a read N50 of 20 kb. This data enabled Suzuki and colleagues to identify a new class of cancer-associated structural variation, which they termed “Cancerous Local Copy Number Lesions” or CLCLs. In the CLCL region, the genome sequence is affected by complex aberration patterns consisting of tandem repeats, short inversions, and repeats. Such structural aberrations are also found in clinical lung adenocarcinoma specimens.

Yutaka and the team also performed genome-wide sequencing of the Kobe cattle (Wagyu). Most of the Kobe cattle breeding was done by artificial insemination and whilst this increased yield, it led to an increase in genetic diseases, especially recessive genetic diseases. This is work in progress, so stay tuned.

Assistant Professor Xia Yu, Southern University of Science and Technology, China

“The nanopore-based long-reading metagenomic sequencing technology provides an effective alternative to NGS assembly and binning.”

Discharged wastewater from sewage treatment plants contains antibiotic resistant bacteria (ARB), and resistance genes can be highly enriched in water supplies even far from the source. Due to the lack of reliable ARB identification methods, the mechanism of propagation and spread of resistance genes in these environments is not fully understood. Konghong Ji demonstrated a genomic approach to identifying resistant bacteria based on nanopore sequencing, with the results showing that effluent water has 10 times the concentration of resistance genes than seawater. These findings indicate that sewage discharge plays an important role in establishing a resistant environment, accounting for approximately 87% of resistance genes present in the receiving water supply.

Xia Yu's team also used MinION to sequence the frozen soil and glacial meltwater in the Qilian Mountains at an altitude of 4,000 meters. The results showed that the core microbial communities with strong adaptability in the frozen soil were resistant to changes in altitude, and the photosynthetic autotrophic Oscillatoria (Saccharomyces cerevisiae) is abundant in samples with an altitude of 4,000 meters. The strong solar radiation at this location can provide favorable conditions for living microorganisms that depend on light-driven mode.

Professor Martin Frith , Japan Industrial Technology Research Institute (ASIT) & University of Tokyo, Japan

Martin Frith highlighted the difficulty of identifying complex, disease-causing mutations, such as tandem repeat expansion/contraction, homologous recombination, chromosome breakage, and virus/transposon insertion. Using the PromethION, he performed whole-genome sequencing to overcome some of the challenges surrounding the identification of these variations. Some, such as repeat expansion or contraction, can only be identified by spanning the whole region with long reads, as they can’t be accurately mapped and quantified with short reads. Long reads from the PromethION also allowed Frith to discover transposon, pseudogene and mitochondrial sequence insertions.

In a neurodegenerative disease example, Frith was able to elucidate that nuclear inclusion was associated with repeat expansion by enriching for the target region with Cas9 and building a consensus to examine mutations. This consensus allowed for full identification of congenital mutations caused by chromosome breakage, and association of the repeat expansion to disease symptoms.

Dr. Yue Wan , Institute of Genetics, Singapore

Studying how RNA folds is critical to understanding RNA function inside cells. Mapping RNA secondary structure using short-read sequencing can provide large-scale structural information but lacks the connectivity between structures along a transcript. Yue Wan and her team used nanopore direct RNA sequencing to detect structural modifications, and, in combination with machine-learning models, accurately detected RNA secondary structures and their dynamics in known RNAs.

Structure probing of the human embryonic stem cell transcriptome with nanopore direct RNA sequencing captured structural features seen in other high-throughput structural datasets, and allowed Yue to detect structural information in individual isoforms of a gene. The experimental system used demonstrated high reproducibility and low noise. The results showed that the structural features could be located, for example, at the 5'UTR region of a particular transcript. Nanopore long-read sequencing was also able to span the entire structure of a particular section, helping the team to understand the relationship between different transcripts.

Large-scale nanopore sequencing

Fritz Sedlazeck Ph.D., Baylor College of Medicine, United States

The core of biology and medicine is to better understand the relationship between genes and phenotypes, namely genetic variation, gene regulation and other research fields. Fritz said that sequencing advances such as long nanopore reads generated on the MinION and GridION have given him a better understanding of the diversity of genetic variation, and that PromethION has enabled this on a large-scale.

Fritz first verified the capacity of the PromethION to detect biologically-relevant variants. He noted that their first run generated 74 Gb of data from one PromethION Flow Cell, whilst now they achieve 140 Gb: a yield difficult to achieve via short-read sequencing technology. In the Centers for Common Disease Genomics (CCDG) 4400 Human Genome Project for cardiovascular disease research, the team used the tool SVCollector and existing short-read data to optimise sample selection. These samples were sequenced on the PromethION platform; the tool PRINCESS, developed by Fritz, was then used to identify SVs, SNPs and perform phasing; it was also possible to evaluate methylation. Fritz noted that the resolution of each genome was greater than that generated via short-read sequencing. He then introduced the project to sequence 100 tomato genomes in 100 days using high-throughput sequencing on the PromethION.

Liu Min, Product Director, Biomarker

“Using the nanopore platform, we are able to perform accurate and quantitative analysis of transcripts to better examine the expression of isoforms in vivo”

Liu Min, Director at Biomarker, explained how full-length transcriptomics can reveal complex transcriptional events in organisms, identifying alternative splicing, polyadenylation, fusion genes and gene family association in the transcriptome. Nanopore long-read sequencing can span the entire isoform, removing or reducing ambiguous alignment, and bringing obvious advantages to high quality genome assembly.

Biomarker have a MinION, GridION and PromethION, which allowed them to sequence and assemble 60 species with a contig N50 consistently above 1 Mb, with some difficult marine organisms reaching contig N50s of 22-27 Mb. Adding nanopore transcriptome data to these helped to improve genome annotation by providing information on the alternative splicing of genes.

Min described how Biomarker have used nanopore transcriptomics for annotation, but also two further uses. The first of these is quantitative analysis of gene expression – nanopore sequencing correlates well to expected transcript frequencies, allowing investigation into expression of genes and isoforms. The second is further investigation of gene structure; the Biomarker team have obtained over 70% full-length transcripts from each dataset, allowing more effective analysis of gene families, evolutionary relationships and non-coding RNAs than with short-read RNA data.

When comparing short reads to nanopore long reads, Biomarker’s results were highly consistent, with a correlation coefficient of above 0.8 for all runs and a similar number of genes found in each dataset. When compared to other long-read RNA sequencing, the number of transcripts found in 2 Gb of nanopore data was approximately the same as those found in 20 Gb of the alternative data.

Zhang Shiwei, Director of Research, Novo Zhiyuan

“The long barrel length and long read length of the nanopore are necessary for Contig assembly”

Director Zhang Shiwei firstly described how plant genomes range in size from 0.6 Gb to 148 Gb, and most plant genomes contain many repeats, ranging from a few to millions of copies. Nanopore long-read sequencing shows lower GC bias than short-read sequencing, and can span highly repetitive regions, enabling more accurate and complete assembly of genomes, and detection of different types of mutations (such as deletions, insertions, translocations, or SNPs).

They sequenced two plant genomes using the current R9 flow cell, obtaining over 115 Gb (N50 of 38 Kb) and 81 Gb (N50 of 41 Kb) of data. The team are also early testers of the new R10 pore.

Size-selected long fragments (>10-20 kb) and ultra-long fragments (>40-50 kb) were sequenced using the Nanopore ligation sequencing kit on the PromethION platform, with an average read length of >25 kb (N50 of >40 kb). The average yield of the flow cell was >60 Gb, and the maximum yield was as high as 106 Gb. The average read length of the ultra-long library was >35 Kb, with an N50 of >55 Kb; the average yield of a single flow cell of ultra-long reads was greater than 30 Gb, and the maximum yield was up to 45 Gb. Combining multiple platforms, they obtained three high-quality plant genome assembly results.

That's all for now from Beijing but we'll be posting updates from Shanghai — the second leg of the Nanopore Tech Tour — soon!