Nanopore Tech Tour: a round up from Shanghai

Yesterday the Nanopore Tech Tour landed in Shanghai for another fantastic day of talks, live sequencing demos, tutorials and flow cell loading clinics. Read on for summaries of the talks.

Dr. Chun Hang Au, Hong Kong Sanatorium & Hospital, China

"No need for qPCR - the MinION can perform sequencing directly; I strongly recommend molecular diagnosticians to consider the use of the real-time MinION sequencer."

Real-time long-read sequencing for structural variation detection in clinical applications

Dr. Chun Hang Au stated that structural variation (SV) is a key component in the comprehensive analysis of hematological malignancies. Current mainstream techniques for biological investigations include karyotype analysis, qRT-PCR, fluorescence in situ hybridization (FISH), and next generation sequencing. Since 2017, Dr. Au and his team have been evaluating the clinical application of long-read, real-time nanopore sequencing to the diagnostic molecular pathology laboratory environment.

In a patient with acute myeloid leukemia (AML), Au and colleagues performed whole-genome sequencing (WGS) on the MinION, detecting a cryptic driver translocation t(5;11) and a passenger translocation t(10;12). In adult patients with T-cell acute lymphoblastic leukemia (T-ALL), nanopore WGS accurately detected a breakpoint of translocation t(11;14), and revealed a potential TRA/TRD-LMO2 fusion. Breakpoint information can not only provide important diagnostic and prognostic information, but also disease-specific markers for therapeutic monitoring. The team are also performing nanopore WGS for aneuploidy screening, and CRISPR/Cas9 mediated, PCR-free target enrichment for SV breakpoint detection, achieving targeted sequencing results in less than one hour.

Finally, Chun Hang Au concluded that the MinION can be a very good complement to short-read sequencing technology, digital PCR, and other conventional techniques, to improve the understanding of clinical diseases, thanks to the low cost and portability of the MinION and simple library preparation workflow. He stated that he highly recommends that molecular diagnosticians begin to consider the use of the real-time MinION sequencing platform.

Associate Professor Xiao Chuanle, Sun Yat-sen University, China

Development and application of key tools for third generation sequencing data

Dr. Xiao Chuanle, Associate Professor at Sun Yat-sen University, China, began by describing the clear advantages of nanopore sequencing technology, including the long-read output, the ability to sequence native strands without the need for PCR amplification, and how these features enable both de novo assembly and the analysis of base modifications in plant and animal samples. He went on to introduce the key calculation methods and applications that he and his team have developed for analysis of data from third generation sequencing technology:

  1. NECAT: an efficient correction and assembly tool for nanopore long reads

The team generated the rapid assembly system NECAT, proposing a global seed voting scoring model and a partial map sequence correction model. Results indicate that the method is 17-56x faster than two similar tools; it is currently being used to assemble multiple plant genomes.

  1. Detecting epigenetic modifications in nanopore data

Using a recurrent neural network (RNN), the team developed DeepMod, a tool for high-precision detection of the modifications 5mC and 6mA in nanopore sequencing data across whole genomes. Results for the tool were published in Nature Communications in 2019 (DOI: 10.1038/s41467-019-10168-2), with accuracy reaching 99% for 5mC detection and 90% for 6mA.

Wang Depeng, CEO, GrandOmics Bioscience Co. Ltd

G1000: nanopore sequencing of 1000 whole genomes

Mr. Wang Depeng, CEO of GrandOmics Bioscience Co. Ltd, began by describing how in the early days of human genome sequencing, such as in the Yan Huang Project, limitations were imposed by the technology used. Although the technology enabled significant achievements, he noted the difficulties encountered in resolving repetitive regions. Wang then described the HX001 project, launched with the goal of detecting structural variations in the human genome. The project involved the development of a workflow utilising assembly tool NextDenovo and polishing tool NextPolish. Using the high-throughput PromethION platform, they were able to sequence and assemble the 12-chromosome Oryza sativa (rice) “9311” genome to almost whole-chromosome contig level, with a contig N50 reaching 23.6 Mb.

Wang then went on to describe how there are many structural variations (SVs) responsible for genomic disease which cannot be investigated using short-read sequencing methods. Now, he explained, nanopore sequencing can be used to comprehensively detect mutations, from SVs to SNPs, insertions, deletions, copy number variants (CNVs), short tandem repeats (STRs) and methylation, enabling a complete picture of variation.

Wang described how they are using the PromethION platform to sequence large numbers of human whole genomes from the Chinese population, generating data for dbSV: a vast database of structural variants. He highlighted the very high throughput of PromethION sequencing, enabling the generation of very large datasets in a short time span, noting that the throughput can be significantly greater than that of alternative high-throughput third-generation sequencing technologies. The project has three phases: in the first, 1,424 human genomes were sequenced, generating an average yield per run of 50 Gb, an average read length of 17 kb and average read N50 of 22 kb – sufficient for 18x depth of coverage of the genome. He described how the sequencing enabled resolution of chromosome deletion, repetitive regions, insertions, inversions and more. The project is now in its second phase, and Wang described how he plans to obtain more nanopore sequencing platforms and recruit more participants to complete the database.

Linfeng Yang, R&D Director, BGI-Tech, China

Long read lengths for solving complex genome assembly

As one of the most experienced sequencing service providers in China, BGI and their collaborators have published over 165 research papers, and have recently completed a number of plant and animal genome assembly projects with nanopore sequencing. Director Linfeng Yang began by explaining that use of traditional short-read sequencing technologies alone gives highly fragmented assemblies, particularly of larger genomes, which limits their utility as reference sequences. In order to resolve complex genomes much longer read lengths are needed, meaning nanopore sequencing exhibits tremendous potential for BGI to generate high-quality genome assemblies.

In order to assess the utility of nanopore sequencing for their needs, Yang’s team sequenced 5 plant and animal species, obtaining 80% of their data with a raw Q score of greater than 10. To explore the data quality further, Yang and team then examined different basecalling options, calling the data with both transducer and “flip flop” models to compare. Yang showed some initial conclusions, explaining that recalling the data with the latest version of Guppy lowered the per-read error rate by 4%, and switching to the Guppy High-Accuracy mode reduced the error by a further 2%, allowing for better assemblies. These findings could be taken forward to form recommendations for future projects, including updating to the latest basecalling option prior to downstream analysis.

Moving onto some assembly examples, Yang presented tables of metrics for more plant and animal species, some of which displayed a high proportion of repetitive content which had made them historically very tricky to work with. Here the team aimed to generate assemblies with a high contig N50 to give more complete assemblies than existing draft genomes. In one particular example of a 633 Mb genome, 120X depth of coverage gave a contig N50 of 3.6 Mb and BUSCO completeness score of 92.4, despite the plant exhibiting extremely high heterozygosity.

In two further cases, Yang described how the addition of nanopore sequencing data greatly improved highly repetitive plant genomes where the repetitive content ranged from 71 – 74%, the details of which are shown below. Error correction with Medaka and Racon, followed by assembly with NECAT proved an efficient and accurate pipeline for these assemblies.

Yang said in conclusion: “For de novo assembly, because nanopore sequencing gives long read lengths, low bias, high consistency and high throughput, it can be used effectively to optimise assembly strategies”.