Applications Nanopore sequencing accuracy
Accuracy
Page last updated: December 2020
For many years Oxford Nanopore has continuously iterated our technology to improve its performance. We continue to improve the nanopore sensing system, through updates to analytical methods and new chemistries. This page guides you on what to expect from the nanopore sequencing system, and which tools to choose to achieve these results.
Introduction
Nanopore DNA and RNA sequencing accuracy can be measured in a number of ways, and the relevant metric for a scientist will depend on the specific experiments being performed.
As with all systems, choosing the most up to date analysis tools for the analysis that you are interested in is critical, and the quality of the sample can also influence the outcome. With so many relevant variables, clear guidelines are important, and below we have defined some accuracy measurement types, and included recommendations for best performance.
Raw read accuracy
Nanopore sequencing provides direct electronic analysis of the target molecule, rather than sequencing a synthetic copy or using surrogate markers such as fluorescence. Basecalling algorithms are then used to provide an interpretable output of the sequencing reads. Nanopore basecalling algorithms are continuously improved to enhance accuracy over time, also allowing new methods to be applied to previously sequenced raw data.
Direct sequencing avoids sources of bias such as PCR and gives native information about the target molecule. We define raw read accuracy as the accuracy achieved when reading a single DNA or RNA fragment/molecule once. Applications for which raw read sequencing is relevant include those where time-to-result maybe be critical, but at this time most applications are more likely to focus on variant calling, consensus accuracy or other metrics. Improvements in raw read accuracy can drive improvements in other accuracy metrics.
Latest updates to nanopore sequencing achieve:
Chemistry | Raw read accuracy (modal) | Analytical tools | Sample |
---|---|---|---|
R9.4.1 | > 97% | MinKNOW 4.0, Guppy 3.6 (production software) | Mixed genomes |
R9.4.1 | 98.3% | Bonito 0.3.0 (research tool) | Mixed genomes |
R10.3 | >97% | MinKNOW 4.0, Guppy 3.6 (production software) | Mixed genomes |
Latest updates to nanopore sequencing achieve:
Chemistry | Consensus accuracy | Analytical tools | Sample |
---|---|---|---|
R9.4.1 | Q45 | Guppy basecall, Flye assembly, Medaka polish | Zymo mock community (bacterial) |
R10.3 | Q50 | Guppy basecall, Flye assembly, Medaka polish | Zymo mock community (bacterial) |
Consensus accuracy
Consensus generation can also be applied to specific regions of interest, by combining multiple exact copies of a single original fragment or molecule into a single high-quality sequence. These exact copies could be sequenced together in a single read, for example generated by circular or linear amplification, or could be associated by use of a unique identifier (UMI). Through combining multiple copies together, a higher confidence in accuracy is achieved.
Applications where single molecule consensus could be particularly useful include liquid biopsy low-frequency variant detection, or 16S sequencing.
Single molecule consensus
Consensus generation can also be applied to specific regions of interest, by combining multiple exact copies of a single original fragment or molecule into a single high-quality sequence. These exact copies could be sequenced together in a single read, for example generated by circular or linear amplification, or could be associated by use of a unique identifier (UMI). Through combining multiple copies together, a higher confidence in accuracy is achieved.
Applications where single molecule consensus could be particularly useful include liquid biopsy low-frequency variant detection, or 16S sequencing.
Latest updates to nanopore sequencing achieve:
Covering all of the genome
To create an accurate picture of the genome, it is important for a sequencing technology to reach all parts of it, even the parts which are difficult to map. Genomes are littered with repetitive and low-complexity regions, which are difficult to sequence and align using traditional technologies. For example, it is estimated that short-read technology reaches only 92% of the human genome, leaving 8% that contains many disease-relevant genes, excluded from the dataset. Nanopore technology has been shown to reduce these “dark” areas of the genome by 81%, shedding light on parts of the genome not sequenced by any other technology (Ebbert, 2019), and giving a more complete picture.
Variant calling
Single nucleotide variants (SNVs), small indels and structural variants (SVs) are critical for our understanding of how genomic changes drive phenotypes. The ability of nanopore technology to sequence any length of nucleic acid molecule allows for unprecedented resolution of complex structural variants, as well as identification and haplotype phasing of single nucleotide alterations.
The ability to accurately call variants is often expressed as precision and recall values, generated from reads covering the position of interest multiple times. Precision is the proportion of calls in the call set that are correct, whereas recall is the percentage of variants present in the genome that are found in the call set.
The latest precision, recall and F1 (a harmonic mean of precision and recall) for nanopore chemistries can be found below, along with a recommended tool chain to achieve similar metrics.
Latest updates to nanopore sequencing achieve:
Chemistry | Coverage | Precision | Recall | F1 | Tools | Sample | |
---|---|---|---|---|---|---|---|
SV | R9.4.1 | 50X | 97.5 | 95.5 | 96.49 | EPI2ME workflow, github pipeline, EPI2ME Labs tutorial | human, HG002 |
R10.3 | 60X | 95.6 | 96.4 | 96.0 | EPI2ME workflow, github pipeline, EPI2ME Labs tutorial | human, HG002 | |
SNP | R9.4.1 | 50X | 99.92 | 99.92 | 99.92 | Medaka, EPI2ME Labs tutorial | human, HG002 |
R10.3 | 60X | 98.92 | 99.46 | 99.12 | Medaka, EPI2ME Labs tutorial | human, HG002 |
Base modifications
The four ‘canonical’ bases (A, C, G and T in DNA and A, C, G and U in RNA) can be biologically modified by the presence of additional chemical group, such as methylation. These modifications can significantly alter gene expression and are implicated in a range of diseases including cancer. Scientists are only just beginning to scratch the surface of how newly-recognised epigenetic changes impact function, for example, RNA is known to possess over 170 distinct modifications.
Oxford Nanopore’s technology can sequence the DNA or RNA molecules directly, enabling direct, real-time detection of 5mC, 5hmC, 6mA.
This allows for detection of these base modifications with no additional experiments or sample preparation steps required, and modification information is accessible through onboard software. In contrast, traditional technologies can require a separate process called bisulphite sequencing, which uses aggressive sample treatment and has a number of limitations.
Recent publications have shown excellent correlation between nanopore modification detection and existing techniques, as well as further modification sites detected that had previously been missed by alternative approaches.
Compared to whole-genome bisulphite sequencing, nanopore demonstrates: |
---|
Strong correlation |
Higher number of CpG positions called |
Less data required |
Faster analysis |
Simpler workflow with no toxic components |
Better reproducibility and consistency run-to-run |
Phasing of methylation is possible |
More even coverage, less effect of GC bias |
Test accuracy
Sequencing may be used to perform a certain biological test, for example presence or absence of a particular organism, species identification, testing for one or more genetic variants, or to perform multi-omics testing in one assay. Test accuracy can be defined as the ability of the technology to answer that question correctly every time, and this can be quantified by identifying the proportion of true and false positives and negatives among a total number of cases. Test accuracy is an important metric for areas such as food safety, and microbial surveillance. Nanopore sequencing has been shown to be effective at accurately performing many different types of tests. Browse the resource centre for examples.
For these examples, the analysis pipeline is specific to the test in question, but tool recommendations can be found in the protocol builder.
Future developments
Our goal is to enable to genetic analysis of anything, by anyone, anywhere, and as such we are pursuing constant iterative performance improvements. For many years Oxford Nanopore has continuously iterated our technology to improve its performance. We continue to improve the nanopore sensing system, boosting accuracy performance through updates to analytical methods and new chemistries. Latest releases can be found in the Nanopore Community, or in the News section.
Subscribe
Get in touch
Talk to us
If you have any questions about our products or services, chat directly with a member of our sales team.
Talk to usBook a sales call
To book a call with one of our sales team, please click below.
Book a call