In a paper published in Nature Biotechnology, teams at the European Molecular Biology Laboratory (EMBL) and the University of Nottingham describe an approach, using Bayesian experimental design, that incorporates experimental data in real-time to dynamically optimise targeted sequencing, without experimenter interactions. This allows real-time focus of sequencing efforts onto areas of highest benefit.
Oxford Nanopore sequencing technology has been used to provide real-time data that can be applied to enrich or deplete regions of interest during sequencing experiments without the need for additional library preparation, optimising information gain while saving time and reducing cost. In a publication released today in Nature Biotechnology, a team led by Matt Loose of the University of Nottingham, and Lukas Weilguny and Ewan Birney at the European Molecular Biology Laboratory (EMBL) describe how they applied a method called adaptive sampling that uses an algorithmic framework and software to generate dynamically updated decision strategies. During this process, the system is able to focus on sequencing DNA fragments most likely to contribute to experiment success, either rejecting, or allowing reads as they enter the nanopore.
This marks a significant development as, until now, selection with Oxford Nanopore’s current product line has been based on predetermined regions of interest that are specified in a reference file by the user and remain constant throughout an experiment.
In this study, the authors created a new approach they have named BOSS-RUNS – Benefit-Optimising Short-term Strategy for Read Until Nanopore Sequencing – that dynamically incorporates already-observed data into the decision-making process. They found that by using BOSS-RUNS, decisions could be made using only 0.8s of sequence data – roughly 350 bases.
This allows the real-time enrichment to focus on the areas of highest benefit, which can provide even more coverage across a genome, or between genomes, mitigating sampling or preparation biases. It can also enhance variant calling capabilities, including SNPs, due to its ability to amplify data from the most relevant genomic positions.
This study used the GridION device, suitable for the application thanks to the powerful on-board NVIDIA GPU for rapid basecalling to enable real-time decisions. The software interfaces (APIs) used by the team to extend the features of adaptive sampling are freely available to the Nanopore Community, and we encourage users to experiment with, and extend, our nanopore sequencing toolset.
Ewan Birney, Director of European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), said:
“This has real utility. In this use case it is about trading additional coverage on highly prevalent bacteria for more coverage on the less prevalent bacteria. And this can all be done as the experiment runs - the system dynamically senses the right balance at every point. We think this has many other applications beyond metagenomics, including genomic assembly, human resequencing and other experiments as well.”
The paper can be accessed in full here.