Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries

Synthetic biology utilises the Design-Build-Test-Learn pipeline for the engineering of biological systems. Typically, this requires the construction of specifically designed, large and complex DNA assemblies. The availability of cheap DNA synthesis and automation enables high-throughput assembly approaches, which generates a heavy demand for DNA sequencing to verify correctly assembled constructs. Next-generation sequencing is ideally positioned to perform this task, however with expensive hardware costs and bespoke data analysis requirements few laboratories utilise this technology in-house. Here a workflow for highly multiplexed sequencing is presented, capable of fast and accurate sequence verification of DNA assemblies using nanopore technology. A novel sample barcoding system using PCR is introduced and sequencing data is analysed through a bespoke analysis algorithm. Crucially, this algorithm overcomes the problem of high-error rate nanopore data (which typically prevents identification of single nucleotide variants) through statistical analysis of strand bias, permitting accurate sequence analysis with single-base resolution. As an example, 576 constructs (6 x 96 well plates) were processed in a single workflow in 72 hours (from E. coli colonies to analysed data). Given our procedure’s low hardware costs and highly multiplexed capability, this provides cost effective access to powerful DNA sequencing for any laboratory, with applications beyond synthetic biology including directed evolution, SNP analysis and gene synthesis.

Authors: Andrew Currin, Neil Swainston, Mark S Dunstan, Adrian J Jervis, Paul Mulherin, Christopher J Robinson, Sandra Taylor, Pablo Carbonell, Katherine A Hollywood, Cunyu Yan, Eriko Takano, Nigel S Scrutton, Rainer Breitling