Products
Services
Applications
Resources Get started
Resource Centre

De novo assembly of large eukaryotic genomes with long nanopore reads plus scaffolding using Pore-C

Poster

Date: 5th December 2019

Long flip-flop-basecalled nanopore reads simplify de novo assembly of eukaryotic genomes, resulting in increased contiguity and accuracy. Pore-C can be used to correct and scaffold the assembled contigs

Download the PDF

Fig. 1 Illustration of a typical approach to genome assembly

Long nanopore reads enable de novo assembly of large and complex genomes

Nanopore reads can reach hundreds of kilobases in length, which is more than sufficient to span entire viral genomes in single reads. In contrast, to obtain a complete genome sequence from bacterial, or larger, genomes it is currently necessary to reconstruct the sequence by aligning and joining together overlapping sequence reads. This process is termed ‘de novo genome assembly’ (Fig. 1). Assembling genomes using data from short-read sequencing technologies presents a computational challenge, and the results tend to be imperfect, particularly when the genomes contain extensive repetitive regions. Long reads make assembly far easier, and allow us to resolve repeats and structural variants that are several kilobases in length.

Fig. 2 Read length a) typical distribution b) assembly c) mapped long human MinION reads

Nanopore sequencing can give long reads without the need for size selection

The read length that can be obtained from nanopore sequencing is limited only by the integrity of the DNA extracted from the sample and the care taken during library preparation. The read- length distribution corresponds closely to the fragment-length distribution of the sample DNA. When starting with high-molecular-weight genomic DNA, it is straightforward to obtain reads that are tens of kilobases in length (Fig. 2a). The longer the sequence read, the longer the repetitive region or SV that can be resolved, allowing the correct structure of the variant to be elucidated (Fig. 2b). Recent increases in throughput make it realistic to sequence whole human genomes on a MinION (Fig. 2c).

Fig. 3 Nanopore-only assembly of the Sadri and Basmati varieties of rice

Highly contiguous assemblies of Sadri and Basmati varieties of rice

Rice is the third highest agricultural commodity worldwide and is a staple food for around one third of the world’s population. The genome is diploid, with 12 pairs of chromosomes and is just under 400 Mb in size. We generated ~47x and ~57x of flip-flop-basecalled reads for the varieties Sadri and Basmati respectively and assembled both genomes using a custom workflow (Fig. 3a). Benchmarking analysis showed the resulting assemblies to be highly contiguous (Figs. 3b–3d), with assembly N50 values of 11.05 and 8.86 Mb respectively. BUSCO analysis (Fig. 3e) indicates a level of completeness in these assemblies that is comparable with the reference assembly, in spite of the fact that our complete sample-to-answer nanopore-only workflow took one week, compared to many months and several technologies for the reference.  

Fig. 4 Combining Pore-C and assembly data for NA24385

Improving the contiguity of human-genome assembly by scaffolding with Pore-C data

Assembly contiguity can be improved substantially by the addition of relatively low coverage of Pore-C data using the pipeline shown in Fig. 4a. We generated approximately 130 Gb of reads from the human genome NA24385 using one PromethION flowcell and assembled these using redbean. The resulting assembly had an NG50 of 10.4 Mb. The inclusion of an additional flowcell of Pore-C reads increased the assembly contiguity substantially, with reads produced by HindIII giving the greatest increase, to 98.6 Mb (Fig. 4b). Prior to scaffolding, when Pore-C reads are plotted against the redbean contigs, many off-diagonal features are visible, indicating suboptimal assembly (Fig. 4c, shown for Chr. 4). Following Pore-C scaffolding, a more optimal assembly is obtained, with just three scaffolds covering the entire chromosome (Figs. 4d and 4e).

© 2019 Oxford Nanopore Technologies. All rights reserved. Oxford Nanopore Technologies' products are currently for research use only. 

Recommended for you

Open a chat to talk to our sales team
FAQs

FAQs

Search