Main menu

Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats


Generating a complete, de novo genome assembly for prokaryotes is often considered a solved problem. However, we here show that Pseudomonas koreensis P19E3 harbors multiple, near identical repeat pairs up to 70 kilobase pairs in length. Beyond long repeats, the P19E3 assembly was further complicated by a shufflon region. Its complex genome could not be de novo assembled with long reads produced by Pacific Biosciences technology, but required very long reads from the Oxford Nanopore Technology. Another important factor for a full genomic resolution was the choice of assembly algorithm. Importantly, a repeat analysis indicated that very complex bacterial genomes represent a general phenomenon beyond Pseudomonas. Roughly 10% of 9331 complete bacterial and a handful of 293 complete archaeal genomes represented this dark matter for de novo genome assembly of prokaryotes. Several of these dark matter genome assemblies contained repeats far beyond the resolution of the sequencing technology employed and likely contain errors, other genomes were closed employing labor-intense steps like cosmid libraries, primer walking or optical mapping. Using very long sequencing reads in combination with assemblers capable of resolving long, near identical repeats will bring most prokaryotic genomes within reach of fast and complete de novo genome assembly.

Authors: Michael Schmid, Daniel Frei, Andrea Patrignani, Ralph Schlapbach, Juerg E. Frey, Mitja N.P. Remus-Emsermann, Christian H. Ahrens

入门指南

购买 MinION 启动包 Nanopore 商城 测序服务提供商 全球代理商

纳米孔技术

订阅 Nanopore 更新 资源库及发表刊物 什么是 Nanopore 社区

关于 Oxford Nanopore

新闻 公司历程 可持续发展 领导团队 媒体资源和联系方式 投资者 合作者 在 Oxford Nanopore 工作 职位空缺 商业信息 BSI 27001 accreditationBSI 90001 accreditationBSI mark of trust
Chinese flag