Main menu

HapSolo: An optimization approach for removing secondary haplotigs during diploid genome assembly and scaffolding


Background Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative contigs that represent highly heterozygous regions. If primary and secondary contigs are not properly identified, the primary assembly will overrepresent both the size and complexity of the genome, which complicates downstream analysis such as scaffolding.

Results Here we illustrate a new method, which we call HapSolo, that identifies secondary contigs and defines a primary assembly based on multiple pairwise contig alignment metrics. HapSolo evaluates candidate primary assemblies using BUSCO scores and then distinguishes among candidate assemblies using a cost function. The cost function can be defined by the user but by default considers the number of missing, duplicated and single BUSCO genes within the assembly. HapSolo performs hill climbing to minimize cost over thousands of candidate assemblies. We illustrate the performance of HapSolo on genome data from three species: the Chardonnay grape (Vitis vinifera), a mosquito (Anopheles funestus) and the Korean Mudskipper fish (Periophthalmus magnuspinnatus).

Conclusions HapSolo rapidly identifies candidate assemblies that yield dramatic improvements in assembly metrics, including decreased genome size and improved N50 scores. N50 scores improved by 26%, 8% and 21% for Chardonnay, mosquito and the mudskipper, respectively, relative to unreduced primary assemblies. The benefits of HapSolo were amplified by down-stream analyses, which we illustrated by scaffolding with Hi-C data.

We found, for example, that prior to the application of HapSolo, only 39% of the Chardonnay genome was captured in the largest 19 scaffolds, corresponding to the number of chromosomes. After the application of HapSolo, this value increased to ∼77%. The improvements for mosquito scaffolding were similar to that of Chardonnay, but mudskipper was even more dramatic.

Authors: Edwin A. Solares, Yuan Tao, Anthony D. Long, Brandon S. Gaut

入门指南

购买 MinION 启动包 Nanopore 商城 测序服务提供商 全球代理商

纳米孔技术

订阅 Nanopore 更新 资源库及发表刊物 什么是 Nanopore 社区

关于 Oxford Nanopore

新闻 公司历程 可持续发展 领导团队 媒体资源和联系方式 投资者 合作者 在 Oxford Nanopore 工作 职位空缺 商业信息 BSI 27001 accreditationBSI 90001 accreditationBSI mark of trust
Chinese flag