Main menu

MaSuRCA assembler version 3.2.2 for hybrid ONT genome assembly


The newest version of the MaSuRCA genome assembler is now able to produce highly contiguous assemblies from long read data produced by Oxford Nanopore's MinION sequencer in combination with deep-coverage in short Illumina reads. The open-source MaSuRCA (Maryland Super-Reads with Celera Assembler) genome assembly software has been under development at the University of Maryland and Johns Hopkins University since 2011, with recent work focusing on assembly of hybrid data sets (Zimin et al., 2013). The MaSuRCA software is based on the idea of super-reads originally suggested by James A Yorke, in which short Illumina reads are extended using k-mers into longer, highly accurate super-reads. Because the super reads are longer than the original reads, they can be mapped more accurately to long MinION long reads.The latest MaSuRCA release (v 3.2.2) employs a new mega-reads algorithm (Zimin et al., 2016) to correct errors in each long read using super-reads and then to assemble the mega-reads. Using a publicly available human NA12878 30x coverage Nanopore data set, MaSuRCA 3.2.2 constructed an assembly with an N50 contig size of 4.06 Million base pairs (Mbp) and an N50 scaffold size of 5.04 Mbp, with 2.88 billion bases total, providing over 95% coverage of the human reference genome with a sequence quality of <4 errors per 10,000 bases. The assembly took about 50,000 CPU-hours on AMD Opteron 6000 series cluster. The latest release of MaSuRCA is available from http://masurca.blogspot.com.

Authors: Aleksey V. Zimin, Guillaume Marçais, Daniela Puiu, Michael Roberts, Steven L. Salzberg, James A. Yorke

入门指南

购买 MinION 启动包 Nanopore 商城 测序服务提供商 全球代理商

纳米孔技术

订阅 Nanopore 更新 资源库及发表刊物 什么是 Nanopore 社区

关于 Oxford Nanopore

新闻 公司历程 可持续发展 领导团队 媒体资源和联系方式 投资者 合作者 在 Oxford Nanopore 工作 职位空缺 商业信息 BSI 27001 accreditationBSI 90001 accreditationBSI mark of trust
Chinese flag