Main menu

Mash: fast genome and metagenome distance estimation using MinHash

  • Published on: June 20 2016
  • Source: Genome Biology

Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence collections. Mash reduces large sequences and sequence sets to small, representative sketches, from which global mutation distances can be rapidly estimated. We demonstrate several use cases, including the clustering of all 54,118 NCBI RefSeq genomes in 33 CPU h; real-time database search using assembled or unassembled Illumina, Pacific Biosciences, and Oxford Nanopore data; and the scalable clustering of hundreds of metagenomic samples by composition. Mash is freely released under a BSD license (https://github.com/marbl/mash).

Authors: Brian D. Ondov, Todd J. Treangen, Páll Melsted, Adam B. Mallonee, Nicholas H. Bergman, Sergey Koren, Adam M. Phillippy

入门指南

购买 MinION 启动包 Nanopore 商城 测序服务提供商 全球代理商

联系我们

知识产权 Cookie 政策 企业报告 隐私政策 条件条款 前瞻性陈述

关于 Oxford Nanopore

联系我们 领导团队 媒体资源和联系方式 投资者 在 Oxford Nanopore 工作 BSI 27001 accreditationBSI 90001 accreditationBSI mark of trust
Chinese flag