Main menu

Theory of local k-mer selection with applications to long-read alignment

  • Published on: May 23 2021
  • Source: BioRxiv

Motivation Selecting a subset of k-mers in a string in a local manner is a common task in bioinformatics tools for speeding up computation. Arguably the most well-known and common method is the minimizer technique, which selects the ‘lowest-ordered’ k-mer in a sliding window. Recently, it has been shown that minimizers are a sub-optimal method for selecting subsets of k-mers when mutations are present. There is however a lack of understanding behind the theory of why certain methods perform well.

Results We first theoretically investigate the conservation metric for k-mer selection methods. We derive an exact expression for calculating the conservation of a k-mer selection method. This turns out to be tractable enough for us to prove closed-form expressions for a variety of methods, including (open and closed) syncmers, (α, b, n)-words, and an upper bound for minimizers. As a demonstration of our results, we modified the minimap2 read aligner to use a more optimal k-mer selection method and demonstrate that there is up to an 8.2% relative increase in number of mapped reads.

Authors: Jim Shaw, Yun William Yu

入門

MinION Starter Packを購入 ナノポア製品の販売 シークエンスサービスプロバイダー グローバルディストリビューター

お問い合わせ

Intellectual property Cookie policy Corporate reporting Privacy policy Terms, conditions and policies Accessibility

Oxford Nanoporeについて

Contact us 経営陣 メディアリソース & お問い合わせ先 投資家向け Oxford Nanopore社で働く BSI 27001 accreditationBSI 90001 accreditationBSI mark of trust
Japanese flag