*De novo* clustering of long-read transcriptome data using a greedy, quality value-based algorithm

shared.published_on: March 16 2020

Long-read sequencing of transcripts with Pacific Biosciences (PacBio) Iso-Seq and Oxford Nanopore Technologies has proven to be central to the study of complex isoform landscapes in many organisms. However, current de novo transcript reconstruction algorithms from long-read data are limited, leaving the potential of these technologies unfulfilled. A common bottleneck is the dearth of scalable and accurate algorithms for clustering long reads according to their gene family of origin.

To address this challenge, we develop isONclust, a clustering algorithm that is greedy (to scale) and makes use of quality values (to handle variable error rates).

We test isONclust on three simulated and five biological data sets, across a breadth of organisms, technologies, and read depths. Our results demonstrate that isONclust is a substantial improvement over previous approaches, both in terms of overall accuracy and/or scalability to large data sets.

resources.authors: Kristoffer Sahlin, Paul Medvedev

Full text - Journal of Computational Biology

消耗品

すべての製品

研究領域

技術

技術

Resources

Documentation

Nanopore Learning

会社

ニュース & イベント

グローバルパートナー

De novo clustering of long-read transcriptome data using a greedy, quality value-based algorithm

resources.download

入門

お問い合わせ

Oxford Nanoporeについて