Products

Discover nanopore sequencing

What can it do? How does it work? Our platform performance and accuracy

Explore products

Prepare Sequence Analyse
Applications
Store Resources Support About

Publications tagged "Tools"

Methylartist: Tools for visualising modified bases from nanopore sequence data

Methylartist is a consolidated suite of tools for processing, visualising, and analysing nanopore methylation data derived from modified basecalling methods. All detectable methylation types (e.g. 5...

Trycycler: consensus long-read assemblies for bacterial genomes

Assembly of bacterial genomes from long-read data (generated by Oxford Nanopore or Pacific Biosciences platforms) can often be complete: a single contig for each chromosome or plasmid in the genome....

Dysgu: efficient structural variant calling using short or long reads

Structural variation (SV) plays a fundamental role in genome evolution and can underlie inherited or acquired diseases such as cancer. Long-read sequencing technologies have led to improvements in t...

Jasmine: population-scale structural variant comparison and analysis

The increasing availability of long-reads is revolutionizing studies of structural variants (SVs). However, because SVs vary across individuals and are discovered through imprecise read technologies...

BoardION: real-time monitoring of Oxford Nanopore sequencing instruments

Background One of the main advantages of the Oxford Nanopore Technology (ONT) is the possibility of real-time sequencing. This gives access to information during the experiment and allows either to...

NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks

Motivation Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the ac...

WeFaceNano: a user-friendly pipeline for complete ONT sequence assembly and detection of antibiotic resistance in multi-plasmid bacterial isolates

Bacterial plasmids often carry antibiotic resistance genes and are a significant factor in the spread of antibiotic resistance. The ability to completely assemble plasmid sequences would facilitate ...

PlasLR enables adaptation of plasmid prediction for error-prone long reads

Plasmids are extra-chromosomal genetic elements commonly found in bacterial cells that support many functional aspects including environmental adaptations. The identification of these genetic elemen...

Theory of local k-mer selection with applications to long-read alignment

Motivation Selecting a subset of k-mers in a string in a local manner is a common task in bioinformatics tools for speeding up computation. Arguably the most well-known and common method is the mini...

DNA methylation calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation

Background Nanopore long-read sequencing technology greatly expands the capacity of long-range single-molecule DNA-modification detection. A growing number of analytical tools have been actively dev...

Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads

Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order)...

BugSeq 16S: NanoCLUST with improved consensus sequence classification

NanoCLUST has enabled species-level taxonomic classification from noisy nanopore 16S sequencing data for BugSeq’s users and the broader nanopore sequencing community. We noticed a high misclassifica...

MMMVI: Detecting SARS-CoV-2 Variants of Concern in Metagenomic Samples

Motivation SARS-CoV-2 is the causative agent of the COVID-19 pandemic. Variants of Concern (VOCs) and Variants of Interest (VOIs) are lineages that represent a greater risk to public health, and can...

Illuminating the transposon insertion landscape in plants using Cas9-targeted Nanopore sequencing and a novel pipeline

Transposable elements (TEs), which occupy significant portions of most plant genomes, are a major source of genomic novelty, contributing to plant adaptation, speciation and new cultivar production....

Linked machine learning classifiers improve species classification of fungi when using error-prone long-reads on extended metabarcodes

The increased usage of long-read sequencing for metabarcoding has not been matched with public databases suited for error-prone long-reads. We address this gap and present a proof-of-concept study f...

JAFFAL: Detecting fusion genes with long read transcriptome sequencing

Massively parallel short read transcriptome sequencing has greatly expanded our knowledge of fusion genes which are drivers of tumor initiation and progression. In cancer, many fusions are also impo...

A comparative analysis of computational tools for the prediction of epigenetic DNA methylation from long-read sequencing data

Recent development of Oxford Nanopore long-read sequencing has opened new avenues of identifying epigenetic DNA methylation. Among the different epigenetic DNA methylations, N6-methyladenosine is th...

SENSV: detecting structural variations with precise breakpoints using low-depth WGS data from a single Oxford Nanopore MinION flowcell

Structural variation (SV) is a major cause of genetic disorders. In this paper, we show that low-depth (specifically, 4x) whole-genome sequencing using a single Oxford Nanopore MinION flow cell suff...

InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses

Motivation: InterARTIC is an interactive web application for the analysis of viral whole-genome sequencing (WGS) data generated on Oxford Nanopore Technologies (ONT) devices. A graphical interface e...

DNA-based data storage via combinatorial assembly

Persistent data storage is the basis of all modern information systems. The long-term value and volume of data are growing at an accelerating rate and pushing extant storage systems to their limits....

Penguin: a tool for predicting pseudouridine sites in direct RNA Nanopore sequencing data

Pseudouridine is one of the most abundant RNA modifications, occurring when uridines are catalyzed by Pseudouridine synthase proteins. It plays an important role in many biological processes and als...

ModPhred: an integrative toolkit for the analysis and storage of nanopore sequencing DNA and RNA modification data

DNA and RNA modifications can now be identified using Nanopore sequencing. However, we currently lack a flexible software to efficiently encode, store, analyze and visualize DNA and RNA modification...

Freely accessible ready to use global infrastructure for SARS-CoV-2 monitoring

The COVID-19 pandemic is the first global health crisis to occur in the age of big genomic data. Although data generation capacity is well established and sufficiently standardized, analytical capac...

Pan-genomic matching statistics for targeted nanopore sequencing

Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software ca...

Deeplasmid: Deep learning accurately separates plasmids from bacterial chromosomes

Plasmids are mobile genetic elements that play a key role in microbial ecology and evolution by mediating horizontal transfer of important genes, such as antimicrobial resistance genes. Many microbi...

Tiled-ClickSeq for targeted sequencing of complete coronavirus genomes with simultaneous capture of RNA recombination and minority variants

High-throughput genomics of SARS-CoV-2 is essential to characterize virus evolution and to identify adaptations that affect pathogenicity or transmission. While single-nucleotide variations (SNVs) a...

MinION barcodes: biodiversity discovery and identification by everyone, for everyone

DNA barcodes are a useful tool for discovering, understanding, and monitoring biodiversity. This is critical at a time when biodiversity loss is a major problem for many countries. However, widespre...

ANCHOR, a technical approach to monitor single-copy locus localization in planta

Gene expression is governed by several layers of regulation which in addition to genome organization, local chromatin structure, gene accessibility and the presence of transcription factors also inc...

Haplotype-aware variant calling enables high accuracy in nanopore long-reads using deep neural networks

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. ...

Adaptation of Oxford Nanopore technology for hepatitis C whole genome sequencing and identification of within-host viral variants

Background Hepatitis C (HCV) and many other RNA viruses exist as rapidly mutating quasi-species populations in a single infected host. High throughput characterization of full genome, within-host v...

Identification and quantification of SARS-CoV-2 leader subgenomic mRNA gene junctions

Introduction: SARS-CoV-2 has a complex strategy for the transcription of viral subgenomic mRNAs (sgmRNAs), which are targets for nucleic acid diagnostics. Each of these sgRNAs has a unique 5 sequenc...

Epstein-Barr virus long non-coding RNA RPMS1 full-length spliceome in transformed epithelial tissue

Epstein-Barr virus is associated with two types of epithelial neoplasms, nasopharyngeal carcinoma and gastric adenocarcinoma. The viral long non-coding RNA RPMS1 is the most abundantly expressed pol...

Rapid and detailed characterization of transgene insertion sites in genetically modified plants via Nanopore sequencing

Molecular characterization of genetically modified plants can provide crucial information for the development of detection and identification methods, to comply with traceability, and labeling requi...

IsoTV: processing and visualizing functional features of translated transcript isoforms

Despite the continuous discovery of new transcript isoforms, fueled by the recent increase in accessibility and accuracy of long-read RNA sequencing data, functional differences between isoforms ori...

NGSpeciesID: DNA barcode and amplicon consensus generation from long‐read sequencing data

Third‐generation sequencing technologies, such as Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio), have gained popularity over the last years. These platforms can generate millio...

Evaluation of full-length nanopore 16S sequencing for detection of pathogens in microbial keratitis

Background Microbial keratitis is a leading cause of preventable blindness worldwide. Conventional sampling and culture techniques are time-consuming, with over 40% of cases being culture-negative...

Automated strain separation in low-complexity metagenomes using long reads

High-throughput short-read metagenomics has enabled large-scale species-level analysis and functional characterization of microbial communities. Microbiomes often contain multiple strains of the sam...

A benchmarking of human mitochondrial DNA haplogroup classifiers from whole-genome and whole-exome sequence data

The mitochondrial genome (mtDNA) is of interest for a range of fields including evolutionary, forensic, and medical genetics. Human mitogenomes can be classified into evolutionary related haplogroup...

On the application of BERT models for nanopore methylation detection

Motivation DNA methylation is a common epigenetic modification, which is widely associated with various biological processes, such as gene expression, aging, and disease. Nanopore sequencing provide...

LoopViz: A uLoop assembly clone verification tool for nanopore sequencing reads

Cloning has been an integral part of most laboratory research questions and continues to be an essential tool in defining the genetic elements determining life. Cloning can be difficult and time con...

MicroPIPE: An end-to-end solution for high-quality complete bacterial genome construction

Oxford Nanopore Technology (ONT) long-read sequencing has become a popular platform for microbial researchers; however, easy and automated construction of high-quality bacterial genomes remains chal...

GraphUnzip: unzipping assembly graphs with long reads and Hi-C

Long reads and Hi-C have revolutionized the field of genome assembly as they have made highly continuous assemblies accessible for challenging genomes. As haploid chromosome-level assemblies are now...

A deep learning framework for real-time detection of novel pathogens during sequencing

Motivation Novel pathogens evolve quickly and may emerge rapidly, causing dangerous outbreaks or even global pandemics. Next-generation sequencing is the state-of-the art in open-view pathogen detec...

Haploflow: Strain-resolved de novo assembly of viral genomes

In viral infections often multiple related viral strains are present, due to coinfection or within-host evolution. We describe Haploflow, a de Bruijn graph-based assembler for de novo genome assembl...

Unique K-mer sequences for validating cancer-related substitution, insertion and deletion mutations

The cancer genome sequencing has led to important discoveries such as identifying cancer gene. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that m...

Raven: a de novo genome assembler for long reads

We present new methods for the improvement of long-read de novo genome assembly incorporated into a straightforward tool called Raven (https://github.com/lbcb-sci/raven). Compared with other assembl...

HapSolo: An optimization approach for removing secondary haplotigs during diploid genome assembly and scaffolding

Background Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative con...

Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data

Background Compared with second-generation sequencing technologies, third-generation single-molecule RNA sequencing has unprecedented advantages; the long reads it generates facilitate isoform-leve...

PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores

Motivation Recent advances in high-throughput long-read sequencers, such as PacBio and Oxford Nanopore sequencers, produce longer reads with more errors than short-read sequencers. In addition to t...

LIQA: Long-read Isoform Quantification and Analysis

Long-read RNA sequencing (RNA-seq) technologies have made it possible to sequence fulllength transcripts, facilitating the exploration of isoform-specific gene expression over conventional short-rea...

Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms

Alternative splicing (AS) is an important mechanism in the development of many cancers, as novel or aberrant AS patterns play an important role as an independent onco-driver. In addition, cancer-spe...

DNAModAnnot: a R toolbox for DNA modification filtering and annotation

Motivation Long-read sequencing technologies can be employed to detect and map DNA modifications at the nucleotide resolution on a genome-wide scale. However, published software packages neglect th...

SEQU-INTO: Early detection of impurities, contamination and off-targets (ICOs) in long read/MinION sequencing

The MinION sequencer by Oxford Nanopore Technologies turns DNA and RNA sequencing into a routine task in biology laboratories or in field research. For downstream analysis it is required to have a s...

Machine Boss: rapid prototyping of bioinformatic automata

Motivation Many C++ libraries for using Hidden Markov Models in bioinformatics focus on inference tasks, such as likelihood calculation, parameter-fitting, and alignment. However, construction of th...

SARS-CoV-2 RECoVERY: a multi-platform open-source bioinformatic pipeline for the automatic construction and analysis of SARS-CoV-2 genomes from NGS sequencing data

Background Since its first appearance in December 2019, the novel Severe Acute Respiratory Syndrome Coronavirus type 2 (SARS-CoV-2), spread worldwide causing an increasing number of cases and deaths...

NanoMethViz: an R/Bioconductor package for visualizing long-read methylation data

Motivation A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. Tools for effective visualization of data generated by thi...

S-IRFindeR: stable and accurate measurement of intron retention

Accurate quantification of intron retention levels is currently the crux for detecting and interpreting the function of retained introns. Using both simulated and real RNA-seq datasets, we show that...

Rapid screening and detection of inter-type viral recombinants using phylo-k-mers

Motivation Novel recombinant viruses may have important medical and evolutionary significance, as they sometimes display new traits not present in the parental strains. This is particularly concerni...

Merqury: reference-free quality and phasing assessment for genome assemblies

Recent long-read assemblies often exceed the quality of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based...

DeeReCT-APA: prediction of alternative polyadenylation site usage through deep learning

Alternative polyadenylation (APA) is a crucial step in post-transcriptional regulation. Previous bioinformatic works have mainly focused on the recognition of polyadenylation sites (PAS) in a given ...

Rapid Mycobacterium tuberculosis spoligotyping from uncorrected long reads using Galru

Spoligotyping of Mycobacterium tuberculosis provides a subspecies classification of this major human pathogen. Spoligotypes can be predicted from short read genome sequencing data; however, no metho...

MINTyper: A method for generating phylogenetic distance matrices with long read sequencing data

In this paper we present a complete pipeline for generating a phylogenetic distance matrix from a set of sequencing reads. Importantly, the program is able to handle a mix of both short reads from t...

Two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing

Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long-reads reveals the true complexity of processing, however the relatively...

Reads2Resistome: An adaptable and high-throughput whole-genome sequencing pipeline for bacterial resistome characterization

Summary The bacterial resistome is the collection of all the antibiotic resistance genes, virulence genes, and other resistance elements within a bacterial isolate genome including plasmids and bact...

Efficiently processing amplicon sequencing data for microbial ecology with dadasnake, a DADA2 implementation in Snakemake

Background Amplicon sequencing of phylogenetic marker genes, e.g. 16S, 18S or ITS rRNA sequences, is still the most commonly used method to estimate the structure of microbial communities. Microbial...

NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data

Summary NanoCLUST is an analysis pipeline for classification of amplicon-based full-length 16S rRNA nanopore reads. It is characterized by an unsupervised read clustering step, based on Uniform Mani...

abPOA: an SIMD-based C library for fast partial order alignment using adaptive band

Summary Partial order alignment, which aligns a sequence to a directed acyclic graph, is now frequently used as a key component in long-read error correction and assembly. We present abPOA (adaptive...

AnVIL: An overlap-aware genome assembly scaffolder for linked reads

10X Genomics Chromium linked reads contain information that can be used to link sequences together into scaffolds in draft genome assemblies. Existing software for this purpose perform the scaffoldi...

Multiplex single-molecule kinetics of nanopore-coupled polymerases

DNA polymerases have revolutionized the biotechnology field due to their ability to precisely replicate stored genetic information. Screening variants of these enzymes for unique properties gives th...

MetaGenomic analysis of short and long reads

Identifying single organisms in environmental samples is one of the key tasks of metagenomics. During the last few years, third generation sequencing technologies have enabled researchers to sequenc...

yacrd and fpa: upstream tools for long-read genome assembly

Motivation Genome assembly is increasingly performed on long, uncorrected reads. Assembly quality may be degraded due to unfiltered chimeric reads; also, the storage of all read overlaps can take up...

Antibiotic resistance prediction for Mycobacterium tuberculosis from genome sequence data with Mykrobe

Two billion people are infected with Mycobacterium tuberculosis, leading to 10 million new cases of active tuberculosis and 1.5 million deaths annually. Universal access to drug susceptibility testi...

RabbitQC: high-speed scalable quality control for sequencing data

Motivation Modern sequencing technologies continue to revolutionize many areas of biology and medicine. Since the generated datasets are error-prone, downstream applications usually require quality...

HASLR: Fast Hybrid Assembly of Long Reads

Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The large...

A benchmark of structural variation detection by long reads through a realistic simulated model

Despite the rapid evolution of new sequencing technologies, structural variation detection remains poorly ascertained. The high discrepancy between the results of structural variant analysis program...

Fast gap-affine pairwise alignment using the wavefront algorithm

Motivation Pairwise alignment of sequences is a fundamental method in modern molecular biology, implemented within multiple bioinformatics tools and libraries. Current advances in sequencing te...

nanoDoc: RNA modification detection using Nanopore raw reads with Deep One-Class Classification

Advances in Nanopore single-molecule direct RNA sequencing (DRS) have presented the possibility of detecting comprehensive post-transcriptional modifications (PTMs) as an alternative to experimental...

VIRUSBreakend: viral integration recognition using single breakends

Integration of viruses into infected host cell DNA can causes DNA damage and can disrupt genes. Recent cost reductions and growth of whole genome sequencing has produced a wealth of data in which vi...

TrancriptomeReconstructoR: data-driven annotation of complex transcriptomes

Background The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and co...

DAJIN-assisted multiplex genotyping to validate the outcomes of CRISPR-Cas-based genome editing

Genome editing induces various on-target mutations. Accurate identification of mutations in founder mice and cell clones is essential to perform reliable genome editing experiments. However, no geno...

Hapo-G, Haplotype-Aware Polishing Of Genome assemblies

Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order) ...

Detecting and phasing minor single-nucleotide variants from long-read sequencing data

Cellular genetic heterogeneity is common in many biological conditions including cancer, microbiome, co-infection of multiple pathogens. Detecting and phasing minor variants, which is to determine w...

Swan: a library for the analysis and visualization of long-read transcriptomes

Motivation Long-read RNA-sequencing technologies such as PacBio and Oxford Nanopore have discovered an explosion of new transcript isoforms that are difficult to visually analyze using currently av...

Streamlining quantitative analysis of long RNA sequencing reads

Transcriptome analyses allow for linking RNA expression profiles to cellular pathways and phenotypes. Despite improvements in sequencing methodology, whole transcriptome analyses are still tedious, ...

Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications

The ability to characterize repetitive regions of the human genome is limited by the read lengths of short-read sequencing technologies. Although long-read sequencing technologies such as Pacific Bi...

Systematic benchmarking of tools for CpG methylation detection from Nanopore sequencing

DNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy ...

A Python-based optimization framework for high-performance genomics

Exponentially-growing next-generation sequencing data requires high-performance tools and algorithms. Nevertheless, the implementation of high-performance computational genomics software is inaccess...

Genome ARTIST_v2 software – a support for annotation of class II natural transposons in new sequenced genomes

Transposon annotation is a very dynamic field of genomics and various tools assigned to support this bioinformatics endeavor were reported. Genome ARTIST (GA) software was initially developed for ma...

DNAscent v2: detecting replication forks in nanopore sequencing data with deep learning

The detection of base analogues in Oxford Nanopore Technologies (ONT) sequencing reads has become a promising new method for the high-throughput measurement of DNA replication dynamics with single-m...

DR2S: an integrated algorithm providing reference-grade haplotype sequences from heterozygous samples

Background High resolution HLA genotyping of donors and recipients is a crucially important prerequisite for haematopoetic stem-cell transplantation and relies heavily on the quality and completenes...

Nucleotide-resolution bacterial pan-genomics with reference graphs

Bacterial genomes follow a U-shaped frequency distribution whereby most genomic loci are either rare (accessory) or common (core) - the alignable fraction of two genomes from a single species might ...

Towards inferring nanopore sequencing ionic currents from nucleotide chemical structures

The characteristic ionic currents of nucleotide kmers are commonly used in analyzing nanopore sequencing readouts. We present a graph convolutional network-based deep learning framework for predict...

lra: the Long Read Aligner for Sequences and Contigs

It is computationally challenging to detect variation by aligning long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies. One approach to efficie...

A long read mapping method for highly repetitive reference sequences

About 5-10% of the human genome remains inaccessible for functional analysis due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. To enable high-quali...

Nanopanel2 calls phased low-frequency variants in Nanopore panel sequencing data

Clinical decision making is increasingly guided by accurate and recurrent determination of presence and frequency of (somatic) variants and their haplotype through panel sequencing of disease-releva...

NanoGalaxy: Nanopore long-read sequencing data analysis in Galaxy

Background Long-read sequencing can be applied to generate very long contigs and even completely assembled genomes at relatively low cost and with minimal sample preparation. As a result, long-read...

BugSeq: a highly accurate cloud platform for long-read metagenomic analyses

As the use of nanopore sequencing for metagenomic analysis increases, tools capable of performing long-read taxonomic classification in a fast and accurate manner are needed. Existing tools were eit...

Metagenomics Strain Resolution on Assembly Graphs

We introduce a novel bioinformatics pipeline, STrain Resolution ON assembly Graphs (STRONG), which identifies strains de novo, when multiple metagenome samples from the same community are available....

Reference-free reconstruction and quantification of transcriptomes from nanopore long-read sequencing

Single-molecule long-read sequencing with Nanopore provides an unprecedented opportunity to measure transcriptomes from any sample. However, current analysis methods rely on the comparison with a re...

Ratatosk - Hybrid error correction of long reads enables accurate variant calling and assembly

Motivation Long Read Sequencing (LRS) technologies are becoming essential to complement Short Read Sequencing (SRS) technologies for routine whole genome sequencing. LRS platforms produce DNA fragme...

periscope: sub-genomic RNA identification in SARS-CoV-2 ARTIC network nanopore sequencing data

We have developed periscope, a tool for the detection and quantification of sub-genomic RNA in ARTIC network protocol generated Nanopore SARS-CoV-2 sequence data. We applied periscope to 1155 SARS-...

Gaussian Mixture Model-Based Unsupervised Nucleotide Modification Number Detection Using Nanopore Sequencing Readouts

Motivation Nucleotides modification status can be decoded from the Oxford Nanopore Technologies (ONT) nanopore sequencing ionic current signals. Although various algorithms have been developed for n...

Liftoff: an accurate gene annotation mapping tool

Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of...

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read tools

Application of Oxford Nanopore Technologies’ long-read sequencing platform to transcriptomic analysis is increasing in popularity. However, such analysis can be challenging due to small library size...

Detection of differential RNA modifications from direct RNA sequencing of human cell lines

Differences in RNA expression can provide insights into the molecular identity of a cell, pathways involved in human diseases, and variation in RNA levels across patients associated with clinical ph...

Using SPAdes de novo assembler

SPAdes—St. Petersburg genome Assembler—was originally developed for de novo assembly of genome sequencing data produced for cultivated microbial isolates and for single‐cell genomic DNA sequencing. ...

BoardION: real-time monitoring of Oxford Nanopore Technologies devices

One of the main advantages of the Oxford Nanopore Technology (ONT) is the possibility of sequencing in real time. However, the ONT sequencing interface is not sufficient to explore the quality of se...

CSA: A high-throughput chromosome-scale assembly pipeline for vertebrate genomes

Background Easy-to-use and fast bioinformatics pipelines for long-read assembly that go beyond the contig level to generate highly continuous chromosome-scale genomes from raw data remain scarce. R...

NanoSPC: a scalable, portable, cloud compatible viral nanopore metagenomic data processing pipeline

Metagenomic sequencing combined with Oxford Nanopore Technology has the potential to become a point-of-care test for infectious disease in public health and clinical settings, providing rapid diagn...

GALA: gap-free chromosome-scale assembly with long reads

High-quality genome assembly has wide applications in genetics and medical studies. However, it is still very challenging to achieve gap-free chromosome-scale assemblies using current workflows of l...

Porcupine: Rapid and robust tagging of physical objects using nanopore-orthogonal DNA strands

Porcupine lets end-users label physical objects with custom DNA tags, without requiring a lab to create or read tags, and offers rapid readout using nanopore sequencing. Molecular tagging is an app...

iGenomics: Comprehensive DNA sequence analysis on your Smartphone

Background Following the miniaturization of integrated circuitry and other computer hardware over the past several decades, DNA sequencing is on a similar path. Leading this trend is the Oxford Nan...

Opportunities and challenges in long-read sequencing data analysis

Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characterist...

BOSS-RUNS: a flexible and practical dynamic read sampling framework for nanopore sequencing

Real-time selective sequencing of individual DNA fragments, or 'Read Until', allows the focusing of Oxford Nanopore Technology sequencing on pre-selected genomic regions. This can lead to large impr...

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED

Conventional targeted sequencing methods eliminate many of the benefits of nanopore sequencing, such as the ability to accurately detect structural variants or epigenetic modifications. The ReadUnt...

Benchmarking of long-read assemblers for prokaryote whole genome sequencing

Background Data sets from long-read sequencing platforms (Oxford Nanopore Technologies and Pacific Biosciences) allow for most prokaryote genomes to be completely assembled – one contig per chromoso...

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming...

HyPo: Super fast and accurate polisher for long read genome assemblies

Efforts towards making population-scale long read genome assemblies (especially human genomes) viable have intensified recently with the emergence of many fast assemblers. The reliance of these fast...

The string decomposition problem and its applications to centromere analysis and assembly

Motivation Recent attempts to assemble extra-long tandem repeats (such as centromeres) faced the challenge of translating long error-prone reads from the nucleotide alphabet into the alphabet of rep...

Accurate and simultaneous identification of differential expression and splicing using hierarchical Bayesian analysis

The regulation of mRNA controls both overall gene expression as well as the distribution of mRNA isoforms encoded by the gene. Current algorithmic approaches focus on characterization of significant...

Molecular barcoding of native RNAs using nanopore sequencing and deep learning

Nanopore sequencing enables direct measurement of RNA molecules without conversion to cDNA, thus opening the gates to a new era for RNA biology. However, the lack of molecular barcoding of direct RN...

Wengan: Efficient and high quality hybrid de novo assembly of human genomes

The continuous improvement of long-read sequencing technologies along with the development of ad-doc algorithms has launched a new de novo assembly era that promises high-quality genomes. However, i...

SVJedi: Genotyping structural variations with long reads

Motivation Studies on structural variants (SV) are expanding rapidly. As a result, and thanks to third generation sequencing technologies, the number of discovered SVs is increasing, especially in ...

MasterOfPores: A workflow for the analysis of Oxford Nanopore Direct RNA sequencing datasets

The direct RNA sequencing platform offered by Oxford Nanopore Technologies allows for direct measurement of RNA molecules without the need of conversion to complementary DNA, fragmentation or amplif...

Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing

Whole-genome sequencing using sequencing technologies such as Illumina enables the accurate detection of small-scale variants but provides limited information about haplotypes and variants in repeti...

Spectral Jaccard Similarity: a new approach to estimating pairwise sequence alignments

A key step in many genomic analysis pipelines is the identification of regions of similarity between pairs of DNA sequencing reads. This task, known as pairwise sequence alignment, is a heavy comput...

Completing circular bacterial genomes with assembly complexity by using a sampling strategy from a single MinION run with barcoding

The Oxford Nanopore MinION is an affordable and portable DNA sequencer that can produce very long reads (tens of kilobase pairs), which enable de novo bacterial genome assembly. Although many algori...

Bioinformatics of nanopore sequencing

Nanopore sequencing is one of the most exciting new technologies, which undergoes dynamic development. With its development, a growing number of analytical tools are becoming available for researche...

Assembly methods for nanopore-based metagenomic sequencing: a comparative study

Background Metagenomic sequencing has lead to the recovery of previously unexplored microbial genomes. In this sense, short-reads sequencing platforms often result in highly fragmented metagenomes, ...

kASA: Taxonomic Analysis of Metagenomic Data on a Notebook

The taxonomic analysis of sequencing data has become important in many areas of life sciences. However, currently available software tools for that purpose either consume large amounts of RAM or yie...

Introduction to the Analysis of Environmental Sequences: Metagenomics with MEGAN

Metagenomics has become a part of the standard toolkit for scientists interested in studying microbes in the environment. Compared to 16S rDNA sequencing, which allows coarse taxonomic profiling of ...

Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs

Background Mycobacterium tuberculosis resistance to anti-tuberculosis drugs is a major threat to global public health. Whole genome sequencing (WGS) is rapidly gaining traction as a diagnostic tool...

A rapid and accurate MinION-based workflow for tracking species biodiversity in the field

Genetic markers (DNA barcodes) are often used to support and confirm species identification. Barcode sequences can be generated in the field using portable systems based on the Oxford Nanopore Techn...

Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data

DNA base modifications, such as C5-methylcytosine (5mC) and N6-methyldeoxyadenosine (6mA), are important types of epigenetic regulations. Short-read bisulfite sequencing and long-read PacBio sequenc...

GraphClust2: annotation and discovery of structured RNAs with scalable and accessible integrative clustering

RNA plays essential roles in all known forms of life. Clustering RNA sequences with common sequence and structure is an essential step towards studying RNA function. With the advent of high-throughp...

Assembly of long, error-prone reads using repeat graphs

Accurate genome assembly is hampered by repetitive regions. Although long single molecule sequencing reads are better able to resolve genomic repeats than short-read data, most long-read assembly al...

Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads

Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansion...

Decoding the epitranscriptional landscape from native RNA sequences

Traditional epitranscriptomics relies on capturing a single RNA modification by antibody or chemical treatment, combined with short-read sequencing to identify its transcriptomic location. This appr...

NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION

Technological limitations have hindered the large-scale genetic investigation of tandem repeats in disease. We show that long-read sequencing with a single Oxford Nanopore Technologies PromethION f...

Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads

The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies ...

A framework and an algorithm to detect low-abundance DNA by a handy sequencer and a palm-sized computer

Motivation Detection of DNA at low abundance with respect to the entire sample is an important problem in areas such as epidemiology and field research, as these samples are highly contaminated wit...

Harnessing the MinION: An example of how to establish long‐read sequencing in a laboratory using challenging plant tissue from Eucalyptus pauciflora

Long‐read sequencing technologies are transforming our ability to assemble highly complex genomes. Realizing their full potential is critically reliant on extracting high‐quality, high‐molecular‐wei...

NanoPack: visualizing and processing long read sequencing data

Summary Here we describe NanoPack, a set of tools developed for visualization and processing of long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. Availability an...

npInv: accurate detection and genotyping of inversions using long read sub-alignment

Background Detection of genomic inversions remains challenging. Many existing methods primarily target inversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic ...

Efficient data structures for mobile de novo genome assembly by third-generation sequencing

Mobile/portable (third-generation) sequencing technologies, including Oxford Nanopore’s MinION and SmidgION, are revolutionizing once again –after the advent of high-throughput sequencing– biomedica...

Fast and sensitive mapping of nanopore sequencing reads using GraphMap

Realizing the democratic promise of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics. Here we present GraphMap, a mapping...

Picopore: A tool for reducing the size of Oxford Nanopore Technologies' datasets without losing information.

A tool for reducing the size of Oxford Nanopore Technologies' datasets without losing information. Options: Lossless compression: reduces footprint without reducing the ability to use other nano...

poRe GUIs for parallel and real-time processing of MinION sequence data

Motivation Oxford Nanopore's MinION device has matured rapidly and is now capable of producing over one million reads and several gigabases of sequence data per run. The nature of the MinION output ...

Hybrid assembly pipeline released (using Canu, racon and Pilon)

The long sequencing reads produced by Oxford Nanopore’s platforms enable the assembly of genomes with superior contiguity compared to those produced by second generation technologies. In some circum...

BWA and LAST have been tuned to work with nanopore reads

Burrow-Wheeler Aligner (BWA) for pairwise alignment between DNA sequences. BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome. It consists...

De novo sequencing and variant calling with nanopores using PoreSeq

The accuracy of sequencing single DNA molecules with nanopores is continually improving, but de novo genome sequencing and assembly using only nanopore data remain challenging. Here we describe Pore...

LAST

Martin Frith, Computational Biology Research Center in Tokyo Release Date: 18-Sep-2015 LAST finds similar regions between sequences, and aligns them. It is designed for comparing large datasets to...

Mash: fast genome and metagenome distance estimation using MinHash

Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence co...

Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences

Motivation: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10kbp in length, which have enabled high-quality genome assembly at an af...

nanopolish – nanopore sequence analysis and genome assembly software

Jared Simpson, University of Toronto Release Date: 04-Sept-2015 A nanopore consensus algorithm using a signal-level hidden Markov model.

npReader – real-time conversion and analysis of Nanopore reads

npReader (jsa.np.f5reader) is a program that extracts Oxford Nanopore sequencing data from FAST5 files, performs an initial analysis of the date and streams them to real-time analysis pipelines. The...

DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads

The MinION device by Oxford Nanopore produces very long reads (reads over 100 Kbp were reported); however it suffers from high sequencing error rate. We present an open-source DNA base caller based ...

Rapid antibiotic resistance predictions from genome sequence data for S. aureus and M. tuberculosis

Rapid and accurate detection of antibiotic resistance in pathogens is an urgent need, affecting both patient care and population-scale control. Microbial genome sequencing promises much, but many ba...

Evaluation of hybrid and non-hybrid methods for de novo assembly of nanopore reads

Motivation Recent emergence of nanopore sequencing technology set a challenge for established assembly methods. In this work, we assessed how existing hybrid and non-hybrid de novo assembly methods ...

LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads

Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the ina...

Successful test launch for nanopore sequencing

Nanopore sequencing gets a boost with accurate error modelling and variant-calling tools for Oxford Nanopore Technology’s highly anticipated MinION platform.

nanoCORR – error correction tool for nanopore sequence data

Sara Goodwin and James Gurtowski at Cold Spring Harbor Laboratory have created error correction software for Oxford Nanopore data Release Date: 24-Aug-2015

poRe: an R package for the visualization and analysis of nanopore sequencing data

Motivation: The Oxford Nanopore MinION device represents a unique sequencing technology. As a mobile sequencing device powered by the USB port of a laptop, the MinION has huge potential applications...

Poretools: a toolkit for analyzing nanopore sequence data

Motivation: Nanopore sequencing may be the next disruptive technology in genomics, due to its ability to detect single DNA molecules without prior amplification, lack of reliance on expensive optica...