Products

Discover nanopore sequencing

What can it do? How does it work? Our platform performance and accuracy

Explore products

Prepare Sequence Analyse
Applications
Store Resources Support About

Publications tagged "Data Analysis"

Blog: Metagenomic sequencing: which assembly method is best?

In this blog, Adriel Latorre-Perez shares his work on comparing methods for assembling metagenomes from nanopore sequencing data, and he provides recommendations on the best tools to use ...

Oxford Nanopore Technologies and Geneyx announce the first scalable software solution to advance the future clinical use of nanopore sequencing

The new platform is designed to enable the end-to-end analysis and clinical reporting of nanopore sequencing, empowering clinical labs, researchers and other users with an advanced, "one-...

Methylartist: Tools for visualising modified bases from nanopore sequence data

Methylartist is a consolidated suite of tools for processing, visualising, and analysing nanopore methylation data derived from modified basecalling methods. All detectable methylation types (e.g. 5...

Jasmine: population-scale structural variant comparison and analysis

The increasing availability of long-reads is revolutionizing studies of structural variants (SVs). However, because SVs vary across individuals and are discovered through imprecise read technologies...

Widespread occurrence of hybrid internal-terminal exons in human transcriptomes

Alternative RNA processing is a major mechanism for diversifying the human transcriptome. Messenger RNA isoform differences are predominantly driven by alternative first exons, cassette internal exo...

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex seg...

NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks

Motivation Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the ac...

PlasLR enables adaptation of plasmid prediction for error-prone long reads

Plasmids are extra-chromosomal genetic elements commonly found in bacterial cells that support many functional aspects including environmental adaptations. The identification of these genetic elemen...

Theory of local k-mer selection with applications to long-read alignment

Motivation Selecting a subset of k-mers in a string in a local manner is a common task in bioinformatics tools for speeding up computation. Arguably the most well-known and common method is the mini...

Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads

Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order)...

MinION 16S datasets of a commercially available microbial community enables the evaluation of DNA extractions and data analyses

New advances in sequencing technology and bioinformatics analysis tools have significantly supported the culture-independent analysis of complex microbial communities associated with environmental, ...

BugSeq 16S: NanoCLUST with improved consensus sequence classification

NanoCLUST has enabled species-level taxonomic classification from noisy nanopore 16S sequencing data for BugSeq’s users and the broader nanopore sequencing community. We noticed a high misclassifica...

MMMVI: Detecting SARS-CoV-2 Variants of Concern in Metagenomic Samples

Motivation SARS-CoV-2 is the causative agent of the COVID-19 pandemic. Variants of Concern (VOCs) and Variants of Interest (VOIs) are lineages that represent a greater risk to public health, and can...

A reference-quality, fully annotated genome from a Puerto Rican individual

Until 2019, the human genome was available in only one fully-annotated version, which was the result of 18 years of continuous improvement and revision. Despite dramatic improvements in sequencing t...

A comparative analysis of computational tools for the prediction of epigenetic DNA methylation from long-read sequencing data

Recent development of Oxford Nanopore long-read sequencing has opened new avenues of identifying epigenetic DNA methylation. Among the different epigenetic DNA methylations, N6-methyladenosine is th...

Redefining the PTEN promoter: identification of two upstream transcription start regions

Germline mutation of PTEN is causally observed in Cowden syndrome (CS) and is one of the most common genetic causes of autism spectrum disorder (ASD). However, the majority of individuals who presen...

Principles of mRNA targeting and regulation via the Arabidopsis m6A-binding proteins ECT2 and ECT3

Gene regulation dependent on N6-methyladenosine (m6A) in mRNA involves RNA-binding proteins that recognize m6A through a YTH domain. The Arabidopsis YTH-domain protein ECT2 is thought to influence m...

Generation of a novel SARS-CoV-2 sub-genomic RNA due to the R203K/G204R variant in nucleocapsid

The adjacent amino acid polymorphisms in the nucleocapsid (R203K/G204R) of SARS-CoV-2 arose on the background of the spike D614G change and strains harboring these changes have become dominant circu...

Pervasive cis effects of variation in copy number of large tandem repeats on local DNA methylation and gene expression

Variable number tandem repeats (VNTRs) are composed of large tandemly repeated motifs, many of which are highly polymorphic in copy number. However, because of their large size and repetitive natur...

Comparison of long read sequencing technologies in interrogating bacteria and fly genomes

The newest generation of DNA sequencing technology is highlighted by the ability to generate sequence reads hundreds of kilobases in length. Pacific Biosciences (PacBio) and Oxford Nanopore Technolo...

ModPhred: an integrative toolkit for the analysis and storage of nanopore sequencing DNA and RNA modification data

DNA and RNA modifications can now be identified using Nanopore sequencing. However, we currently lack a flexible software to efficiently encode, store, analyze and visualize DNA and RNA modification...

Freely accessible ready to use global infrastructure for SARS-CoV-2 monitoring

The COVID-19 pandemic is the first global health crisis to occur in the age of big genomic data. Although data generation capacity is well established and sufficiently standardized, analytical capac...

Mitochondrial genome sequencing of marine leukemias reveals cancer contagion between clam species in the seas of Southern Europe

Clonally transmissible cancers are tumour lineages that are transmitted between individuals via the transfer of living cancer cells. In marine bivalves, leukemia-like transmissible cancers, called h...

Haplotype-aware variant calling enables high accuracy in nanopore long-reads using deep neural networks

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. ...

Identification and quantification of SARS-CoV-2 leader subgenomic mRNA gene junctions

Introduction: SARS-CoV-2 has a complex strategy for the transcription of viral subgenomic mRNAs (sgmRNAs), which are targets for nucleic acid diagnostics. Each of these sgRNAs has a unique 5 sequenc...

Epstein-Barr virus long non-coding RNA RPMS1 full-length spliceome in transformed epithelial tissue

Epstein-Barr virus is associated with two types of epithelial neoplasms, nasopharyngeal carcinoma and gastric adenocarcinoma. The viral long non-coding RNA RPMS1 is the most abundantly expressed pol...

How the replication and transcription complex of SARS-CoV-2 functions in leader-to-body fusion

Background Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Although unprecedented efforts are underway to develop therapeutic strategie...

Rapid and detailed characterization of transgene insertion sites in genetically modified plants via Nanopore sequencing

Molecular characterization of genetically modified plants can provide crucial information for the development of detection and identification methods, to comply with traceability, and labeling requi...

The complete mitochondrial genome of Labeo catla (Hamilton, 1822) using long read sequencing

Labeo catla is a widely cultured species in monoculture and polyculture systems of the Indian subcontinent. In this study, the complete mitochondrial genome sequence of catla was reconstructed from ...

Automated strain separation in low-complexity metagenomes using long reads

High-throughput short-read metagenomics has enabled large-scale species-level analysis and functional characterization of microbial communities. Microbiomes often contain multiple strains of the sam...

Long-reads are revolutionizing 20 years of insect genome sequencing

The first insect genome (Drosophila melanogaster) was published two decades ago. Today, nuclear genome assemblies are available for a staggering 601 different insects representing 20 orders. Here, w...

A benchmarking of human mitochondrial DNA haplogroup classifiers from whole-genome and whole-exome sequence data

The mitochondrial genome (mtDNA) is of interest for a range of fields including evolutionary, forensic, and medical genetics. Human mitogenomes can be classified into evolutionary related haplogroup...

Context-independent function of a chromatin boundary in vivo

Mammalian genomes are partitioned into sub-megabase to megabase-sized units of preferential interactions called topologically associating domains or TADs, which are likely important for the proper i...

On the application of BERT models for nanopore methylation detection

Motivation DNA methylation is a common epigenetic modification, which is widely associated with various biological processes, such as gene expression, aging, and disease. Nanopore sequencing provide...

LoopViz: A uLoop assembly clone verification tool for nanopore sequencing reads

Cloning has been an integral part of most laboratory research questions and continues to be an essential tool in defining the genetic elements determining life. Cloning can be difficult and time con...

MicroPIPE: An end-to-end solution for high-quality complete bacterial genome construction

Oxford Nanopore Technology (ONT) long-read sequencing has become a popular platform for microbial researchers; however, easy and automated construction of high-quality bacterial genomes remains chal...

GraphUnzip: unzipping assembly graphs with long reads and Hi-C

Long reads and Hi-C have revolutionized the field of genome assembly as they have made highly continuous assemblies accessible for challenging genomes. As haploid chromosome-level assemblies are now...

Haploflow: Strain-resolved de novo assembly of viral genomes

In viral infections often multiple related viral strains are present, due to coinfection or within-host evolution. We describe Haploflow, a de Bruijn graph-based assembler for de novo genome assembl...

Hybrid clustering of long and short-read for improved metagenome assembly

Next-generation sequencing has enabled metagenomics, the study of the genomes of microorganisms sampled directly from the environment without cultivation. We previously developed a proof-of-concept,...

Unique K-mer sequences for validating cancer-related substitution, insertion and deletion mutations

The cancer genome sequencing has led to important discoveries such as identifying cancer gene. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that m...

GPU accelerated adaptive banded event alignment for rapid comparative nanopore signal analysis

Background Nanopore sequencing enables portable, real-time sequencing applications, including point-of-care diagnostics and in-the-field genotyping. Achieving these outcomes requires efficient bioi...

Barapost: binning of nucleotide sequences according to taxonomic annotation

Contemporary sequencing technologies, Oxford Nanopore in particular, provide a way to sequence multiple samples during single run using molecular barcodes. Specific circumstances, however, can make ...

HapSolo: An optimization approach for removing secondary haplotigs during diploid genome assembly and scaffolding

Background Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative con...

Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data

Background Compared with second-generation sequencing technologies, third-generation single-molecule RNA sequencing has unprecedented advantages; the long reads it generates facilitate isoform-leve...

LIQA: Long-read Isoform Quantification and Analysis

Long-read RNA sequencing (RNA-seq) technologies have made it possible to sequence fulllength transcripts, facilitating the exploration of isoform-specific gene expression over conventional short-rea...

TGS-GapCloser: a fast and accurate gap closer for large genomes with low coverage of error-prone long reads

Background Analyses that use genome assemblies are critically affected by the contiguity, completeness, and accuracy of those assemblies. In recent years single-molecule sequencing techniques gener...

Accurate spliced alignment of long RNA sequencing reads

Long-read RNA sequencing techniques are quickly establishing themselves as the primary sequencing technique to study the transcriptome landscape. Many such analyses are dependent upon splice alignme...

Improvements in the sequencing and assembly of plant genomes

Background Advances in DNA sequencing have reduced the difficulty of sequencing and assembling plant genomes. A range of methods for long read sequencing and assembly have been recently compared and...

Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms

Alternative splicing (AS) is an important mechanism in the development of many cancers, as novel or aberrant AS patterns play an important role as an independent onco-driver. In addition, cancer-spe...

The effect of hybridization on transposable element accumulation in an undomesticated fungal species

Transposable elements (TEs) are mobile genetic elements that can profoundly impact the evolution of genomes and species. A long-standing hypothesis suggests that the merging of diverged genomes with...

AsmMix: A pipeline for high quality diploid de novo assembly

In this paper, we report a pipeline, AsmMix, which is capable of producing both contiguous and high-quality diploid genomes. The pipeline consists of two steps. In the first step, two sets of assemb...

Machine Boss: rapid prototyping of bioinformatic automata

Motivation Many C++ libraries for using Hidden Markov Models in bioinformatics focus on inference tasks, such as likelihood calculation, parameter-fitting, and alignment. However, construction of th...

SARS-CoV-2 RECoVERY: a multi-platform open-source bioinformatic pipeline for the automatic construction and analysis of SARS-CoV-2 genomes from NGS sequencing data

Background Since its first appearance in December 2019, the novel Severe Acute Respiratory Syndrome Coronavirus type 2 (SARS-CoV-2), spread worldwide causing an increasing number of cases and deaths...

NanoMethViz: an R/Bioconductor package for visualizing long-read methylation data

Motivation A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. Tools for effective visualization of data generated by thi...

S-IRFindeR: stable and accurate measurement of intron retention

Accurate quantification of intron retention levels is currently the crux for detecting and interpreting the function of retained introns. Using both simulated and real RNA-seq datasets, we show that...

Rapid screening and detection of inter-type viral recombinants using phylo-k-mers

Motivation Novel recombinant viruses may have important medical and evolutionary significance, as they sometimes display new traits not present in the parental strains. This is particularly concerni...

Merqury: reference-free quality and phasing assessment for genome assemblies

Recent long-read assemblies often exceed the quality of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based...

Read trimming has minimal effect on bacterial SNP calling accuracy

Read alignment is the central step of many analytic pipelines that perform SNP calling. To reduce error, it is common practice to pre-process raw sequencing reads to remove low-quality bases and res...

Highly accurate barcode and UMI error correction using dual nucleotide dimer blocks allows direct single-cell nanopore transcriptome sequencing

Droplet-based single-cell sequencing techniques have provided unprecedented insight into cellular heterogeneities within tissues. However, these approaches only allow for the measurement of the dist...

stLFRsv: a germline SV analysis pipeline using co-barcoded reads

Co-barcoded reads originated from long DNA fragment (mean length larger than 50Kbp) with barcodes, maintain both single base level accuracy and long range genomic information. We propose a pipeline ...

DeeReCT-APA: prediction of alternative polyadenylation site usage through deep learning

Alternative polyadenylation (APA) is a crucial step in post-transcriptional regulation. Previous bioinformatic works have mainly focused on the recognition of polyadenylation sites (PAS) in a given ...

Minimum error correction-based haplotype assembly: considerations for long read data

The single nucleotide polymorphism (SNP) is the most widely studied type of genetic variation. A haplotype is defined as the sequence of alleles at SNP sites on each haploid chromosome. Haplotype in...

Tirant stealthily invaded natural Drosophila melanogaster populations during the last century

It was long thought that solely three different transposable elements - the I-element, the P-element and hobo - invaded natural D. melanogaster populations within the last century. By sequencing the...

MINTyper: A method for generating phylogenetic distance matrices with long read sequencing data

In this paper we present a complete pipeline for generating a phylogenetic distance matrix from a set of sequencing reads. Importantly, the program is able to handle a mix of both short reads from t...

Two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing

Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long-reads reveals the true complexity of processing, however the relatively...

Reads2Resistome: An adaptable and high-throughput whole-genome sequencing pipeline for bacterial resistome characterization

Summary The bacterial resistome is the collection of all the antibiotic resistance genes, virulence genes, and other resistance elements within a bacterial isolate genome including plasmids and bact...

Efficiently processing amplicon sequencing data for microbial ecology with dadasnake, a DADA2 implementation in Snakemake

Background Amplicon sequencing of phylogenetic marker genes, e.g. 16S, 18S or ITS rRNA sequences, is still the most commonly used method to estimate the structure of microbial communities. Microbial...

NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data

Summary NanoCLUST is an analysis pipeline for classification of amplicon-based full-length 16S rRNA nanopore reads. It is characterized by an unsupervised read clustering step, based on Uniform Mani...

abPOA: an SIMD-based C library for fast partial order alignment using adaptive band

Summary Partial order alignment, which aligns a sequence to a directed acyclic graph, is now frequently used as a key component in long-read error correction and assembly. We present abPOA (adaptive...

AnVIL: An overlap-aware genome assembly scaffolder for linked reads

10X Genomics Chromium linked reads contain information that can be used to link sequences together into scaffolds in draft genome assemblies. Existing software for this purpose perform the scaffoldi...

Overlap detection on long, error-prone sequencing reads via smooth q-gram

Motivation Third generation sequencing techniques, such as the Single Molecule Real Time technique from PacBio and the MinION technique from Oxford Nanopore, can generate long, error-prone sequenci...

Genomic signals of admixture and reinforcement between two closely related species of European sepsid flies

Interspecific gene flow by hybridization may weaken species barriers and adaptive divergence, but also initiate reinforcement of reproductive isolation trough natural and sexual selection. The exten...

Benchmarking Oxford Nanopore read assemblers for high-quality molluscan genomes

Choosing the optimum assembly approach is essential to achieving a high-quality genome assembly suitable for comparative and evolutionary genomic investigations. Significant recent progress in long-...

Chromosome-scale genome assembly provides insights into speciation of allotetraploid and massive biomass accumulation of elephant grass (<i>Pennisetum purpureum</i> Schum.)

Elephant grass (Pennisetum purpureum Schum., A’A’BB, 2n=4x=28), which is characterized as robust growth and high biomass, and widely distributed in tropical and subtropical areas globally, is an imp...

yacrd and fpa: upstream tools for long-read genome assembly

Motivation Genome assembly is increasingly performed on long, uncorrected reads. Assembly quality may be degraded due to unfiltered chimeric reads; also, the storage of all read overlaps can take up...

Antibiotic resistance prediction for Mycobacterium tuberculosis from genome sequence data with Mykrobe

Two billion people are infected with Mycobacterium tuberculosis, leading to 10 million new cases of active tuberculosis and 1.5 million deaths annually. Universal access to drug susceptibility testi...

RabbitQC: high-speed scalable quality control for sequencing data

Motivation Modern sequencing technologies continue to revolutionize many areas of biology and medicine. Since the generated datasets are error-prone, downstream applications usually require quality...

HASLR: Fast Hybrid Assembly of Long Reads

Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The large...

A benchmark of structural variation detection by long reads through a realistic simulated model

Despite the rapid evolution of new sequencing technologies, structural variation detection remains poorly ascertained. The high discrepancy between the results of structural variant analysis program...

nanoDoc: RNA modification detection using Nanopore raw reads with Deep One-Class Classification

Advances in Nanopore single-molecule direct RNA sequencing (DRS) have presented the possibility of detecting comprehensive post-transcriptional modifications (PTMs) as an alternative to experimental...

VIRUSBreakend: viral integration recognition using single breakends

Integration of viruses into infected host cell DNA can causes DNA damage and can disrupt genes. Recent cost reductions and growth of whole genome sequencing has produced a wealth of data in which vi...

TrancriptomeReconstructoR: data-driven annotation of complex transcriptomes

Background The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and co...

Manual annotation of genes within Drosophila species: the Genomics Education Partnership protocol

Annotating the genomes of multiple organisms allows us to study their genes as well as the evolution of those genes. While many eukaryotic genome assemblies already include computational gene predic...

Swan: a library for the analysis and visualization of long-read transcriptomes

Motivation Long-read RNA-sequencing technologies such as PacBio and Oxford Nanopore have discovered an explosion of new transcript isoforms that are difficult to visually analyze using currently av...

Streamlining quantitative analysis of long RNA sequencing reads

Transcriptome analyses allow for linking RNA expression profiles to cellular pathways and phenotypes. Despite improvements in sequencing methodology, whole transcriptome analyses are still tedious, ...

Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications

The ability to characterize repetitive regions of the human genome is limited by the read lengths of short-read sequencing technologies. Although long-read sequencing technologies such as Pacific Bi...

The Ensembl COVID-19 resource: Ongoing integration of public SARS-CoV-2 data

The Ensembl COVID-19 browser (covid-19.ensembl.org) was launched in May 2020 in response to the ongoing pandemic. It is Ensembl’s contribution to the global efforts to develop treatments, diagnostic...

The emergence of inter-clade hybrid SARS-CoV-2 lineages revealed by 2D nucleotide variation mapping

I performed whole-genome sequencing on SARS-CoV-2 collected from COVID-19 samples at Mayo Clinic Rochester in mid-April, 2020, generated 85 consensus genome sequences and compared them to other geno...

Systematic benchmarking of tools for CpG methylation detection from Nanopore sequencing

DNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy ...

Strain-level sample characterisation using long reads and MAPQ scores

A simple but effective method for strain-level characterisation of microbial samples using long read data is presented. The method, which relies on having a non-redundant database of reference genom...

Comparative genomic analysis of Mycobacterium tuberculosis reveals evolution and genomic instability within Uganda I sub-lineage

Introduction Tuberculosis (TB) is the leading cause of morbidity and mortality globally, responsible for an estimated annual 10.0 million new cases and 1.3 million deaths among infectious diseases w...

A Python-based optimization framework for high-performance genomics

Exponentially-growing next-generation sequencing data requires high-performance tools and algorithms. Nevertheless, the implementation of high-performance computational genomics software is inaccess...

Genome ARTIST_v2 software – a support for annotation of class II natural transposons in new sequenced genomes

Transposon annotation is a very dynamic field of genomics and various tools assigned to support this bioinformatics endeavor were reported. Genome ARTIST (GA) software was initially developed for ma...

DNAscent v2: detecting replication forks in nanopore sequencing data with deep learning

The detection of base analogues in Oxford Nanopore Technologies (ONT) sequencing reads has become a promising new method for the high-throughput measurement of DNA replication dynamics with single-m...

Practical probabilistic and graphical formulations of long-read polyploid haplotype phasing

Resolving haplotypes in polyploid genomes using phase information from sequencing reads is an important and challenging problem. We introduce two new mathematical formulations of polyploid haplotype...

DR2S: an integrated algorithm providing reference-grade haplotype sequences from heterozygous samples

Background High resolution HLA genotyping of donors and recipients is a crucially important prerequisite for haematopoetic stem-cell transplantation and relies heavily on the quality and completenes...

Nucleotide-resolution bacterial pan-genomics with reference graphs

Bacterial genomes follow a U-shaped frequency distribution whereby most genomic loci are either rare (accessory) or common (core) - the alignable fraction of two genomes from a single species might ...

precisionFDA Truth Challenge V2:Calling variants from short- and long-reads in difficult-to-map regions

The precisionFDA Truth Challenge V2 aimed to assess the state-of-the-art of variant calling in difficult-to-map regions and the Major Histocompatibility Complex (MHC). Starting with FASTQ files, 20...

Introducing the new MinKNOW App

Today marks the release of the MinKNOW App for iOS and Android devices, now available to download from the Apple Store and Google Play.

lra: the Long Read Aligner for Sequences and Contigs

It is computationally challenging to detect variation by aligning long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies. One approach to efficie...

A long read mapping method for highly repetitive reference sequences

About 5-10% of the human genome remains inaccessible for functional analysis due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. To enable high-quali...

Nanopanel2 calls phased low-frequency variants in Nanopore panel sequencing data

Clinical decision making is increasingly guided by accurate and recurrent determination of presence and frequency of (somatic) variants and their haplotype through panel sequencing of disease-releva...

Benchmarking small variant detection with ONT reveals high performance in challenging regions

Background The development of long read sequencing (LRS) has led to greater access to the human genome. LRS produces long read lengths at the cost of high error rates and has shown to be more useful...

Single cell transcriptome sequencing on the Nanopore platform with ScNapBar

The current ecosystem of single cell RNA-seq platforms is rapidly expanding, but robust solutions for single cell and single molecule full- length RNA sequencing are virtually absent. A high-through...

MAJORA: Continuous integration supporting decentralised sequencing for SARS-CoV-2 genomic surveillance

Genomic epidemiology has become an increasingly common tool for epidemic response. Recent technological advances have made it possible to sequence genomes rapidly enough to inform outbreak response,...

BugSeq: a highly accurate cloud platform for long-read metagenomic analyses

As the use of nanopore sequencing for metagenomic analysis increases, tools capable of performing long-read taxonomic classification in a fast and accurate manner are needed. Existing tools were eit...

Reference-free reconstruction and quantification of transcriptomes from nanopore long-read sequencing

Single-molecule long-read sequencing with Nanopore provides an unprecedented opportunity to measure transcriptomes from any sample. However, current analysis methods rely on the comparison with a re...

Ratatosk - Hybrid error correction of long reads enables accurate variant calling and assembly

Motivation Long Read Sequencing (LRS) technologies are becoming essential to complement Short Read Sequencing (SRS) technologies for routine whole genome sequencing. LRS platforms produce DNA fragme...

periscope: sub-genomic RNA identification in SARS-CoV-2 ARTIC network nanopore sequencing data

We have developed periscope, a tool for the detection and quantification of sub-genomic RNA in ARTIC network protocol generated Nanopore SARS-CoV-2 sequence data. We applied periscope to 1155 SARS-...

Gaussian Mixture Model-Based Unsupervised Nucleotide Modification Number Detection Using Nanopore Sequencing Readouts

Motivation Nucleotides modification status can be decoded from the Oxford Nanopore Technologies (ONT) nanopore sequencing ionic current signals. Although various algorithms have been developed for n...

Liftoff: an accurate gene annotation mapping tool

Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of...

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read tools

Application of Oxford Nanopore Technologies’ long-read sequencing platform to transcriptomic analysis is increasing in popularity. However, such analysis can be challenging due to small library size...

Detection of differential RNA modifications from direct RNA sequencing of human cell lines

Differences in RNA expression can provide insights into the molecular identity of a cell, pathways involved in human diseases, and variation in RNA levels across patients associated with clinical ph...

Using SPAdes de novo assembler

SPAdes—St. Petersburg genome Assembler—was originally developed for de novo assembly of genome sequencing data produced for cultivated microbial isolates and for single‐cell genomic DNA sequencing. ...

BoardION: real-time monitoring of Oxford Nanopore Technologies devices

One of the main advantages of the Oxford Nanopore Technology (ONT) is the possibility of sequencing in real time. However, the ONT sequencing interface is not sufficient to explore the quality of se...

Stability of SARS-CoV-2 Phylogenies

The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolution...

CSA: A high-throughput chromosome-scale assembly pipeline for vertebrate genomes

Background Easy-to-use and fast bioinformatics pipelines for long-read assembly that go beyond the contig level to generate highly continuous chromosome-scale genomes from raw data remain scarce. R...

NanoSPC: a scalable, portable, cloud compatible viral nanopore metagenomic data processing pipeline

Metagenomic sequencing combined with Oxford Nanopore Technology has the potential to become a point-of-care test for infectious disease in public health and clinical settings, providing rapid diagn...

GALA: gap-free chromosome-scale assembly with long reads

High-quality genome assembly has wide applications in genetics and medical studies. However, it is still very challenging to achieve gap-free chromosome-scale assemblies using current workflows of l...

A computational toolset for rapid identification of SARS-CoV-2, other viruses, and microorganisms from sequencing data

In this paper, we present a toolset and related resources for rapid identification of viruses and microorganisms from short-read or long-read sequencing data. We present fastv as an ultra-fast tool...

Characterization of SARS-CoV-2 viral diversity within and across hosts

In light of the current COVID-19 pandemic, there is an urgent need to accurately infer the evolutionary and transmission history of the virus to inform real-time outbreak management, public health p...

Quality control of low-frequency variants in SARS-CoV-2 genomes

During the current outbreak of COVID-19, research labs around the globe submit sequences of the local SARS-CoV-2 genomes to the GISAID database to provide a comprehensive analysis of the variability...

Three adjacent nucleotide changes spanning two residues in SARS-CoV-2 nucleoprotein: possible homologous recombination from the transcription-regulating sequence

The COVID-19 pandemic is caused by the single-stranded RNA virus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a virus of zoonotic origin that was first detected in Wuhan, China in D...

Portable nanopore analytics: Are we there yet?

Motivation Oxford Nanopore technologies (ONT) add miniaturization and real-time to high-throughput sequencing. All available software for ONT data analytics run on cloud/clusters or personal compute...

Accurate detection of single nucleotide polymorphisms using nanopore sequencing

Nanopore sequencing is a powerful single molecule DNA sequencing technology which provides a high throughput and long sequence reads. Nevertheless, its relatively high native error rate limits the d...

F5N: Nanopore Sequence Analysis Toolkit for Android Smartphones

F5N is the first ever Android application for nanopore sequence analysis on a mobile phone, comprised of popular tools for read alignment (Minimap2), sequence data manipulation (Samtools) and methyl...

Benchmarking the MinION: Evaluating long reads for microbial profiling

Nanopore based DNA-sequencing delivers long reads, thereby simplifying the decipherment of bacterial communities. Since its commercial appearance, this technology has been assigned several attribute...

Discovering multiple types of DNA methylation from bacteria and microbiome using nanopore sequencing

Nanopore sequencing provides a great opportunity for direct detection of chemical DNA modification. However, existing computational methods were either trained for detecting a specific form of DNA m...

iGenomics: Comprehensive DNA sequence analysis on your Smartphone

Background Following the miniaturization of integrated circuitry and other computer hardware over the past several decades, DNA sequencing is on a similar path. Leading this trend is the Oxford Nan...

Computational methods for 16S metabarcoding studies using Nanopore sequencing data

Assessment of bacterial diversity through sequencing of 16S ribosomal RNA (16S rRNA) genes has been an approach widely used in environmental microbiology, particularly since the advent of high-throu...

Complete, closed bacterial genomes from microbiomes using nanopore sequencing

Microbial genomes can be assembled from short-read sequencing data, but the assembly contiguity of these metagenome-assembled genomes is constrained by repeat elements. Correct assignment of genomic...

Economic Genome Assembly from Low Coverage Illumina and Nanopore Data

We describe a new approach to assemble genomes from a combination of low-coverage short and long reads. LazyBastard starts from a bipartite overlap graph between long reads and restrictively filtere...

Metagenomics workflow for hybrid assembly, differential coverage binning, transcriptomics and pathway analysis (MUFFIN)

Metagenomics has redefined many areas of microbiology. However, metagenome-assembled genomes (MAGs) are often fragmented, primarily when sequencing was performed with short reads. Recent long-read s...

Opportunities and challenges in long-read sequencing data analysis

Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characterist...

High precision Neisseria gonorrhoeae variant and antimicrobial resistance calling from metagenomic Nanopore sequencing

The rise of antimicrobial resistant Neisseria gonorrhoeae is a significant public health concern. Against this background, rapid culture-independent diagnostics may allow targeted treatment and prev...

BOSS-RUNS: a flexible and practical dynamic read sampling framework for nanopore sequencing

Real-time selective sequencing of individual DNA fragments, or 'Read Until', allows the focusing of Oxford Nanopore Technology sequencing on pre-selected genomic regions. This can lead to large impr...

Readfish enables targeted nanopore sequencing of gigabase-sized genomes

Nanopore sequencers can be used to selectively sequence certain DNA molecules in a pool by reversing the voltage across individual nanopores to reject specific sequences, enabling enrichment and de...

Benchmarking of long-read assemblers for prokaryote whole genome sequencing

Background Data sets from long-read sequencing platforms (Oxford Nanopore Technologies and Pacific Biosciences) allow for most prokaryote genomes to be completely assembled – one contig per chromoso...

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming...

HyPo: Super fast and accurate polisher for long read genome assemblies

Efforts towards making population-scale long read genome assemblies (especially human genomes) viable have intensified recently with the emergence of many fast assemblers. The reliance of these fast...

The string decomposition problem and its applications to centromere analysis and assembly

Motivation Recent attempts to assemble extra-long tandem repeats (such as centromeres) faced the challenge of translating long error-prone reads from the nucleotide alphabet into the alphabet of rep...

Accurate and simultaneous identification of differential expression and splicing using hierarchical Bayesian analysis

The regulation of mRNA controls both overall gene expression as well as the distribution of mRNA isoforms encoded by the gene. Current algorithmic approaches focus on characterization of significant...

Critical assessment of bioinformatics methods for the characterization of pathological repeat expansions with single-molecule sequencing data

A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats ...

Wengan: Efficient and high quality hybrid de novo assembly of human genomes

The continuous improvement of long-read sequencing technologies along with the development of ad-doc algorithms has launched a new de novo assembly era that promises high-quality genomes. However, i...

RNA modifications detection by comparative Nanopore direct RNA sequencing

Abstract RNA molecules undergo a vast array of chemical post-transcriptional modifications (PTMs) that can affect their structure and interaction properties. To date, over 150 naturally occurring P...

SVJedi: Genotyping structural variations with long reads

Motivation Studies on structural variants (SV) are expanding rapidly. As a result, and thanks to third generation sequencing technologies, the number of discovered SVs is increasing, especially in ...

MasterOfPores: A workflow for the analysis of Oxford Nanopore Direct RNA sequencing datasets

The direct RNA sequencing platform offered by Oxford Nanopore Technologies allows for direct measurement of RNA molecules without the need of conversion to complementary DNA, fragmentation or amplif...

Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing

Whole-genome sequencing using sequencing technologies such as Illumina enables the accurate detection of small-scale variants but provides limited information about haplotypes and variants in repeti...

Spectral Jaccard Similarity: a new approach to estimating pairwise sequence alignments

A key step in many genomic analysis pipelines is the identification of regions of similarity between pairs of DNA sequencing reads. This task, known as pairwise sequence alignment, is a heavy comput...

Bioinformatics of nanopore sequencing

Nanopore sequencing is one of the most exciting new technologies, which undergoes dynamic development. With its development, a growing number of analytical tools are becoming available for researche...

kASA: Taxonomic Analysis of Metagenomic Data on a Notebook

The taxonomic analysis of sequencing data has become important in many areas of life sciences. However, currently available software tools for that purpose either consume large amounts of RAM or yie...

Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data

DNA base modifications, such as C5-methylcytosine (5mC) and N6-methyldeoxyadenosine (6mA), are important types of epigenetic regulations. Short-read bisulfite sequencing and long-read PacBio sequenc...

Semi-quantitative characterisation of mixed pollen samples using MinION sequencing and Reverse Metagenomics (RevMet)

Peel and colleagues describe their RevMet (Reverse Metagenomics) pipeline that enables reliable and semi-quantitative characterisation of mixed eukaryote samples, such as mixed pollen samples. This ...

Featherweight long read alignment using partitioned reference indexes

The advent of Nanopore sequencing has realised portable genomic research and applications. However, state of the art long read aligners and large reference genomes are not compatible with most mobil...

Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads

Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansion...

SquiggleKit: A toolkit for manipulating nanopore signal data

The management of raw nanopore sequencing data poses a challenge that must be overcome to accelerate the development of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a ...

Performance of neural network basecalling tools for Oxford Nanopore sequencing

Background Basecalling, the computational process of translating raw electrical signal to nucleotide sequence, is of critical importance to the sequencing platforms produced by Oxford Nanopore Tech...

A framework and an algorithm to detect low-abundance DNA by a handy sequencer and a palm-sized computer

Motivation Detection of DNA at low abundance with respect to the entire sample is an important problem in areas such as epidemiology and field research, as these samples are highly contaminated wit...

Deepbinner: Demultiplexing barcoded Oxford Nanopore reads with deep convolutional neural networks

Multiplexing, the simultaneous sequencing of multiple barcoded DNA samples on a single flow cell, has made Oxford Nanopore sequencing cost-effective for small genomes. However, it depends on the abi...

On site DNA barcoding by nanopore sequencing

Note: the chemistry in this paper has since been superseded. Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of t...

Mapping DNA methylation with high-throughput nanopore sequencing

DNA chemical modifications regulate genomic function. We present a framework for mapping cytosine and adenosine methylation with the Oxford Nanopore Technologies MinION using this nanopore sequencer...

Efficient data structures for mobile de novo genome assembly by third-generation sequencing

Mobile/portable (third-generation) sequencing technologies, including Oxford Nanopore’s MinION and SmidgION, are revolutionizing once again –after the advent of high-throughput sequencing– biomedica...

Nanopore detection of bacterial DNA base modifications

The common bacterial base modification N6-methyladenine (m6A) is involved in many pathways related to an organism's ability to survive and interact with its environment. Recent research has shown th...

Fast and sensitive mapping of nanopore sequencing reads using GraphMap

Realizing the democratic promise of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics. Here we present GraphMap, a mapping...

The use of Oxford Nanopore native barcoding for complete genome assembly

The Oxford Nanopore Technologies MinIONTM is a mobile DNA sequencer that can produce long read sequences with a short turn-around time. Here we report the first demonstration of single contig genome...

Picopore: A tool for reducing the size of Oxford Nanopore Technologies' datasets without losing information.

A tool for reducing the size of Oxford Nanopore Technologies' datasets without losing information. Options: Lossless compression: reduces footprint without reducing the ability to use other nano...

poRe GUIs for parallel and real-time processing of MinION sequence data

Motivation Oxford Nanopore's MinION device has matured rapidly and is now capable of producing over one million reads and several gigabases of sequence data per run. The nature of the MinION output ...

Hybrid assembly pipeline released (using Canu, racon and Pilon)

The long sequencing reads produced by Oxford Nanopore’s platforms enable the assembly of genomes with superior contiguity compared to those produced by second generation technologies. In some circum...

Comparison of bacterial genome assembly software for MinION data and their applicability to medical microbiology

Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of ass...

Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.

Background Improved DNA sequencing methods have transformed the field of genomics over the last decade. This has become possible due to the development of inexpensive short read sequencing technolo...

Quality Assessment Tools for Oxford Nanopore MinION data

IONiseR provides tools for the quality assessment of Oxford Nanopore MinION data. It extracts summary statistics from a set of fast5 files and can be used either before or after base calling. In add...

NanoOk – Flexible, multi-reference software for pre- and post-alignment analysis of nanopore sequencing data, quality and error profiles

The recent launch of the Oxford Nanopore Technologies MinION Access Program (MAP) resulted in the rapid development of a number of open source tools aimed at extracting reads and yield information f...

LAST

Martin Frith, Computational Biology Research Center in Tokyo Release Date: 18-Sep-2015 LAST finds similar regions between sequences, and aligns them. It is designed for comparing large datasets to...

Mash: fast genome and metagenome distance estimation using MinHash

Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence co...

Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences

Motivation: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10kbp in length, which have enabled high-quality genome assembly at an af...

A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits

The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage ov...

Real-time selective sequencing on the MinION

The MinION replaces the conventional model of "sequence followed by analysis to final result" with instant access to data before the completion of a sequencing run. This instant access extends to th...

Clive Brown, CTO of Oxford Nanopore, talks at the London Calling conference

Clive is Chief Technology Officer at Oxford Nanopore. On the Executive team, he is responsible for all of the Company’s product-development activities. Clive leads the specification and design of th...

Running and Reading in Real Time: Looking at Squiggles on the MinION

Dr Matthew Loose, Head of Next Generation Sequencing Service, Nottingham University talks to the MinION commnity.

Leveraging MinHash for Rapid Identification of Nanopore Data on Mobile Hardware

Dr Brian Ondov, Bioinformatics Engineer at National Biodefense Analysis and Countermeasures Centre talks to the MinION Community about Leveraging MinHash.

Oxford Nanopore MinION Applications: Kits and Tools, Genetics and Metagenomics

The Applications team at Oxford Nanopore has two overarching responsibilities: creation and development of sample and library preparation protocols for a wide variety of sample types, and undertakin...

A year of happy MAPping

In this talk I will cover the highs and lows of being part of the Oxford Nanopore MinION Access Programme. Our laboratory joined the MAP programme in May 2014. Soon afterwards we published the first...

Real-time identification of pathogens and antibiotic resistance profile using Oxford Nanopore sequencing

Clinical pathogen sequencing has been demonstrated to have a positive outcome on treatment of patients with unknown bacterial infection. However, widespread adoption of clinical pathogen sequencing ...

minoTour – real time analysis tools for minIONs

Nanopore sequencing introduces true real-time sequencing for the first time. Full exploitation of real-time sequencing requires a novel approach to data analysis for which we have developed the mino...

Error correction, assembly and consensus algorithms for MinION data

In my talk, I will discuss my collaboration with Nick Loman’s lab to develop de novo assembly methods for MinION data. We have built a pipeline to error correct nanopore reads using partial order gr...

Democratizing DNA Fingerprinting

We report a rapid, inexpensive, and portable strategy to re-identify human DNA using the MinION, a miniature sequencing sensor by Oxford Nanopore Technologies. Our strategy requires only 10-30 minut...

Comparison of bacterial genome assembly software for MinION data and their applicability to medical microbiology

Translating the Oxford Nanopore MinION sequencing technology into medical microbiology requires on-going analysis that keeps pace with technological improvements to the instrument and release of ass...

Centrifuge: rapid and sensitive classification of metagenomic sequences

Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing sch...

Use of Unamplified RNA/cDNA–Hybrid Nanopore Sequencing for Rapid Detection and Characterization of RNA Viruses

Nanopore sequencing, a novel genomics technology, has potential applications for routine biosurveillance, clinical diagnosis, and outbreak investigation of virus infections. Using rapid sequencing o...

Mapping DNA methylation with high-throughput nanopore sequencing

DNA chemical modifications regulate genomic function. We present a framework for mapping cytosine and adenosine methylation with the Oxford Nanopore Technologies MinION using this nanopore sequencer...

DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads

The MinION device by Oxford Nanopore produces very long reads (reads over 100 Kbp were reported); however it suffers from high sequencing error rate. We present an open-source DNA base caller based ...

HPG pore: an efficient and scalable framework for nanopore sequencing data

The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequen...

MinION Analysis and Reference Consortium: Phase 1 data release and analysis

The advent of a miniaturised DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme ...

LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads

Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the ina...

Community utility of nanopore data | Ewan Birney

Ewan Birney gives a talk at London Calling 2015 on community utility of nanopore data.

Successful test launch for nanopore sequencing

Nanopore sequencing gets a boost with accurate error modelling and variant-calling tools for Oxford Nanopore Technology’s highly anticipated MinION platform.

Improved data analysis for the MinION nanopore sequencer

Speed, single-base sensitivity and long read lengths make nanopores a promising technology for high-throughput sequencing. We evaluated and optimised the performance of the MinION nanopore sequencer...

marginAlign, marginCaller, marginStats – tools to align nanopore reads to a reference genome

The marginAlign package can be used to align reads to a reference genome and call single nucleotide variations (SNVs). It is specifically tailored for Oxford Nanopore Reads The package comes with t...

nanoCORR – error correction tool for nanopore sequence data

Sara Goodwin and James Gurtowski at Cold Spring Harbor Laboratory have created error correction software for Oxford Nanopore data Release Date: 24-Aug-2015

A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer

Background The MinION™ is a new, portable single-molecule sequencer developed by Oxford Nanopore Technologies. It measures four inches in length and is powered from the USB 3.0 port of a laptop comp...

poRe: an R package for the visualization and analysis of nanopore sequencing data

Motivation: The Oxford Nanopore MinION device represents a unique sequencing technology. As a mobile sequencing device powered by the USB port of a laptop, the MinION has huge potential applications...