NanoCaller for accurate detection of SNPs and small indels from long-read sequencing by deep neural networks


Background
Variant detection from high-throughput sequencing data remains an important, unresolved yet often overlooked problem. Long-read sequencing technologies, such as Oxford Nanopore and PacBio sequencing, present unique advantages to detect SNPs and small indels in genomic regions that short-read sequencing cannot reliably examine (for example, only ~80% of genomic regions are marked as "high-confidence region" to have SNP/indel calls in the Genome In A Bottle project). However, existing software tools for short-read data perform poorly on long-read data; instead, several recent studies showed promising results in variant detection on long-read data by deep learning.

Methods
Here we present NanoCaller, a computational method that integrates haplotype structure in deep convolutional neural network for the detection of SNPs/indels from long-read sequencing data. NanoCaller uses long-range information to generate predictions for each candidate variant site by considering pileup information of other candidate sites sharing reads. Subsequently, it performs read phasing and carries out local realignment on each set of phased reads to call indels.

Results
We evaluate NanoCaller on multiple human genomes (NA12878/HG001, NA24385/HG002, NA24149/HG003, NA24143/HG004 and HX1), by cross-genome, cross-chromosome, cross-reference genome, and cross-platform benchmarking tests. Our results demonstrate that NanoCaller performs competitively against other long-read variant callers. In particular, NanoCaller can generate SNP/indel calls in complex genomic regions that are removed from variant calling by other software tools.

Conclusions
In summary, NanoCaller enables the detection of genetic variants from genomic regions that are previously inaccessible to genome sequencing, and may facilitate the use of long-read sequencing in finding disease variants in human genetic studies.

Authors: Umair Ahsan, Qian Liu, Kai Wang