Short and long-read genome sequencing methodologies for somatic variant detection; genomic analysis of a patient with diffuse large B-cell lymphoma

Recent advances in throughput and accuracy mean that the Oxford Nanopore Technologies (ONT) PromethION platform is a now a viable solution for genome sequencing. Much of the validation of bioinformatic tools for this long-read data has focused on calling germline variants (including structural variants). Somatic variants are outnumbered many-fold by germline variants and their detection is further complicated by the effects of tumour purity/subclonality.

Here, we evaluate the extent to which Nanopore sequencing enables genome-wide detection and analysis of somatic variation. We do this through sequencing tumour and germline genomes for a patient with diffuse B-cell lymphoma and comparing results with 150 bp short-read sequencing of the same samples. Calling germline single nucleotide variants (SNVs) from the long-read data achieved good specificity and sensitivity. However, results of somatic SNV calling highlight the need for the development of specialized joint calling algorithms.

We find the comparative performance of different tools varies significantly between structural variant types, and suggest long reads are especially advantageous for calling large somatic deletions and duplications.

Finally, we highlight the utility of long reads for phasing clinically relevant variants, confirming that a somatic 1.6Mb deletion and a p.(Arg249Met) mutation involving TP53 are oriented in trans.

Authors: Hannah E Roberts, Maria Lopopolo, Alistair T Pagnamenta, Eshita Sharma, Duncan Parkes, Lorne Lonie, Colin Freeman, Samantha J L Knight, Gerton Lunter, Helene Dreau, Helen Lockstone, Jenny C Taylor, Anna Schuh, Rory Bowden, David Buck