Precise characterization of somatic structural variations and mobile element insertions from paired long-read sequencing data with nanomonsv

We introduce our novel software, nanomonsv, for detecting somatic structural variations (SVs) using tumor and matched control long-read sequencing data with a single-base resolution. Using paired long-read sequencing data from three cancer cell-lines and their matched lymphoblastoid lines, we demonstrate that our approach can identify not only somatic SVs that can be captured with short-read technologies but also novel ones especially those whose breakpoints are located in repeat regions.

In addition, we have developed a workflow for classifying mobile element insertions while elucidating their in-depth properties such as 5′ truncations, internal inversion as well as source sites in the case of LINE1 transductions.

Finally, we identify complex SVs probably caused by replication mechanisms or telomere crisis by examining the co-occurrence of multiple somatic SVs in common supporting reads. In summary, our approaches applied to cancer long-read sequencing data can reveal various features of somatic SVs and will lead to further understanding of mutational processes and functional consequences of somatic SVs.

Authors: Yuichi Shiraishi, Junji Koya, Kenichi Chiba, Yuki Saito, Ai Okada, Keisuke Kataoka