Structural variation in All of Us analyzed with long-read sequencing at a scale

Novel long-read technologies establish scientists to analyze previously inaccessible regions of the Human genome, including 193 medically relevant genes and regions such as centromeres and telomeres.  Combined with nationwide initiatives, such as the All of Us research program such technologies can help establish meaningful variants in the human genome on a population level-scale and link them to specific phenotypes/diseases. Novel long-read technologies require different approaches of analysis, therefore we have developed Sniffles structural variant caller to detect variants both on germline and population scale, as well as somatic/mosaic level. Moreover, we are interested in underlying causes of the structural variations and their consequences, including insertional/deleterious mutations or chromosomal rearrangements, which have been observed to be associated with transposable elements. Therefore, our further goal is to characterize transposable elements in the dataset and investigate possible association of specific elements with observed structural variants, which can possibly be linked to alterations within medically relevant genes and other regions associated with numerous diseases.

Download the PDF