Identification of genetic variation and transcriptional perturbations implicated in Parkinson’s disease

Anastasia (DZNE Tübingen, Germany) stated that, although genetic analyses of Parkinson’s disease (PD), such as genome wide association studies (GWAS), have identified a range of genes involved in its susceptibility, there remains what is termed “missing heritability” — the underlying genetic association has not been fully resolved. Structural variants (SVs) may underlie some of this missing heritability. Anastasia’s team have proposed that SVs play a particular role in the pathogenesis of sporadic PD.

Identification of structural variants in Parkinson’s disease

Anastasia introduced the FOUNDIN-PD initiative, which was the source of data she has used for her research. These data include that derived from DNA-based assays, RNA-based assays, and live cell analysis, from dopaminergic neuronal cell lines differentiated from 100 iPS lines. These iPS lines were obtained from PD patients and healthy controls, via the Parkinson’s Progression Markers Initiative.

Ten cell lines were sequenced with both long-read and short-read technologies; and these data were used in Anastasia’s study, which aimed to identify novel candidate SVs associated with sporadic PD cases. Her SV detection pipeline involved PromethION sequencing of gDNA derived from the 10 cell lines (3 healthy control lines and 7 cell lines containing known causal PD mutations), basecalling and filtering with Guppy v4.4.1, alignment to the GRCh38 reference genome (using winnowmap v2.01), SV calling with cuteSV via the Oxford Nanopore SV analysis pipeline v2.0.2, and filtering of SV calls.  Calls were then unified with SURVIVOR v1.0.6.

Presenting a summary of her results, Anastasia stated that 92,727 SVs were identified in total, including deletions, insertions, duplications, and inversions. They were particularly interested in those variants present in the PD samples but absent in the controls. Of the SVs identified, on average 40,000 SVs were identified per sample, the majority of which were insertions and deletions, and in intergenic or intronic regions.

Prioritisation of SVs

Moving on to prioritisation of the candidate SVs, Anastasia explained how, at the genomic level, this was based on those that intersected with PD GWAS variants. To this end, 1,312 SVs were identified in 89 PD GWAS loci; and according to the variant effect predictor (VEP), four genes were associated with high/moderate impact from four heterozygous insertions found.

Anastasia also investigated if any SVs could be prioritised at the transcriptome level. For this, transcriptome data were obtained via short-read sequencing and PromethION sequencing of cDNA derived from the 10 cell lines; the FLAIR pipeline was used to identify isoforms. SVs were prioritised based on their association with differential isoform usage and alternative splicing events. Interestingly, no differentially expressed transcripts were found from genes that had been prioritised based on their intersection with PD GWAS variants. Overall, their transcriptome-based analyses resulted in the prioritisation of 25 SV candidates. Anastasia shared one example of an insertion, located in the promoter region of a ribosomal protein gene which had been found to be associated with differential transcript expression. A deletion had been identified in the intron of another gene at this locus, LRSAM1. According to the variant effect predictor, the insertion might lead to nonsense mediated decay of the transcript.

Conclusions

Anastasia summarised that they had prioritised 29 SVs as potential candidates for PD pathogenesis in sporadic PD: four insertions, based on their association with PD GWAS hits and having a high-moderate impact on protein sequence; and 25 SVs in the 5 kb window of genes with differentially expressed transcripts. She plans to validate her findings at protein level and assess allele-specific expression of heterozygous SVs overlapping regulatory or coding regions. She also aims to assess cis-regulation of SVs in a larger dataset.

Authors: Anastasia Illarionova