Martin Elferink: Multiplex CRISPR-Cas enrichment of clinically relevant genomic repeat structures
Martin Elferink from the University Medical Center in Utrecht spoke about using CRISPR-Cas9 to enrich for genomic repeat structures in a number of diseases. Martin described that there are over 40 neurological and neuromuscular diseases caused by repeat expansions. These repeat expansions can range in size from approximately 2 – 60 bp and can propagate through replication errors resulting in repeat structures up to kilobases in length. In addition, the number of repeats is often indicative of disease severity and thus it is difficult to resolve using traditional or short read sequencing solutions.
The use of long reads which span the entire repeat expansion “have the potential to solve this problem” and could be used to accurately count and categorise the types and number of expansion events. Martin said that his overall aims were to use nanopore long read sequencing to generate reads spanning entire loci containing repeat expansions and ideally determine epigenetic modifications.
In order to do this Martin explained a targeted enrichment approach using CRISPR-Cas9 where all native DNA ends are dephosphorylated preventing ligation of sequencing adapters. Next, cleavage of the target sites is performed using an RNA guide and Cas9 protein. This cleavage leaves phosphorylated ends to which adapters can be preferentially ligated. The advantage of this approach is that multiple target sites can be excised in one step providing a multiplex solution to targeted sequencing. Due to this, cost prohibitive WGS sequencing is not required and more information can be obtained in a single workflow than just targeting a single locus. Furthermore, this protocol does not use PCR and thus it is the actual native DNA, with preserved modifications, that will go through to sequencing.
Looking at 10 specific loci, Martin undertook a project in collaboration with Oxford Nanopore Technologies. The main aims in this collaboration were to determine if all 10 loci could be accurately distinguished, along with the number of repeat expansion events in each one. Using DNA from a healthy control sample, Wigard Kloosterman, both short and long read sequencing was performed to 30x and 70x respectively and the number of repeat copies were determined for 6 of the 10 loci. Using the CRISPR-Cas9/Cas12a approach the Cas9 approach yielded a median of just under 600x coverage of the loci under study while the Cas12a yielded a median coverage of approximately 200x each on a single flow cell on a GridION. Both protocols gave over 100x coverage of the target sequences. Examining the read coverage and alignment plots, Martin stated that cleavage sites were very specific and there was no evidence of allelic dropout across the multiplex panel. Expanding upon this he stated that for 7 of the loci all alleles were detected, as were the relevant SNPs, and for the remaining three, although detected, no informative variants were seen.
Next Martin moved on to attempting to detect and count the number of repeat expansion events across six patients with known pathogenic repeat expansions in different alleles. Except for one patient sample with degraded DNA, all samples processed using the Cas9 protocol had over 100x coverage of the target regions. Using the remaining patients as controls, each disease associated allele showed that the correct diagnosis was determined for each patient and the repeat expansion count broadly agreed with traditional diagnostic assays.