CRISPR-Cas9 targeted sequencing - supplementary information (Q-SQK-CS9109)


Background

How Cas9 works

Cas9 is part of the CRISPR-Cas family of proteins which can be programmed to cut specific sequences which has enabled genome editing but also targeted sequencing. Recent advances have shown the CRISPR-Cas9 target enrichment can be effectively coupled with Oxford Nanopore sequencing to generate high coverage of specific loci (Gilpatrick et al. 2020).

Briefly, the Cas9 protein forms a a ribonucleoprotein complex (RNP) with two different RNA strands; a target specific CRISPR RNA (crRNA, also called a probe), and a universal trans-activating CRISPR RNA (tracrRNA). The RNP searches the genomic DNA for the target sequence or region of interest (ROI). Once found, Cas9 cuts both strands, revealing ends suitable for ligation and nanopore sequencing.

05 Figure 1. (1) The RNP complex bound to the region of interest. (2) The RNP melts the DNA duplex upon hybridisation. The target sequence is a 20 nt section called a protospacer, next to the 3 nt protospacer-adjacent motif (PAM) which has the sequence NGG. The protection of the PAM-distal end by Cas9, which remains bound, provides the directionality for the strand. Probe directionality is shown on the top image.

Note: Cas9 can preferentially remain bound to the PAM-distal ends and release the PAM-proximal strand, causing a bias in available ends for ligation of the nanopore sequencing adapters to the region of interest to enable sequencing of the desired region.

CRISPR-Cas9 for nanopore sequencing

The enrichment protocol works by first protecting all the DNA ends in the sample using dephosphorylation, limiting the possibility of the sequencing adapter ligation. Cas9 cleavage results in the release of the downstream DNA molecule, revealing phosphate which enable ligation to the sequencing adapter following dA-tailing. 06 Figure 2. The region of interest (ROI) is cleaved at the 5’ side, and the PAM-distal site is protected by the bound Cas9. The cleavage and protection happen on both sides of the ROI. The directionality of the read is determined by the protection of the PAM-distal site by Cas9.

Protospacer sequences matching the forward strand (matching the genome reference), referred to as (+), will result in the release of the downstream DNA (towards the 3’ end), which will contain the ROI (Figure 1). In contrast, protospacer matching the reverse complement strand, or (-), will release the reverse strand, enabling double strand enrichment when combined with a (+) probe.

Different targeting strategies can be implemented to enrich ROI based on the desired outcome.

Excision approach method ‘Cut-and-read’ method*
crRNA probes are used to target regions either end of the ROI to excise that region to be sequenced (ideal for particular gene targets)
01
crRNA probe is designed at one end of a target region to read into the unknown (ideal for characterisation of integration sites within a genome).
02
The ROI is <10 kb (dependent on length of input DNA)

Both ends of the ROI are known, allowing probes to be designed on either side
The ROI is <10 kb (dependent on length of input DNA)

Only one end of the ROI is known

Table 1. Methods for enrichment of regions of interest.

*Unverified for Q-line

Note: For longer ROI (>10kb), a tiling approach can be used. This method involves designing two pools of probes (where each pool contains (+) and (-) probes). These should be prepared as two separate libraries and pooled in the final step (before the final clean up) and loaded together onto the same flow cell. Please contact support@nanoporetech.com if you need further information regarding this method.

For the excision method, we recommend excising a ROI by making four cuts, two upstream of the ROI targeting the (+) strand, and two downstream targeting the (-) strand. Four cuts instead of two are for redundancy, in case one or more crRNAs yield incomplete cleavage. This method should provide the highest coverage of a target region, as on-target strands will have sequencing adapters ligated to both ends. 12 Figure 5. The directionality of the sequenced read when the region of interest (ROI) is cleaved by the excision approach (two (+) and two (–) direction probes).

Successful enrichment sequencing runs and the choice of method depend on a few critical parameters:

  • Length of the ROI targeted should be between 5 and 10 kb
  • Quality of the input gDNA; we recommend to freshly extract high molecular weight (HMW) gDNA
  • Whether the sequences either side of the ROI are known
  • Coverage required for a particular application



Experimental design recommendations

Input DNA

For successful enrichment experiments, we recommend freshly extracted high molecular weight DNA. Our customers have found that extractions using New England Biolabs Monarch or QIAGEN PureGene products generally yield better coverage. Short input samples will have more DNA ends resulting in higher background. When manipulating DNA until the cleaving and dA-tailing step, make sure to use wide bore tips.

crRNA probe design

There are some key guidelines to follow when designing probes for a Cas9 experiment. The method for designing a single probe is the same no matter which overall approach is taken. We recommend that a period of development and testing be undertaken to optimise probe design for the ROI before introducing them to a regulated environment.

Probe spacing and position relative to ROI
We recommend spacing the crRNA probes appropriately to target up to a 5 kb ROI. A minimum of 1 kb either side of the ROI must be added to allow sequencing to start before the ROI. The flanking region can also be extended if the ROI is very small (<3 kb) and could be lost in the library preparation clean-up step. If an ideal target sequence cannot be found, consider expanding the flanking region further.

Probe design tools
Probes can be designed manually but there are a number of tools available to help, for example CHOPCHOP. When using CHOPCHOP, select “nanopore-enrichment” under the ‘For’ dropdown to set the recommended presets. We recommend inputting coordinates rather than gene targets as it may only return results from exons and selecting probe sequences with an efficiency score >0.3.

For best coverage, we recommend ordering multiple probes.

Parameter Recommendation
Mismatch (MM) Relevant to off target, minimise amount
GC % 40–80 %
Self-complementarity Minimise chances

Table 2. Criteria to consider when designing probes with CHOPCHOP.



How and what to order

Ordering Cas9 protein and RNA components

Cas9 protein and RNAs used in the nanopore sequencing method can all be purchased from Integrated DNA Technologies (IDT).

Component IDT component name
CRISPR-Cas9 nuclease Alt-R™ S.p. HiFi Cas9 Nuclease V3
CRISPR-Cas9 crRNA Alt-R™ CRISPR-Cas9 crRNA*
CRISPR-Cas9 tracrRNA Alt-R® CRISPR-Cas9 tracrRNA

Table 3. IDT components.

*Only the protospacer sequence is needed when ordering from IDT (some tools like CHOPCHOP return the full target sequence, i.e. the protospacer + PAM sequence)

Control crRNA sequences
Control crRNAs can be ordered from IDT (Alt-R™ CRISPR-Cas9 crRNA*, ‘use your own design’) and used in the initial step of the sample prep protocol as a control experiment or as an in-run control with other targets without impacting the coverage of any ROI.

For more information on the probes Oxford Nanopore Technologies used to assess kit performance, see the Expectations and Guidance section.

Name Protospacer sequence
HTT_1 TTTGCCCATTGGTTAGAAGC
HTT_2 TCTTATGAGTCTGCCCACTG
HTT_3 GGACAAAGTTAGGTACTCAG
HTT_4 CTAGACTCTTAACTCGCTTG

Table 4. Protospacer sequences for IDT ordering.

The HTT gene region to use in a BED file for assessing as an in-run control should be Chr4 3072000-3078000 in GRCh38.

Probe Name HTT_1 HTT_2 HTT_3 HTT_4
Chromosome Chr4 Chr4 Chr4 Chr4
Sense + + - -
Allele Maternal Maternal Maternal Maternal
Location in GRCh38 3072436 3072537 3077287 3079444
GC % 45 50 45 45
Self-complementarity 0 0 0 0
MM0 1 1 1 1
MM1 0 0 0 0
MM2 0 0 0 0
MM3 1 2 2 1
Efficiency score 0.66 0.60 0.64 0.51
PAM AGG AGG AGG AGG

Table 5. Specification according to CHOPCHOP for HTT crRNA probe sequences.



Expectations and guidance

Cas9 enrichment compared to whole genome sequencing

The Q Cas9 sequencing kit enriches for a ROI by limiting the ability of off-target regions to be sequenced. Whole genome sequencing runs typically feature high sequencing capacity and output, resulting in coverage spread over the whole genome rather than region of interest. In contrast, Cas9 enrichment experiments have lower sequencing capacity but with a higher percentage of sequencing time focused on sequencing the ROI, boosting the coverage of the region of interest several hundred-fold. This also results in a reduction in coverage of the rest of the genome compared to a whole genome sequencing experiment. 14 Figure 6. Relative coverage of the whole genome and Cas9 enrichment for the region of interest or background.

Downstream analysis for Cas9 sequencing experiments

Oxford Nanopore Technologies Q system provides FASTQ output to allow for use of a wide range of downstream analysis tools. The implementation of any analysis pipeline as part of a regulated environment will require additional validation.

Oxford Nanopore Technologies provides downstream analysis tools through its EPI2ME Labs platform. For targeted sequencing using Cas9, we recommend using the wf-cas9 workflow as a starting point for your analysis. This workflow is designed for analysing on-target, off-target and background sequencing from Cas9 native DNA enrichment experiments.

For the further information and to access the latest versions, please visit the EPI2ME Labs website: https://labs.epi2me.io/wfindex/



Summary of key results

Target coverage

Probe design
Coverage can be variable between targets even if they have similar characteristics, such as length. This highlights the importance of optimising probe designs for each ROI. We recommend that a period of development and testing be undertaken to optimise probe design for the ROI before introducing them to a regulated environment. More detailed information can be found in the Probe design section of this document.

The implementation of new probes in any process as part of a regulated environment will require additional validation.

Multiple targets
Coverage at a ROI should remain similar whether multiple ROIs are targeted or not since there is limited competition for sequencing time on the flow cell. The Q-SQK-CS9109 protocol is verified for up to 5 targets (20 probes). Data from earlier stages of development may suggest targeting 20 targets (100 probes) or more is possible. The implementation of a large number of probes in any process as part of a regulated environment will require additional validation.

Taking advantage of the above; adding the HTT control probes as an internal control could allow for debugging without impacting coverage of ROIs. The sequences for these probes can be found in the Probe design section of this document.

Off-target and background reads

Non-target DNA observed during sequencing can be split into two categories: off-target and background. - Background reads are due to enzymatic inefficiencies in the dephosphorylation step, resulting in ligation of sequencing adapters to non-cut DNA. Though the Oxford Nanopore Q Cas9 Sequencing Kit is optimised to minimise this effect, some inefficiencies are expected, and therefore some background reads will be present in sequencing runs. The extent of the background will depend on the ROIs, the probes designs, and the size of the panel being targeted. - Off-target reads are due to mismatched binding of the probes to other sites than the expected ROI. While each crRNA in a panel should result in a cut at the expected site, it may also allow cuts at sites bearing mismatches compared to the target sequence, leading to adapter ligation at those unwanted sites. The chance of this “off-target” activity can be mitigated by the careful design of crRNAs to have a minimum number of mismatches while maintaining high cut efficiency.



Verified performance claims

Verified performance claims

Performance of Q-SQK-CS9109 was verified by Oxford Nanopore Technologies using an excision approach (as per Q-SQK-CS9109 protocol) with a single gene target and with a multi-gene target experiment, set up as detailed below:

Experiment settings Single gene target experiment Multi-gene target experiment
Target regions HTT* (Chr4 3072000–3078000) HTT* - Chr4
(3072000–3078000)

SCA3 - Chr14
(92067100–92075400)

SCA6 - Chr19
(13204400–13211100)

SCA10 - Chr22
(45791500–45799400)

SCA17 - Chr6
(170556800–170566300)
Probe design method 4x control probes used for excision approach 4x probes used for excision approach for all targets (20x crRNA probes in total)
Input DNA Human GRCh38

5 µg input

N50: 22kb (based on ONT sequencing using LSK)
Human GRCh38

5 µg input

N50: 22kb (based on ONT sequencing using LSK)
Flow cell Q-FLO-MIN106D Q-FLO-MIN106D
Kit Q-SQK-CS9109 Q-SQK-CS9109
Sequencing software version Q Sequencing Software 23.06 Q Sequencing Software 23.06
Assay Sequencing Settings as per Cas9 PCR-free Targeted Enrichment Sequencing Experiment v1.0 Cas9 PCR-free Targeted Enrichment Sequencing Experiment v1.0
Analysis tools EPI2ME Labs: wf-cas9 V1.1.2 EPI2ME Labs: wf-cas9 V1.1.2

Table 6. Details of performance verification experiments.

*The 4 control probe sequences used to target the HTT gene in the single gene and multi-gene experiments can be found in the Probe design section.

Performance Metrics Single gene target experiment Multi-gene target experiment
Prep time (start of sample prep to flow cell loading) 4 hours 5 hours
Median Coverage of ROI HTT: 402x HTT: 681x
SCA3: 651x
SCA6: 558x
SCA10: 629x
SCA17: 531x
Median coverage of defined off-target region* 0x 1.5x
Modal Q score (non-aligned) Q15–16 Q15–16
Modal speed 380 bps 377 bps
Sequencing occupancy 4.5% 9.5%

Table 7. Average performance metrics for each experiment.

*INS gene (Chr11 2158100-2167500) not targeted in this experiment is used as a marker for off target coverage for the whole genome.



References

Gilpatrick, T., Lee, I., Graham, J.E. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat Biotechnol 38, 433–438 (2020). https://doi.org/10.1038/s41587-020-0407-5


Change log

Date Version Change
30 Jan 2025 Q_CSI_revA_30Jan2025 Initial publication

Last updated: 2/4/2025

Document options

Language: