Optimisation of library preparation for longer cell-free DNA (cfDNA) libraries

Introduction

The analysis of cell-free (cf)DNA has shown significant potential for a variety of diagnostic applications, including the detection of cancer and the identification of the tissue of origin. For example; recent studies using single-molecule sequencing have highlighted the diagnostic value of cfDNA in maternal (Choy et al. 2022) and oncological contexts (Yu et al. 2023), where they found that increased CpG and SNP sites in long cfDNA molecules significantly enhanced identification of the tissue-of-origin.

cfDNA predominantly circulates in blood as fragments that are multiples of nucleosome lengths, a phenomenon thought to result from the action of DNases on histone-protected DNA. This leads to a characteristic length profile reflecting the fixed positioning of nucleosomes along the DNA (Figure 1).

cfDNA long lib optimisiation 1 Figure 1. Fragment length profile of cfDNA extracted from plasma. This figure illustrates the fragment length distribution of cfDNA obtained from plasma samples, measured using capillary electrophoresis. This characteristic profile illustrates the nucleosome-associated banding patterns. Peaks are observed at ~163 bp, ~307 bp, among others, which correspond to the lengths of DNA protected by nucleosomes. These peaks are hereafter labelled sequentially as Peak 1, Peak 2, etc. (Figure adapted from Pedini et al.)


In this study we show that the characteristic length profile of cfDNA can be manipulated by performing size selection during library preparation to enrich for longer fragments. We also show that the fragment length distribution of the sample is a reasonable predictor of both the raw read length and aligned read length profiles observed from sequencing.


Methods and results

cfDNA was extracted from 5 ml or 10 ml of plasma using the QIAamp MinElute ccfDNA Midi Kit (QIAGEN) following the manufacturers instructions. The resulting cfDNA was quantified using a Qubit dsDNA HS Kit (ThermoFisher) and fragment size quantification performed using the Femto Pulse (Agilent) (Table 1 and Figure 2). Libraries were prepared for sequencing with the our legacy Human blood cfDNA protocol using cfDNA from the 5ml and 10 ml extractions (~30 ng and ~60 ng inputs respectively).

We now recommend using our new updated methods:

To enrich for longer cfDNA fragments, we tested lower ratios of AMPure XP Beads (AXP) in the clean-up after end-preparation, from 3x (current recommendation) to 1x, 0.8x and 0.6x (as described in Table 2).

The purified DNA was quantified (Qubit) and analysed by Femto Pulse (Agilent) (Figure 3A) and library preparation was performed by ligation of sequencing adapters, as described in the legacy protocol. The libraries were sequenced on PromethION R10.4.1 flow cells with the minimum read length set to 20 bp. We recorded the read length and flow cell output (Table 2 and Figure 3B). Reads were aligned using Minimap2. Raw reads and primary alignments were binned into nucleosomal peaks and the proportion of data in each peak was compared to the fragment lengths of the size selected material determined by the Femto Pulse (Agilent) (Figure 4).

cfDNA long lib optimisiation Table 1 v2 Table 1. cfDNA recoveries from extractions of 5 ml and 10 ml of human plasma. Technical replicates were pooled per volume, with total ng inputs for sequencing libraries indicated by the average yield.


cfDNA long lib optimisiation Table 2 Table 2. Library preparation and sequencing statistics for cfDNA libraries prepared in this study.
* The "Approximate Flow cell load (fmol)” is calculated (retrospectively) using the measured read N50 (bases) and the library yield (ng) : mol loaded = mass loaded / (read N50 x 660 Da).


cfDNA long lib optimisiation 2 Figure 2. Fragment length profile of extracted cfDNA. Femto Pulse profile (Panel A) and stacked bar chart (Panel B) showing the distribution of nucleosomal peaks 1-4.


cfDNA long lib optimisiation 3 Figure 3. Fragment length and sequencing read length profile of size selected cfDNA (extracted from 5 ml plasma). Panel A: Fragment length profiles were determined using the Femto Pulse system (Agilent). Panel B: Read length profiles were determined by preparing LSK114 libraries and sequencing on PromethION. Similar profiles were obtained for cfDNA extracted from 10 ml of plasma (data not shown).


Fragment Length Stacked Figure 4 Figure 4. Representation of the different nucleosome peaks (1-4) with varying degrees of size selection. cfDNA was extracted from 5 mL of plasma and prepared for sequencing using different ratios of AXP during the library preparation. Fragment length (F), read length (R) and alignment length (A) are shown for each condition.


Discussion

We find that by altering the AMPure XP Beads (AXP) ratio, it is possible to achieve enrichment of longer fragments, and the fragment length profile observed after end-prep closely matches both the sequence and aligned read lengths obtained (Figures 3 & 4).

It was observed that reducing the AMPure XP Beads (AXP) ratio (thereby intensifying the size selection process), leads to an increased loss of material (see Table 2). Notably, using a 0.6x ratio results in significant material loss, subsequently leading to reduced loading of library onto the flow cell, particularly when starting with 5 ml of plasma. This leads to a reduction in flow cell output. To counteract this reduction in flow cell output when using a 0.6x AXP bead ratio, increasing the starting plasma volume from 5 ml to 10 ml proved effective. If more than 5 ml of plasma is not available, users should exercise caution when using the 0.6x AXP condition.


Summary

Modification of the bead-based clean-ups during library preparation can be an effective way of enriching for longer fragments in cfDNA samples. We observe good correlation between the fragment length observed after size selection and the resulting read lengths obtained in sequencing.


References

Choy L, et al. Single-Molecule Sequencing Enables Long Cell-Free DNA Detection and Direct Methylation Analysis for Cancer Patients, Clinical Chemistry, Volume 68, Issue 9, (2022). https://doi.org/10.1093/clinchem/hvac086

Pedini, P., Graiet, H., Laget, L. et al. Qualitative and quantitative comparison of cell-free DNA and cell-free fetal DNA isolation by four (semi-)automated extraction methods: impact in two clinical applications: chimerism quantification and non-invasive prenatal diagnosis. J Transl Med 19, 15 (2021). https://rdcu.be/duYd9

Yu, S.C.Y., Choy, L.Y.L. & Lo, Y.M.D. ‘Longing’ for the Next Generation of Liquid Biopsy: The Diagnostic Potential of Long Cell-Free DNA in Oncology and Prenatal Testing. Mol Diagn Ther 27, 563–571 (2023). https://doi.org/10.1007/s40291-023-00661-2

Change log

Version Change
v3, 01st July 2024 Updated Figure 4 image
v2, May 2024 Updates including new protocol links
v1, Feb 2024 Initial publication

Last updated: 7/1/2024

Document options