Requirements
Telo-Seq know-how document
FOR RESEARCH USE ONLY
Contents
Introduction
- 1. Overview of the document
- 2. Understanding telomeres through Telo-Seq
- 3. Telo-Seq protocol overview
- 4. Discontinuation of single-plex Telo-Seq
Prepare
- 5. Input mass and multiplexing
- 6. Fragment distribution
- 10. Sample origin
- 11. Q score filtering
- 12. SUP vs HAC basecalling
- 13. Adaptive sampling
Analyse
- 14. wf-teloseq
- 15. Workflow pathways
- 18. Example wf-teloseq output for a human cell line dataset
- 19. Non-matching reference
- 20. Methylation
- 21. Data availability
Telo-Seq method design considerations and validation
Accurate telomere length estimation
- 25. Fragment length affects chromosomal representation
- 26. Restriction enzyme choice may skew chromosomal arm representation
- 27. Telomere lengths vary by chromosomal arm
- 28. Mean of telomeric read lengths is not the same as the median of medians chromosomal arm telomere lengths
References
Change log
The fully released and supported Telo-Seq multiplex end-to-end protocol is now available and compatible with PromethION, MinION and GridION devices. The full method can be accessed via the documentation space on the Nanopore Community or by following this link:
The singleplex Telo-Seq end-to-end protocol is no longer supported and the RYI page will no longer be active. Please contact us at support@nanoporetech.com if you still require support with the singleplex method.
Introduction
Overview of the document
This document offers comprehensive guidance on the Telo-Seq method for sequencing telomeres in high molecular weight genomic (HMW) DNA. Telo-Seq is designed to accurately measure telomere length and assign each telomere to a specific chromosome arm. The updated workflow utilises barcoded Telo-adapters for multiplexed Telo-Seq experiments on a single flow cell, along with analysis through wf-teloseq, a bioinformatic pipeline that can be run from the command line or within the EPI2ME Desktop app.
The following key areas are covered:
- The role of telomeres in health and disease: understanding the biological significance of telomeres.
- Telo-Seq method and protocol: an overview and detailed steps of the Telo-Seq method.
- Telomeric enrichment and length estimation: techniques for enriching telomeric sequences and estimating their length.
- Sample input and fragment distribution: considerations for sample preparation and fragment analysis.
- Sequencing setup and run parameters: guidelines for optimal sequencing performance.
- Example sequencing performance and analysis pipeline: expected outcomes and analysis options.
Understanding telomeres through Telo-Seq
Telomeres are essential repetitive DNA sequences located at the ends of linear chromosomes, protecting them from degradation. In humans, they consist of repetitive n(GGTTAG) motifs, ending in a single-stranded 3' G-rich overhang (see Figure 1) (Podlevsky and Chen, 2011). Telomeres gradually shorten with each cell division, and once they reach a critically short length, cells enter a state of senescence known as the ‘Hayflick limit’ (Lulkiewicz et al., 2020). The telomeres provide protective padding as they can shorten without affecting gene expression. This shortening process is closely associated with age-related diseases, including cancer, as many cancer cells bypass this limit by reactivating telomerase or using alternative mechanisms to maintain telomere length, allowing unchecked cell growth.
Figure 1. The telomeric 3' overhang. In this example, the overhang starts with ‘GGTTAG’.
In humans there are 22 pairs of autosomal chromosomes, along with a pair of sex chromosomes XX or XY, making up 23 chromosome pairs. Both maternal and paternal chromosomes have telomeres on the P and Q arms (see Figure 2), resulting in 92 individual telomere arms.
Figure 2. Inheritance of parental chromosomes and their contribution to individual telomere arms.
Telo-Seq utilises the unique properties of telomeric DNA, allowing for precise measurements of telomere length at the chromosomal arm level. This method provides significant advantages over traditional sequencing techniques, including improved accuracy and the ability to work with high molecular weight (HMW) genomic DNA. By assigning telomere lengths to specific chromosome arms, Telo-Seq is a useful tool for understanding telomere dynamics.
Telo-Seq protocol overview
Telo-Seq is designed to accurately measure telomere length and assign each telomere to its specific chromosome arm. As illustrated in Figure 3, the step-by-step process is as follows:
1. Ligation of Telo-adapters: Telo-Seq uses the telomeric 3’ overhang to ligate custom barcoded 'Telo-adapters’ onto the end of each chromosome arm.
2. Restriction digestion: the DNA is subjected to a restriction digestion using EcoRV. The enzyme digests most of the chromosome, leaving the telomere and sub-telomere regions intact.
3. 3’ dA-tailing: after digestion, a 3’ dA-tailing step is performed to exclude the distal end of the fragment from sequencing adapter ligation.
4. Splint annealing: to mitigate dissociation of the splint from the pre-annealed Telo-Adapter, a reannealing step is carried out, ensuring the presentation of a cohesive end for sequencing adapter ligation.
5. Adapter ligation: the cohesive end created by the annealed splint is then ligated with the sequencing adapter, allowing the DNA to be sequenced.
Figure 3. Overview of the Telo-Seq library preparation.
Telo-Seq experiments sequence the “C strand” of the telomere, from the start of the double stranded portion of the telomere, from the outside of the telomere inwards through to the sub-telomere. The ssDNA 3’ overhang of the “G strand” of the telomere is not sequenced.
Discontinuation of single-plex Telo-Seq
The Telo-Seq protocol has been updated to accommodate multiplexing through barcodes. The previous single-plex approach has been discontinued. Multiplexing provides greater efficiency and cost-effectiveness by allowing multiple samples to be processed simultaneously on a single flow cell, enhancing output. This update responds to feedback from early access users and internal performance evaluations, which showed that multiplexing offers superior performance across different sample types and use cases:
- Increased throughput as up to 12 different samples can be processed concurrently on a single flow cell, reducing the time and cost per sample.
- Better utilisation of the sequencing capacity, yielding more data and greater coverage per sample.
- Barcoding and adapters: custom barcoded Telo-adapters are used to differentiate samples within the multiplex run. Each barcode corresponds to a specific sample, and careful adapter ligation ensures high specificity and minimal barcode crosstalk.
The singleplex Telo-Seq end-to-end protocol is no longer supported and the RYI page will no longer be active. Please contact us at support@nanoporetech.com if you still require support with the singleplex method.
Prepare
Input mass and multiplexing
Telo-Seq offers significant telomeric enrichment compared to standard sequencing methods, enabling precise telomere length measurements. Optimal Telo-Seq performance requires at least 5 µg of HMW DNA per barcode for a 12-plex to achieve full flow cell occupancy for optimum sequencing output. As shown in Figure 4, increasing the DNA input mass improves telomeric read output. However, inputs of less than 5 µg per barcode for a 12-plex yield insufficient library to achieve full pore occupancy on the flow cell, which in turn results in reduced telomeric read output.
Figure 4. The effect of varying input mass per barcode on Telo-Seq performance. Mean telomeric reads ± SD per barcode against input mass HG002 gDNA extracted with QIAGEN PureGene. Increasing the input DNA mass per barcode leads to improved outputs. Data was obtained from MinION flow cells run for 48 hours on GridION, with all outputs analysed with wf-teloseq. Across all tested input masses, Telo-Seq demonstrated a significant increase in telomeric reads compared to SQK-LSK114.
For accurate telomere length estimation at the individual chromosomal arm level, at least 1000 telomeric reads per barcode are required. For this reason, when processing samples through the multiplex protocol, we recommend that between the samples to be processed a minimum of 60 μg is used. For example:
- 12 x 5 μg inputs = 60 μg total
- 6 x 10 μg inputs = 60 μg total
- 4 x 15 μg inputs = 60 μg total
- 1 x 15 μg input will not yield sufficient library to fill the flow cell, and result in reduced telomeric output.
To ensure a minimum of 1000 telomeric reads per barcode, we recommend running the sequencing experiment for 48 hours. Please note, reducing the initial sample input and increasing sequencing time will not result in comparable output. Reducing the initial sample input will cause lower pore occupancy, resulting in reduced sequencing output through accelerated pore loss during the sequencing run.
Fragment distribution
Assessing fragment distribution
We have found input sample fragment distribution to be a critical variable that may impact Telo-Seq performance. Optimal Telo-Seq performance is achieved when >90% of the starting DNA fragments are longer than 10 Kbp, due to the inherent length of telomeres and sub-telomeres. Sequencing >10 Kbp fragments allows for better capture of chromosomal context for alignment and arm assignment. Successful alignment of telomeric reads to a genomic reference requires sufficiently unique sequence in the sub-telomeric regions of the chromosome. Therefore, it is recommended that DNA inputs for Telo-Seq do not contain >10% of fragments shorter than 10 Kbp, as shorter fragments may fail to map to chromosome arms, leading to poor coverage. Fragment distributions can be assessed by Pulsed-field gel electrophoresis (PFGE), however we recommend a quantitative measure like Agilent Femto Pulse.
Achieving optimal fragment distribution
Several DNA extraction methods have been tested at Oxford Nanopore Technologies. Optimal fragment distributions for Telo-Seq performance have been observed in the following extraction methods:
- QIAGEN PureGene.
- New England Biolabs Monarch HMW extraction kit (following the blood and cell extraction method with a slow agitation speed at 300 rpm).
Extraction methods like QIAGEN DNeasy and QIAGEN Genomic-tip were found to provide less suitable fragment distributions and are therefore not recommended for Telo-Seq. Other extraction methods may be used, but it is important to ensure that >90% of the fragment distribution is longer than 10 Kbp.
Correcting sub-optimal fragment distribution
If the sample has a high percentage (>10%) of fragments below 10 Kbp, consider using the Short Fragment Eliminator Kit (EXP-SFE001) to deplete shorter fragments. The use of EXP-SFE001 has been shown to improve Telo-Seq performance for samples with a large proportion of fragments below 10 Kbp.
Sample origin
Telo-Seq development and validation at Oxford Nanopore Technologies primarily used HMW gDNA extracted from GM24385 cell culture, where the telomere and sub-telomere are an average of 8 Kbp long. While the underlying chemistry should be compatible with any DNA samples containing the repetitive telomeric n(GGTTAG) motif, the performance with other samples than those outlined in this document have not been validated internally and results may vary. Some organisms may have significantly longer telomeres or sub-telomeres which could impact chromosomal mapping, or a different repetitive telomeric motif.
It is important to consider the restriction enzyme cut site positions (see Restriction enzyme choice). If processing samples which are non-human in origin, we recommend performing an in-silico digestion of the reference genome to determine theoretical cut sites and verify whether there is any cleavage within the telomere or sub-telomere. Please note, Telo-Seq is not compatible with samples that do not have a repetitive telomeric n(GGTTAG) motif as the barcoded telo-adapters will not hybridise to different motifs effectively.
Q score filtering
The wf-teloseq analysis workflow has a Q score filtering step integrated into the workflow. There is no need to modify the default Q score parameters within MinKNOW when setting up a Telo-Seq experiment or processing the data downstream.
SUP vs HAC basecalling
Whilst the telomere itself is a repetitive polymer of n(GGTTAG), it can contain minor variations within the repeating sequence. For this reason, we recommend using the SUP basecalling model for the best sequencing accuracy and alignment accuracy to each chromosomal arm (see Figure 5). Basecalling and demultiplexing are done post-sequencing using scripts provided with wf-teloseq.
Figure 5. Comparison of mapped read counts across chromosome arms for three HG002 samples. Data shows increased mapped reads when using the super accuracy (SUP) basecalling model versus the high accuracy (HAC) model. Samples were processed as part of a 12-plex GridION sequencing run, and basecalling was with version v5.0.0 models using Dorado v0.9.1.
Adaptive sampling
Adaptive sampling for Telo-Seq experiments has been trialled at Oxford Nanopore Technologies. No significant improvement in on-target sequencing was observed with adaptive sampling. Therefore, we do not recommend combining adaptive sampling with Telo-Seq.
Analyse
wf-teloseq
The Telo-Seq analysis pipeline, wf-teloseq, is hosted on GitHub. The workflow can be run from the command line or within the EPI2ME Desktop application. Prior to analysis, barcoded samples can be basecalled and demultiplexed following the instructions provided in the protocol Telomere multiplex sequencing (Telo-seq) from DNA using EXP-NBA114, EXP-ULA001, EXP-LFB001 and EXP-AUX003.
Workflow pathways
There are two options to choose from when analysing Telo-Seq data, based on the desired output: global telomere length estimation and individual chromosome arm telomere length estimation.
Pathway 1: Global telomere length estimation
The wf-teloseq pathway 1 provides a mean telomere length estimation of all identified telomeric reads without mapping. The pathway requires a minimum range of 300 – 500 unmapped telomeric reads per sample (defined as reads containing repeats of n(TAACCC) as Telo-Seq sequences the “C-strand” of the telomere) for a representative measurement without the need of a matching genome reference. This is a good option if read count is very low or a matching sample reference is not available. As a guide, 8000 total reads typically correspond to approximately 500 telomeric reads. However, the fraction of telomeric reads is influenced by multiple factors in sample preparation. This analysis pathway takes approximately 4 minutes to run for a 12-plex sample sequenced on a GridION, and 12 minutes if sequenced on a PromethION, when using 16 threads.
Pathway 2: Individual chromosome arm telomere length estimation for samples with a matched reference
The wf-teloseq pathway 2 may be used to determine counts and telomere lengths of each individual chromosomal arm, as well as the aggregate of those measurements for a global sample telomere length. This analysis requires a reference that matches the sample, with each chromosome arm represented as a contig (for instance, the P and Q arms of chromosome 1 would be two contigs). Sample-specific references for a multi-sample run can be supplied via the sample_sheet parameter in the workflow. For specific chromosomal arm coverage, it is recommended that a minimum of 10x coverage per telomere arm is achieved. Spread across 92 telomere arms (Human), 10x coverage may be achieved with a 1500 set of telomeric reads. Achieving approximately 30x coverage (e.g. 3000 filtered telomere reads) further decreases variance and provides a more robust estimate of telomere length and is desirable. This analysis pathway takes approximately 1 hour 50 minutes for a 12 sample PromethION run using 20-40 K telomeric reads per sample, and 11 minutes for GridION run using 1-3 K telomeric reads per sample, when using 16 threads.
Example wf-teloseq output for a human cell line dataset
We conducted duplicate 12-plex Telo-Seq experiments using high molecular weight (HMW) genomic DNA (gDNA) derived from three distinct sources. Six barcodes were allocated to GM24385 (HG002), three to GM24631 (HG005), and the remaining three barcodes to peripheral blood mononuclear cell (PBMC) DNA from Bos taurus (cow). An input of 5 µg gDNA was used per barcode. One replicate was sequenced on the GridION platform, while the second replicate was sequenced on the PromethION. This dataset is available via the Oxford Nanopore Technologies Open Data portal.
After analysing a representative barcode dataset with wf-teloseq via mapping to a matching reference, we can see that telomeres have broadly consistent coverage (exceptions being chr21 maternal/paternal P arms that have >60 kb read lengths) (Figure 6, top) and telomere length distribution (Figure 6, bottom). In this HG002 example two chromosome arms are identical which is why 91 chromosome arms are used (Chr13_22_paternal_P is used to indicate both Chr13_paternal_P and Chr22_PATERNAL_P). Sorting by telomere median length shows no significant bias in read count.
Figure 6. Coverage across the maternal and paternal chromosome arms for a sample with matching reference (HG002 v1.01 T2T) analysed using wf-teloseq Pathway 2. Telomere read count (top panel) and median length (bottom panel) per chromosome arm. In this example two chromosome arms are identical which is why 91 chromosome arms are used (Chr13_paternal_P is used to indicate both Chr13_paternal_P and Chr22_PATERNAL_P).
Non-matching reference
When mapping reads to a non-matching reference genome (HG002 cell line data to HG005 reference), we observed a substantial fragmentation of read groups across chromosome arms (Figure 7). This can also be seen in IGV (Figure 8), where reads from multiple chromosome arms appear as distinct groups with different variants incorrectly mapped to a single arm. To quantify this effect, we mapped HG002 sequencing data to both its matching reference (HG002) and a non-matching reference (HG005), then calculated how reads that were grouped together in the matching reference map across multiple chromosome arms in the non-matching reference (see Figure 8 and Figure 9). This demonstrates the importance of using a matching reference when available.
Figure 7. Impact of reference genome selection on chromosome-specific read mapping. Chromosome-arm assigned read counts are shown for the same dataset mapped to a matching versus a non-matching reference genome. HG002 sample DNA was sequenced on an PromethION (1 of 12 samples) and mapped against two different reference genomes (HG002, HG005). The top panel shows read mapping when using the HG002 (matching) reference genome, while the bottom panel displays mapping results using the HG005 (non-matching) reference genome. Both panels are sorted from lowest to highest read counts to facilitate direct comparison. The distribution reveals significant differences in mapping when using a reference genome that matches the sample's genetic background (HG002-to-HG002) versus a non-matching reference (HG002-to-HG005). These differences highlight the importance of reference genome selection in telomere enriched read mapping for chromosome-arm specific analyses where haplotype-specific telomere lengths are different.
Figure 8. Impact of reference genome selection on chromosome-specific read mapping. IGV screenshot displaying HG002 mapped to the matching HG002 reference (left panel) and to the non-matching HG005 reference (right panel). Reads mapped to HG005 reveal four distinct read groups originating from four different chromosome arms.
Figure 9. Changes in chromosome arm assignment when using matching vs non-matching reference. Top panel: Each bar represents a chromosome arm in the HG002 reference and the height indicates the proportion of reads in an HG002 sample that mapped to that arm and mapped to together to a target chromosome arm when using a non-matching reference (HG005). The data is sorted from highest preservation (left) to lowest preservation (right). Many chromosome arms maintain nearly perfect aggregation (close to 100%), some arms show moderate fragmentation (middle section), and a few arms show significant fragmentation (right side) where only 20-50% of reads stay together. Bottom panel: Number of source (HG002) chromosome arms contributing to each target (HG005) chromosome arm. Each bar represents a chromosome arm in the HG005 reference, showing how many different HG002 chromosome arms contribute more than 15% of reads to that location. Higher values indicate target (HG005) chromosomes receiving reads from multiple source (HG002) chromosomes i.e. where reads mapping to multiple chromosome arms in HG002 map to a single chromosome in HG005. While the top panel shows how source reads fragment across targets, this figure reveals where multiple sources converge onto the same target regions.
Methylation
Telo-Seq utilises native library preparation, preserving DNA modifications present on the nucleic acids being sequenced. This ensures that sequencing data contains the signals for any modifications, such as methylation, provided these modifications are present in the sample. Either from a read and motif centric approach, or if a suitable reference genome is available from a reference centric approach, users can interrogate these modifications. However, wf-teloseq does not currently perform methylation analysis directly. Users interested in exploring methylation data would need to process it separately using third party tools. We welcome researchers interested in this area to get in touch for further discussion or to explore collaborative solutions.
Data availability
A 12-plex Telo-seq dataset is available via the Oxford Nanopore Technologies Open Data portal. The dataset includes data for two human cell lines (HG002, 6 samples; HG005, 3 samples) and cow peripheral blood mononuclear cells (PBMCs, 3 samples) multiplexed together and sequenced on a PromethION and a GridION Flow Cell. The files included are the raw signal data (.pod5), basecalled output using SUP and HAC models (.bam), and full output of the wf-teloseq analysis workflow.
Telo-Seq method design considerations and validation
Dominant Telo-frames
Telomeres with the repetitive n(GGTTAG) sequence motif can present a 3’ overhang in one of six different frames (as illustrated in Figure 10). In humans, there is evidence suggesting that the n(GGTTAG) frame is dominant (Smoom et al., 2023). To minimise redundancy in oligonucleotides and reagents, internal investigations were conducted to corroborate this and ascertain the impact on Telo-Seq.
Figure 10. The six possible frames of the telomeric n(GGTTAG) 3' overhang, with the first seven bases highlighted in green.
A range of samples from various origins were analysed to assess the prevalence of different 3' overhang frames. Custom Telo-Adapters complementary for each of the 6 possible frames were used to specifically capture and report the different frames. The results in Figure 11 demonstrate that the n(GGTTAG) frame is indeed dominant, as reported (Smoom et al., 2023). Therefore, it is recommended that Telo-Seq users utilise only the barcoded Telo-Adapter compatible with n(GGTTAG) which is supplied in the Telo-Seq protocol. Data from the remaining five frames is minimal and will not be captured.
Figure 11. Distribution of telomeric 3' overhangs across different samples, demonstrating that the majority of telomeric output data comes from endings captured by the Telo-Adapter complementary to the GGTTAG frame.
FTU sequencing tether
The Telo-Seq protocol advises the use of the FTU flow cell tether from the Ultra-Long Auxiliary Vials expansion pack (EXP-ULA001). Figure 12 demonstrates how an increase in the number of telomeric reads obtained is observed when using FTU compared with the FCT flow cell flow cell tether supplied with the Sequencing Auxiliary Vials V14 expansion pack (EXP-AUX003).
Figure 12. Telomeric output of FCT and FTU tethered experiments. Mean output ± SD relative to FCT output of a 12-plex sequenced on PromethION for 48 hours.
Restriction enzyme choice
In Telo-Seq experiments the telomeric read length is independent of the length of the telomere itself. Telomeric fragment length is dependent on the input fragment distribution (see the fragment distribution section of this document) as well as the proximity of the restriction digestion site to the telomere individual chromosomal arms.
Alternative restriction enzymes will yield different fragment distributions for each chromosomal arm due to the different recognition sites of the enzymes. By changing restriction enzyme, the telomeric fragment length changes, but does so independently of the telomere length. This may result in improved capture of the fragment and consequently a greater representation of the corresponding telomere arm. Alternatively, it may result in reduced representation of the telomere arm, particularly if the sub-telomere becomes too short to map to a specific chromosome.
To demonstrate the impact of restriction enzyme choice, a bulk Telo-Adapted human sample was split into aliquots and subjected to restriction digestion with one of four different enzymes selected for their in silico cut site proximity to the telomere. Figure 13 shows how the mean telomeric fragment read length of the restriction enzymes EcoRV, NcoI, SpeI, and ScaI varies for different telomeric arms, and in turn impacts the relative representation of the corresponding telomere arms. The telomere arms shown in the graph were selected for the biggest deviation in the proportionate representation relative to all 92 telomeres arms, where a change of restriction enzyme, and therefore telomeric fragment length, can result in more than a 2-fold change in the relative representation.
Figure 13. Mean telomeric fragment read lengths ± SD (bars, left Y axis) and corresponding percentage telomeric reads (lines, right Y axis) across 6 different chromosome arms. As fragment read length varies between different restriction enzymes, it impacts the relative representation of different telomeric arms.
For human samples, EcoRV achieves a balanced telomeric fragment length that also allows sufficient chromosomal context for mapping. However, there are some chromosomal arms where the EcoRV fragments may be over 50 Kbp long as the recognition site is far from the 3’ telomeric overhang.
Accurate telomere length estimation
While Telo-Seq enables both global and chromosome arm-specific telomere length estimation via the wf-teloseq pipeline, we recommend using chromosome arm-level measurement when reference is available and taking a median of the chromosome arm telomere medians. Global telomere length estimation may not appropriate for many applications and can produce misleading results. This section highlights examples that have been made in examples above, drawing them together into a clear outline of why arm-specific measurements are essential for applications that require accurate interpretation of telomeric length.
Fragment length affects chromosomal representation
Telo-Seq relies on HMW DNA to ensure reads span into sub-telomeric regions, which are necessary for mapping to specific chromosome arms. If a sample has poor fragment length distribution (e.g. <90% >10 kb), some chromosomal arms will not be captured or will be underrepresented in the data.
Restriction enzyme choice may skew chromosomal arm representation
The restriction enzyme defines where DNA is cut relative to the telomere. Different enzymes produce different fragment lengths for different arms, altering how well each arm is represented. For using individual arm telomere assignment (wf-teloseq pathway 2), without sufficient chromosomal context for alignment to the reference genome, some arms may not map and therefore be under-represented, depending on the restriction enzyme and sample. This is highlighted in Restriction enzyme choice Figure 13.
Telomere lengths vary by chromosomal arm
Each of the 92 telomere arms has a unique telomere length distribution. A global average obscures this biological variability and could mask meaningful differences between arms. This is highlighted above in Example wf-teloseq output for a human cell line dataset Figure 6, but also in Figure 14 in the next section below.
Mean of telomeric read lengths is not the same as the median of medians chromosomal arm telomere lengths
A grand average of all telomeric reads will disproportionately weight over-sampled chromosomal arms, and result in a median telomere length estimation with a wider interquartile range. This is demonstrated in Figure 14 where the global length estimate is very broad.
A more precise telomere length estimation may be obtained by measuring each chromosome arm individually, where multiple measurements of the same chromosomal arm have much tighter interquartile ranges around the median measurement. It is clear in Figure 14 that each of the chromosomal arms has an individual telomere length, with some shorter or longer than the others. At 0.5 K reads, the sampling depth is not sufficient to cover all the arms (0.5 K reads would be split across the 92 arms at ~5 reads each). Doubling the coverage to 1 K reads shows all chromosomal arms are represented. Increasing the coverage again to 17 K reads yields very little change in the median and interquartile range of each individual telomere length. By comparison, the global length estimation does not change for 0.5, 1 or 17 K reads, and the interquartile range remains wide compared to the individual arm measurements.
If a global length estimation is desired, we strongly recommend taking the median telomere length of each individual chromosomal arm and then calculating the median of mediant for an equal representation of all the telomeres in the sample.
Figure 14. Violin plots of telomere length distributions for down-sampled read sets. Plots are trellised vertically for down-sampled sets, and horizontally for telomeric length measurement method. For the individual chromosomal arm telomeric measurement, violins are split in half by parental chromosome. Median and interquartile range is plotted on each violin. Total of 1,000 telomeric reads captures each chromosomal arm at approximately 10X coverage. For global length estimates on the right, all telomeric reads were used, regardless of chromosomal alignment, plotting the median and IQR. In this example, the sub-telomeric regions of two chromosome arms (Chr13_paternal_P and Chr22_PATERNAL_P, grouped under Chr13_paternal_P in this plot) are identical and therefore 91 chromosome arms are shown. The merged chromosome arm (Chr13_paternal_P) telomere length shows a large distribution likely because the two arms have distinct telomere lengths.
References
Lulkiewicz, M., Bajsert, J., Kopczynski, P. et al. Telomere length: how the length makes a difference. Mol Biol Rep 47: 7181–7188 (2020). https://doi.org/10.1007/s11033-020-05551-y.
Smoom, R, et al. Telomouse—a mouse model with human-length telomeres generated by a single amino acid change in RTEL1. Nat Commun 14: 6708 (Oct 2023). https://doi.org/10.1038/s41467-023-42534-6
Podlevsky, J.D.. and Chen, J.J.-L. It all comes together at the ends: Telomerase structure, function, and biogenesis Mutat Res. 730(0), 3–11 (2011). https://doi.org/10.1016/j.mrfmmm.2011.11.002
Change log
Version | Change |
---|---|
v4, May 2025 | Know-how document content and data overhaul for the fully released method: Telomere multiplex sequencing (Telo-seq) from DNA using EXP-NBA114, EXP-ULA001, EXP-LFB001 and EXP-AUX003 |
v3, March 2025 | Document reverted to V2 - Incorrect content had imported Jan-Feb 2025 |
v2, Oct 2024 | Addition of reference to article: High resolution long-read telomere sequencing reveals dynamic mechanisms in aging and cancer. |
v1, Nov 2023 | Initial publication |