Characterization of full-length isoforms in single cells with FLT-seq and FLAMES
Matt Ritchie (The Walter and Eliza Hall Institute of Medical Research, Melbourne) outlined the goal of his project, which was to obtain full-length cDNA for sequencing on the Oxford Nanopore PromethION platform, to explore isoform heterogeneity at the single-cell level.
Matt’s team have developed the Full-Length Transcriptome sequencing (FLT-seq) protocol (Tian et al. (2020) bioRxiv; Jabbari & Tan (2019) protocols.io). With this method they have profiled cell lines (n=2), mouse stem cells (n=1), and peripheral blood mononuclear cells (PBMCs) from an individual with chronic lymphocytic leukaemia (CLL) (n=1), in order to investigate isoform heterogeneity in a variety of systems. Alongside FLT-seq, the team have also developed the new toolbox ‘FLAMES’ (Full-Length transcript quantification, Mutation and Splicing analysis for long-read data) (available at github.com/LuyiTian/FLAMES). This analysis pipeline identifies and quantifies isoforms from the input sequencing data, and detects genetic variation, and then with integrated short-read data, joint analysis of splicing is performed. Other types of sequence data, such as ATAC-seq, can also be integrated with the long- and short-read data.
To demonstrate what this all looks like, Matt displayed a UMAP visualisation of the clustering of CLL PBMC gene expression. Their differential transcript usage analysis between the cell types/clusters for each of the samples revealed a number of genes with different isoform expression patterns. For example, in the short-read data, differential gene expression of the ribosomal protein subunit RBS24 (a housekeeping gene) was observed between normal and cancer cells in the CLL sample. ‘The benefit of our long-read data though is that now we have isoform-specific counts’, and Matt demonstrated how differential expression of the four RBS24 isoforms was apparent between the normal PBMCs and cancerous cells. Matt said that this is just one of many examples, and he could ‘speak at length’ about all the interesting genes that they found.
‘The final component of our analysis’ is looking for mutations; and ‘because we have full-length isoform information, we are not just restricted to only looking for mutations in the final exon’. Similar to the differential isoform expression analysis, Matt was looking for variants that had different distribution between the cell clusters. As an example, a specific mutation in the BCL2 gene (Gly101Val) was only found in CLL cells and not the normal PBMCs. This is particularly interesting as it is known that this mutation stops the drug that this individual was being administered from binding as effectively, and therefore preventing it from working, leading to relapse. Matt stated that his future work will involve applying their FLT-seq/FLAMES protocol to additional clinical research samples, to look at isoforms correlated with treatment resistance