Pioneering single-cell sequencing — an interview with Fuchou Tang
Cells that share the same genetic blueprint can play vastly different roles within the body. To control these different functions, we expect epigenetic regulation and isoform expression to differ between cell types, but bulk analysis and legacy sequencing methods can't capture this variation.
We sat down with Fuchou Tang (Biomedical Pioneering Innovation Center, Peking University, China) to learn how he is tackling this shortfall with his cutting-edge single-cell multiomic sequencing methods. Read on to learn about these novel approaches and how they could shed light on complex tissues, germline cell development, and tumorigenesis.
'During my postdoc research, I had already developed the first single-cell transcriptome sequencing technology in the world.'
Fuchou Tang, Biomedical Pioneering Innovation Center, Peking University, China
From pioneering postdoc to principal investigator in single-cell sequencing research
What are your current research focuses?
My lab has two major interests: firstly, developing new single-cell omics sequencing technologies for genomics studies; secondly, using these cutting-edge technologies to study the epigenetic regulation of human germline cell development as well as the tumorigenesis process.
(1) Developing new single-cell omics sequencing technologies for genomics studies:
For this, we are trying to develop methods with single-molecule , single-cell multiomics sequencing technologies. There is still a huge amount of ‘dark matter’ in different layers of our ‘omics’. For example, it is well known that there are about 20,000 protein-coding genes in the human genome. Nearly all single-cell transcriptome studies treat one gene in a cell as a functional unit. However, we know that these 20,000 protein-coding genes can generate 170,000 different RNA isoforms, and from them generate 70,000 different proteins. That means, on average, one protein coding gene can generate 8–9 different RNA isoforms and 3–4 different proteins.
Different proteins from the same gene can have different or even opposite biological functions. For example, the BCL-X (BCL2L1) gene can generate two different proteins: BCL-X(L) and BCL-X(S). BCL-X(L) represses apoptosis whereas BCL-X(S) promotes apoptosis. So, it is unhelpful to simply know if the BCL-X gene is expressed in an individual cell or not; you need to know if the RNA isoform for BCL-X(L) or BCL-X(S), or maybe both, is expressed in an individual cell or not.
More importantly, it is estimated that there are many more currently unknown (novel) RNA isoforms from these 20,000 protein coding genes in our genome. Only single-molecule, single-cell transcriptome sequencing technologies can systematically resolve these complex issues of a cell. So, my lab is trying to systematically develop single-molecule, single-cell sequencing technologies for transcriptome, genome, epigenome (chromatin accessibility, 3D genome structure, etc.), multiomics, etc. analyses.
(2) Using these cutting-edge technologies to study epigenetic regulation:
Germline cells are crucial for transmitting genetic information from generation to generation and keeping a species stable for millions of years. However, at many stages during human embryonic development, germline cells are rare and difficult to access. More importantly, the germ cells are nearly always mixed with other types of cells in human embryos. Even in mouse models, it is extremely difficult to get millions of pure (the same cell type at the same developmental stage) germ cells for bulk sequencing studies.
Only single-cell omics methods can universally analyse their gene regulation features thoroughly. During the past ten years, with the help of single-cell sequencing technologies, we have made tremendous progress in understanding the epigenetic regulation of human germ cell development.
How did your scientific research start and what led you to these current research focuses?
When I did my postdoc research in Azim Surani's lab, I worked on germline cell development using mouse models. After I set up my own lab, I thought it was natural to move one step further: to directly study the epigenetic regulation of human germline cell development. During my postdoc research, I had already developed the first single-cell transcriptome sequencing technology in the world (2009). It felt natural that in my own lab I tried to develop other single-cell omics sequencing technologies to facilitate developmental biology studies.
Published research
Can you briefly explain what the techniques scNanoHi-C and scNanoATAC-seq are used to investigate?
The technique scNanoATAC-seq can be used to simultaneously analyse chromatin accessibility and genetic changes (especially structural variations) in an individual cell, especially at complex genomic regions. scNanoHi-C can be used to analyse haplotype-resolved 3D genome structures in an individual diploid cell (the majority of cells in the human body are diploid).
Notably, it can routinely identify higher-order (multi-way) chromatin interactions of an individual cell. That is, multiple enhancers simultaneously binding to the same promoter to promote its transcriptional activity, or one enhancer simultaneously binding to multiple promoters to promote their transcriptional activities.
Why did you decide to use Oxford Nanopore technology in this research?
Long reads are essential to analyse higher-order chromatin interactions within an individual cell. Since these methods are based on amplification of genomic DNA fragments within an individual cell, the methylation information is lost in our scNanoATAC-seq and scNanoHi-C methods.
We have since been trying to develop a single-molecule-based DNA methylome sequencing technology.
Read the team's publication, in Cell Research (2023), demonstrating a nanopore sequencing-based method (scNanoCOOL-seq) for combined analysis of genome (copy number variation), methylome, chromatin accessibility, and transcriptome, in the same individual cell.
Can you explain why phasing is important in this research?
It is estimated that about 10–30% of the chromatin interactions are in trans. That is, between DNA fragments from different chromosomes. However, the two homologous chromosomes in a cell usually locate at different positions (different chromosome territories) in the nucleus and have different neighbouring chromosomes and different trans-chromatin interactions. That is, the trans-chromatin interactions are usually allele specific. So, you have to phase the genome before you can identify the trans-chromatin interactions (allele specific chromatin interactions).
How might looking at the 3D genome in single cells impact scientific research in areas such as cancer?
It is well known that 3D genome structure changes drastically during tumorigenesis and is very likely to contribute to the tumorigenesis process. However, tumour tissues are always a mixture of many different types of cell. Even just the cancer cells in tumour tissues can have different genetic clones and subpopulations. So, it is essential to use single-cell 3D genome structure analysis methods to analyse the chromatin interactions in the tumorigenesis process.
In your paper on scNanoHi-C, it was suggested that extrachromosomal DNA (ecDNA) affects the 3D genome and maybe oncogene expression, were you surprised by these results?
We are not surprised but excited by these findings. Now we have a concrete way to study the looping and interaction of the enhancers and promoters within an ecDNA molecule. We can also confidently study the interaction of the enhancers in an ecDNA molecule with the promoters within a linear chromosome, especially the higher-order complex interactions. We will definitely investigate them further.
Research highlights and challenges
'We believe that the era of third-generation, single-cell omics sequencing technologies is coming.'
Fuchou Tang, Biomedical Pioneering Innovation Center, Peking University, China
Do you have any highlights or successes that stand out in your research?
We believe that the era of third-generation, single-cell omics sequencing technologies is coming. It will help us to resolve many mysteries of the ‘dark matter’ in different layers of our omics. For example, there are about 14,000 pseudogenes in the human genome and many of them are transcribed in the cells, and the transcripts from many of these pseudogenes are functionally important for the cells. However, short-read platform-based single-cell transcriptome sequencing technologies in general cannot accurately identify the transcripts from the pseudogenes. Third-generation single-cell transcriptome sequencing technologies can reliably resolve this issue.
What have been the main challenges in this research and how have you approached them?
There are two major hurdles when developing single-cell omics sequencing approaches: (1) how to amplify long DNA fragments evenly and efficiently. Most of the previous amplification methods are suitable for short DNA fragments, but not for long DNA fragments. (2) How to computationally handle the long reads with higher sequencing errors (1–10% errors instead of 0.1% errors).
Future research
'We believe that single-molecule sequencing technologies will revolutionise the single-cell omics field within the next 3 to 5 years and change the genomics field profoundly.'
Fuchou Tang, Biomedical Pioneering Innovation Center, Peking University, China
What are you most excited about in your future research?
We are most excited by the possibility that within the next several years we will be able to see the ‘dark matter’ in our transcriptome, genome, and epigenome, in every individual cell in our body, and understand how each contributes to the gene regulation network in human biology.
Want to join Fuchou Tang in exploring the ‘dark matter’? Get started with our single-cell workflow overview.
Oxford Nanopore Technologies products are not intended for use for health assessment or to diagnose, treat, mitigate, cure, or prevent any disease or condition.
Find out more:
Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6:377–382 (2009). DOI: https://doi.org/10.1038/nmeth.1315
Fan, X. et al. Single-cell RNA-seq analysis of mouse preimplantation embryos by third-generation sequencing. PLoS Biol. 18(12):e3001017 (2020). DOI: https://doi.org/10.1371/journal.pbio.3001017
Fan, X., Yang, C., and Li, W. et al. SMOOTH-seq: single-cell genome sequencing of human cells on a third-generation sequencing platform. Genome Biol. 22:195 (2021). DOI: https://doi.org/10.1186/s13059-021-02406-y
Hu, Y., Jiang, Z., and Chen, K. et al. scNanoATAC-seq: a long-read single-cell ATAC sequencing method to detect chromatin accessibility and genetic variants simultaneously within an individual cell. Cell Res. 33:83–86 (2022). DOI: https://doi.org/10.1038/s41422-022-00730-x
Xie, H. et al. De novo assembly of human genome at single-cell levels. Nucleic Acids Res. 50(13):7479–7492 (2022). DOI: https://doi.org/10.1093/nar/gkac586
Xie, H. et al. Long-read-based single sperm genome sequencing for chromosome-wide haplotype phasing of both SNPs and SVs. Nucleic Acids Res. 51(15):8020–8034 (2023). DOI: https://doi.org/10.1093/nar/gkad532
Liao, Y., Liu, Z., and Zhang, Y. et al. High-throughput and high-sensitivity full-length single-cell RNA-seq analysis on third-generation sequencing platform. Cell Discov. 9:5 (2023). DOI: https://doi.org/10.1038/s41421-022-00500-4
Li, W. and Lu, J.et al. scNanoHi-C: a single-cell long-read concatemer sequencing method to reveal high-order chromatin structures within individual cells. Nat. Methods 20:1493–1505 (2023). DOI: https://doi.org/10.1038/s41592-023-01978-w
Lin, J. and Xue, X. et al. scNanoCOOL-seq: a long-read single-cell sequencing method for multi-omics profiling within individual cells. Cell Res. 33:879–882 (2023). DOI: https://doi.org/10.1038/s41422-023-00873-5