Interview: Identification of structural variation in chimpanzees using optical mapping and nanopore sequencing


Daniela Soto is a Ph.D. candidate in the Integrative Genetics and Genomics program at UC Davis, working in the Dennis Lab where she is using long-read sequencing technologies to unveil the variability landscape of complex genomic regions in great apes. Ahead of her upcoming webinar with Technology Networks, we caught up with Daniela to talk about her current research and how long-read sequencing is helping to identify and characterise structural variation in great apes.

Daniela will be presenting her webinar ‘Identification of structural variation in chimpanzees using optical mapping and nanopore sequencing’ on Thursday 16th July 2020 at 5pm UK time.

What are your current research interests?

Currently, I am interested in complex genomic variation and its impact on species-specific traits in great apes. Additionally, I am interested in bioinformatics reproducibility, workflow automation, and data science.

What first ignited your interest in genomics?

When I finished my master’s degree in biochemical engineering, I knew I wanted to switch gears and work in a field involving data science, coding, and biology. To my excitement, I discovered that there was such a discipline – bioinformatics! I then landed on a bioinformatics position in my home country, Chile, and later started a Ph.D. in genomics in the USA.

Can you tell us more about how long-read sequencing technology is changing your field? How has it benefited your work?

Long-read sequencing is allowing us to unveil the variability landscape of complex genomic regions – such as polymorphic structural variants – in great apes. Short reads usually lack enough sequence context to characterize large variants at breakpoint resolution and their nucleotide sequence. Therefore, laborious techniques like BAC clone sequencing had to be employed to study them. With long reads, it has never been easier to detect such variants, allowing to sequence multiple individuals to ultimately answer questions about the biological impact of these variants.

What impact could a greater understanding of structural variation within chimpanzees have?

Structural variation accounts for more genetic differences between humans and our closest living relatives, chimpanzees, than other sources of variation. Structural variants have also been a hallmark in the evolution of great apes, whose genomes are enriched in large interspersed duplications – some of them associated with regions of copy-number polymorphism. Thus, structural variants have been proposed as candidates to harbor the genetic differences underlying species-specific traits in great apes. To study their role in evolution and disease, a thorough characterization across multiple species and individuals is required. By sequencing two new chimpanzees with long-read sequencing, we aim to collaborate in this larger effort to thoroughly characterize the landscape of structural variants and their functional impact.

What have been the main challenges in your research and how have you approached them?

In particular, one of the main challenges of this research is the validation of positive calls. There is still a need to manually check some of these variants, especially those involving complex rearrangements, such as combinations of multiple variant types. We used short reads as an orthogonal data type to increase the confidence of our calls, as well as manual curation.

What’s next for your research? Are you planning on studying other great ape genomes?

Focusing on chimpanzees, we are currently studying large-scale rearrangements using a de novo genome assembly approach. Additionally, to further study the genetic basis of species-specific traits, we are focusing on duplication events private to the human lineage and the genes and regulatory elements within them.

What is your advice for someone getting started?

For someone starting with long reads, my main advice is to keep track of the literature. It is a rapidly evolving field, and tools and approaches change often. And for someone getting started in genomics, my advice is to find passion for what you do. It has not been easy for me to be a foreign Latina woman in the USA, especially with the uncertainty regarding our visas amid COVID-19. But when things are difficult, the passion I have for my research allows me to persevere.

Register to attend Daniela's webinar ‘Identification of structural variation in chimpanzees using optical mapping and nanopore sequencing’ on Thursday 16th July 2020 at 5pm UK time.