Interview: Generating microbial insights from nanopore metagenomic data using an open-source, cloud-based metagenomics tool for researchers


Sara Simmonds is a computational biologist at the Chan Zuckerberg Initiative (CZI), where she works closely with scientists, engineers, and the product team to develop open-source software for metagenomic (CZ ID) and single-cell (CZ CELLxGENE) applications.

We caught up with Sara to discuss her research background, how tools for metagenomics sequencing are helping researchers investigate emerging infectious diseases, and what her next steps are in developing metagenomic analysis platforms. You can hear more about Sara’s work in her webinar ‘Generating microbial insights from nanopore metagenomic data using an open-source, cloud-based metagenomics tool for researchers’.

Watch talk

Could you share your research background and what first ignited your interest in genomics?

My PhD is from UCLA in evolutionary biology and ecology. I first became interested in genomics at UCLA; this was during the start of the ‘next-generation sequencing’ revolution that made it possible to sequence DNA genome-wide. Before that, we were laboriously sequencing single genes or multiplexing a small number of genes. Armed with genomic data, I went on to research the evolution of biodiversity in the different ecosystems and worked to manage endangered salmon species in the wild.

Could you dive deeper into metagenomics and how this technique has the potential to transform the way we detect and track infectious diseases?

Metagenomics (mNGS) is emerging as one of the most effective and efficient techniques for investigating infectious diseases. At a high level, it allows scientists and researchers to analyse all nucleic acids in a sample and agnostically detect the presence of pathogens — without prior knowledge of what's making someone sick. More specifically, mNGS provides scientists and researchers with a comprehensive 'genetic inventory' of all organisms within a given sample, which can then be sequenced in a laboratory setting. Once sequencing is complete, scientists and researchers can compare their results to a database of known genetic sequences to identify the disease-causing pathogen. Until recently, mNGS could only be done in specialised laboratories with access to complex data analysis tools — making the technique inaccessible for researchers in low- to middle-income countries (LMICs), who are often disproportionately burdened with treatable and preventable diseases.

How can tools for metagenomics sequencing, like CZ ID, help researchers worldwide investigate novel and emerging infectious diseases?

Created by CZI in collaboration with Chan Zuckerberg Biohub SF (CZ Biohub SF), Chan Zuckerberg ID (CZ ID) is an open-source bioinformatics platform that helps researchers detect, identify, and track infectious diseases worldwide, regardless of computational knowledge or resources. The platform quickly processes sequencing data and generates results that provide actionable information on the pathogens in a given set of samples. By enabling fast, easy access to mNGS results, researchers and scientists worldwide can make data-driven decisions that advance our understanding of infectious diseases and inform treatment and control efforts. Similarly, platforms like CZ ID are especially beneficial for infectious-disease researchers in LMICs, who often need greater access to the tools, resources, and technologies to investigate and detect novel and emerging pathogens. We're very excited by the growing reach of CZ ID, which is already being used in over 73 countries to track different diseases affecting local communities — from the surveillance of zoonotic diseases in Madagascar to a study on the root causes of neuroinfectious disease in Vietnam.

CZ ID just released a new metagenomics pipeline for analyzing long-read nanopore data. How will this capability support more infectious disease researchers around the world?

This April, we announced a new metagenomics module for CZ ID to analyse long-read nanopore data. The new capability allows researchers to upload, process, and analyse mNGS nanopore data within the platform. Oxford Nanopore's sequencing technology provide researchers with cost-effective, portable tools to sequence DNA/RNA, and through our collaboration, mNGS nanopore data can now be plugged into CZ ID for metagenomic analysis. This new capability makes mNGS research possible for more infectious disease researchers around the world, especially those who rely on nanopore sequencing to investigate pathogens affecting their communities.

What are the next steps for your work in developing metagenomic analysis platforms?

Our team is adding more functionality to the CZ ID mNGS nanopore module, including data visualisation with a heatmap to compare multiple samples. And later this year, we're launching antimicrobial resistance (AMR) gene analysis using short-reads, making CZ ID the first metagenomics tool to integrate microbe detection and AMR gene analysis.