An unprecedented metagenomic landscape of European and North American coastal samples


We have generated the largest long-read environmental dataset we are aware of at ~10 Tb of nanopore sequencing data from the Baltic Sea and the San Francisco Estuary. Both ecosystems are highly anthropogenically impacted and important for commerce, recreation, agriculture, and wastewater treatment for millions of people. The health of these brackish water ecosystems relies on their microbiomes, so we aim to generate a complete catalogue of microbial genomes, plasmids, and viruses to help elucidate the structure and function of their microbiomes. To achieve this goal, we overcame challenges involving HMW DNA extraction from environmental samples and depleted short environmental DNA fragments, resulting in sequencing N50s of 10–20 kb. Due to the species diversity of the samples, we sequenced ultra-deep (>200 Gb per sample) and required the use of HPC systems for basecalling and assembly. We have assembled much of the data and found hundreds of complete bacterial and archaeal genomes along with thousands of plasmids, phage, other mobile elements, and environmental DNA of macro-organisms. Among the complete genomes are several from the SAR11, SAR86, and Actinomarina clades. To the best of our knowledge, complete versions of these have never been extracted from metagenomes. We are currently basecalling all of our data using the latest Dorado models. The expectation is that this will yield more accurate and complete assemblies. We are also generating full methylation data for all datasets, and we intend to use it in an effort to associate plasmids, phage, and other mobile elements with their hosts.


Lauren Lui is a research scientist in the Environmental Genomics and Systems Biology Division at Lawrence Berkeley National Laboratory. She studies how microbial communities participate in biogeochemical element cycling and how they respond to environmental pressures, such as climate change. She is developing computational and experimental methods to help better interrogate and quantify microbial community members (bacteria, archaea, and viruses) to more accurately model microbial population dynamics. Specifically, she is developing methods to improve long-read metagenomics sequencing and assembly, and uses methylation data afforded by nanopore sequencing to link mobile genetic elements (viruses, plasmids, etc.) with their hosts.

Authors: Lauren Lui