Great Barrier Reef Microbial Genomes Database: enhanced recovery of prokaryote, viral, and T2T eukaryote genomes | LC 25


Biography

Dr Steven Robbins is an environmental microbiologist and bioinformatician at the Australian Centre for Ecogenomics, where he uses meta-omic techniques to characterise the microbial communities associated with corals, marine sponges, and coral reef seawater to clarify their roles in maintaining the health and stability of the Great Barrier Reef.

Steven’s current role as lead data analyst constructing Australia’s Great Barrier Reef Microbial Genomes Database (GBR-MGD) involves employing novel technologies and bioinformatic methods to deliver actionable reef management outcomes.

Abstract

Coral reefs are under unprecedented threat due to climate change, with microbes underpinning critical reef processes. The establishment of large databases of metagenome-assembled genomes (MAGs) has markedly improved our understanding of marine microbes but seldom include Australian oceans or coral reefs globally.

These databases have focused on marine prokaryotes, although paradoxically, MAGs from most dominant lineages (Pelagibacter, Prochlorococcus, etc.) are largely absent, a phenomenon hypothesised to result from an inability of short reads to resolve strain-diverse populations.

To address these gaps, the Great Barrier Reef Microbial Genomes Database (GBR-MGD) was established by subjecting seawater from across the GBR to nanopore-based sequencing, hypothesising that long reads could span strain-variable regions, enhancing MAG recovery from ‘difficult’ taxa. Of the >5,000 prokaryote MAGs generated, ~1,500 are near complete and >350 are circular, including complete Pelagibacter.

Systematic benchmarking against short-read-only data showed that all difficult taxa are reliably recovered (often circularised) in nanopore-based metagenomes, but not those using Illumina short reads. We show that this phenomenon results not only from an inability of short reads to resolve strains, but also from platform-specific GC bias.

Our nanopore-based strategy also facilitated the recovery of chromosome-level, telomere-to-telomere picoeukaryote MAGs and >100,000 viruses, including novel clades of marine Crassvirales.

Leveraging the GBR-MGD, we find that a subset of difficult microbial taxa can reliably predict the effects of fisheries management practices on coral reefs using machine learning. Hence, the GBR-MGD represents an unprecedented holistic resource for marine researchers and policy makers, made possible only through the use of long-read sequencing.

Authors: Steven Robbins