Assembling metagenomes

Analysis of complex metagenomic samples has wide-reaching applications, with the potential to recover genomes from previously unexplored microbial species, such as those comprising gut microbiomes, as well as in the context of outbreak surveillance. Short-read sequencing and assembly of metagenomes is challenging due to the difficulty in assigning short reads to the correct genome of origin.

Bertrand and colleagues introduced the first hybrid metagenome assembler, OPERA-MS1. Applying this tool to the gut metagenomes from 28 antibiotic-treated patients, the team demonstrated that the integration of long-read nanopore sequence data with short-read data provided a 200-fold improvement in assembly contiguity (Figure 1), and enabled completion of over 80 plasmid or phage sequences. Such high-quality assemblies are likely to provide a comprehensive insight into the gut resistome.

In the context of outbreak surveillance, Liana Kafetzopoulou and her team performed in-field metagenomic sequencing and analysis during a major Lassa fever virus (LASV) outbreak2. In 2018, the Nigerian Lassa fever season experienced the largest ever upsurge in cases, raising fears of an emergent strain with increased transmission rate. The LASV genome is highly variable and therefore, for genome analysis, an unbiased metagenomic approach is preferable to targeted amplicon or capture-based whole-genome sequencing methods.

‘Portable metagenomic sequencing of genetically diverse RNA viruses on the MinION…and with no pathogen-specific enrichment, is shown to be a feasible methodology enabling a real-time characterization of potential outbreaks in the field’2

Benefiting from the speed and portability of real-time sequencing with the MinION, the team sequenced 120 infected human samples over 7 weeks at the epicentre of the outbreak. An average of 4.26% of the sequencing reads were LASV, which was sufficient for phylogenetic comparison of at least one genomic segment in 91/120 samples tested. Hepatitis A virus co-infection was also detected in one sample, comprising 0.1% of reads which provided 74% genome coverage and 20x depth, using the Centrifuge metagenomic classification software. This demonstrated the potential of their simple approach for identifying multiple RNA viruses, even within the same sample. They identified that rodent hosts were the main source of the upsurge in cases, as opposed to person-to-person transmission, with no strong evidence of a new emerging strain. The results were immediately reported to the WHO and Nigerian authorities, supporting a rapid public health response to the outbreak.

Figure 1: Increase in assembly contiguity as a function of read coverage for a representative short-read assembler (a), long-read assembler (b), and the hybrid OPERA-MS assembler (c). Unassembled genomes are shown as circles with black borders. Adapted from Bertrand (2019)1.
  1. Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37(8):937-944 (2019).
  2. Kafetzopoulou, L. E. et al. Metagenomic sequencing at the epicenter of the Nigeria 2018 Lassa fever outbreak. Science 363(6422):74-77(2019).