Complete genomes from metagenomic samples

The advancement of DNA sequencing technologies to allow genomic analysis of samples containing many organisms (metagenomics) has made it possible to obtain genome sequences from unculturable microorganisms — shedding new light on the composition and interaction of complex microbial communities. However, the limitations of traditional short-read sequencing strategies mean that most of these sequenced metagenomes are fragmented.

Many researchers have now applied the long reads provided by nanopore sequencing to generate complete, closed microbial genome assemblies from a range of metagenomic samples.1, 2, 3

Given that nanopore sequencing can generate megabase length reads and that the typical size range of double-stranded DNA bacteriophages are approximately 3–300 kb, Beaulaurier et al.1 reasoned that it would be possible to resolve entire viruses in single reads — overcoming all of the assembly challenges associated with short-read sequencing techniques. They developed a novel assembly-free workflow to obtain complete viral genomes from environmental samples.

‘The method requires no amplification or de novo short-read assembly, and so avoids the most common biases inherent in previous approaches’1

Seawater samples obtained at depths of 25 m, 117 m, and 250 m, and viral particles were enriched using tangential flow filtration. The resulting viral concentrate was subsequently purified and prepared for sequencing on the GridION. In order to select for full-length genome sequences, all reads containing direct terminal repeats (DTRs) — which flank both ends of most dsDNA tailed phages — were retained. Reads were then grouped into discreet genome clusters and the individual reads within each cluster were used to generate a polished genome.

In total, 1,864 high-quality polished draft genomes were obtained, with the 25 m, 117 m, and 250 m samples generating 566, 93, and 1,205 unique assembly-free virus genomes (AFVGs), respectively (Table 1).

Table 1: Sequencing statistics for virus-enriched seawater samples obtained from three different depths. Table adapted from Beaulaurier et al.1

Examining the AFVGs revealed that the virus DTRs ranged from 32–4,829 bp in length, with only minor differences in length observed between the three samples. According to the researchers, ‘Such repeats would not be readily resolved via short-read assembly approaches alone, since they would either collapse into a single copy or produce circular misassemblies’.

‘The nanopore sequencing approach described here recovered many more complete virus genome sequences than did short-read sequencing and assembly approaches alone’1

Analysis of the sequence content of DTR repeats also allowed inference of the phage packaging strategy, discriminating between cleavage at specific, fixed sites and non-specific cleavage associated with the ‘headful’ packaging mechanism.

Few of the protein coding regions identified bore high similarity to those in the NCBI RefSeq database, with the largest proportion of novel genes identified in the 250 m sample. Recently researchers at the Chinese Academy of Sciences utilised the long nanopore sequencing reads delivered by the PromethION device to profile the human gut virome.4 To capture both DNA and RNA viruses, both native DNA sequencing and cDNA sequencing was performed, with the former permitting the first insights into the epigenetic modifications of phages. A highly diverse virome was observed across the five samples studied, with only a small amount of ‘core’ viruses shared between all samples. In addition, the majority of contigs were not present in existing sequence databases, suggesting the identification of previously unknown genomes.

1. Beaulaurier, J. et al. Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities. Genome Res.30(3):437–446 (2020)

2. Moss, E.L., Maghini, D.G., and Bhatt, A.S. Complete, closed bacterial genomes from microbiomes using nanopore sequencing. Nat. Biotechnol. [Online ahead of print] (2020).

3. Stewart, R.D. et al. Compendium of 4,941 rumen metagenome assembled genomes for rumen microbiome biology and enzyme discovery. Nat. Biotechnol. 37(8):953-961 (2019).

4. Cao, J. et al. Profiling of human gut virome with Oxford Nanopore technology. bioRxiv 933077 (2020).