Amanda Warr - Going full circle: Assembly of high-quality, single-contig microbial genomes from the rumen microbiome using long-read sequencing
London Calling 2019
Ruminants such as cows and sheep are important livestock species. They convert low nutritional value plant matter into high-quality meat and dairy products. Within a specialised stomach called the rumen, microbes ferment the plant matter producing short-chain fatty acids from difficult to digest plant matter. The composition of the rumen microbial community can affect the animal’s health, feed efficiency and level of methane production. Species in the rumen are typically difficult to culture and despite its importance, it remains an underexplored environment. DNA sequencing of the contents of the rumen offers the potential to identify microbial species without culture techniques. Here we sequence cow rumen fluid using Oxford Nanopore sequencing. We show that despite these data coming from a highly complex microbial sample we can assemble high-quality, single-contig whole genomes and plasmids of known and novel species, including numerous circular contigs. Additionally, we compare and validate the assemblies of these genomes with binned genomes generated from short read Illumina assemblies. We show that the long-read assembly out performs the short-read assembly in contiguity and in incorporation of important features such as AMR genes and marker genes..
In the second presentation in this session, Dr. Amanda Warr from The Roslin Institute shared her work on the use of nanopore sequencing to assemble high-quality, single-contig microbial genomes from the rumen. Opening her presentation, Amanda gave a brief background on ruminants, of which there are approximately 200 species, including economically important livestock species such as sheep and cows. These animals have a specialized stomach, called the rumen, which breaks down difficult to digest plant matter to provide energy and nutrients. This is achieved with the help of a complex microbial community. Amanda explained how the composition of this community can affect animal health, feed efficiency, and the level of methane production. As a result, the study of rumen microbial composition is gaining increasing importance, not least due to the fact that the contribution of methane to global warming is 25 times that of carbon dioxide — with farmed ruminants being responsible for 14% of human-associated methane production. Knowledge of rumen microbial composition could allow improved feed conversion efficiency, reduce methane emissions, and improve animal health. It may also contribute to the identification of novel enzymes of relevance to the production of biofuels.
Amanda described how, until recently, the classification of rumen microbiome samples has been very poor. She suggested that if you included all of the relevant databases and ‘were lucky’ you might get up to just 15% of reads being classified. The gold standard for obtaining genomes is through culture, but this is expensive, time consuming, and many of these rumen microbes cannot be cultured. As such, metagenomic DNA sequencing offers an attractive solution to characterising the rumen microbiome. However, genome assembly from complex mixed microbial samples using traditional short-read sequencing technology is challenging, resulting in highly fragmented genomes which lack repetitive regions, often including the 16S genes. As a result, Amanda turned to nanopore sequencing, which allows the generation of long and ultra-long sequencing reads offering the potential to generate highly complete microbial genomes.
Amanda presented results for the sequencing of a rumen sample taken from a beef cow using the MinION. Following assembly using Canu, over 31 circular contigs were obtained. The longest of these contigs, at 3.8 Mb, was for Prevotella copri – representing the first publicly available, single-contig assembly of the species. This assembly compares favourably with the reference genome, which is comprised of 27 contigs. In addition, the team generated the first assembly for the species Selenomonas ruminatium. Data was also shown for 26 small, circular contigs. These comprised novel and known plasmids (one of which had at least 3 antimicrobial resistance genes) and novel bacteriophages.
Comparison of these assemblies to those generated using short read sequencing technology revealed that long-read nanopore assemblies provided superior contiguity. In addition, they also incorporated a higher number of marker genes (including 16S) and more complete antimicrobial resistance genes. In the Prevotella copri genome, the team identified 25 polysaccharide utilisation loci, which may be of interest in the production of biofuel. Amanda stated that this finding would not have been possible using the short-read data.
The initial nanopore sequencing work was achieved using R9.4 (RevC) MinION Flow Cells; however, subsequent work sequencing the rumen microbiomes of five dairy cows, was performed using the newer R9.4.1 (RevD) MinION Flow Cells. Amanda showed data highlighting how the RevD flow cell delivered up to four times more data than the best performing RevC flow cell. Preliminary data comparing the microbiome composition of beef and dairy cows revealed distinct differences, with significantly higher representation of Prevotella in the dairy cows tested.
Summarising her presentation, Amanda reiterated that the rumen microbiome contains an abundance of undiscovered species and that long reads can allow the assembly of whole genomes from complex microbiomes in a single contig.