Long-read metagenomic sequencing of faecal samples to study convergent dietary adaptation in ant-eating mammals


Sophie began by providing some background to her work on myrmecophagy, which she stated was a diet derived largely from ingesting ants and or termites. Sophie described myrmecophagy as a classic example of convergent evolution, defined as independent evolution of similar phenotypes in different lineages. She listed the five mammals displaying myrmecophagy, three of which form the subject of her work: aardvark, ground pangolin and the southern aardwolves. She outlined the main aim of her project was to uncover how these three different species adapt to the myrmecophagy diet. To do this, she followed the CovergeAnt project approach, which integrates data from three disciplines; genomics, metagenomics and morphology. Sophie noted that the main focus of her talk was on the metagenomics facet, with a particular interest in the role of the gut microbiome.

The first question she wanted to address was whether the gut microbiome of myrmecophagous species was similar in terms of taxonomy and gene content. Further to this, she wanted to assess the presence of symbiotic bacteria that aid the digestion of prey and chitin, which she pointed out is the main composite of the exoskeleton of insects. If there were such bacteria present, Sophie expressed the importance of comparing such theses bacteria across the three myrmecophagous species, and to subsequently analyse their gene content. She then summarised her aims as wanting to first identify the microbial taxa present in the gut microbiome of myrmecophagous species, then reconstruct their genomes and finally study chitin degradation pathways.

Sophie began to discuss the process of her research, starting out with the collection of faecal samples in French Guiana and South Africa. She explained that the faecal samples were either obtained from roadkill dissections or by following the organisms. For example, for the aardvark they looked for burrows and located nearby faeces. Following faecal collection, they added ethanol and stored the sample at -20 deg. Then back in the lab, Sophie proceeded with DNA extraction prior to long read sequencing. As such, she stated the need for high molecular weight DNA. However, in Sophie’s case, following just one DNA extraction, small DNA fragments and degraded DNA were observed, which necessitated two successive DNA extractions to obtain the high molecular weight DNA. Sophie postulated that in the first DNA extraction, they were obtaining extracellular DNA, and the second extraction led to cell lysis enabling access to the bacterial intracellular DNA. In order to optimise the protocol further, she added a purification step following the extraction.

Sophie went on to talk about nanopore sequencing. She loaded 300 – 600 ng of DNA per flow cell and used 1 flow cell per sample. Each sample was run on a MinION Mk1C device for 48 hours. For the three species, she sequenced three samples for the aardvark, three samples form the pangolin and five samples for the aardwolf. Sophie touched upon her bioinformatics pipeline, using Guppy for basecalling and Porechop for adapter removal. MetaMaps was used to filter host and human reads. Because MiniSeq+H contains more than 12, 000 microbial genomes, it was used to the taxonomic profiling of the metagenomes.

Next Sophie talked through some of her results from the taxonomic profiling, which showed the main bacterial phyla present in the myrmecophagous gut metagenomes. Sophie initially directed her focus to the abundant unclassified reads, which did not map to any of the genomes in the database. She expressed that the incompleteness in these reference databases poses a problem in these types of analysis, as much of the microbial diversity is still unknown. Nonetheless, Sophie was still able to find bacterial phyla that were expected in the gut microbiome, namely Bacteroidetes, Firmicutes, and Proteobacteria. Based on what was written in the relevant literature, she then looked closer at the microbial taxa that may be involved in chitin degradation. Sophie proceeded to give some examples, including the Bacteroidetes Chitinophaga. These types of microbes were found in all the metagenome samples, suggesting that myrmecophagous species may use such bacteria to aid in the digestion of their prey. After identifying the suspected microbial taxa, Sophie explained that she wanted to look at their genomes to further understand mechanisms involved in chitin degradation. To do this, she performed de novo assembly of the reads using two long-read metagenome assemblers: MetaFlye and Raven. Sophie performed one assembly per sample, and compared the assemblies in terms of important metrics and number of genes using the tool anvi’o. She displayed her results, showing that Raven assemblies were more contiguous, albeit slightly shorter than MetaFlye assemblies.

Sophie moved on to her approach to find genes in her assemblies using anvi’o. Using the tool prodigal to search for ORFs within the contigs and HMMER to search for single copy genes that are specific to bacteria, archaea, and eukaryotes. Sophie then showed her results, with more genes being identified in the MetaFlye assembly compared to the Raven assembly. Next, in order to obtain metagenome assembled genomes (MAGs), she clustered the contigs belonging to the same genome together (genome binning) using CONCOCT. Sophie revealed she was able to retrieve more than 100 genome bins for the aardvark and aardwolf. Subsequently, she selected for potentially good MAGs, with bins that exceeded 70% completion and a maximum of 20% redundancy. Sophie mentioned there is scope to manually refine the bins to lower the redundancy to around 10%, which she added is the accepted threshold.  Overall, Sophie reported up to 30 good MAGs across the species. She then estimated the taxonomy of those genome bins using the SCG taxonomy module within anvi’o.

Sophie proceeded with the process for uncovering if the microbes harbour the chitinase gene. To do this, she stated they used the dbCAN2 tool on the genome bins to look for GH18 genes, which encode a family of enzymes that degrade chitin or related polysaccharides. In the sequences of these genes, Sophie looked for the chitinolytic domain, comprised of 7 amino acids, of which four are conserved and are essential for the chitinolytic activity.  Subsequently, she used BLAST to determine if these genes were similar to known bacterial chitinases. Sophie reported that up to 15 genome bins for each organism contained at least one GH18 gene, and that each genome bin had a total length of around 3-4 Mb with around 4,000 genes, a completion of around 90% and a redundancy of around 15%. Then looking at the taxonomy estimated for the genome bins using anvi’o, Sophie found that the taxa are consistent with those already reported in the literature. Sophie pointed out an interesting finding — the representation of taxa that include species that have been isolated in the environment but are yet to have been reported to be part of the gut microbiome. She went on to explain that one of their hypothesis’ was that bacteria can be recruited from the environment into the gut microbiome, and as such she wants to further investigate this by sequencing soil samples from the myrmecophagies habitats. Sophie then compared data from the ground pangolin, with gut microbiome data of a Malayan pangolin generated in a previous study. She found an overlap of several species, all sharing chitinolytic genes. Sophie then looked at the sequences of these genes, and identified the chitinolytic domain, albeit a few of them harboured mutations. Sophie suggested that this could confer an inability to degrade chitin but may lend itself to other functions – which again spurred further questioning. This led her on to presenting phylogenetic analysis of these sequences. Firstly, Sophie directed her attention to the sequences presenting an active chitinolytic domain, which revealed that not all these were similar to known bacterial chitinases. Some for instance possessed a peptidoglycan-binding domain – whether it can bind chitin remains to be determined.

Sophie concluded the presentation by outlining her next steps. Kicking off with the bioinformatics pipeline, she talked about removing redundant MAGS, prior to placing them in the tree of life to assess their taxonomy. Sophie explained that this analysis would help her identify MAGs that are specific and abundant in certain host species, and downstream they could then ascertain their taxonomy and whether they carry the chitinase genes. Sophie said her work will help to shed light on whether bacterial species carry the same genes between the different myrmecophagous species and more generally understand if the same mechanisms are involved in the adaption to this diet. This will be an interesting question to pursue, because Sophie noted that the adaptation to myrmecophagy in mammals involves different morphological and genome adaptations. Sophie also expressed her desire to combine long-reads and short-read data. In short, she wants to use the bonito basecaller for improved accuracy, then assemble the long reads with MetaFlye and finally polish with short reads. For strain separation, Sophie intends to use Strainberry post assembly. She touched upon wanting to sequence in the field with the Mk1C, endowing fresh sample analysis, which could improve the quality of DNA and ability to obtain long fragments. Finally, she mentioned her desire to compare her results with non-myrmecophagous vertebrae species.

Authors: Sophie Teullet