Tackling the teak genome

Combining extreme durability, strength and resistance to pests, chemicals and water, teak (Tectona grandis L.T) is one of the world’s most highly-valued timber species1. Unfortunately, due to illegal logging and climate change, natural teak populations are diminishing. While teak plantations are expanding to meet the demand for this desirable material, the genetic diversity of such plantations is inherently low. As a result, there is a clear need to characterise the genetic diversity of natural teak populations to assist conservation efforts. In addition, previous studies have shown that there is significant scope for the genetic improvement of teak for timber production, which could be further supported through genetic characterisation1.

To support conservation and breeding strategies, researchers from India sequenced seedlings from six dominant teak trees collected from disparate geographical regions across the country1.

Five of the samples were subject to lowcoverage (15x) genome sequencing using short-read technology, while one sample, which was to be the reference, was sequenced at high coverage (151x) using a combination of traditional short-read technology and long-read sequencing using the MinION1.

The nanopore sequencing delivered 782,591 reads and a total yield of 2.7 Gb, which corresponds to approximately 7x genome coverage.

The team implemented a hybrid assembly strategy utilising the MaSuRCA tool to integrate the short- and long-read data. Due to the utilisation of long nanopore reads, the resulting assembly, comprising 2,993 contigs, exhibited a significantly higher contig N50 (277,872 bp) compared with previous assemblies of members of the Lamiaceae family (Table).

The researchers used the assembly data to characterise the diverse array of repeat elements, which, in total, made up 11.18% of the genome. They were also able to identify 615 genes that encode proteins involved in the durability of teak. Further, the new reference genome allowed the identification of simple sequence repeats (SSR) across all teak samples, which can be used as genotypic markers for future conservation and tree improvement programmes.

Validating the use of long reads in the characterisation of large plant genomes, the researchers stated:

‘Genomic applications to genetic resource conservation and breeding in forest trees is expected to harness [many] benefits due to the long-read sequencing technologies’.

case study figure 5.PNGTable: Comparative analysis of genome sizes, assembly and annotation of Lamiaceae species. The inclusion of long-read nanopore sequencing significantly increased the contiguity of the Tectona grandis assembly when compared with other genome assemblies which were predominantly created using short-read sequencing technologies. Table adapted from Yasodha et al.1

This case study was taken from the plant white paper.

  1. Yasodha, R. et al. Draft genome of a high value tropical timber tree, Teak (Tectona grandis L. f): insights into SSR diversity, phylogeny and conservation. DNA Research, dsy013 (2018).