Todd Michael: Unraveling the mysteries of CBD and THC content with a chromosome resolved Cannabis genome


Closing this year’s Nanopore Community Meeting on a high, Todd Michael, Professor and Director of Informatics at the J. Craig Venter Institute (JCVI) provided an entertaining overview of his team’s research on the cannabis genome. The potential medical benefits of cannabis are of increasing interest to researchers working across a range of disease areas and the demand for non-psychoactive varieties with high concentrations of cannabidiol acid (CBDA) and low concentrations of the highly psychoactive delta-9-tetrahydrocannabinol-acid (THCA) is rapidly growing. The enzymes that produce these compounds, CBDA and THCA synthase, compete for a common precursor, and copy number variation, as well as sequence variation at the loci encoding these enzymes has been proposed as a potential explanation for differences in ratios of the two compounds. However, according to Todd: ‘our understanding of the underlying genomic architecture of the CBDA and THCA synthase loci has been limited by the repetitive nature of the cannabis genome’.

To fully characterise the cannabis genome, including repetitive regions, the JCVI team performed whole genome sequencing using the long read nanopore technology. They chose a cannabis strain with high (15%) CBDA and low (0.3%) THCA. High-molecular weight DNA was extracted from young leaf tissue using a modified CTAB method. In total, the team generated 26 Gb of data, providing approximately 36x coverage of the 735 Mb genome. Following a genome assembly pipeline comprising minimap, miniasm, racon and pilon, and comparison to a genetic map, the genome could be resolved into 10 chromosomes.

Further analysis allowed the resolution of 14 CBDA and THCA synthase cassettes, 21 of which reside in two linked tandem arrays, nestled among a complex array of transposable elements on chromosome 9. Todd suggested that these transposons are potentially responsible for copy number variation of the synthase genes seen between different cultivars.

The team next utilised nanopore sequencing to generate full-length transcripts in order to support gene prediction and the identification of expressed synthase genes. Interestingly only one of the CBDA synthase gene loci is expressed. Todd stated that they have now observed this situation in two other high CBDA varieties.

The data also allowed them to clarify the lineage of the cultivar under investigation, which turned out to be a marijuana variety even though these varieties are traditionally classed as having very low levels of CBDA. Todd suggested that the CBDA in this variety is likely a recent introgression from hemp due to the breeders trying increase the levels of CBDA.

Wrapping up, Todd commented that using long-read nanopore sequencing, the team at JCVI have generated the most contiguous cannabis assembly to date, delivering new insights into the genetics of synthase genes and their evolution.