Genome assembly and annotation of Macadamia tetraphylla

Macadamia is a kind of evergreen nut trees which belong to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. Catherine et al. reported M. integrifolia genome using NGS sequencing technology. However, the lack of a high-quality assembly for M. tetraphylla hinders the progress in biological research and breeding program.

In this study, we report a high-quality genome sequence of M. tetraphylla using the Oxford Nanopore Technologies (ONT) technology. We generated an assembly of 750.54 Mb with a contig N50 length of 1.18 Mb, which is close to the size estimated by flow cytometry and k-mer analysis. Repetitive sequence represent 58.57% of the genome sequence, which is strikingly higher compared with M. integrifolia.

A total of 31,571 protein-coding genes were annotated with an average length of 6,055 bp, of which 92.59% were functionally annotated. The genome sequence of M. tetraphylla will provide novel insights into the breeding of novel strains and genetic improvement of agronomic traits.

Authors: Ying-Feng Niu, Guo-Hua Li, Shu-Bang Ni, Xi-Yong He, Cheng Zheng, Zi-Yan Liu, Li-Dan Gong, Guang-Hong Kong, Jin Liu