Species boundaries and molecular markers for classification among 16SrI phytoplasmas informed by genome analysis

Phytoplasmas are diverse plant-pathogenic bacteria that greatly impact agriculture worldwide. The current classification system for these uncultivated bacteria is based on the restriction fragment length polymorphism (RFLP) analysis of their 16S rRNA genes. With the increased availability of phytoplasma genome sequences, the classification system can now be refined.

This work examined 11 strains that belong to the 16SrI group within the genus 'Candidatus Phytoplasma' and investigated the possible species boundaries.

We found that the RFLP classification method is problematic due to intragenomic variation of the 16S rRNA genes and uneven weighing of different nucleotide positions. Importantly, our results based on the molecular phylogeny, differentiations in chromosomal segments and gene content, and divergence in homologous sequences, all supported that these strains may be classified into multiple operational taxonomic units equivalent to species. Strains assigned to the same species share >97% genome-wide average nucleotide identity (ANI) and >78% of their protein-coding genes. In comparison, strains assigned to different species share <94% ANI and <75% of their genes. These findings suggested the existence of barrier(s) against homologous recombination between species, and supported the proposal that 95% ANI could serve as a cutoff for distinguishing species in bacteria.

Critical examination of these results and the raw sequencing reads also allowed us to identify one genome that was possibly mis-assembled by mixing two sequencing libraries containing different phytoplasmas. This finding provided a cautionary tale for working on uncultivated bacteria.

Based on the new understanding of phytoplasma divergence and the current genome availability, we developed four molecular markers that could be used for multilocus sequence analysis (MLSA). By selecting markers that are short yet highly informative, and are distributed evenly across the chromosome, these markers provided a cost-effective system that is robust against recombination.

Finally, examination of the effector gene distribution further confirmed the rapid gains and losses of these genes, as well as the involvement of potential mobile units (PMUs) in their molecular evolution. Future improvements on the taxon sampling of phytoplasma genomes will allow us to further expand the analysis, and thus contribute to phytoplasma taxonomy and diagnostics.

Authors: Shu-Ting Cho, Hung-Jui Kung, Weijie Huang, Saskia A Hogenhout, Chih-Horng Kuo