Comparative genomic analysis of Mycobacterium tuberculosis reveals evolution and genomic instability within Uganda I sub-lineage

Introduction Tuberculosis (TB) is the leading cause of morbidity and mortality globally, responsible for an estimated annual 10.0 million new cases and 1.3 million deaths among infectious diseases with Africa contributing a quarter of these cases in 2019. Classification of Mycobacterium tuberculosis (MTB) strains is important in understanding their geographical predominance and pathogenicity.

Different studies have gone ahead to classify MTB using different methods. Some of these include; RFLP, spoligotyping, MIRU-VNTR and SNP set based phylogeny. The SNP set based classification has been found to be in concordance with the region of difference (RD) analysis of MTB complex classification system. In Uganda, the most common cause of pulmonary tuberculosis (PTB) is Uganda genotype of MTB and accounts for up to 70 % of isolates.

Methods Sequenced MTB genome samples were retrieved from NCBI and others from local sequencing projects. The genomes were subjected to snippy (a rapid haploid variant calling and core genome alignment) to call variants and annotate them. Outputs from snippy were used to classify the isolates into Uganda genotypes and Non Ugandan genotypes based on 62 SNP set. The Ugandan genotype isolates were later subjected to 413 SNP set and then to a pan genome wide association analysis.

Results 6 Uganda genotype isolates were found not to classify as either Uganda I or II genotypes based on the 62 SNP set. Using the 413 SNP set, the 6 Uganda genotype isolates were found to have only one SNP out of the 7 SNPs that classify the Uganda I genotypes. They were also found to have both missense and frameshift mutations within the ctpH gene whereas the rest of Uganda I that had a mutation within this gene, was a missense.

Conclusion Among the Uganda genotypes genomes, Uganda I genomes are unstable. We used publicly available datasets to perform analysis like mapping, variant calling, mixed infection, pan-genome analysis to investigate and compare evolution of the Ugandan genotype.

Authors: Stephen Kanyerezi, Patricia Nabisubi