In:
Frontiers in Microbiology, Frontiers Media SA, Vol. 14 ( 2023-10-9)
Abstract:
Around 10% of the coding potential of Mycobacterium tuberculosis is constituted by two poorly understood gene families, the pe and ppe loci, thought to be involved in host-pathogen interactions. Their repetitive nature and high GC content have hindered sequence analysis, leading to exclusion from whole-genome studies. Understanding the genetic diversity of pe/ppe families is essential to facilitate their potential translation into tools for tuberculosis prevention and treatment. Methods To investigate the genetic diversity of the 169 pe / ppe genes, we performed a sequence analysis across 73 long-read assemblies representing seven different lineages of M. tuberculosis and M. bovis BCG. Individual pe/ppe gene alignments were extracted and diversity and conservation across the different lineages studied. Results The pe / ppe genes were classified into three groups based on the level of protein sequence conservation relative to H37Rv, finding that & gt;50% were conserved, with indels in pe_pgrs and ppe_mptr sub-families being major drivers of structural variation. Gene rearrangements, such as duplications and gene fusions, were observed between pe and pe_pgrs genes. Inter-lineage diversity revealed lineage-specific SNPs and indels. Discussion The high level of pe/ppe genes conservation, together with the lineage-specific findings, suggest their phylogenetic informativeness. However, structural variants and gene rearrangements differing from the reference were also identified, with potential implications for pathogenicity. Overall, improving our knowledge of these complex gene families may have insights into pathogenicity and inform the development of much-needed tools for tuberculosis control.
Type of Medium:
Online Resource
ISSN:
1664-302X
DOI:
10.3389/fmicb.2023.1244319
DOI:
10.3389/fmicb.2023.1244319.s001
Language:
Unknown
Publisher:
Frontiers Media SA
Publication Date:
2023
detail.hit.zdb_id:
2587354-4