Postgenomic analysis of bacterial pathogens repertoire reveals genome reduction rather than virulence factors

In the pregenomic era, the acquisition of pathogenicity islands via horizontal transfer was proposed as a major mechanism in pathogen evolution. Much effort has been expended to look for the contiguous blocks of virulence genes that are present in pathogenic bacteria, but absent in closely related species that are nonpathogenic. However, some of these virulence factors were found in nonpathogenic bacteria. Moreover, and contrary to expect-ation, pathogenic bacteria were found to lack genes (antivirulence genes) that are characteristic of nonpathogenic bacteria. The availability of complete genome sequences has led to a new era of pathogen research. Comparisons of genomes have shown that the most pathogenic bacteria have reduced genomes, with less ribosomal RNA and unorganized operons; they lack transcriptional regulators but have more genes that encode protein toxins, toxin^ antitoxin (TA) modules, and proteins for DNA replication and repair, when compared with less pathogenic close relatives. These findings questioned the paradigm of virulence by gene acquisition and put forward the notion of genomic repertoire of virulence.


INTRODUCTION
The beginning of microbiology can be traced back to the late 19th century with the development of the germ theory of disease, which states that many diseases are caused by specific microorganisms. One of the initial arguments against the germ theory of disease was the presence of microbes in healthy carriers. The theory gained gradual acceptance with the crucial distinction between disease-causing microbes and commensal microbes as explained by Pasteur, who was supported by Koch in Germany and Lister in England. The germ theory of disease culminated in the guiding dogma of infectious disease that demonstrated the relationship between infection by a particular microorganism and observable symptoms according to the following criteria: (i) the microorganism should be found in all patients with the disease in question and not found in healthy subjects; (ii) this microorganism should be cultivated outside the body of the host; (iii) the organism must be isolated from a diseased organism and grown in pure culture; and (iv) the introduction of the cultured microorganism into a healthy organism should cause disease [1]. These criteria were grouped, although they have never been formulated in this way, under the terms of Koch's postulates to identify the pathogens.
The awareness that came from an understanding of Koch's postulates resulted in fewer incidences of major acute infectious diseases. Nevertheless, two experiments questioned the validity of this dogma. First, Roux and Yersin found that pure supernatants from cultures of Corynebacterium diphtheriae (freed from the living organism by filtration) contained a toxin. The inoculation of the cell-free supernatant into an experimental model gave the same symptoms as those obtained with the inoculation of the living organism [2]. This experiment pointed out the role played by toxins in the development of disease. These seminal findings, together with the almost simultaneous discovery of tetanus toxin, led rapidly to the discovery of humoral immunity. The denaturation of diphtheria toxin provided the basis for widespread immunization and serotherapy as ways to prevent disease [3]. Second, Pasteur noted weakened pathogenicity when Pasteurella multocida, the agent of fowl cholera, was cultured in axenic medium. In contrast, fresh cultures of P. multocida maintained their pathogenicity [4]. This laid the foundation for the use of an attenuated microbe to protect against infection, that is, the development of a vaccine. The chickens that received the attenuated microbe were immune to fowl cholera and did not become ill after receiving an injection of the virulent P. multocida; injection of the attenuated germ protected them against subsequent infection. This experiment also showed that by adapting to culture in axenic medium, P. multocida lost fitness in the chicken host. Altogether, these experiments revealed that the existence of protein toxins and fitness tradeoffs in alternative environments greatly influenced the development of pathogenicity.
The overwhelming desire of humans to combat infectious disease and discover new methods to prevent infection has driven a considerable proportion of research on prokaryotes. A number of experiments were conducted with the goal of identifying 'virulence factors'. The genetic approaches used to identify virulence factors were biased by the search for genes that were unique to pathogenic bacteria rather than also identifying genes that were missing from pathogens. With the advent of whole-genome sequencing, bacterial pathogens were among the first organisms to be sequenced, with Haemophilus influenzae leading the way in 1995 [5]. The application of genomic methods to agents of infectious disease has led to considerable increases in our knowledge of bacterial pathogenesis. Indeed, genomics revealed unexpected evolutionary mechanisms that have transformed opportunistic or commensal bacteria into pathogens capable of invading and surviving in host tissues, resulting in disease. For example, comparison of genome sequences from Rickettsia conorii and Rickettsia prowazekii [6], and Mycobacterium leprae and Mycobacterium tuberculosis with other Mycobacterium spp. [7,8], showed that genomes from the most powerful killers of mankind have not increased in size by the acquisition of genes associated with virulence, but have instead been drastically reduced. Thus, gene acquisition and gene loss are complementary forces shaping the evolution of bacterial pathogenesis. The present review examines the current knowledge about pathogenicity in the light of Rickettsia, Mycobacteria, Shigella and Yersinia genomics.

PREGENOMIC ERA:VIRULENCE FACTORS AND ADDITIONAL GENES
Microbial pathogens are generally believed to have evolved from ancestral strains that were either freeliving or commensal inhabitants of their hosts. Underlying this belief is the assumption that microbes acquired traits that enabled them to cause infectious disease. Many bacterial virulence determinants were found encoded as clusters of genes on the chromosomes of pathogenic bacteria [9], such clusters are known as 'pathogenicity islands' (PAIs) [10,11]. PAIs encode a diversity of toxins that enable host invasion [12][13][14][15] such as the toxic shock syndrome toxin of Staphylococcus aureus and the toxins of Clostridium difficile and Bordetella pertussis [16][17][18][19]. PAIs of Gram-negative bacteria such as Yersinia spp., Salmonella spp., Escherichia spp. and Helicobacter pylori were found to encode types III and IV secretion systems that inject toxic effector molecules directly into host cells [15,[20][21][22][23][24]. As PAIs encode such virulence functions, they were thought to have profoundly affected the emergence of pathogenicity [25], by being a prerequisite for the pathogenic lifestyle. Indeed, the induction of virulence factors was sufficient to convert a common soil bacterium into a parasite that can grow in the cytoplasm of a mammalian cell [26]. Deletion or site-directed mutagenesis of these virulence factors led to strains that were less pathogenic [27,28].
PAIs were first described as clusters of genes that are present on the chromosome of pathogenic bacteria but absent from nonpathogenic strains of the same or closely related species [28], they have also been found later on in large virulence plasmids or genomes of bacteriophages. The enteroinvasive Escherichia coli (EIEC) carries virulence plasmids that encode toxins responsible for invasive enteric disease, a Shigella-like dysentery [29,30]. Interestingly, PAIs also carry genes that are associated with mobile genetic elements; examples include insertion sequences in the PAIs of Shigella SHI-1, and a cryptic integrase in the PAIs of the uropathogenic E. coli strain 536. These findings suggest that PAIs can be transferred via mobile genetic elements such as integrated plasmids, conjugative transposons or bacteriophages [31]. Consequently, the horizontal acquisition of virulence genes was proposed as a major mechanism in pathogen evolution. Horizontal transfer of PAIs is supported by their G þ C content and codon usage that differs from that of the core genome (e.g. the G þ C content of PAIs in uropathogenic E. coli is 10%, well below the 51% mean G þ C content of the genome). The broad-host-range plasmid that encodes antimicrobial resistance genes illustrates the evolution of pathogenicity through horizontal gene acquisition [32][33][34]. Horizontal transfer of PAIs genes is a powerful engine of evolution with the potential to radically alter the phenotypic profile of an organism in a single step according to 'evolution in quantum leaps'.
In the pregenomic era, strains could be defined as pathogens if their genomes contained PAIs with uncharacteristic G þ C content and codon usage, indicating acquisition through horizontal transfer [31]. These virulence traits were predicted to increase the fitness of their bacterial carriers, which would increase the transmission of such traits by enabling bacteria to colonize and survive in novel niches. However, it has been found that these genetic elements are also widespread among nonpathogenic microbes [31]. For example, in E. coli, PAIs coding for fimbrial adhesions can be detected in commensal strains of the normal intestinal microflora. Likewise, genes encoding siderophores were found in pathogenic Yersinia [35] and in nonpathogenic Salmonella [36]. These findings indicate that the evolution of pathogenicity may not strictly rely on the acquisition of canonical 'virulence factors' (e.g. toxins). These findings further necessitate the review of what defines a 'virulence factor', in order to eliminate factors that do not have a major impact on pathogenicity. Toxins seem to be the only factors with direct implications for pathogenicity.

GENOMIC ERA
The advent of whole-genome sequencing has facilitated progress in understanding pathogens and how to best treat their infections. Much genomic analysis has been performed with the avowed purpose of identifying virulence factors that transform harmless bacteria into organisms capable of invading and surviving in host tissues, often resulting in disease. However, the comparison of pathogenic bacteria with closely related nonpathogens revealed few or no additional genes in the pathogenic genomes.

Reductive evolution in Rickettsia spp.
Rickettsia are obligate intracellular alphaproteobacteria. The genus comprises highly pathogenic species, such as R. prowazekii, the arthropod-borne agent responsible for epidemic typhus [37], and the less pathogenic species Rickettsia typhi and R. conorii, which cause murine typhus and human Mediterranean spotted fever, respectively [38,39]. The genus presents a new paradigm that renders the previous concepts about pathogenicity outdated. Indeed, motility has been considered a virulence factor in invasive infections [40], and actin comet tail formation has been shown to be an important virulence determinant in Listeria spp. and Shigella spp. [41,42]. Interestingly, both R. conorii and R. typhi present an intracellular motility based on actin polymerization [43], whereas the more virulent R. prowazekii is nonmotile. RickA and Sca2 proteins are involved in actin-based motility in Rickettsia [44,45]. Rickettsia prowazekii does not contain rickA nor express a functional Sca2 protein, whereas rickA, Sca2 protein and actin-based motility are commonly found in the genomes of other Rickettsia spp. without identified pathogenicity [45,46].
Whole-genome analysis of Rickettsia spp. did not identify virulence factors and even raised doubt on the existence of such factors ( Figure 1). The genome of R.
prowazekii (834 open reading frames [ORFs]) seems to be a subset of the larger R. conorii genome (1374 ORFs) [6] with few additional genes that are not directly identifiable as virulence determinants [47]. Moreover, the amount of genes encoding hypothetical function that can be eventually involved in pathogenicity is smaller in the genome of R. prowazekii compared with the genome of R. conorii (13% versus 15%). Altogether, the genome of R. prowazekii seems not to contain genes obviously associated with pathogenicity. The nearly perfect colinearity between the two genomes allowed the clear identification of gene alterations and gene remnants in the R. conorii genome. Indeed, Rickettsia spp. seem to have lost many genes coincident with their emergence as pathogens [48][49][50]. Many genes coding for proteins that synthesize amino acids have been lost from the R.prowazekii genome [51]. Translation capacities were predicted to have decreased, and translation regulation factors were affected [48]. Despite its genome being highly reduced, R. prowazekii retained genes encoding recombination and repair proteins, which are most likely needed for protection from the host immune response [52]. Interestingly, it has been shown that the loss of essential genes that regulate bacterial response to environmental changes [53] could be a possible mechanism for the development of pathogenicity in R. prowazekii and Rickettsii. rickettsii, the agent of rocky mountain spotted fever [54]. Thus, Rickettsia illustrates a new mechanism for pathogenicity: when regulation of invasion, replication and transmission processes is altered, virulence can emerge.
Gene decay in pathogenic Mycobacterium spp.
Mycobacterium spp. are phenotypically diverse. The genus includes pathogens, such as M. leprae, M. tuberculosis and Mycobacterium ulcerans (the etiologic agents of leprosy, tuberculosis and Buruli ulcer, respectively), and the free-living environmental species Mycobacterium smegmatis and Mycobacterium marinum that rarely cause infection. The comparative genomic analyses revealed that pathogenic Mycobacterium species are undergoing reductive evolution, perhaps following their switch from an environmental to a host-adapted or obligate intracellular lifestyle. The genome of M. leprae is considerably reduced in size at 3.27 Mb and possesses a low G þ C content of 57.8% compared with other Mycobacterium spp. Moreover, only 49.5% of the M. leprae genome was found to contain protein-coding sequences. The genome contains an exceptionally large number of genes that are homologous to functional genes in M. tuberculosis, but that containing activating mutations leading to their pseudogenization [8,55]. Gene inactivation to pseudogenes appeared to be major mechanism of genome downsizing in M. leprae as an obligate intracellular pathogen.
Gene deletion has been directly implicated in the emergence of pathogenicity ( Figure 2). Comparative genomic analysis has revealed that M. ulcerans arose from M. marinum, by horizontal transfer of a plasmid that carries a cluster of genes for mycolactone production, followed by extensive pseudogene formation, genome rearrangements and gene deletion [56,57]. Mycobacterium ulcerans lacks one of the major contributors to mycobacterial virulence, the ESX loci, which encode type VII secretion systems [58]. The ESX loci are required to mediate the export of protein family and specific effectors such as the 6 kDa early secretory antigenic target, and the ESX-1 secretion-associated protein A [59]. Interestingly, the disruption of the mce1 operon in M. tuberculosis enhanced the virulence of the mutant strain and aberrant granuloma formation in mice [60]. However, the mce1 operon mutant was unable to stimulate the proinflammatory response by the macrophages that would otherwise induce organized granuloma formation and control the infection without killing the organism. These observations are consistent with the suggestion that the mce1 operon products are associated with the induction of a proinflammatory response. The loss of mce1 operon renders the bacteria more virulent as it can kill the mice more rapidly than did the wild-type strain.

Black holes in Shigella species and EIEC
Bacteria of the genus Shigella are Gram-negative rods that cause bacillary dysentery. The genetic mapping and chromosomal sequences of E. coli K-12 and Shigella spp. demonstrate a high degree of colinearity, with the organization and arrangement of the majority of genes being identical, and the genomes are >90% homologous [61]. Nevertheless, the pathogenicity and epidemiology of E. coli and Shigella are radically different. Indeed, E. coli, with the exception of few pathogenic clones, are commensals of the human intestine. Evolutionary studies demonstrated that Shigella are a group of pathogenic E. coli [62] ( Figure 3). It has been assumed that Shigella traits were gained via horizontal acquisition of a virulence plasmid [63]. However, Shigella strains have been distinguished from E. coli by chromosome-associated characteristics, such as lysine decarboxylase (LDC)  Shigella evolved by the acquisition of a plasmid (represented by an empty circle) and PAIs. A major step in the evolution of Shigella was the loss of antivirulence genes that enhanced its pathogenicity. Likewise, the pathogenicities of the enterohemorrhagic (EHEC), uropathogenic (UPEC), enteropathogenic (EPEC) and enterotoxigenic (ETEC) E. coli have been enhanced by the loss of antivirulence genes, for example, melA that encodes alphagalactosidase activity and yjcV that encodes a permease protein for the D-allose transport system. activity, which is lacking in Shigella but present in >90% of E. coli strains [64,65]. Introduction of the gene encoding LDC, cadA, into Shigella flexneri produced a transformant S. flexneri BS529 (pCAD þ ) that was able to express LDC activity; however, this strain had significantly reduced activity of enterotoxins. The enterotoxin inhibitor was identified as cadaverine, a product of the reaction catalyzed by LDC. Overall, induction of LDC (achieved by introducing the cadA gene) attenuated the virulence of this transformed S. flexneri strain.
Comparison of the S. flexneri 2a and laboratory E. coli K-12 genomes revealed a large deletion encompassing the LDC gene locus in Shigella. Maurelli et al. had defined a mechanism whereby virulent Shigella spp. and EIEC strains had enhanced the activity of their enterotoxins by deleting cadA and not producing cadaverine. These findings suggest that the creation of 'black holes' (genome deletions) is a pathway that complements gene acquisition in the evolution of bacterial pathogens. Thus, the lack of LDC activity in Shigella spp. and EIEC is an example of the enhancement of virulence potential by gene loss. CadA may be an 'antivirulence' gene for Shigella according to the 'pathoadaptive mutation' via gene loss definition [66][67][68]. Another example of virulence enhancement by gene loss concerns the cryptic prophage DLP12 that is present in nonpathogenic E. coli strains and absent in Shigella spp. The prophage encodes a protease activity that degrades the Shigella outer membrane protein, VirG, a virulence factor encoded on the large plasmid that is crucial for the pathogen to invade host cells and spread. The most plausible scenario is that Shigella evolved from the E. coli complex via a plasmid containing genes essential for cell invasion. Massive gene deletions followed, thereby increasing its virulence. Therefore, Shigella, like other pathogenic bacteria, has no more virulence genes than closely related nonpathogenic bacteria. The formation of these 'black holes,' deletions of genes that are detrimental to a pathogenic lifestyle, provides an evolutionary pathway that enables a pathogen to enhance virulence.

Antivirulence factors of Yersinia pestis
Yersinia pestis, the causative agent of enzootic plague, has been responsible for devastating epidemics throughout history [69,70]. Yersinia pestis is very closely related to the foodborne pathogen Yersinia pseudotuberculosis on the basis of DNA hybridization methods, and sequence identity of 16S ribosomal RNA (99.7% identity) and housekeeping genes [71]. Although almost genetically identical, the two organisms produce markedly different diseases that have been associated with lineage-specific acquisition of PAIs and virulence factors. Many Y. pestis-specific chromosomal regions exhibit features of PAIs [72,73], including the iron transport system, the yersiniabactin region of the high PAI [35,74]. Yersinia pestis was found to possess two additional virulence-associated plasmids, pPCP1 and pMT1, that carry genes required for flea-borne transmission and hematogenous dissemination: the type III secretion system [75], the plasminogen activator and the murine toxin [76]. Therefore, the acquisition of additional genes and subsequent gain of function contributed significantly to the evolution of high virulence in Y. pestis [77].
Genomic approaches have shown that loss of function, via gene loss and pseudogenization, has also played a significant role in the evolution of Y. pestis [73,78].Yersinia pestis has lost genes that repress biofilm synthesis (rcsA) and enhance biofilm degradation (nghA); expression of these genes in Y. pestis inhibited biofilm formation in the flea foregut [79,80]. These experiments suggest that Y. pestis has adapted to its flea vector by losing genes whose products reduce biofilm formation. These genes are considered antivirulence genes because their inactivation contributed to efficient flea-borne transmission. Yersinia pestis has also lost the lpxL gene that mediates the acetylation of lipid A on bacterial LPS, thus activating Toll-like receptor 4 [81]. The expression of a functional lpxL gene in Y. pestis induced an appropriate immune response in a mouse model without any sign of disease, while the lpxL-lacking wild-type strain resulted in 100% mortality, indicating that loss of lpxL enables the pathogen to evade the host innate immune defense. These genome comparisons thus show that the genome of Y. pestis lacks many antivirulence genes that, if present, would interfere with host infection.

MASSIVE COMPARATIVE GENOMICS
The development of modern sequencing technology has resulted in an exponential increase in the number of available genome sequences, offering an unprecedented opportunity to elucidate the mechanisms underlying bacterial specialization and the development of pathogenicity. The massive comparative analysis of 317 bacterial genomes revealed that the genomes of pathogenic bacteria tend to be smaller than those of related free-living nonpathogenic bacteria [82]. Obligate intracellular bacteria, including those associated with disease, share numerous genomic characteristics despite their distant phylogenetic relatedness. Gene degradation has been shown to be a common feature of obligate intracellular bacteria. Indeed, when bacterial lineages make the transition from a free-living or facultative host-associated lifestyle to a lifestyle of permanent host association (endosymbiotic and parasitic lifestyles), they pass through a population bottleneck that results in genome downsizing, low GþC% and few and disrupted ribosomal RNA operons [82]. Moreover, the genomes exhibit an accumulation of pseudogenes, random gene loss and the loss of transcriptional regulators. Patterns that had been previously observed in few pathogens could now be further tested by comparisons between a large number of bacterial genomes with different lifestyles. Pathogenic species from different phyla display marked similarities in patterns of gene content as a result of convergent evolution [82]. Comparative genomics provided evidence for the reductive evolution of obligate intracellular pathogens and introduced gene loss as a complementary evolutionary pathway to gene acquisition and gain of function ( Figure 4). These 'pathoadaptive mutations' occur in order to maximize fitness in the host niche [68], possibly at the expense of fitness in the ancestral niche [4]. Indeed, gene loss seems to confer a fitness advantage in the new host environment, and the loss of transcriptional regulators seems to have triggered the development of pathogenicity [82].

PAIRWISE COMPARISON AND PATHOGENICITY
Specific features of pathogenicity were unraveled, and virulence determinants were recognized, by comparing genomes from pairs of closely related species with different effects on hosts (e.g. highly pathogenic bacteria versus lowly pathogenic bacteria) [83]. Genetic changes that are strong candidates for pathoadaptive mutations (including gene loss events) would be those that underlie traits absent in pathogenic species, but commonly expressed in closely related commensals. Analysis of gene repertoires revealed that the majority of genes in highly pathogenic bacteria have orthologs in closely related bacteria that are less pathogenic. Rickettsia prowazekii had the highest proportion of reciprocal gene best match similarity, with 94% of its genes matching with those in Rickettsia Africae, compared with only 10% for Treponema denticola and Treponema pallidum. Genomes of highly pathogenic bacteria were enriched for genes that encode replication, recombination and repair functions, whereas these genomes were depauperate of genes encoding metabolic functions and transcriptional regulators when compared with closely related bacteria (Figures 5 and 6). The loss of these genes may be possible because the host can provide many nutrients but their loss may also enhance virulence, as in Shigella. Highly pathogenic bacteria exhibit a mean of 49 þ 45 genes that have no BLAST hit match in the closely related congeneric species, compared with 87 þ 73 in low pathogenic bacteria (Figure 7). These findings suggest that the core genome of the genus includes most genes in highly pathogenic bacteria, whereas the pan-genome is expanded by gene repertoires of the bacteria with low pathogenicity. The highly pathogenic bacterial genomes are usually subsets of less pathogenic and commensal bacteria species. The lower proportion of lineage-specific genes in the genomes of the highly pathogenic bacteria indicates that genes are not acquired via horizontal gene transfer as often as in bacteria with low pathogenicity. Finally, less pathogenic bacteria have quantitatively more plasmids than close relatives that are highly pathogenic (Figure 7).
Comparisons of the predicted lengths of proteins encoded by lineage-specific genes illuminated the dynamics of gene loss. The studied genomes display marked similarities in the frequency distribution of protein length. The mean length of ORFs was 922 þ 47 bp in the genomes of lowly pathogenic bacteria versus 944 þ 70 in the highly pathogenic species. The ORFs length as depicted in Figure 8 showed no significant difference between the lowly and highly pathogenic species. About 2.7% of the ORFs in highly pathogenic bacteria are >2500 bp compared with 2.2% in the genomes of lowly pathogenic bacteria. ORFs with lengths greater than 2500 bp are mainly involved in replication, recombination and repair and in secondary metabolites biosynthesis, transport and catabolism including the polyketide synthase. This analysis clearly shows that the loss of genes is not dependent on protein length, that is, both long and short genes can be lost, but rather on selection pressures unique to the environmental niche. These data highlight massive  genome reduction in pathogenic bacteria and suggest that gene loss is probably dependent on function and not length.

TOXINS AND TA MODULES: THE PLAUSIBLE VIRULENCE-ASSOCIATED PROTEINS
The expression of toxin genes constitutes an evolutionary milestone in the development of pathogenicity. These genes can be located on the bacterial chromosome, plasmids or phages. Thus, diphtheria toxin produced by C. diphtheriae is encoded by a temperate phage [84]. The injection of animals with sterile filtrates from liquid C. diphtheriae cultures caused death with a pattern of lesions characteristic of diphtheria [2]. Likewise, tetanus and botulism are caused by extracellular neurotoxins that are produced by the spore-forming bacteria, Clostridium tetani and Clostridium botulinum, respectively [85,86]. These toxins are zinc-dependent metalloproteinases that catalyze the cleavage of specific protein targets involved in the release of neurotransmitters [87]. The tetanus toxin blocks the release of the inhibitory neurotransmitters glycine and gamma-amino-butyric acid in the central nervous system. This leaves excitatory nerve impulses unopposed and results in muscle spasms. The botulinum toxin blocks the release of acetylcholine from the presynaptic axonal terminals of the motor endplate, thereby rendering the muscle unable to contract [88]. These toxins are among the most potent poisons known. Administration of minute quantities of botulinum or tetanus neurotoxin is sufficient to cause the death from relentless descending paralysis or intense global muscle spasm, respectively. Altogether, toxins refer to macromolecules that, when introduced to an organism, can cause impairment of physiological functions that leads to disease or death; these macromolecules can often be modified to generate a protective immune response.
TA loci likely play a major role in the development of pathogenicity. Comparative analysis of the most dangerous pandemic bacteria to their closest non-epidemic relatives showed an abundant presence of TA loci in the bacterial pathogens despite the convergent reductive evolution of these genomes [83,89] (Figure 9). TA loci were initially identified as plasmid stabilization factors [90]. They code for two components: a toxin that inhibits cell growth and an antitoxin that contains a DNA-binding motif and autoregulates transcription of the TA operon [91]. Highly pathogenic bacteria were found to harbor these loci on the chromosome [89,92]. Many of these bacteria, including M. tuberculosis, M. leprae and Streptococcus pyogenes, have a slow growth rate in relation with a small count and disrupted ribosomal RNA operons [82]. They generate persistent or recurrent infections that resist antibiotics. Interestingly, experiments with the model organism E. coli have shown that the TA loci are required for persistence, and that increased persistence is associated with a decreased growth rate of a bacterial culture [93]. These findings raise the possibility that TA genes contribute to pathogenicity by enabling cells to persist in the presence of antibiotics, and resist environmental and nutritional stresses. Indeed, studies on Rickettsia showed that the persistence of pathogenic bacteria also depends on TA loci [94,95]. Exposure to chloramphenicol resulted in the death of Rickettsia and the release of VapC toxin into the cytoplasm of infected host cells, leading to the formation of lytic plaques. Further studies are needed to test the general function of TA in the development of pathogenicity. The presence of TA loci in the reduced genomes of pathogens will enable a better understanding of the mechanism of pathogenicity and the identification of novel antimicrobial agents.

PATHOGEN DETECTION
The new strategy of process biology that integrates genomics and modern informatics tools changed the way we view reality and helped to elaborate new unprecedented concepts about pathogenicity. Indeed, the full appreciation of the richness associated with the high-throughput sequencing requires new techniques to harness large-scale data at the DNA level. Therefore, significant progress has been made in informatics to improve the analysis of genomic data and try to bring order to its bewildering complexity. The development of analysis tools for genome assembly, whole-genome alignment, ORFs detection, functional annotation and whole-genome comparison led to the discovery of major evidences that would have remained largely ignored without using these methods. New advances from genomics compel researchers to adopt a critical comprehensive approach to identifying microbial pathogens. The modern approach to detect pathogens consists of identifying gene repertoires associated with virulence rather than looking for additional 'virulence factors'. Therefore, a bacterium is likely to be pathogenic if it has a tiny genome with low G þ C content, few and/ or disrupted rRNA operons, the presence of toxin or TA operons, and a large part of the genome encoding replication functions at the expense of transcriptional regulation. Genome comparison of these species with their phylogenetically related less pathogenic bacteria can help to detect antivirulence genes. Finally, transcriptome profiling provides information on differential expression of genes and allows functional predictions to be made.

CONCLUSION
Comparative genomics has led to a dramatic revision of the paradigm of virulence by showing that most Figure 9: TA modules. Hierarchical clustering of TA module content as identified in Georgiades and Raoult [83]. Genes and bacteria were clustered using an average linkage method (MeV program). The colour scale represents the count of TA modules. Some highly pathogenic bacteria cluster together, rather than with their close relatives.
highly pathogenic bacteria have small genomes. Instead of microbial evolution by gene acquisition, genomic studies report the existence of a complementary, but inverse pathway that may enable a low pathogenic bacterium to evolve toward a highly pathogenic lifestyle. Pathoadaptive mutations include the deletion of genes to form 'black holes'. Indeed, the emergence of high pathogenicity is driven by gene loss, which is preceded by the horizontal acquisition of PAIs. Genomics holds the potential for elucidating the mechanisms of pathogenicity and for discovery of new antimicrobial therapies.

Key Points
The genetic approaches used to elucidate the mechanism of pathogenicity were biased by the search for 'virulence factors' that were unique to pathogenic bacteria. Comparative genomics revealed that genomes of the most pathogenic bacteria have undergone a reductive evolution. Antivirulence genes gave insights into the evolution of highly pathogenic bacteria via gene loss. Toxin and TA modules constitute the plausible virulenceassociated proteins. Pathogenic bacteria have a genomic repertoire of virulence characterized by a tiny size with low G þ C content, few and/or disrupted rRNA operons. It contains toxins and TA modules and a large proportion of genes encoding replication functions at the expense of transcriptional regulation.