All the physical linkage groups constituting the genome of Leishmania infantum have been identified for the first time by hybridization of specific DNA probes to pulsed field gradient-separated chromosomes. The numerous co-migrating chromosomes were individualised using the distinctive size polymorphisms which occur among strains of the L.infantum/L.donovani complex as a tool. A total of 244 probes, consisting of 41 known genes, 66 expressed sequence tags (ESTs) and 137 anonymous DNA sequences, were assigned to a specific linkage group. We show that this genome comprises 36 chromosomes ranging in size from 0.35 to ∼3 Mb. This information enabled us to compare the genome structure of L.infantum with those of the three other main Leishmania species that infect man in the Old World, L.major, L.tropica and L.aethiopica. The linkage groups were consistently conserved in all species examined. This result is in striking contrast to the large genetic distances that separate these species and suggests that conservation of the chromosome structure may be critical for this human pathogen. Finally, the high density of markers obtained during the present study (with a mean of 1 marker/130 kb) will speed up the construction of a detailed physical map that would facilitate the genetic analysis of this parasite, for which no classical genetics is available.
The protozoan parasite Leishmania causes a broad spectrum of diseases affecting ∼15 million people, mainly in tropical and subtropical areas. Increased incidence of the disease leishmaniasis can be related to various causes, including HIV/Leishmania co-infection. The genetic study of this parasite has long been hampered by the lack of any manipulable recombination system. The construction of a physical map for the Leishmania genome and the localization on this map of a large set of markers would be of great help for the definition of chromosome structure and subsequently for the study of the genetics of this parasite. As the chromosomes of this organism do not condense significantly during its life cycle, they could only be separated with the advent of pulsed field gel electrophoresis (PFGE).
Attempts at separating all the chromosomes of Leishmania by PFGE have only been partially successful, due to a lack of knowledge about chromosome number and ploidy and also because of the concentration of numerous chromosomal bands in a limited size class interval. Due to these limitations, estimates of the chromosome number of different Leishmania species have varied from 23 to 96 (1–4), giving estimated genome sizes between 23 and 134 Mb. This confusion was also due to a lack of comparative mapping analysis between different Leishmania species, which vary widely in geographic distribution and pathogenic behaviour. The nucleotide divergence between the main species of this genus has been estimated at 13–25% (5). This corresponds to a long evolutionary time and questions the degree of synteny conservation among different species.
The structure of the chromosomes of Leishmania seems comparable with that of other protozoa, with a central core of conserved single or low copy sequences and long stretches of subtelomeric and telomeric repeated sequences (6). These subtelomeric regions are responsible for size variations between homologous chromosomes, which appear so frequent that virtually every Leishmania isolate displays a distinctive karyotype (7). Therefore, the identification of each chromosome relies upon obtaining a specific set of single copy markers. Such markers have recently been developped for the six smallest chromosomes of L.infantum (8). They allowed a comparison of the organization of these chromosomes between distant Leishmania species. Surprisingly, this analysis revealed a conservation of the linkage groups for these six chromosomes, without any evident rearrangement (8). It remained to be investigated if the results obtained for these chromosomes, which represent only ∼6% of the total nuclear DNA, were applicable to the whole genome.
In this study we used the systematic hybridization of specific markers onto the molecular karyotypes to identify all the chromosomes of Leishmania. Besides their usefulness in future mapping analyses of individual chromosomes, these markers allowed us to compare the chromosome contents of different species of this parasite.
Materials and Methods
The Leishmania strains used for the definition of the linkage groups in the L.donovani complex were: L.infantum LEM1317, a clone of LEM356 (MHOM/FR/82/LEM356), LEM1163, a clone of strain LEM75 (MHOM/FR/78/LEM75), LEM1284, a clone of LEM251 (MCAN/FR/81/LEM251), and LEM267 (MCAN/FR/81/ LEM267); L. donovani LEM1651, a clone of LEM138 (MHOM/ IN/00/DEVI), and LEM1448, a clone of LEM536 (MHOM/ SA/81/JEDDAH-KA). The inter-species comparison was made with: L.major LEM1958, a clone of LEM62 (MHOM/YEM/ 76/LEM62), and LV39-Clone5 (MRHO/SU/00/LV39-CL5); L.tropica LEM1661, a clone of LEM579 (MHOM/GR/00/ LEM579), and LEM1909, a clone of LEM408 (MHOM/AF/82/ K0061); L.aethiopica LEM1660, a clone of LEM144 (MHOM/ET/72/L100). All strains were cultivated on blood agar (NNN) medium supplemented with RPMI 1640 (Gibco). Their identity was checked by the examination of isoenzyme profiles (15 isoenzyme systems) immediately prior to this study, at the reference Laboratoire d'Ecologie Médicale et Pathologie Parasitaire in Montpellier.
PFGE and hybridizations
Chromosomal DNA agarose blocks were prepared and processed for PFGE on home-made devices as described previously (7). PFGE was performed at 15°C in 1 or 1.5% agarose gels with 0.5× TBE (1× TBE = 89 mM Tris-HCl, 89 mM boric acid, 2 mM EDTA, pH 7.4) running buffer. The voltages and pulse conditions necessary to obtain a fine resolution for every chromosomal size class are described in the legend to Figure 1. The size markers were chromosomes from Saccharomyces cerevisiae strain AB1380, Hansenula wingei (BioRad) or Schizosaccharomyces pombe (BioRad). After staining with ethidium bromide gels were blotted onto nylon filters (Hybond N+; Amersham) by alkaline transfer (9). The probes were labelled by random primed synthesis and hybridised as described previously (8).
Three different sources of probes were used to establish the linkage groups. First we constructed a series of genomic libraries consisting of pBluescript vectors carrying inserts of small size (100–500 bp), in order to limit the presence of repeated sequences. The clones from these libraries were called ISA (for the libraries made from L.infantum LEM75 DNA) or ST (for the libraries made from L.infantum LEM1317 or LEM189). The source DNA was either the total nuclear content or gel-purified chromosomes. The latter was done to avoid under-representation of the smallest chromosomes in a total DNA library or elsewhere to increase the number of markers in a specific size class. A total of 10 different libraries were used, made from the six smallest chromosomes (ST1–ST6 series), a mixture of the eight intermediate sized chromosomal bands (ST7–ST9 series) and the total DNA (STP and ISA series). To avoid repeated cloning of the same inserts these libraries were made with a series of different restriction enzymes (PstI, HinpII, TaqI, HhaI and ClaI) and a limited number of clones were isolated from each library. Either PCR-amplified products or inserts obtained from restriction digests were used for labelling.
A second source of probes was ESTs sequenced from a L.major cDNA library made in λZAP and provided by J.Ajioka and J. Blackwell (Cambridge University, UK; M. Levick et al. submitted for publication). The inserts were purified after PCR amplification with M13/M13 reverse primers.
Finally, we used 41 gene probes isolated in different laboratories from different Leishmania species. The corresponding genes are listed thereafter by order of increasing size of the identified chromosomes: L.major mini-exon (10); L.major actin (D. F. Smith, Imperial College, London, UK, unpublished data); minisatellite sequence LiSTIR1 (11); the B729 fragment of dihydrofolate reductase-thymidylate synthase (DHFR-TS) from L.major (12); the L.major β-tubulin gene family (13); L.mexicana cysteine protease lmcpb (14); L.donovani phosphoribosyl pyrophosphate synthetase (PRS) (15); surface protease gp63 from L.major (16); histone H2B from L.enrietii (17); L.major surface antigen PSA-2 (18); L.donovani ornithine decarboxylase (ODC) (19); α-tubulin from L.enrietii (20); L.chagasi kinesin (21); L.major developmentally regulated cDNAs 2, 7, 14 and 16 (22); L.donovani S-adenosylhomocysteine hydrolase (SAH) (23); L.donovani inosine 5′-monophosphate dehydrogenase (IMPDH) (24); L.mexicana cysteine protease lmcpa (25); L.infantum histone H2A (26); L.mexicana CDC2-related kinase lmmcrk1 (27); hypoxanthine-guanine phosphoribosyl transferase (HGPRT) from L.donovani (28); the L.tarentolae H region implicated in drug resistance (29); L.major ribosomal protein S8 (30); the L.major homolog of yeast Silent Regulator 2 (hSIR2) (31); L.major heat shock protein 70-related gene hsp70.4 (32); adenine phosphoribosyl transferase (APRT) from L.donovani (33); the Trypanosoma brucei ribosomal DNA cluster (34); Drosophila melanogaster hsp70 (35); L.mexicana cysteine protease lmcpc (36); the L.major homolog of hsp100 (clpB) (37); L.major hsp70-related hsp70.1 (38); L.donovani S-adenosylmethionine decarboxylase (SAMdc) (B. Ullman, unpublished results); L.amazonensis hsp83 (39); the P-glycoprotein implicated in multi-drug resistance (MDR) from L.donovani (40); the E13 fragment from the frequently amplified LD1/CD1 region from L.donovani (41); L.mexicana pyruvate kinase (PK) (42); L.enrietii Pro-1 glucose transporter (43); L.amazonensis N-acetylglucosamine 1-phosphate transferase (NAGT) (44); L.mexicana CDC2-related kinase lmmcrk3, homologous to the T.brucei tbcrk3 gene (45).
The complete molecular karyotype of L.infantum LEM1317
Clone LEM1317 of L.infantum was previously selected for these studies because of its relatively simple karyotype. To gain the maximal information from the PFGE separation of its chromosomes we had to use four different running conditions (Fig. 1). A total of 25 chromosomal bands, ranging in size from 0.35 to ∼3 Mb, were resolved with very variable staining intensities, suggesting the presence of more than one chromosome in many bands, as well as of different sized homologues of the same chromosome. The number of bands may be approximate, since some wide bands may be more or less resolved according to the electrophoretic conditions. Two facts indicate that no chromosome of a larger size exists in LEM1317. First, under the conditions shown in Figure 1D the S.pombe chromosomes were able to enter the gel, with the 3.5 Mb chromosome 3 being separated from the two others (data not shown) and no compression zone was seen in this region for LEM1317. Second, we never found any DNA probe tested against the LEM1317 karyotype that would hybridise only to the gel slot. We concluded that the 25 bands observed under these four electrophoretic conditions contain all the nuclear DNA molecules of L.infantum.
Use of inter-strain polymorphisms to define linkage groups
We have previously defined six physical linkage groups for the six smallest chromosomes of Old World Leishmania (8). This was possible because of the clear separation of the six chromosomes in most strains. This approach proved impracticable above 550 kb, as the patterns observed were more complex in this size class, probably due to the presence of many chromosomes of a similar size. The use of densitometric measurements to estimate the number of chromosomes per band was also impossible, because of the size variations between homologous chromosomes in the same strain (6) and of the as yet uncertain ploidy of Leishmania. To define the chromosomal contents of each band in the LEM1317 karyotype we selected five strains (three of L.infantum and two of the closely related species L.donovani) previously shown to display highly polymorphic karyotypes due to widely divergently sized homologues in all size classes (7). In all cases previously analysed these size variations were attributed to amplification/deletion mechanisms and not to inter-chromosomal rearrangements (6,8; our unpublished results). In none of these strains was the karyotype clearer than in LEM1317 (Fig. 2). The chromosome size polymorphisms were used here as a tool, allowing identification of each chromosome from its distinctive inter-strain hybridization pattern with specific probes. An example is shown in Figure 2, where a chromosomal band of 850 kb in LEM1317 was resolved as three different patterns corresponding in the final numbering to chromosomes 21, 22 and 23. The pairs of different sized homologues observed in some strains both hybridized with all the probes defining the corresponding linkage groups.
A total of 244 probes were hybridised on these six karyotypes, consisting of 41 genes, 66 ESTs and 137 anonymous DNA markers. Using this method we identified 36 different physical linkage groups (Table 1). All these chromosomes were clearly identifiable by their specific hybridization patterns on the six karyotypes. One exception was chromosomes 32 and 33, which remain closely associated in al six strains. The fact that they constitute independent linkage groups was established using polymorphic L.braziliensis strains, where these chromosomes migrate at close but different positions (data not shown). Subtle but consistent differences between these two chromosomes were also visible in L.tropica and L.aethiopica strains (data not shown). The existence of additional unidentified chromosomes is very unlikely for two reasons: first, individual chromosomes in the karyotype always consist of a band with a low staining intensity and all the intensely staining bands have been resolved using a high number of probes (e.g. 30 for the band containing chromosomes 21, 22 and 23); second, no band remained unindentified in the karyotypes of the six polymorphic L.infantum/L.donovani strains used for definition of the linkage groups (not shown).
These results also enabled us to establish the haploid genome size of L.infantum by summing up the sizes of the 36 heterologous chromosomes in LEM1317: this gave a value of 35.5 Mb.
Comparison of the L.infantum linkage groups among pathogenic species
In order to compare the physical linkage groups across different species, we selected representative cloned strains of the species L.tropica, L.major and L.aethiopica. We chose two L.major and two L.tropica strains with polymorphic karyotypes and a single L.aethiopica strain (Fig. 3). Of the 244 loci defined in L.infantum, 241 gave a good hybridisation signal on these three species, the difference arising from three probes with a low level of inter-species sequence conservation (indicated in italics in Table 1). We observed that all probes belonging to a specific chromosome defined in L.infantum remained linked to the same chromosomal band in these three species. Moreover, specific size variations of most chromosomes were observed among the two L.major and between the two L.tropica strains (Fig. 3B). This enabled us to conclude that the linkage groups defined in each band correspond to specific chromosomes. Although this comparison was not made between L.aethiopica isolates, it seems highly unlikely that the intensely staining bands containing more than one linkage group in this species may be the product of an inter-chromosomal sequence exchange giving rise to two chromosomes of exactly the same size. We concluded that the L.infantum linkage groups were conserved in these four species. Figure 4 shows a diagrammatic representation of the molecular karyotypes of one representative clone of each of these four species, with localization of each of the 36 linkage groups. The extensive inter-strain chromosomal size polymorphisms should be noted, as well as the shuffling in the chromosome order to which they lead. The haploid genome size estimates were comparable in all species examined (35.7 Mb for LEM1958, 36.5 Mb for LEM1909 and 35 Mb for LEM1660).
For the first time, we have resolved the complete karyotype of several species of Leishmania into 36 chromosomes using specific single or low copy DNA probes. This enabled us to define unambiguously the haploid genome size of this organism as 36 ± 1 Mb (depending on the strains considered). The approach we followed made use of the size variations that affect homologous chromosomes in different strains of the same species. This proved essential for the correct assignment of probes to a specific linkage group. Many imprecise localizations were noted when we compared our data with published results. For example, the existence in a L.tropica strain of two widely different sized bands for the PSA-2/gp46 antigen family was interpreted as suggesting the presence of these genes on two heterologous chromosomes (46). We actually found that they mapped on a single chromosome (no. 12 in Table 1) whose linkage group is perfectly conserved but which varies greatly in size, so that many strains contain two different versions of it. Another example is the localization of hsp70, a b-tubulin cluster, a histone H2A cluster and the ribosomal DNA region on a single 1.3 Mb chromosome in L.infantum (47). We found that they mapped to four different linkage groups of similar sizes (chromosomes 27, 28, 29 and 32). Thus the use of a single strain to identify a linkage group may lead to the confusion of a chromosome with a chromosomal band or of two homologues with two different chromosomes. It can be expected that the availability of a set of probes for each of the 36 chromosomes identified in this study and of reference karyotypes will allow a definitive chromosomal assignment for any new locus to be cloned.
The method used in the present work did not allow consideration of the chromosomal regions containing highly dispersed repetitive DNA present on many chromosomes. Nevertheless, we were able to map repeated loci when they were present on two to four heterologous chromosomes. We found nine such probes, defining a total of 22 loci (indicated by an asterisk in Table 1). It is noteworthy that when a single chromosome bears several of these markers (chromosomes 11, 19, 21, 22, 29 and 32) the other copies of each of these markers are borne by different chromosomes (see Table 1). This suggests that these duplications are not large and that the Leishmania genome is probably not very redundant.
An important result of the present work was the finding of complete conservation in linkage group constitution in four different Leishmania species. These four taxa were studied previously in terms of isoenzyme variation (48) and nuclear restriction site divergence (5). Both techniques showed that they were separated by significant genetic distances. These biochemical estimates agreed with extrinsic criteria, such as the geographic distribution and the clinical disease, which served as the initial criteria for species separation. The conservation in linkage group constitution, for which we did not observe any exception, is therefore surprising. It could be explained either by a very low level of recombination between heterologous chromosomes or by specific constraints on the chromosomal contents. In this respect one can speculate that it may be related to the transcriptional mechanism in Leishmania and related trypanosomatids, which seems to be essentially polycistronic (49). It is possible that new transcription units, such as those expected from a chromosome translocation, would not be selectively advantageous. Indeed, a few examples of translocations have recently been shown among rodent malaria parasite species (for which trancription does not appear to be polycistronic), although the density of probes used was only about a third of those obtained in the present study (50).
A consequence of these results is that the extreme genome plasticity observed in Leishmania (51) is not caused by large inter-chromosomal rearrangements. It has been shown that amplification/deletion in subtelomeric repetitive regions are responsible for the variation in chromosome 1 and 5 in L.infantum and chromosome 2 in L.major (6,10,11; our unpublished results). The present work suggests that this may be a general mechanism for chromosome size variability in Leishmania, as observed in other protozoan parasites (reviewed in 52).
The strategy used and the number of probes localized in this study are essential for the future construction of a detailed map of the Leishmania genome. With the genome size that we have defined we can estimate a density of 1 marker/130 kb (with a mean of 7 markers/chromosome and assuming a random distribution of the probes). This estimate may be even higher if we consider the size of the subtelomeric repeated regions, which can constitute up to 25% of the chromosome length (6). These markers should help to speed up the construction of long range restriction maps of individual chromosomes and to validate the construction of cloned contigs. The four karyotypes shown in Figure 4 may serve as a reference for future mapping studies. It is noteworthy that several of the remaining clones we examined exhibited much more complex karyotypes with diffuse or compressed bands and are therefore less hepful in this respect. Moreover, since all chromosomes are now characterized by their marker contents, we suggest that the numbering of the chromosomes of Leishmania established in this study also serve as a reference for future studies.
We thank all our colleagues who generously provided the gene probes. This work was supported by a grant from the Ministére de la Recherche et de l'Enseignement Supérieur (GREG: Groupement d'étude sur le Génome, contract 61/94) and by the UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (contract no. 940415). C.R. is the recipient of a training grant from the ministre de la Recherche et de l'Enseignement Suprieur.