DNA sequence analysis of the nadA gene of Ehrlichia chaffeensis revealed a 942 bp open reading frame with the capacity to encode 314 amino acids. The amino acid sequence of the E. chaffeensis quinolinate synthetase A (NAD A) has 53.6% identity and 82% similarity to the NAD A of the cyanelle of Cyanophora paradoxa. Portions of the homologous genes of E. canis and E. muris were also sequenced. The amino acid sequences of the NAD A of E. canis and E. muris have 89.2% and 93.2% homology, respectively, to the NAD A of E. chaffeensis. We propose that the nadA gene may be an excellent candidate for a genetic tool for the phylogenetic study of ehrlichiae.
Nicotinamide adenine dinucleotide (NAD) plays a central role in cellular metabolism as a cofactor in more than 300 oxidation-reduction reactions. The de novo synthesis of NAD is via quinolinate (pyridine-2,3-decarboxylate) . Quinolinate is biosynthesized from l-aspartate and dihydroxyacetone phosphate by a quinolinate synthetase complex. The quinolinate synthetase complex consists of the two enzymes quinolinate synthetases A and B, which are the products of the genes nadA and nadB. The nadA genes of Escherichia coli, Salmonella typhimurium, and the cyanelle of Cyanophora paradoxa have been sequenced previously. In this study we sequenced the nadA genes of Ehrlichia chaffeensis, E. canis, and E. muris. E. chaffeensis is the etiologic agent of an emerging infectious disease, human monocytic ehrlichiosis , and E. canis causes canine monocytic ehrlichiosis. A recent report demonstrated that E. canis also causes human infection . E. muris, another recently identified Ehrlichia sp., was isolated from a vole in Japan . Among the genetically diverse ehrlichiae, these three species are closely related.
Materials and methods
E. chaffeensis (Arkansas strain) and E. canis (Oklahoma strain) were provided by Jacqueline Dawson (CDC, Atlanta, GA). E. muris was obtained from Yasuko Rikihisa (Ohio State University, Columbus, OH). Ehrlichiae were cultivated in DH82 cells at 37°C in EMEM containing 10% bovine serum.
DNA cloning and sequencing
E. chaffeensis genomic DNA was partially digested with XbaI and cloned into λZAP II phage vector (Stratagene, La Jolla, CA). The genomic library of E. chaffeensis was screened using canine anti-E. chaffeensis serum. The recombinant bacterial phages were converted into phagemids in vivo according to the instructions of the manufacturer. DNA sequencing was performed with an ABI 377 sequencer (Perkin Elmer, Foster City, CA).
Finding the unknown portion of DNA sequences of the E. chaffeensis nadA gene
The 5′ end of the E. chaffeensis nadA gene which was not present in the identified clone was obtained using a Promoter Finder Kit (Clontech, Palo Alto, CA).
To detect the nadA genes of E. canis and E. muris, genomic DNA of E. canis and E. muris was digested with EcoRI and hybridized with the E. chaffeensis nadA gene. The PCR-amplified E. chaffeensis nadA gene was labeled with digoxigenin using a Dig labeling and detection kit (Boehringer Mannheim, Indianapolis, IN). DNA hybridization was performed at 60°C.
Polymerase chain reaction amplification
PCR was used to amplify the nadA gene of E. chaffeensis and the homologous genes of E. canis and E. muris. The PCR amplification consisted of 30 cycles of 30 s at 94°C, 1 min at 55°C, and 1 min at 72°C.
Cloning and sequencing of the nadA gene of E. chaffeensis
In the course of cloning and sequencing the 120 kDa protein gene of E. chaffeensis, a recombinant λ phage clone designated λ5 was obtained by reaction with canine anti-E. chaffeensis serum . λ5 recombinant phage was converted into phagemid pλ5. Sequence analysis of the 6.5 kb DNA insert in pλ5 demonstrated two open reading frames (ORFs). The first ORF is a 400 bp small ORF with a truncated 5′ end. The second ORF encodes the 120 kDa immunodominant protein of E. chaffeensis. Between the two ORFs, there are approximately 1.5 kb of non-coding DNA sequence. There are unique ClaI restriction endonuclease cleavage sites in the insert DNA and the vector (pBluescript), respectively. pλ5 was digested with ClaI and religated to remove the insert DNA downstream of the small ORF. In the resulting plasmid, pXC, the small ORF was fused in-frame with the β-galactosidase gene from the pBluescript plasmid vector. A 19 kDa β-galactosidase fusion protein was observed on Coomassie blue-stained SDS-PAGE of IPTG-induced E. coli which contained plasmid pXC. The fusion protein did not react with canine or rabbit anti-E. chaffeensis sera (data not shown). This result indicated that the protein product of the small ORF is not an immunodominant protein of E. chaffeensis.
Database searching demonstrated that the amino acid sequence deduced from the small ORF has 52.0% identity to the quinolinate synthetase A of the cyanelle of Cyanophora paradoxa. This result led us to find the missing portion of the small ORF. Two primers (PXCR7: TGT CGA TCC AAT GAA ATG AGC and PXCR6: CAA ACG CAT ATG TGG GCA), which were directed upstream of the gene, were designed from the small ORF and were used to determine the unknown upstream sequence of the gene using the promoter finding method. The upstream sequences of the E. chaffeensis gene were amplified from a PvuII promoter finder library by PCR. Sequence analysis of a 1 kb PCR product revealed two ORFs, which were 450 bp and 600 bp in length. The 600 bp ORF sequence overlapped with the previously sequenced small ORF. DNA alignment of the 600 bp ORF and the small ORF resulted in a 942 bp ORF with the capacity to encode 314 amino acids (ORF314). The predicted protein has a molecular size of 35.2 kDa. The amino acid sequence deduced from ORF314 was compared with the protein sequences in the SwissProt database using the FastA program with Wisconsin sequence analysis software (Genetics Computer Group, Inc., Madison, WI). Database searching demonstrated that the E. chaffeensis amino acid sequence has 53.6% identity and 84% similarity to the amino acid sequence of NAD A of the Cyanophora paradoxa cyanelle (Fig. 1), 36.1% identity to the NAD A of S. typhimurium, and 34.6% identity to E. coli NAD A. Therefore, we considered that the ORF314 represents a nadA gene of E. chaffeensis. No homology was found in the database for the amino acid sequence deduced from the 450 bp ORF, which is 24 bp upstream of the ORF314 gene. GenBank accession number for the E. chaffeensis nadA gene is U90899.
Sequencing the nadA gene of E. canis and E. muris
The homologous nadA genes of E. canis and E. muris were detected by DNA hybridization with the digoxigenin-labeled PCR product of the nadA gene of E. chaffeensis. The nadA gene of E. canis is located on an approximately 10 kb EcoRI fragment, and the nadA gene of E. muris is located on an approximately 8.0 kb EcoRI fragment (Fig. 2). Portions of the nadA gene of E. canis and E. muris were amplified by PCR using the primers derived from the E. chaffeensis nadA gene. A 228 bp DNA fragment was amplified from E. canis with a primer pair consisting of forward primer λ5F1: TGT GAA TAG GAT GTG CAT GTG and reverse primer PXCR6. This primer pair amplifies a DNA fragment corresponding to nucleotides 1001–1231 of the E. chaffeensis nadA gene. The PCR product of E. canis DNA was sequenced. A primer (CANISNADR: CGA AGC ACA ACC ACC ATC CAA) was designed from the E. canis DNA sequence and was used to amplify the unknown upstream sequence of the E. canis nadA gene using the PromoterFinder method. A 580 bp PCR product was amplified with this primer from the PromoterFinder library of E. canis constructed with DraI.
A DNA fragment of 789 bp was amplified from E. muris DNA by the primer pair with the forward primer NADAF2 (ACG TCA TTT GGC TCA GGA) and the reverse primer PXCR6. The NADAF2 and PXCR6 primer pair amplified nucleotides 443–1232 of the E. chaffeensis nadA gene, which consists of 87% of the E. chaffeensis nadA gene. The PCR products of E. canis and E. muris were sequenced. GenBank accession numbers for the E. canis and E. muris nadA genes are U90900 and U90901, respectively. The amino acids deduced from these portions of the nadA gene of E. canis and E. muris were used to compare the similarity with the nadA gene of E. chaffeensis. The homology of amino acid sequences among E. chaffeensis, E. canis, and E. muris was determined using the amino acids corresponding to amino acids 87–258 in the E. chaffeensis NAD A. The homology of the amino acid sequences of NAD A is 93.2% between E. chaffeensis and E. muris, 89.2% between E. chaffeensis and E. canis, and 91.5% between E. canis and E. muris.
Hydropathy analysis of the predicted NAD A of E. chaffeensis
The hydropathy plot indicated that the NAD A of E. chaffeensis is predominantly hydrophobic although several segments in the middle of the protein are hydrophilic (Fig. 3). For comparison we also analyzed the hydropathy of the NAD A of E. coli, S. typhimurium, and the cyanelle according to the amino acid sequences in the database (data not shown). Hydropathy plots showed that the NAD A from all these organisms are hydrophobic. The hydropathy patterns indicated that E. chaffeensis and the cyanelle are similar and that E. coli and S. typhimurium are similar, respectively. Both the N- and C-terminals of the NAD A of E. coli and S. typhimurium are hydrophilic as previously reported for S. typhimurium. However, the NAD A of E. chaffeensis and the cyanelle do not have hydrophilic N- and C-terminals. These results suggest that the NAD A of E. chaffeensis is a cytoplasmic protein.
On the basis of amino acid homology of the ORF314 with the NAD A of the Cyanophora paradoxa cyanelle, E. coli, and S. typhimurium, we considered the gene for the ORF314 of E. chaffeensis as the nadA gene. We also sequenced portions of the nadA genes of E. canis and E. muris. Comparison of amino acid sequences of NAD A of E. chaffeensis, E. canis, and E. muris demonstrated that the nadA genes are conserved among the species in this clade of the genus Ehrlichia. E. muris is closer to E. chaffeensis than to E. canis. The high homology of the nadA gene in closely related organisms has been demonstrated previously in E. coli and S. typhimurium. The amino acid sequences of the NAD A show 91% identity between E. coli and S. typhimurium. Not only are the amino acid sequences highly homologous among the NAD A in different organisms but also the sizes of the NAD A are very close. The NAD A of E. chaffeensis, the cyanelle, E. coli, and S. typhimurium are 314, 333, 349, and 365 amino acids [2, 4, 8], respectively. Some regions of the NAD A amino acid sequences are conserved among all species (Fig. 1). These regions may serve as the characteristic motifs of the enzyme. The small size and relatively conserved nature of the nadA among the species of Ehrlichia, as we have demonstrated, make it a good genetic tool for the phylogenetic study of ehrlichiae.
We have also sequenced 500 bp upstream and 1 kb downstream of the E. chaffeensis nadA gene. A 450 bp ORF encoding 150 amino acids was located 24 bp upstream of the nadA gene. The function of the protein encoded by this ORF is unknown. We did not find any ORF in any of the three reading frames within 1.5 kb downstream of the nadA gene. The 120 kDa protein gene of E. chaffeensis is located approximately 1.5 kb downstream of the nadA gene. The gene arrangement of E. chaffeensis is possibly different from S. typhimurium. In S. typhimurium, the nadA gene and pnuC, a pyridine nucleotide uptake protein gene, occur in an operon with transcription initiating upstream from nadA. The product of pnuC is an essential component for transporting nicotinamide mononucleotide into the cytoplasmic membrane in the pyridine nucleotide scavenging system of S. typhimurium. Analysis of the DNA sequence upstream and downstream of the nadA gene of E. chaffeensis did not reveal any sequence homologous to the pnuC gene of S. typhimurium.
It was unexpected that NAD A would be highly homologous between ehrlichiae and the cyanelle. Ehrlichiae and cyanelles are not known to be related to each other phylogenetically. Moreover, superficially on the basis of their ecologic niches and phenotypic characteristics, ehrlichiae and cyanelle would not appear to be related. Ehrlichiae are Gram-negative, obligately intracellular bacteria. Evolutionarily ehrlichiae must have had a free-living bacterial ancestor, subsequently adapted to intracellular living, most likely lost some genes whose products are readily obtained from the host cell, and finally became totally dependent on the host cell and thus obligately intracellular. The genome of E. chaffeensis is 1.2 kb, which is less than half of the genome of the free-living bacterium E. coli. We do not know whether ehrlichiae are beneficial to their natural host or purely parasitic. On the other hand, cyanelles are the photosynthetic organelle of Cyanophora paradoxa. Although cyanelles are genuine plastids, they resemble cyanobacteria in morphology and biochemical organization of their photosynthetic apparatus, and in the presence of the peptidoglycan . Among eukaryotes, peptidoglycan is found only in cyanelle-containing organisms. Biological investigation of cyanelle peptidoglycan revealed amino acids and amino sugars typical of Gram-negative bacteria . DNA sequence analysis of the 16S rRNA gene of the cyanelle and cyanobacteria revealed that cyanelles are closely related to cyanobacteria . Cyanelles are found not only in different species of the Glaucocystophyceae family but also in other eukaryotic cells which are unrelated evolutionarily and systematically. For example, a cyanelle has been detected in an amoeboid ‘host’, Paulinella chromatophora. The existence of cyanelles in systematically diverse species indicates repeated invasions of heterotrophically living cells by cyanobacteria-like organisms in the evolutionary past . It will be interesting to investigate the evolutionary relationship of the obligately intracellular bacteria and cyanelles. They may have diverged from a common free-living bacterial ancestor which invaded the eukaryotic cells or was captured by predator eukaryotic cells and subsequently adapted to the intracellular life style becoming a parasitic obligately intracellular bacterium, an endosymbiotic organism in some cases, or an organelle of the eukaryotic host cell.
We are grateful to Patricia Crocquet-Valdes for her inspiring discussions, to Josie Ramirez for expert assistance in the preparation of the manuscript and to Tom Bednarek for preparation of the illustrations. This study was supported by a research grant from the National Institute of Allergy and Infectious Diseases (AI31431).