Ants provide remarkable examples of equivalent genotypes developing into divergent and discrete phenotypes. Diploid eggs can develop either into queens, which specialize in reproduction, or workers, which participate in cooperative tasks such as building the nest, collecting food, and rearing the young. In contrast, the differentiation between males and females generally depends upon whether eggs are fertilized, with fertilized (diploid) eggs giving rise to females and unfertilized (haploid) eggs giving rise to males. To obtain a comprehensive picture of the relative contributions of gender (sex), caste, developmental stage, and species divergence to gene expression evolution, we investigated gene expression patterns in pupal and adult queens, workers, and males of two species of fire ants, Solenopsis invicta and S. richteri. Microarray hybridizations revealed that variation in gene expression profiles is influenced more by developmental stage than by caste membership, sex, or species identity. The second major contributor to variation in gene expression was the combination of sex and caste. Although workers and queens share equivalent diploid nuclear genomes, they have highly distinctive patterns of gene expression in both the pupal and the adult stages, as might be expected given their extraordinary level of phenotypic differentiation. Overall, the difference in the proportion of differentially expressed genes was greater between workers and males than between workers and queens or queens and males, consistent with the fact that workers and males share neither gender nor reproductive capability. Moreover, between-species comparisons revealed that the greatest difference in gene expression patterns occurred in adult workers, a finding consistent with the fact that adult workers most directly experience the distinct external environments characterizing the different habitats occupied by the two species. Thus, much of the evolution of gene expression in ants may occur in the worker caste, despite the fact that these individuals are largely or completely sterile. Analyses of gene expression evolution revealed a combination of positive selection and relaxation of stabilizing selection as important factors driving the evolution of such genes.
Morphological, physiological, and behavioral differences can arise from sequence differences in coding genes or from differences in the level of expression of genes. Examples of the effect of variation in gene expression include tissue differentiation within individuals (e.g., Enard et al. 2002; Khaitovich et al. 2006; Yang et al. 2006), sexual dimorphism in organisms with otherwise almost equivalent genomes (Ellegren and Parsch 2007), and many of the phenotypic differences between closely related species (Enard et al. 2002; Ranz et al. 2003; Rifkin et al. 2003; Cassone et al. 2008). Several studies have demonstrated that such variation in gene expression is an important substrate for adaptive evolution (Ortiz-Barrientos et al. 2006; Whitehead and Crawford 2006a, 2006b; Rebeiz et al. 2009; Babbitt et al. 2010).
An important question is to determine the relative contribution of differences in coding gene sequences and level of gene expression on phenotypic evolution. Both environmental differences and genetic variation (e.g., in transcription factors and regulatory motifs) can translate into gene expression and phenotypic differences. An effective way to investigate the environmental contributions to gene expression variation is to study organisms in which divergent and discrete phenotypes (polyphenism) can develop from an identical genotype. Environmentally induced differences in gene expression associated with polyphenism have been reported in horned beetles (Snell-Rood et al. 2010), termites (Scharf et al. 2003; Weil et al. 2009), honey bees (Evans and Wheeler 2001; Judice et al. 2006; Sen Sarma et al. 2007), bumble bees (Pereboom et al. 2005), wasps (Sumner et al. 2006; Hoffman and Goodisman 2007; Hunt and Goodisman 2010), and ants (Gräff et al. 2007). These studies found evidence that differential gene expression patterns mediate the decoupling between an otherwise identical genotype and the resulting distinct alternative phenotypes, but much remains to be learned about the mechanisms, causes, and evolution underlying variation in gene expression within- and between-species.
The aim of this study is to investigate gene expression patterns in queens, workers, and males of two closely related ant species to obtain a comprehensive picture of the relative contributions of sex, caste, and species divergence to gene expression evolution. The organizing principle of ant societies is reproductive division of labor, whereby one or a few female individuals (the queens) specialize in reproduction, whereas the others (the workers) participate in cooperative tasks such as building the nest, collecting food, rearing the young, and defending the colony. This reproductive division of labor provides numerous fitness benefits and is one reason for the tremendous ecological success of ants (Hölldobler and Wilson 1990). With the exception of a few atypical lineages (Julian et al. 2002; Volny and Gordon 2002; Helms Cahan and Keller 2003; Fournier et al. 2005), the marked behavioral and morphological differences between queen and worker ants stem from differences in their developmental pathways unrelated to genotype, as any fertilized (diploid) egg has the potential to develop into either caste depending upon environment-induced differentiation in gene expression (reviewed in Schwander et al. 2010). In contrast, the differentiation between males and females generally depends upon whether eggs are fertilized, with fertilized eggs giving rise to females and unfertilized (haploid) eggs giving rise to males.
We studied the relative contribution of sex, caste, and species divergence to gene expression in two species of fire ants, Solenopsis invicta and S. richteri. Our choice of these two species was based on several criteria. First, fire ants are advanced eusocial insects in which the extraordinary degree of caste dimorphism between females is as extreme as found among social animals (including obligate worker sterility enforced by anatomical absence of reproductive organs). Caste differentiation within Solenopsis occurs between the first and second larval instars (Petralia and Vinson 1979), and the observation that the loss of a queen induces many diploid larvae to switch their development from workers to queens is consistent with an environmental caste determination system (Vargo and Fletcher 1986). The evident lack of a genotypic mechanism for caste determination therefore makes Solenopsis an ideal system to investigate gene expression changes associated with radical social evolution. Second, S. invicta and S. richteri are closely related (Pitts et al. 2005), being able to hybridize and even form relatively fit hybrids in nature (Ross et al. 1987; Ross and Robertson 1990; Shoemaker et al. 1996). Third, a microarray containing 21,715 expressed sequence tags (ESTs) representing more than 10,000 different transcripts with diverse molecular functions was recently developed for S. invicta (Wang et al. 2007; Wurm et al. 2009; Wurm et al. 2010).
In addition to studying the contributions of sex, caste, and species divergence to gene expression differences, we analyzed gene expression variation between the pupal and the adult stages. Fire ants and other Hymenoptera are advanced holometabolous insects with extraordinary phenotypic differences among the life stages. Thus, patterns of gene expression are expected to differ substantially from one developmental stage to the next. Finally, we also evaluated the role of selection in driving the evolution of caste-specific and species-specific differences in gene expression.
Materials and Methods
Colonies of S. invicta were collected in Oglethorpe County, Georgia in June 2005 (n = 3) and May 2007 (n = 5). Colonies of S. richteri were collected in Henderson County, Tennessee in June 2005 (n = 2) and May 2007 (n = 5). All collected colonies were returned to the laboratory and maintained in a rearing room under standard conditions (Jouvenaz et al. 1977). Allozyme electrophoretic analyses were performed on a subset of ten females from each colony to confirm that all colonies were either pure S. invicta or pure S. richteri (the two species hybridize in their invasive, USA ranges). We also performed allozyme analyses of 30 additional colonies collected from each of these same sites to confirm that each area contained only the respective pure parental species. Five highly informative or fully diagnostic loci were employed in these analyses. The same subset of ten females per study colony was used to confirm by allozyme electrophoresis that all colonies were of the monogyne social form (single queen per colony), a conclusion supported also by the discovery of only a single wingless, reproductive queen per nest (see Ross et al. 2003 for explanation of expected genotype distributions in colonies of the two forms). Three (S. invicta) or two (S. richteri) polymorphic loci were used for these latter analyses. Relevant allozyme methods are detailed in Shoemaker et al. (1996) and Ross et al. (1997).
The samples comprise three castes (workers, queens, and males) collected at two separate developmental stages (pupae and adults) for both species. Only large (major) workers were selected for the worker samples. We made every effort to minimize environmental noise in gene expression by sampling all study individuals from laboratory colonies reared under standard conditions for at least two weeks after field collection. Also, samples were harvested at specific developmental mileposts to minimize temporal variation in gene expression within each of the two developmental stages: all pupae were sampled at the first appearance of “pink eyes,” typically within 24 h of pupation, and adults were sampled within 1 h of emergence from the pupal stage. These timepoints were chosen because they make it possible to control most reliably for age and developmental state, variation in which would introduce unwanted noise in our experimental design.
RNA/DNA Extraction and Preparation
We randomly sampled, and eventually pooled, 1–6 nest mate individuals (average = 4.6 and median = 5) for each replicate of the 12 specific categories or nodes of interest (two developmental stages × 3 castes × 2 species; supplementary table S1, Supplementary Material online). Each pooled sample was promptly snap-frozen in liquid nitrogen in tubes containing 1 g of 1.4-mm ceramic beads (Quackenbush) and then homogenized in Trizol (Invitrogen) using a Retsch bead shaker. Both RNA and DNA were extracted according to the manufacturer's protocol. The quality of the total RNA was assessed visually by gel electrophoresis of a 2 μl aliquot in a 1.5% agarose gel. Only samples with no signs of RNA degradation were retained and used for subsequent microarray analyses.
Total RNA was amplified (Ambion MessageAmp II kit) and then an ∼6-μg aliquot of this amplified RNA was mixed with 2 μl of random 9mers (2 μg/μl), 0.5 μl of Alien mRNA Spike mix (Stratagene), and water for a final volume of 17.9 μl. The mix was incubated for 10 min at 70 °C, then held for 5 min on ice. Reverse transcription was performed for 2 h at 50 °C after adding 6 μl of 5 × first-strand buffer, 3 μl of 0.1 M dithiothreitol, 0.6 μl of 50X aminoallyl-deoxynucleotide triphosphate mix (25 mM deoxyadenosine triphosphate, 25 mM deoxycytidine triphosphate, 25 mM deoxyguanosine triphosphate, 15 mM deoxythymidine triphosphate, and 10 mM aminoallyl-deoxyuridine triphosphate), 0.5 μl of RNAse inhibitor (15 U/μl, Invitrogen), and 2 μl of SuperScript III reverse transcriptase (200 U/ml, Invitrogen). The residual RNA was hydrolyzed by adding 15 μl of 0.1 M NaOH and incubating for 10 min at 70 °C. Subsequently, the pH was returned to neutrality by the addition of 15 μl of 0.1 M HCl. The purification of the aminoallyl-labeled cDNA and the coupling to Cy3 or Cy5 dyes (GE Healthcare, Little Chalfont, UK) were done using a modified Qiaquick PCR Purification Kit (QIAGEN) protocol. The combined Cy3- and Cy5-labeled probes were purified using the Qiaquick PCR Purification Kit (QIAGEN) and eluted in 82 μl of elution buffer. We finally added 16.5 μl of 20x standard saline citrate (SSC), 3.0 μl of yeast transfer RNA (10 μg/μl), 1.8 μl of polyA cDNA (2 μg/μl; Sigma), and 1.8 μl of 10% sodium dodecyl sulfate (SDS), then denatured this reaction mix at 95 °C for 45 s and hybridized to the microarray slides at 64 °C overnight (18–22 h). Excess probe was removed by washing the slides at room temperature twice for 5 min in a 2x SSC/0.1% SDS solution, twice for 1 min in a 0.2x SSC solution, 1 min in a 0.1x SSC solution, and twice for 5 min in a 0.1x SSC/0.1% Triton solution.
Each experimental sample was hybridized against a common reference RNA (which consists of RNA extracted from hundreds of S. invicta individuals of different castes [i.e., workers, queens, and males] and of different stages [i.e., adults, pupae, and larvae] to ensure maximum coverage of the transcriptome) labeled with the appropriate alternate dye on our custom-made spotted cDNA microarray (Wang et al. 2007; Wurm et al. 2009). Every sample was equivalent to an independent biological replicate (i.e., each colony was used only once as a source for a given caste and stage; see supplementary table S1, Supplementary Material online) that was randomly labeled with either Cy3 or Cy5. After hybridization, slides were scanned using an Agilent Technologies Scanner and the signal intensities for each spot were extracted (low-quality spots were flagged and removed prior to analyses) using GenePix software.
We used the software package limma in R (R Development Core Team 2008) to perform adaptive background correction, within-array print-tip loess normalization and between-array normalization of the intensities (Smyth and Speed 2003; Smyth 2004; Ritchie et al. 2007). To increase the power of the between-array normalization, we included data from (1) 28 additional arrays hybridized with S. invicta and S. richteri cDNA from worker, queen, or male larvae (excluded from the current study due to insufficient sampling) and (2) 46 arrays hybridized with hybrid S. invicta X richteri cDNA from various life stages and castes (Ometto L, Ross KG, Shoemaker D and Keller L, in preparation), all of which underwent the same experimental protocols as the samples used in the present study. Stringent quality control criteria were used to filter out spots with unreliable foreground intensity for each microarray slide (modified from Lemos et al. 2008). The criteria were (((F635 Median > CTRL635 + 2 × CTRL635SD) OR (F532 Median > CTRL532 + 2 × CTRL532SD)) AND (((F635 Median-B635) > 2 × B635 SD) OR ((F532 Median-B532) > 2 × B532 SD)) AND ((%>B635 + 2SD > 70) OR (%>B532 + 2SD > 70)) AND ((SNR635 > 2) AND (SNR532 > 2))), where Fxxx and Bxxx are the foreground and background fluorescence intensities, SNRxxx is the signal to noise ratio, %>Bxxx+2SD is the percentage of pixels above background mean + 2 × standard deviations (SD), CTRLxxx and CTRLxxxSD are the average and SD of the foreground intensities for eight negative control probes spotted onto the microarray (Human beta actin PCR product, Stratagene), and xxx = 532 and 635 for Cy3 and Cy5, respectively.
We estimated the relative expression of each clone for each node (caste/species/stage), and the significance of the differential expression of such clones among nodes, using the Bayesian approach implemented in the program Bayesian Analysis of Gene Expression Levels (BAGEL; Meiklejohn and Townsend 2006). To increase the power of the BAGEL estimations, we also included in the analysis the additional 28 and 46 arrays detailed above. A false discovery rate (FDR) was estimated using BAGEL by analyzing a randomized version of the original data set and comparing the number of positives obtained from the randomized and observed data sets. Based on these analyses, a clone was considered to be differentially expressed between two nodes if the associated Bayesian posterior probability was P < 0.001, which represents a value with an acceptably low FDR (i.e., FDR ≈ 5% when comparing S. invicta castes, and FDR ≈ 10% when comparing species; supplementary table S2, Supplementary Material online). We define caste-biased genes as those that were significantly over or underexpressed in one caste compared with both of the other castes for a given developmental stage (supplementary fig. S1, Supplementary Material online). Separate analyses were performed for S. invicta and S. richteri.
The analysis of gene expression involved multiple samples from two separate species, thus necessitating two main caveats. The first is the possibility of unequal power in detecting differentially expressed genes across comparisons, for example, because of unequal sample sizes. A thorough statistical analysis revealed, however, no substantial difference between the power estimated for our comparisons (Materials and Methods, Supplementary Material online). The second relates to the use of a microarray developed for S. invicta to assess gene expression in both S. invicta and S. richteri. A study by Ranz et al. 2003 showed that the use of a microarray developed for Drosophila melanogaster to compare gene expression of D. melanogaster and D simulans did not lead to biases in the estimation of gene expression levels. Given that the overall nuclear genomic divergence between these two fly species appears greater than that between S. invicta and S. richteri (Ometto L and Shoemaker D, unpublished data), our microarray analyses likely are not confounded by spurious results due to interspecific sequence divergence. Furthermore, Gilad et al. (2005) showed that single- and multispecies arrays performed equally well when imposing a cutoff in gene expression differences of 1.5-fold, close to the GEL50 (ratio of gene expression level beyond which 50% of differences were significant at P < 0.001) and well below the GEL95 ratio thresholds (supplementary table S3, Supplementary Material online). In our analyses, we obtained equivalent patterns of gene expression differences when the GEL95 ratio threshold or the threshold of 1 was used (supplementary table S4, Supplementary Material online). This suggests a negligible effect of the possible sequence mismatches due to interspecific divergence on the levels of microarray hybridization. However, to investigate the possibility that such sequence divergence could result in reduced microarray hybridization, and hence observed expression levels, we calculated the overall level of expression for each caste and species (supplementary table S5, Supplementary Material online). The lack of a consistent difference in expression levels between the species suggests that sequence divergence has negligible effects on levels of microarray hybridization. Finally, because our analyses focus mainly on between-caste, between-sex, and between-stage differences, they should be little affected by interspecific differences in levels of microarray hybridization.
We analyzed our results considering both individual clones and contigs because many contigs are in fact represented by more than one clone spotted onto the microarray (Wang et al. 2007; Wurm et al. 2009; Wurm et al. 2010). In these cases, if one clone for a given contig was differentially expressed (or caste-biased), then that particular contig was also considered differentially expressed or caste-biased. Because the clone-based and contig-based results are very similar (supplementary table S4, Supplementary Material online), we report only results based on contigs. This also makes intuitive sense if we assume that any one gene is likely represented by a single contig but perhaps by multiple clones. Thus, we use contig synonymously with gene throughout the remainder of the text.
Analysis of Determinants of Gene Expression Variation
We estimated pairwise gene expression distances between each of the twelve nodes by summing the squared differences between the relative expression values (estimated by BAGEL) across all clones. The resulting pairwise distance matrix was used to construct a distance tree using the neighbor joining method (PHYLIP v3.67; Felsenstein 2005). The resulting tree reflects the similarities among nodes in overall gene expression patterns. We also performed principal components analyses on the gene expression data using R (R Development Core Team 2008) to evaluate the relative importance of caste, developmental stage, and species status on the observed gene expression levels across all genes.
Analysis of Caste-Biased Gene Expression Evolution
The mode of evolution of gene expression was assessed by computing the ratio RX, which is the variability in gene expression between species over the variability in gene expression within S. invicta. The variability in gene expression between species for a contig represented by n clones on the microarray was calculated as , where and are the relative expression values estimated for S. invicta and S. richteri, respectively, by BAGEL for clone c in the developmental stage and caste of interest. Similarly, variability in gene expression within S. invicta was estimated as , where the mean and the standard deviation of the log2 ratios between the normalized dye intensities of the experimental sample and reference are calculated across all S. invicta samples for the caste and developmental stage of interest. In order to correct for the unequal sample sizes across nodes, we multiplied SDc by , where m is the sample size and N = 6 (for convenience, we normalized to the most common sample size). It follows that .
Our data set contains a mixture of mitochondrial and nuclear genes, as well as transposable elements. Because genes of the different classes likely experience distinct mutational and selective regimes, we limited our analyses to nuclear genes. To this end, we manually annotated 3,316 of the transcripts with a valid BLASTx hit (Nicolas M, Wang J, Ometto L, unpublished data and personal communication) by verifying the presence of functional domains using InterPro (Hunter et al. 2009). We then assigned these genes to a functional category based on the annotation, distinguishing between nuclear and mitochondrial genes and identifying putative viral and transposable elements.
Values of VX, DX, and RX were estimated separately for each nuclear gene. All values were then partitioned according to whether the gene was overexpressed in workers, queens, or males, or, alternatively, was not biased in its expression in any caste. We also partitioned the data based on whether or not the gene was differentially expressed between species in a given caste or developmental stage. We then compared average VX, DX, and RX for these groups of genes using the nonparametric Wilcoxon test to infer possible variation or divergence in the rate of evolution of gene expression.
We also tested for a correlation between the caste specificity of a gene's expression and its variation in expression (VX and DX). As a proxy for caste specificity, we used the degree of caste overexpression, which, for a gene represented by n clones on the microarray, we estimated as where is the relative expression value estimated for S. invicta by BAGEL for clone c and the superscript refers to the three castes (caste1 represents the caste for which we want to estimate cbx). We assume that the larger the value of cbx for a particular gene, the closer that gene is to being wholly caste-specific.
We performed microarray hybridization experiments on 66 samples representing all three castes and two developmental stages in two fire ant species (supplementary table S1, Supplementary Material online). After discarding microarray spots of poor quality, 22,855 clones remained for subsequent analyses, which correspond roughly to 14,467 putatively unique transcripts or genes.
A gene expression–based distance tree shows that developmental stage was the major determinant of overall gene expression profile, followed by sex, female caste, then species identity (fig. 1). These results were confirmed by principal components analyses: the first component (PC1), on which developmental stage loads heavily, explains more than 40% of the variability in gene expression across samples (fig. 2; supplementary fig. S2, Supplementary Material online). The second and third components, most strongly related to sex and female caste, respectively, together account for ∼34% of the variability in gene expression. Finally, PC4 and PC5, both associated with species, together account for only ∼11% of the variability in gene expression.
In accord with the results of the expression-based distance tree and principal components analyses, Bayesian analyses of gene expression levels revealed that a substantial proportion of genes were significantly overexpressed in one caste compared with the other two at a given developmental stage. The percentage of genes overexpressed in one caste was 22% at the pupal stage in S. invicta, 18% at the adult stage in S. invicta, 16% at the pupal stage in S. richteri, and 14% at the adult stage in S. richteri (supplementary table S6 and supplementary fig. S1, Supplementary Material online; fig. 3). In pupae of both species, the percentage of overexpressed genes was highest in males, intermediate in workers, and lowest in queens (χ2 pairwise goodness of fit tests, all comparisons P < 10–10). In adults, the difference was even stronger between queens and the two other castes, whereas there was no statistical difference between males and workers (for both species, P < 10–10 for differences between queens and males and between queens and workers, and P > 0.15 for the differences between workers and males).
A very similar pattern was found when considering genes significantly underexpressed within a single caste compared with the other two castes within a stage (fig. 3). In pupae of both species, there were many fewer genes underexpressed in both queens and workers compared with males (all P < 10–10; supplementary table S6, Supplementary Material online). In contrast, in adults, there were fewer genes underexpressed in queens than in both males and workers (all P < 10–10), but more genes underexpressed in workers than males (all P < 10–10).
The combined analysis of caste-biased gene expression in both developmental stages provided evidence for a core set of genes important in defining caste identity across pupal and adult ontogeny. Overall, there were significantly more genes than expected by chance that exhibited either caste overexpression or caste underexpression in both developmental stages (supplementary table S6, Supplementary Material online; cumulative hypergeometric distribution, all P < 10–9 except for queen underexpressed genes in S. richteri, where P = 0.047). Although we only analyzed two developmental stages, this result hints at the presence of genes whose characteristic expression marks each caste throughout development
Two other lines of evidence further suggest a reasonable amount of conservation in caste-biased gene expression patterns during the divergence of the study species. First, 56% of all genes with caste-biased expression in S. invicta were also caste-biased in S. richteri (cumulative hypergeometric distribution, P < 10–10). Second, as in the single-species analyses, the number of queen overexpressed genes shared between S. invicta and S. richteri was significantly lower than the numbers of worker or male overexpressed genes shared between the species (all P < 10–10; supplementary table S6, Supplementary Material online).
Gene Expression Differences Between Species
It is likely that selective pressures acting differently between the two study species vary in intensity among the castes and developmental stages. Therefore, we compared gene expression patterns between S. invicta and S. richteri separately for each of the three castes and each of the two developmental stages. Overall, 884 genes out of 14,467 (6.1%) were significantly differentially expressed between pupae of the two species in at least one caste (supplementary table S7, Supplementary Material online). The number of such genes was very similar for the queen and worker castes (χ2 pairwise goodness of fit test, P = 0.299), and both values were significantly higher than for males (1.4- and 1.3-fold higher, respectively, P < 0.001 in both cases; fig. 4; supplementary table S7, Supplementary Material online). However, a high percentage of such pupa-expressed genes showed a caste-specific pattern of differentiation as they were differentially expressed between the species in only one caste. For instance, of the genes differentially expressed between heterospecific pupal workers, 35% were differentially expressed in this caste but not in heterospecific pupal queens or males. The corresponding values were 38% for queen pupae and 68% for male pupae (figs. 4 and 5), which suggests a larger male-specific differentiation in pupal gene expression between species than worker- or queen-specific differentiaton. On the other hand, 43% of the genes differentially expressed between heterospecific pupal workers were also differentially expressed between heterospecific queens (38% for the reciprocal comparison; fig. 5), also suggesting a comparable female-specific differentiation.
The overall number of genes differentially expressed between adults of the two species (1,549, corresponding to 10.7% of the analyzed genes) was higher than for pupae. In contrast to pupae, the difference between species was greatest in workers (2.3- and 2.6-fold for worker-male and worker-queen comparisons, respectively, both P < 10–10), and there was a marginally significant difference between males and queens (P = 0.030). As in pupae, there was a high percentage of genes in adults differentially expressed between the species that showed this pattern in only one caste (fig. 4).
Overall, worker and male overexpressed genes were differentially expressed between species more frequently than expected by chance (supplementary table S7, Supplementary Material online). Notably, about 18% of the worker overexpressed genes in either adult S. invicta or S. richteri (supplementary tables S6 and S7, Supplementary Material online) were also differentially expressed between adult workers of the two species (cumulative hypergeometric distribution, P < 10–10). Similar trends appear for male overexpressed genes (interspecific difference: ∼13%, P < 10–10) and, to a lesser extent, for queen overexpressed genes (interspecific difference: ∼6%, P = 0.031) in the adult stage. In pupae, the enrichment in caste overexpressed genes among the species-specific differentially expressed ones is significant only for male genes (∼6%, P < 10–10). Equivalent patterns are found when analyzing the number of genes that are either caste overexpressed or caste underexpressed but not when analyzing caste underexpressed genes alone. Thus, caste overexpressed genes appear to have differentiated much more than caste underexpressed genes during species divergence.
Gene Expression Evolution
We estimated both the intraspecific (within S. invicta) variation in gene expression, VX, and the interspecific differentiation in gene expression, DX, for all genes. Stabilizing and directional selection are predicted to affect differently the patterns of gene expression variation within and between species. Similar to coding sequences, relatively low DX and VX values may indicate the presence of strong functional constraints and the action of purifying selection. In contrast, high DX values may suggest either relaxed selection (when VX is correspondingly high) or diversifying (positive) selection (when VX is low). We estimate the ratio under the rationale that diversifying selection should lead to a disproportionate increase of DX compared with VX, as reported in Drosophila for male-biased genes (Meiklejohn et al. 2003). Because neutral expectations for the actual values of DX, VX, and RX are not necessarily straightforward, we focus instead on the relative differences.
We partitioned all genes into four classes to compare the relative patterns of selection on gene expression in the three castes for each developmental stage: (1) worker overexpressed, (2) queen overexpressed, (3) male overexpressed, and (4) not overexpressed in any caste. In both pupae and adults, the RX values were significantly higher for the three classes of overexpressed genes than for the genes that were not overexpressed in a given caste (P < 0.001 for all castes in both pupae and adults; fig. 6A and supplementary table S8, Supplementary Material online). Furthermore, VX was lower and DX was higher for the three classes of genes being overexpressed in one caste than for those that were not (with the exception of VX in pupae, where worker overexpressed genes had the highest value; supplementary table S8, Supplementary Material online). This pattern is consistent with caste overexpressed genes being under stronger diversifying (positive) selection between the two lineages and/or being subject to less selective constraints than those genes that are equally expressed across castes (Mank and Ellegren 2009).
In pupae, the average RX for male overexpressed genes was higher than that for both worker and queen overexpressed genes (P < 10–10; fig. 6A). In adults, the average RX for worker overexpressed genes was slightly larger than that for male overexpressed genes (P = 0.025), and both values were greater than for queen overexpressed genes (P < 0.01). These results suggest that diversifying selection over both developmental stages plays a more important role in shaping the pattern of expression of worker and, especially, male overexpressed genes than of queen overexpressed genes. The estimates of VX and DX further confirm that such positive selection may have been more important for male overexpressed genes than for queen and, perhaps, worker overexpressed genes over both stages (supplementary table S8, Supplementary Material online). For instance, male overexpressed genes had the lowest average VX values (P < 0.05 both in adults and in pupae), yet a relatively high DX value compared with queen overexpressed genes (P < 0.05; worker overexpressed genes have the highest DX among adults, P < 0.005; in pupae, averages are similar across castes, P > 0.2). These combined results, suggesting low variation within species but relatively high differentiation (and correspondingly high RX values) in gene expression levels between species, are the hallmarks of adaptive evolution associated with lineage divergence. Finally, the findings that worker overexpressed genes have both high DX and VX, which in pupae result in RX values lower than for the other caste overexpressed genes (P < 0.005), suggest that worker overexpressed genes likely experienced relaxed selection on gene expression relative to queen and male overexpressed genes in both developmental stages.
We were interested in further evaluating the mode of evolution of gene expression in genes specifically differentially expressed between S. invicta and S. richteri (see e.g., Khaitovich et al. 2004;,Jordan et al. 2005). Genes differentially expressed between the species had significantly larger RX values than genes showing no such expression differences in both pupae and adults (all P < 10−15; fig. 6B and supplementary table S9, Supplementary Material online). Moreover, the relatively low RX values for the latter genes stem from both larger intraspecific variation in gene expression (higher VX) and lower interspecific differentiation in gene expression (lower DX) compared with genes differentially expressed between the species, in all three castes and in both developmental stages (all P < 10–10; supplementary table S9, Supplementary Material online). Although low DX is expected for genes lacking species-specific expression, their high VX is not. Focusing on genes with species-specific expression in the adult stage, RX was significantly lower for genes differentially expressed in workers than genes differentially expressed in queens and males (P < 10–10), consistent with relaxed diversifying selection on genes exhibiting species-specific expression in the worker caste.
Our results show clear differences in the patterns of gene expression between developmental stages, among castes, and between the two species of fire ant that we studied. Both the principal components analyses and the gene expression–based distance tree revealed that variation in gene expression profiles is influenced more by developmental stage than by caste membership, sex, or species identity. The finding of a very strong effect of major developmental stage is consistent with previous studies of other holometabolous insects with pronounced phenotypic differences in the life stages (Drosophila, vespine wasps), which showed major shifts in gene expression networks among larvae, pupae, and adults (Arbeitman et al. 2002; Hoffman and,Goodisman 2007). Our comparisons also reveal that many genes are differentially expressed between castes within just a single developmental stage. These transient differences suggest that phenotypic variation between castes results from the transcriptional deployment of distinct sets of genes at specific developmental stages (Lebo et al. 2009).
The second major contributor to variation in gene expression was the combination of sex and caste. Sex was the stronger determinant of gene expression, as clearly illustrated by the deeper split between females and males than between workers and queens in their gene expression profiles in the expression-based distance tree (fig. 1). Although worker and queen fire ants share equivalent diploid nuclear genomes, they have very distinct patterns of gene expression in both developmental stages that we studied, as might be predicted given the extraordinary level of dimorphism between the two female castes. For instance, in adults, there are more genes differentially expressed between queens and workers than between queens and males (supplementary fig. S1, Supplementary Material online). The percentage of differentially expressed genes is even higher between males and workers, consistent with the fact that individuals of these two classes share neither gender nor reproductive capability. Thus, queens are more similar to the other two castes in terms of gene expression than workers are to males, a pattern evident also from the fact that significantly more genes are overexpressed in workers and males than in queens, both in the pupal and in the adult stages. A similar pattern is observed when considering genes that are underexpressed, with many more genes being underexpressed in males and workers compared with queens. These findings suggest that both gene activation and silencing characterize caste differentiation and function in fire ants. Such caste-specific patterns of gene expression possibly reduce antagonistic pleiotropic effects that can arise in the absence of chromosomal sex determination and genotypic caste determination (Mank et al. 2008).
Our results further indicate that the proportion of genes underexpressed relative to the proportion overexpressed is higher in adult queens than adult workers and males. This finding is consistent with results reported in the honey bee, Apis mellifera (Evans and Wheeler 2001). However, it is important to note that the adult queens used in our experiments were only recently emerged from the pupa, so it is likely that many genes associated with sexual maturation and reproduction become more highly expressed as queens age, translating into a higher number of genes specifically up-regulated in older queens.
Because fire ant workers do not reproduce and thus have only an indirect fitness component, gene expression may be under more relaxed selection in this caste than in queens and males (Linksvayer and Wade 2009). If we assume that regulation of gene expression has a genetic basis and often is under purifying selection in reproductively competent individuals, genes expressed only in workers may be predicted to exhibit both a higher level of within-species polymorphism and a greater between-species divergence than genes also expressed in males and queens. Consistent with this prediction, both intraspecific variability (VX) and interspecific differentiation (DX) in gene expression were higher for worker overexpressed genes than for queen and male overexpressed genes, although the differences were not always statistically significant (supplementary table S8, Supplementary Material online). To further test these predictions, we can analyze variation in gene expression as a function of the degree of caste-biased gene expression, cbx, where larger values of cbx indicate greater caste specificity in expression (supplementary table S10, Supplementary Material online). An analysis of variance test reveals that both the caste and the degree of caste-biased expression, cbx, significantly affect VX (all P < 10–10) and DX (all P < 10–4), and, more importantly, a significant interaction exists between cbx and caste (P < 10–10 for both VX and DX). Specifically, VX and DX increase disproportionately more with cbx for worker overexpressed genes than for queen or male overexpressed genes (supplementary fig. S3, Supplementary Material online). This is consistent with the hypothesis that as genes become more worker overexpressed, variability within species as well as divergence between species in their expression patterns increase in parallel. Faster evolution of genes showing overexpression in one caste has been reported also in wasps (Hunt and Goodisman 2010), suggesting that caste polyphenism generally is an important determinant of regulatory gene evolution.
The gene expression–based distance tree and the principal components analyses suggest that the degree of gene expression divergence between S. invicta and S. richteri varies according to caste. In adults, worker- and male-biased genes are more frequently expressed in a species-specific manner than are queen-biased genes. This pattern is consistent with the fact that adult workers most directly experience the distinct external environments characterizing the different ranges occupied by the two species, namely tropical and subtropical habitats for S. invicta and somewhat more temperate habitats for S. richteri (Trager 1991). This is because fire ant workers forage in the external environment, whereas queens and males are largely confined to the nests, which comprise highly buffered environments that presumably are quite similar between the species. Similarly, fire ant pupae are entirely confined within the nest, where they are cared for by the workers. Consistent with this view, the divergence in gene expression between species (Dx) is significantly lower for genes expressed in the pupal stage than in the adult stage (P < 0.0001), and the number of such differentially expressed pupal genes is similar among the three castes. The total number of genes exhibiting significant interspecific differentiation (1,549) represents about 10% of the genes we studied. This is much lower than the approximately 50% of genes found to be differentially expressed between the morphologically highly similar D. melanogaster and D. simulans (Ranz et al. 2003), which belong to the same species group and diverged about 2 Ma (Russo et al. 1995). Our finding of relatively low regulatory gene divergence is consistent with S. invicta and S. richteri being close relatives (members of the same species group; Pitts et al. 2005) that are able to produce relatively fit hybrids (Shoemaker et al. 1996). The modest transcriptome divergence we observed evidently is not closely linked to morphological differences between the two species, as they differ in very few diagnostic morphological characters, with their males effectively indistinguishable (Pitts 2002). Instead, traits linked to the physiology and/or behavior may more closely reflect the transcriptional differentiation we detected between S. invicta and S. richteri.
Compared with the relatively low overall differentiation in gene expression between the two fire ant species, the degree of interspecific differentiation in caste-biased gene expression is unexpectedly high. For instance, about 41% of the genes with caste overexpression in adult S. invicta are not similarly overexpressed according to caste in adult S. richteri. In comparison, only 20% of D. melanogaster sex-biased genes were found not to be similarly sex-biased in D. simulans (Ranz et al. 2003). These data suggest a higher evolvability of regulatory elements involved in caste determination than those involved in sex determination during cladogenesis (Mank and Ellegren 2009; Hunt and Goodisman 2010).
An important point to consider in this and other comparative genomic surveys of social Hymenoptera is the different ploidy state of females (diploid) versus males (haploid). Because recessive mutations may be immediately visible to selection in the haploid state, deleterious mutations should be purged more efficiently and beneficial mutations fixed more rapidly in males, barring female sex–limited expression. Several features of our data are consistent with high levels of positive selection on genes expressed in males. In particular, the patterns of expression of genes overexpressed in one caste or differently expressed between species (as measured by RX) suggest stronger positive selection on gene expression in males than in workers or queens. This is particularly evident in the pupal stage, in which spermatogenesis occurs (e.g., Passera and Keller 1992), as there are many more genes overexpressed in male pupae than in worker or queen pupae. Such genes are ideal targets for positive diversifying selection (Ellegren and Parsch 2007) and, indeed, a large fraction of the male overexpressed pupal genes are significantly differentially expressed between the two species. Male overexpressed genes may also evolve at enhanced rates due to sexual selection (e.g., Jagadeeshan and Singh 2005), although sexual selection generally is assumed to be minimal in male ants (Feldhaar et al. 2008).
Our results highlight the role of differential gene expression in species where distinct phenotypes are not merely the result of sexual dimorphism. In fire ants, workers have a considerable number of genes that are specifically up- or down-regulated compared with males and queens. Moreover, workers also have more genes significantly differentially expressed between species than do the other castes. Thus, much of the evolution of gene expression in ants may occur in the worker caste despite the fact that these individuals are largely or completely sterile. This can be explained by a combination of factors, including adult workers experiencing the most diverse environments and exhibiting the broadest behavioral repertoires, and both queens and males having lost ancestral Hymenopteran somatic-specific traits related to feeding and self-maintenance (see also Toth and Robinson 2007). Workers therefore constitute the extended somatic phenotype of the sexual castes.
We thank Christine La Mendola and the Lausanne Genomic Technologies Facility for technical support, Jeffrey Townsend for statistical advice, and John Wang and the Keller laboratory for helpful discussions. This work was supported by grants from the Swiss National Science Foundation and ERC (to L.K.), from USDA NRICGP (to D.S. and K.G.R.), and from the Roche Research Foundation (to L.O.). L.K. thanks Stanislas Leibler who hosted him during his sabbatical at Rockefeller.