-
PDF
- Split View
-
Views
-
Cite
Cite
Julie Jeukens and others, Candidate Genes and Adaptive Radiation: Insights from Transcriptional Adaptation to the Limnetic Niche among Coregonine Fishes (Coregonus spp., Salmonidae), Molecular Biology and Evolution, Volume 26, Issue 1, January 2009, Pages 155–166, https://doi.org/10.1093/molbev/msn235
Close - Share Icon Share
Abstract
In the past 40 years, there has been increasing acceptance that variation in levels of gene expression represents a major source of evolutionary novelty. Gene expression divergence is therefore likely to be involved in the emergence of incipient species, namely, in a context of adaptive radiation. In the lake whitefish species complex (Coregonus clupeaformis), previous microarray experiments have led to the identification of candidate genes potentially implicated in the parallel evolution of the limnetic dwarf lake whitefish, which is highly distinct from the benthic normal lake whitefish in life history, morphology, metabolism, and behavior, and yet diverged from it only ∼15,000 years before present. The aim of the present study was to address transcriptional divergence for six candidate genes among lake whitefish and European whitefish (Coregonus lavaretus) species pairs, as well as lake cisco (Coregonus artedi) and vendace (Coregonus albula). The main goal was to test the hypothesis that parallel phenotypic adaptation toward the use of the limnetic niche in coregonine fishes is accompanied by parallelism in candidate gene transcription as measured by quantitative real-time polymerase chain reaction. Results obtained for three candidate genes, whereby parallelism in expression was observed across all whitefish species pairs, provide strong support for the hypothesis that divergent natural selection plays an important role in the adaptive radiation of whitefish species. However, this parallelism in expression did not extend to cisco and vendace, thereby infirming transcriptional convergence between limnetic whitefish species and their limnetic congeners for these genes. As recently proposed (Lynch 2007a. The evolution of genetic networks by non-adaptive processes. Nat Rev Genet. 8:803–813), these results may suggest that convergent phenotypic evolution can result from nonadaptive shaping of genome architecture in independently evolved coregonine lineages.
Introduction
Ever since the pioneering work of Britten and Davidson (1971) and King and Wilson (1975) on the mechanisms and evolutionary role of regulatory processes, there has been increasing acceptance that variation in levels of gene expression represents a major source of evolutionary novelty, which in turn can lead to phenotypic divergence through natural selection (True and Haag 2001; Wray et al. 2003; Nuzhdin et al. 2004). Gene expression divergence is therefore likely to be involved in the emergence of incipient species, namely, in a context of adaptive radiation (Johnson and Porter 2000). Microarray analysis is being increasingly used as a means to investigate this phenomenon by screening the genome for divergence at the transcriptional level (White 2001; Gibson 2002; Feder and Mitchell-Olds 2003; Clark 2006), thus making it possible to pinpoint the genes that are subject to selection in ongoing speciation events (West-Eberhard 2005).
In lake whitefish (Coregonus clupeaformis), microarray experiments have led to the identification of candidate genes potentially implicated in the repeated independent evolution, or parallel evolution, of the limnetic dwarf lake whitefish, which is highly distinct from the benthic normal lake whitefish in life history, morphology, metabolism, and behavior, and yet diverged from it only ∼15,000 YBP (Bernatchez 2004). More specifically, transcription profiling by means of a cDNA microarray (Rise et al. 2004; von Schalburg et al. 2005) demonstrated parallelism in transcription divergence for over 100 genes among dwarf–normal pairs of lake whitefish in the white muscle and liver (Derome et al. 2006; St-Cyr et al. 2008).
The divergent dwarf lake whitefish matures earlier and has a shorter lifespan compared with the normal lake whitefish (Bernatchez and Dodson 1990; Bernatchez and Giroux 2000; Rogers et al. 2002). Its limnetic lifestyle is associated with higher metabolic costs (Trudel et al. 2001) and a largely zooplanktonic diet (Bernatchez et al. 1999). The highest differentiation between sympatric dwarf and normal lake whitefish occurs in lakes with reduced habitat and prey availability that could increase competition for resources (Bernatchez and Dodson 1990, 1991; Landry et al. 2007). A genetic basis has been demonstrated for many traits that differ between dwarf and normal lake whitefish, namely, swimming behavior (Rogers et al. 2002), growth (Rogers and Bernatchez 2005), morphology, and life history (Rogers and Bernatchez 2007). Many of the quantitative trait loci (QTL) underlying those traits had outlier genetic differentiation (FST) values relative to neutral expectations, which supports their implication in the adaptive divergence and ecological speciation of dwarf and normal lake whitefish (Rogers and Bernatchez 2007). Moreover, recent studies supported intrinsic and extrinsic postzygotic barriers to reproduction between them (Rogers and Bernatchez 2006; Rogers et al. 2007). The lake whitefish is therefore a powerful system to investigate phenomena such as parallel adaptive evolution and ecological speciation. It also has the advantage of being part of one of the most extensive vertebrate adaptive radiation in the Northern Hemisphere, among which cases of ecological convergence are pervasive (Lindsey 1981; Bernatchez 2004).
This study focuses on the comparative analysis of transcription for six candidate genes among incipient species of the North American lake whitefish, the European whitefish (Coregonus lavaretus), and two closely related competitors of limnetic whitefish species: the lake cisco, Coregonus artedi, found in North America and the vendace, Coregonus albula, which occurs only in Eurasia (fig. 1). The European whitefish lineage, which diverged from C. clupeaformis ∼500,000 YBP (Bernatchez et al. 1991; Bernatchez and Dodson 1994), has also gone through postglacial (less than 15,000 YBP) independent events of intralacustrine diversification (Douglas et al. 1999; Østbye, Naesje, et al. 2005; Østbye et al. 2006) and was included here as natural replicate of the North American radiation. Specimens of European whitefish used are from two distinct mitochondrial clades, the North European and South European, which likely diverged during Pleistocene glacier advances, 150,000–350,000 YBP (Bernatchez and Dodson 1994; Østbye, Bernatchez, et al. 2005). The cisco is a limnetic fish that feeds on zooplankton (Bernatchez and Giroux 2000) and often occurs with the normal whitefish in Quebec, whereas it is never found in sympatry with the dwarf whitefish (Doyon et al. 1998). It is therefore recognized as a competitor of limnetic lake whitefish (Bernatchez 2004). The vendace is also a limnetic zooplanktivorous species and considered the main competitor of limnetic European whitefish morphs (Bohn and Amundsen 2001). The vendace and cisco diverged from the whitefish lineage approximately 1 and 2.5 Ma, respectively (Bernatchez et al. 1991).
Coregonine species or morphs (Coregonus spp.) and their phylogenetic relationships (modified from Bernatchez 2004). The numbers of lakes sampled per species are in parentheses. Limnetic fishes are identified by wave symbols.
The aim of this study was to document the extent of transcriptional divergence in lake whitefish and three of its congeners, thereby gaining insight into the transcriptomics of adaptive radiation. To this end, we tested the hypothesis that parallel phenotypic adaptation toward the use of the limnetic niche in coregonine fishes is accompanied by parallelism in candidate gene transcription profiles. More specifically, the three objectives were to 1) confirm the results of previous microarray experiments (Derome and Bernatchez 2006; Derome et al. 2006; St-Cyr et al. 2008) by measuring candidate gene expression using quantitative real-time polymerase chain reaction (PCR); 2) test the hypothesis that the parallel evolution of a limnetic whitefish within and among continental lineages is accompanied by parallelism in candidate gene transcription; and 3) assess the phylogenetic extent of candidate gene transcription parallelism by testing the hypothesis that ecological convergence between American and European limnetic whitefishes and two limnetic congeners (cisco and vendace) is accompanied by convergence in candidate gene transcription. We predicted that candidate gene expression divergence between limnetic and benthic fishes would be in the expected direction according to previous microarray experiments or predictions based on both gene function and species-specific features. In addition, we expected all benthic–limnetic pairs to show similar patterns of expression divergence, therefore supporting the parallel evolution of limnetic coregonine fishes.
Materials and Methods
Sample Collection
North American lake whitefish samples from Cliff Lake (46°23′51″N, 69°15′05″W) and Indian Pond (46°15′2″N, 69°18′05″W, St John River drainage, Maine) were collected in June 2003, whereas samples from lake whitefish held in controlled conditions at the LARSA (46°46′48.21″N, 71°16′27.76″W, Laboratoire Régional des Sciences Aquatiques, Université Laval, Quebec, Canada; dwarf and normal whitefish were originally sampled in Témiscouata Lake [47°41′N, 68°47′W] and Aylmer Lake [45°54′N, 71°20′W], respectively) were collected in September 2004. European whitefish samples were collected in September 2005 in two Norwegian locations, the Pasvik river catchment (69°7′5.19″N, 28°59′59.59″E) and lake Stuorajávri (69°39′2.98″N, 25° 4′7.62″E) and in December 2005 in two Swiss lakes, Zurich (47°15′48.81″N, 8°37′17.98″E) and Lucerne (46°58′54.92″N, 8°28′52.20″E). Lake cisco from Lac des Trente-et-un-Milles and Lac Florent (46°16′27.96″N, 75°52′44.09″W, Gatineau region, Quebec, Canada) were sampled in August 2003 and 2004, respectively. Vendace samples were collected in September 2005 in the Pasvik river catchment, Norway. Eight to 16 adult individuals were sampled per population, half males and half females. Considering the potential allometric scaling of both total and specific RNA (Burness et al. 1999), size differences between species or morphs were minimized. Fish were caught alive with gillnets and dissected in the field. Fork length and sex were recorded, and tissue samples were immediately frozen in liquid nitrogen and later stored at −80 °C. For all populations, approximately 300 mg of white muscle and liver were sampled from each fish. In addition, in the case of European populations, one complete eyeball was also collected. This tissue was no longer available for North American populations at the time of the experiments.
Selection of Candidate Genes
A set of six candidate genes deriving from three different tissue types was selected: three of them based on previous microarray experiments (Derome and Bernatchez 2006; Derome et al. 2006; St-Cyr et al. 2008) and the other three based on functional predictions.
First, the liver was chosen for its implication in a large array of physiological processes (St-Cyr et al. 2008). Two candidate genes were selected from this tissue: anionic trypsin and carboxylesterase. These genes are associated with energetic metabolism and detoxification (St-Cyr et al. 2008) and could be related to metabolic and dietary differences between dwarf and normal lake whitefish. In previous microarray experiments, anionic trypsin was overexpressed in dwarf lake whitefish populations by 86% on average (St-Cyr et al. 2008). This highly conserved enzyme specifically cleaves peptide bonds and has both digestive and enzyme activator functions (Swanson et al. 1967; Tang et al. 1992). Although generally associated to the pancreas, it is also expressed in liver in salmon (Lilleeng et al. 2007). Carboxylesterase was overexpressed in dwarf populations by 58% on average (St-Cyr et al. 2008). It is implicated in biotransformation and detoxification (Clement 1984; Imai et al. 2006) and has been shown to be a useful biomarker for insecticide hazard assessment in fishes (Kuster 2005).
Second, white muscle was selected for its implication in growth and swimming activity (Derome et al. 2006), two functions that differ greatly between dwarf and normal lake whitefish and are likely to have evolved through divergent selection (Trudel et al. 2001; Rogers and Bernatchez 2005). In this tissue, beta parvalbumin was significantly overexpressed in dwarf populations by 49% on average (Derome et al. 2006). Parvalbumins are calcium-binding proteins implicated in muscle contraction regulation and, more specifically, in a regulatory network that accelerates muscle relaxation between brief contractions (Gerday 1982; Rall 1996; Berchtold et al. 2000). Derome et al. (2006) also highlighted the overrepresentation of genes related to energetic metabolism among genes overexpressed in the dwarf lake whitefish and suggested that lactate dehydrogenase (LDH) could be an important candidate gene of that functional group. Indeed, LDH is expressed in the white muscle and has been widely studied as an indicator of anaerobic metabolism, namely, between pelagic and benthic fishes in a swimming capacity comparative framework (Somero and Childress 1990).
Third, two candidate opsin genes expressed in the eye were chosen for their potential to reflect the known difference in depth preference between limnetic and benthic whitefish, for which a genetic basis was identified (Rogers et al. 2002). Cone opsin genes underlie spectral sensitivity tuning by changes in gene expression in cichlid fishes (Carleton and Kocher 2001). Fuller et al. (2005) also demonstrated a genetic and environmental variation in opsin gene expression in the bluefin killifish (Lucania goodei). Moreover, there is evidence for relatively rapid evolution of these genes through positive selection in salmonids (Dann et al. 2004). For this study, short-wavelength sensitive, type 1 (SWS1) and long-wavelength sensitive (LWS) were selected. Achieving a full understanding of visual adaptation in fish is a complex undertaking because the precise visual tasks of the fish are rarely known. Nevertheless, it has long been recognized that two main types of visual pigments are exploited: 1) pigments with an absorption maximum corresponding to the transmission maximum of the water and 2) pigments with an absorption maximum that is offset from the transmission maximum of the water (Lythgoe 1968), which enable vision through contrast detection. Consequently, red-green–absorbing LWS (500–570 nm) corresponds to the best adapted pigment for color vision in freshwater, whereas ultraviolet-absorbing SWS1 (360–430 nm) is a good “offset” candidate (Lythgoe 1968, 1984). In the C. lavaretus complex, the benthic morph is mainly found in shallow littoral zones (Kahilainen et al. 2003), whereas the limnetic morph and vendace tend to dwell deeper, especially during the day (Mehner et al. 2005; Mehner 2006; Kahilainen et al. 2007). Therefore, limnetic fishes, which inhabit deeper zones where light is increasingly dominated by long wavelengths, should overexpress LWS and underexpress SWS1 compared with benthic littoral fishes.
RNA Isolation and Reverse Transcription
RNA from the white muscle and eyes was isolated according to the Trizol Reagent protocol (Gibco BRL, Carlsbad, CA) and with the RNeasy Kit (Qiagen, Mississauga, Canada) for liver samples. Muscle and liver tissue homogenization was performed with a TissueLyser system (Qiagen), whereas complete eyeballs were disrupted using a rotor–stator homogenizer (DIAX-100, Heidolph Instruments, Schwabach, Germany). All samples were supplemented with Superase-In RNase inhibitor (Ambion, Austin, TX), quantified with a GeneQuant spectrometer (Pharmacia, New York, NY), and treated with DNase I (Invitrogen, Carlsbad, CA) for 45 min (1 unit/μg RNA for the muscle and the eye and 2 units/μg RNA for the liver). RNA integrity of 2–4 samples per tissue per lake of origin was assessed using a 2100 Bioanalyzer (Agilent, Palo Alto, CA). Reverse transcription was performed using 1 μg of treated RNA per sample according to the High-Capacity cDNA Archive Kit protocol (Applied Biosystems, Foster City, CA) using random primers.
Primer Design
cDNA from 10 individual lake whitefish (5 normals and 5 dwarfs) was amplified using primers designed from either microarray expressed sequence tags (ESTs, Salmo salar and Oncorhynchus mykiss, Rise et al. 2004; von Schalburg et al. 2005) or salmonid cDNA sequences available in GenBank (table 1). Amplification conditions were as follows: 94 °C for 5 min; 35 x (denaturation at 95 °C for 45 s, annealing at 56.5 °C for 30 s, and elongation at 72 °C for 60 s); and 72 °C for 5 min. Sequencing of the PCR products was performed directly after PCR purification with the QIAquick Gel Extraction Kit (Qiagen). Generated sequence information was used to design primers and fluorescent Taqman minor groove binder (MGB) probes for real-time PCR with Primer Express 2.0 (table 1; Applied Biosystems).
Sequencing Primers, Real-Time PCR Primers, and Taqman MGB Probe for Each Candidate Gene
| Gene | Sequencing Primers | Product | GenBank | Real-Time PCR Primers | Taqman MGB Probe |
| (5′–3′) | Size (bp)a | Accession No.a | (5′–3′) | (5′–3′) | |
| Trypsin | GTGCTGGTCAGCCAGTCAT | 501 | EU400606 | TCACCCTGGCAAGAGTCCTT | CCTCCAGGTATCCAGC |
| CGAGCACAACATCAAGGTGA | AATGATCACAAACGCCATGTTC | ||||
| Carboxylesterase | AGCTTCTGAGGGAGGAGGAC | 556 | EU400607 | GTATCAGCTCCTCCCCGTTTG | TGAGCCAGTGCGAGC |
| ACAGGCAATCAGCGATGTC | GATGGCATATTGGGCCAATTT | ||||
| Parvalbumin | GGCTGAATGCGAACATAAGA | 511 | EU400608 | AAGGTCTGGAGGAACAGCTTGA | ATCCTCCTCAATGAAGC |
| GGATGAGGATATCGCTGCTG | TACTTCGTTGACCAGGACAAGAGT | ||||
| LDH-A | ACAAGATCGTGGGCGATAAG | 527 | EU400609 | TGGGCGAGAAGCTACACCTT | CAGCTGTCACGGCTG |
| CCAGGAGGTGTAACCCTTCA | AGTCTCCGTGCTCTCCAATGA | ||||
| Opsin (SWS1) | TTGCTGGAACACCCCTAAAC | 429 | EU400610 | CACCTTGTGTGCAATGGAATCTT | CATGGGTTCAATAGCAG |
| ACCAGTCAGGTCCACAGGAG | GGGACCAAGCTGACACCAA | ||||
| Opsin (LWS) | GTTCGGAGGCAATGAGGAT | 457 | EU400611 | AACCCTGGGTATGCATTCCA | CTCTGGCTGCCGCC |
| AGGTGCCACAGAGGAAACC | TGGCGCTCTTGGCAAAGTA |
| Gene | Sequencing Primers | Product | GenBank | Real-Time PCR Primers | Taqman MGB Probe |
| (5′–3′) | Size (bp)a | Accession No.a | (5′–3′) | (5′–3′) | |
| Trypsin | GTGCTGGTCAGCCAGTCAT | 501 | EU400606 | TCACCCTGGCAAGAGTCCTT | CCTCCAGGTATCCAGC |
| CGAGCACAACATCAAGGTGA | AATGATCACAAACGCCATGTTC | ||||
| Carboxylesterase | AGCTTCTGAGGGAGGAGGAC | 556 | EU400607 | GTATCAGCTCCTCCCCGTTTG | TGAGCCAGTGCGAGC |
| ACAGGCAATCAGCGATGTC | GATGGCATATTGGGCCAATTT | ||||
| Parvalbumin | GGCTGAATGCGAACATAAGA | 511 | EU400608 | AAGGTCTGGAGGAACAGCTTGA | ATCCTCCTCAATGAAGC |
| GGATGAGGATATCGCTGCTG | TACTTCGTTGACCAGGACAAGAGT | ||||
| LDH-A | ACAAGATCGTGGGCGATAAG | 527 | EU400609 | TGGGCGAGAAGCTACACCTT | CAGCTGTCACGGCTG |
| CCAGGAGGTGTAACCCTTCA | AGTCTCCGTGCTCTCCAATGA | ||||
| Opsin (SWS1) | TTGCTGGAACACCCCTAAAC | 429 | EU400610 | CACCTTGTGTGCAATGGAATCTT | CATGGGTTCAATAGCAG |
| ACCAGTCAGGTCCACAGGAG | GGGACCAAGCTGACACCAA | ||||
| Opsin (LWS) | GTTCGGAGGCAATGAGGAT | 457 | EU400611 | AACCCTGGGTATGCATTCCA | CTCTGGCTGCCGCC |
| AGGTGCCACAGAGGAAACC | TGGCGCTCTTGGCAAAGTA |
Product sizes and GenBank accession numbers are given for the lake whitefish (Coregonus clupeaformis) partial cDNA sequences generated with the sequencing primers.
Sequencing Primers, Real-Time PCR Primers, and Taqman MGB Probe for Each Candidate Gene
| Gene | Sequencing Primers | Product | GenBank | Real-Time PCR Primers | Taqman MGB Probe |
| (5′–3′) | Size (bp)a | Accession No.a | (5′–3′) | (5′–3′) | |
| Trypsin | GTGCTGGTCAGCCAGTCAT | 501 | EU400606 | TCACCCTGGCAAGAGTCCTT | CCTCCAGGTATCCAGC |
| CGAGCACAACATCAAGGTGA | AATGATCACAAACGCCATGTTC | ||||
| Carboxylesterase | AGCTTCTGAGGGAGGAGGAC | 556 | EU400607 | GTATCAGCTCCTCCCCGTTTG | TGAGCCAGTGCGAGC |
| ACAGGCAATCAGCGATGTC | GATGGCATATTGGGCCAATTT | ||||
| Parvalbumin | GGCTGAATGCGAACATAAGA | 511 | EU400608 | AAGGTCTGGAGGAACAGCTTGA | ATCCTCCTCAATGAAGC |
| GGATGAGGATATCGCTGCTG | TACTTCGTTGACCAGGACAAGAGT | ||||
| LDH-A | ACAAGATCGTGGGCGATAAG | 527 | EU400609 | TGGGCGAGAAGCTACACCTT | CAGCTGTCACGGCTG |
| CCAGGAGGTGTAACCCTTCA | AGTCTCCGTGCTCTCCAATGA | ||||
| Opsin (SWS1) | TTGCTGGAACACCCCTAAAC | 429 | EU400610 | CACCTTGTGTGCAATGGAATCTT | CATGGGTTCAATAGCAG |
| ACCAGTCAGGTCCACAGGAG | GGGACCAAGCTGACACCAA | ||||
| Opsin (LWS) | GTTCGGAGGCAATGAGGAT | 457 | EU400611 | AACCCTGGGTATGCATTCCA | CTCTGGCTGCCGCC |
| AGGTGCCACAGAGGAAACC | TGGCGCTCTTGGCAAAGTA |
| Gene | Sequencing Primers | Product | GenBank | Real-Time PCR Primers | Taqman MGB Probe |
| (5′–3′) | Size (bp)a | Accession No.a | (5′–3′) | (5′–3′) | |
| Trypsin | GTGCTGGTCAGCCAGTCAT | 501 | EU400606 | TCACCCTGGCAAGAGTCCTT | CCTCCAGGTATCCAGC |
| CGAGCACAACATCAAGGTGA | AATGATCACAAACGCCATGTTC | ||||
| Carboxylesterase | AGCTTCTGAGGGAGGAGGAC | 556 | EU400607 | GTATCAGCTCCTCCCCGTTTG | TGAGCCAGTGCGAGC |
| ACAGGCAATCAGCGATGTC | GATGGCATATTGGGCCAATTT | ||||
| Parvalbumin | GGCTGAATGCGAACATAAGA | 511 | EU400608 | AAGGTCTGGAGGAACAGCTTGA | ATCCTCCTCAATGAAGC |
| GGATGAGGATATCGCTGCTG | TACTTCGTTGACCAGGACAAGAGT | ||||
| LDH-A | ACAAGATCGTGGGCGATAAG | 527 | EU400609 | TGGGCGAGAAGCTACACCTT | CAGCTGTCACGGCTG |
| CCAGGAGGTGTAACCCTTCA | AGTCTCCGTGCTCTCCAATGA | ||||
| Opsin (SWS1) | TTGCTGGAACACCCCTAAAC | 429 | EU400610 | CACCTTGTGTGCAATGGAATCTT | CATGGGTTCAATAGCAG |
| ACCAGTCAGGTCCACAGGAG | GGGACCAAGCTGACACCAA | ||||
| Opsin (LWS) | GTTCGGAGGCAATGAGGAT | 457 | EU400611 | AACCCTGGGTATGCATTCCA | CTCTGGCTGCCGCC |
| AGGTGCCACAGAGGAAACC | TGGCGCTCTTGGCAAAGTA |
Product sizes and GenBank accession numbers are given for the lake whitefish (Coregonus clupeaformis) partial cDNA sequences generated with the sequencing primers.
Quantitative Real-Time RT-PCR
Target cDNAs were amplified in triplicate by real-time reverse transcription PCR (RT-PCR) using an ABI PRISM 7000 thermocycler according to manufacturer's instructions (Applied Biosystems). Amplification conditions were as follows: a 50 °C activation step (2 min), a 95 °C activation step (10 min), and 40 cycles of 95 °C denaturation (15 sec) and 60 °C anneal/extension (1 min; default conditions). An endogenous control (Human Euk 18S rRNA, Applied Biosystems) was used to normalize for the total amount of RNA in each sample (1 per tissue per individual). The comparative CT method was used for relative quantitation of candidate gene expression (Applied Biosystems). This calculation method eliminates the need for a standard curve and yields expression measures that are relative to an arbitrary reference sample (calibrator), for which expression for all candidate genes will equal 1. Real-time PCR was performed on treated RNA samples from every tissue to assess the amount of contaminating genomic DNA. DNase treatment efficiency was deemed adequate when contaminating DNA amplification occurred outside the dynamic range of the assay, that is, after 34 PCR cycles. Reproducibility of the results was assessed by replication of real-time PCR measurements of the same samples for 1) two independent runs with the same cDNA sample, 2) two independent runs with the same cDNA sample but different experimenters, and 3) two independent cDNA samples extracted from the same tissue sample. Reproducibility was high for all three comparisons with the slope of all linear regressions close to 1 and R2 ≥ 0.95 (results not shown).
Real-time PCR amplicons for all genes and species were sequenced to assess 1) gene identity and 2) polymorphism. More specifically, for two random individuals per species, candidate gene amplicons were amplified from cDNA and cloned according to the TOPO TA Cloning Kit protocol (pCR2.1-TOPO vector, TOP10 chemically competent cells, Invitrogen). Transformants were directly amplified and sequenced according to manufacturer's instructions. Direct sequencing without a cloning step was tried but yielded poor-quality sequence data, as all six real-time PCR amplicons are less than 70 bp long. Results obtained for carboxylesterase, parvalbumin, SWS1, and LWS confirmed gene identity and were devoid of polymorphism (results not shown). For trypsin, the primers and probe sequences were also conserved among species. However, the primers for this gene appeared to amplify other sequences. This is not surprising as trypsin is a very conserved and multicopy gene. The unwanted PCR products are very unlikely to have been measured by real-time PCR because of low identity with the probe (56% and 63%). Indeed, in the real-time PCR, hybridization occurs at 60 °C, whereas basic melting temperature calculations for the 16-bp trypsin probe with 6 and 7 mismatches yield values as low as 27 and 21 °C, respectively. For LDH-A, amplicon sequencing revealed a single nucleotide polymorphism (SNP) at position 7 of the real-time PCR probe, thus resolving two different alleles. Both alleles were present in coregonine species of both North America and Europe: one allele appeared to be very common in North America (5/6 sequences), whereas the other was more common in Europe (5/6 sequences). This feature was taken into consideration in the interpretation of expression data for LDH-A. Overall, sequence analysis of amplicons confirmed their identity across species.
Statistical Analyses
Because microarray and real-time PCR technologies are based on very different processes, expression levels and fold changes are not expected to be identical between both methods (Morey et al. 2006). Therefore, microarray and real-time PCR results were compared only in terms of significant expression differences between groups, as well as their direction.
First, for each candidate gene, one-tailed paired t-tests for unequal variances between groups were performed in R 2.2.1 (The R Foundation for Statistical Computing 2005) to test the null hypothesis that the data did not follow the trends in expression differences expected from microarray results or functional predictions (see Selection of candidate genes). More specifically, the mean relative gene expression for each North American and European limnetic whitefish, lake cisco and vendace, population was compared with that of its sympatric benthic whitefish counterpart (the cisco was compared with the mean of both natural North American benthic populations). In addition, the data from the cisco and vendace were removed, and a second test was performed in order to detect whitefish-specific trends.
Second, populations were grouped into the six coregonine fishes presented in figure 1, which were treated as six separate species for the sake of analyses. In doing so, each ecological type of interest for the study's objectives, that is, benthic whitefish, limnetic whitefish, and limnetic competitor, was represented twice: once for North America and once for Europe. This allowed intercontinental comparisons while greatly reducing the risk of combining samples collected during different years and seasons (see Sample collection). Then, two multivariate mixed linear models were constructed in order to assess the species effect on candidate gene expression while keeping other parameters constant. The first model included genes from the liver and muscle, and the second model included genes from the eye only. The latter were modeled separately because of the different experimental designs (eye tissue unavailable for North American populations, see Sample collection). To construct the models, the MIXED procedure in SAS v.9.1 was used according to the multivariate approach developed by Wright (1998). This procedure allows random variables and missing values and assumes normally distributed homoskedastic residuals. In order to meet this assumption, all gene expression values were log2 transformed. Species, sex, and the species × sex interaction were included in both models as fixed independent variables while lake of origin was included as a random variable. Models were estimated with the restricted maximum likelihood method. A dummy variable containing gene names (var) was used to transform dependent variable columns containing expression data for each gene into a single vector of dependent variables (tall format). The “repeated” statement was then used with individual fish as subject in order to make the model multivariate (Wright 1998). The P values for fixed effects were obtained through a type 3 test of fixed effect (F statistic) and with the Wald test for covariance parameters (Z statistic) in the case of the random lake effect. The P values for simple fixed effects (divided per candidate gene) were obtained through a test of effect slices (F statistic, “slice” option). The P values for differences in gene expression least squares means (t-test) were corrected for multiple testing with the QVALUE software using the smoother method and a false discovery rate of 0.05 (Storey 2002).
The effect of body size on gene expression could not be included in the previous models because of its interdependency with the species effect (species are defined in part by their size differences). This variable was therefore treated separately in order to assess its effect relative to the species effect. To do so, the MIXED procedure in SAS was used to construct two types of models: 1) a multivariate model identical to the one presented above, but with fork length as fixed effect instead of species; 2) a multivariate model identical to the one presented above, but with fork length and fork length2 instead of species, to account for potential nonlinear effects. This was done for both the liver/muscle model and the eye model, with the maximum likelihood method. Akaike information criterion (AIC) and Schwarz Bayesian information criterion were compared between species and size models. These criteria are different expressions of the likelihood that describe the goodness of fit of the estimated statistical model, taking into account the number of parameters included.
Results
Relative Candidate Gene Expression and Direction of Differences
Gene Expression in Liver
In accordance to what was found by St-Cyr et al. (2008), the results showed that trypsin and carboxylesterase were both overexpressed in liver in all North American limnetic relative to benthic lake whitefish populations, including those held in controlled conditions (table 2). Moreover, the same observation was made in the case of European limnetic compared with benthic whitefish populations, both from Norwegian as well as Swiss lakes. However, this parallel overexpression pattern did not extend to the limnetic cisco and vendace, where both trypsin and carboxylesterase were underexpressed, compared with their benthic whitefish counterparts (table 2).
Mean Relative Expression and Standard Deviation (SD) for Each Population and Candidate Gene
| Lake and Populationa | Gene | Trypsin | Carboxylesterase | Parvalbumin | LDH-A | Opsin SWS1 | Opsin LWS | ||||||
| Meanb | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | ||
| Cliff | Normal | 2.81 | 3.17 | 0.63 | 0.28 | 0.34 | 0.35 | 1.23 | 0.59 | n/a | n/a | n/a | n/a |
| Dwarf | 7.40 | 14.25 | 0.83 | 0.16 | 1.55 | 1.26 | 1.47 | 0.68 | n/a | n/a | n/a | n/a | |
| Indian | Normal | 1.33 | 2.49 | 0.67 | 0.12 | 2.61 | 1.26 | 1.09 | 0.33 | n/a | n/a | n/a | n/a |
| Dwarf | 29.25 | 29.29 | 0.83 | 0.15 | 0.90 | 1.02 | 1.27 | 0.21 | n/a | n/a | n/a | n/a | |
| Larsa | Normal | 8.09 | 18.24 | 0.19 | 0.16 | 4.00 | 1.41 | 1.03 | 0.36 | n/a | n/a | n/a | n/a |
| Dwarf | 24.12 | 34.23 | 0.55 | 0.28 | 5.29 | 3.77 | 1.61 | 0.44 | n/a | n/a | n/a | n/a | |
| Zurich | Benthic | 0.02 | 0.01 | 0.16 | 0.09 | 24.29 | 17.34 | 0.49 | 0.36 | 2.12 | 1.85 | 1.23 | 0.86 |
| Limnetic | 0.47 | 0.65 | 0.24 | 0.15 | 48.62 | 54.99 | 0.86 | 1.01 | 0.77 | 0.76 | 1.38 | 0.85 | |
| Lucerne | Benthic | 0.07 | 0.12 | 0.20 | 0.14 | 35.82 | 38.19 | 1.70 | 1.02 | 0.82 | 0.71 | 1.46 | 0.71 |
| Limnetic | 14.40 | 25.85 | 0.73 | 0.91 | 21.32 | 15.67 | 0.94 | 0.81 | 0.33 | 0.19 | 1.87 | 0.62 | |
| Pasvik | Benthic | 6.62 | 10.47 | 0.78 | 0.28 | 41.80 | 27.85 | 1.49 | 0.79 | 0.87 | 0.92 | 1.82 | 0.48 |
| Limnetic | 8.05 | 8.32 | 0.84 | 0.97 | 42.29 | 17.84 | 1.75 | 1.94 | 1.33 | 0.88 | 2.20 | 0.40 | |
| Stuorajávri | Benthic | 1.71 | 1.90 | 0.44 | 0.16 | 5.20 | 4.08 | 1.05 | 0.34 | n/a | n/a | n/a | n/a |
| Limnetic | 5.17 | 9.16 | 0.71 | 0.27 | 32.18 | 13.44 | 0.89 | 0.20 | n/a | n/a | n/a | n/a | |
| Cisco | 0.64 | 1.06 | 0.33 | 0.14 | 3.62 | 2.53 | 1.14 | 0.24 | n/a | n/a | n/a | n/a | |
| Vendace | 0.64 | 0.60 | 0.01 | 0.00 | 44.22 | 38.05 | 1.49 | 0.66 | 0.09 | 0.06 | 3.30 | 0.97 | |
| Predictionc | L > B | L > B | L > B | L > B | L < B | L > B | |||||||
| P value, all datad | 0.05 | 0.32 | 0.15 | 0.29 | 0.12 | 0.07 | |||||||
| P value, whitefish speciesd | 0.02 | 0.005 | 0.19 | 0.29 | 0.24 | 0.03 | |||||||
| Lake and Populationa | Gene | Trypsin | Carboxylesterase | Parvalbumin | LDH-A | Opsin SWS1 | Opsin LWS | ||||||
| Meanb | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | ||
| Cliff | Normal | 2.81 | 3.17 | 0.63 | 0.28 | 0.34 | 0.35 | 1.23 | 0.59 | n/a | n/a | n/a | n/a |
| Dwarf | 7.40 | 14.25 | 0.83 | 0.16 | 1.55 | 1.26 | 1.47 | 0.68 | n/a | n/a | n/a | n/a | |
| Indian | Normal | 1.33 | 2.49 | 0.67 | 0.12 | 2.61 | 1.26 | 1.09 | 0.33 | n/a | n/a | n/a | n/a |
| Dwarf | 29.25 | 29.29 | 0.83 | 0.15 | 0.90 | 1.02 | 1.27 | 0.21 | n/a | n/a | n/a | n/a | |
| Larsa | Normal | 8.09 | 18.24 | 0.19 | 0.16 | 4.00 | 1.41 | 1.03 | 0.36 | n/a | n/a | n/a | n/a |
| Dwarf | 24.12 | 34.23 | 0.55 | 0.28 | 5.29 | 3.77 | 1.61 | 0.44 | n/a | n/a | n/a | n/a | |
| Zurich | Benthic | 0.02 | 0.01 | 0.16 | 0.09 | 24.29 | 17.34 | 0.49 | 0.36 | 2.12 | 1.85 | 1.23 | 0.86 |
| Limnetic | 0.47 | 0.65 | 0.24 | 0.15 | 48.62 | 54.99 | 0.86 | 1.01 | 0.77 | 0.76 | 1.38 | 0.85 | |
| Lucerne | Benthic | 0.07 | 0.12 | 0.20 | 0.14 | 35.82 | 38.19 | 1.70 | 1.02 | 0.82 | 0.71 | 1.46 | 0.71 |
| Limnetic | 14.40 | 25.85 | 0.73 | 0.91 | 21.32 | 15.67 | 0.94 | 0.81 | 0.33 | 0.19 | 1.87 | 0.62 | |
| Pasvik | Benthic | 6.62 | 10.47 | 0.78 | 0.28 | 41.80 | 27.85 | 1.49 | 0.79 | 0.87 | 0.92 | 1.82 | 0.48 |
| Limnetic | 8.05 | 8.32 | 0.84 | 0.97 | 42.29 | 17.84 | 1.75 | 1.94 | 1.33 | 0.88 | 2.20 | 0.40 | |
| Stuorajávri | Benthic | 1.71 | 1.90 | 0.44 | 0.16 | 5.20 | 4.08 | 1.05 | 0.34 | n/a | n/a | n/a | n/a |
| Limnetic | 5.17 | 9.16 | 0.71 | 0.27 | 32.18 | 13.44 | 0.89 | 0.20 | n/a | n/a | n/a | n/a | |
| Cisco | 0.64 | 1.06 | 0.33 | 0.14 | 3.62 | 2.53 | 1.14 | 0.24 | n/a | n/a | n/a | n/a | |
| Vendace | 0.64 | 0.60 | 0.01 | 0.00 | 44.22 | 38.05 | 1.49 | 0.66 | 0.09 | 0.06 | 3.30 | 0.97 | |
| Predictionc | L > B | L > B | L > B | L > B | L < B | L > B | |||||||
| P value, all datad | 0.05 | 0.32 | 0.15 | 0.29 | 0.12 | 0.07 | |||||||
| P value, whitefish speciesd | 0.02 | 0.005 | 0.19 | 0.29 | 0.24 | 0.03 | |||||||
n/a, not applicable
Normal and dwarf populations: Coregonus clupeaformis; benthic and limnetic populations: Coregonus lavaretus.
Expression values are relative to an arbitrary reference sample, for which expression for all candidate genes is equal to 1.
The mean relative gene expression for each limnetic whitefish, cisco and vendace, population was compared with that of its sympatric benthic whitefish counterpart (the cisco was compared with the mean of both natural American benthic populations). Pairs matching the predicted parallel direction in expression difference are in bold characters. L: limnetic fishes; B: benthic fishes.
For every gene, one-tailed paired t-tests were used to test the null hypothesis that there is no overall difference in expression between benthic and limnetic fishes. P values are given for all data and whitefish species only (α = 0.05).
Mean Relative Expression and Standard Deviation (SD) for Each Population and Candidate Gene
| Lake and Populationa | Gene | Trypsin | Carboxylesterase | Parvalbumin | LDH-A | Opsin SWS1 | Opsin LWS | ||||||
| Meanb | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | ||
| Cliff | Normal | 2.81 | 3.17 | 0.63 | 0.28 | 0.34 | 0.35 | 1.23 | 0.59 | n/a | n/a | n/a | n/a |
| Dwarf | 7.40 | 14.25 | 0.83 | 0.16 | 1.55 | 1.26 | 1.47 | 0.68 | n/a | n/a | n/a | n/a | |
| Indian | Normal | 1.33 | 2.49 | 0.67 | 0.12 | 2.61 | 1.26 | 1.09 | 0.33 | n/a | n/a | n/a | n/a |
| Dwarf | 29.25 | 29.29 | 0.83 | 0.15 | 0.90 | 1.02 | 1.27 | 0.21 | n/a | n/a | n/a | n/a | |
| Larsa | Normal | 8.09 | 18.24 | 0.19 | 0.16 | 4.00 | 1.41 | 1.03 | 0.36 | n/a | n/a | n/a | n/a |
| Dwarf | 24.12 | 34.23 | 0.55 | 0.28 | 5.29 | 3.77 | 1.61 | 0.44 | n/a | n/a | n/a | n/a | |
| Zurich | Benthic | 0.02 | 0.01 | 0.16 | 0.09 | 24.29 | 17.34 | 0.49 | 0.36 | 2.12 | 1.85 | 1.23 | 0.86 |
| Limnetic | 0.47 | 0.65 | 0.24 | 0.15 | 48.62 | 54.99 | 0.86 | 1.01 | 0.77 | 0.76 | 1.38 | 0.85 | |
| Lucerne | Benthic | 0.07 | 0.12 | 0.20 | 0.14 | 35.82 | 38.19 | 1.70 | 1.02 | 0.82 | 0.71 | 1.46 | 0.71 |
| Limnetic | 14.40 | 25.85 | 0.73 | 0.91 | 21.32 | 15.67 | 0.94 | 0.81 | 0.33 | 0.19 | 1.87 | 0.62 | |
| Pasvik | Benthic | 6.62 | 10.47 | 0.78 | 0.28 | 41.80 | 27.85 | 1.49 | 0.79 | 0.87 | 0.92 | 1.82 | 0.48 |
| Limnetic | 8.05 | 8.32 | 0.84 | 0.97 | 42.29 | 17.84 | 1.75 | 1.94 | 1.33 | 0.88 | 2.20 | 0.40 | |
| Stuorajávri | Benthic | 1.71 | 1.90 | 0.44 | 0.16 | 5.20 | 4.08 | 1.05 | 0.34 | n/a | n/a | n/a | n/a |
| Limnetic | 5.17 | 9.16 | 0.71 | 0.27 | 32.18 | 13.44 | 0.89 | 0.20 | n/a | n/a | n/a | n/a | |
| Cisco | 0.64 | 1.06 | 0.33 | 0.14 | 3.62 | 2.53 | 1.14 | 0.24 | n/a | n/a | n/a | n/a | |
| Vendace | 0.64 | 0.60 | 0.01 | 0.00 | 44.22 | 38.05 | 1.49 | 0.66 | 0.09 | 0.06 | 3.30 | 0.97 | |
| Predictionc | L > B | L > B | L > B | L > B | L < B | L > B | |||||||
| P value, all datad | 0.05 | 0.32 | 0.15 | 0.29 | 0.12 | 0.07 | |||||||
| P value, whitefish speciesd | 0.02 | 0.005 | 0.19 | 0.29 | 0.24 | 0.03 | |||||||
| Lake and Populationa | Gene | Trypsin | Carboxylesterase | Parvalbumin | LDH-A | Opsin SWS1 | Opsin LWS | ||||||
| Meanb | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | Mean | SD | ||
| Cliff | Normal | 2.81 | 3.17 | 0.63 | 0.28 | 0.34 | 0.35 | 1.23 | 0.59 | n/a | n/a | n/a | n/a |
| Dwarf | 7.40 | 14.25 | 0.83 | 0.16 | 1.55 | 1.26 | 1.47 | 0.68 | n/a | n/a | n/a | n/a | |
| Indian | Normal | 1.33 | 2.49 | 0.67 | 0.12 | 2.61 | 1.26 | 1.09 | 0.33 | n/a | n/a | n/a | n/a |
| Dwarf | 29.25 | 29.29 | 0.83 | 0.15 | 0.90 | 1.02 | 1.27 | 0.21 | n/a | n/a | n/a | n/a | |
| Larsa | Normal | 8.09 | 18.24 | 0.19 | 0.16 | 4.00 | 1.41 | 1.03 | 0.36 | n/a | n/a | n/a | n/a |
| Dwarf | 24.12 | 34.23 | 0.55 | 0.28 | 5.29 | 3.77 | 1.61 | 0.44 | n/a | n/a | n/a | n/a | |
| Zurich | Benthic | 0.02 | 0.01 | 0.16 | 0.09 | 24.29 | 17.34 | 0.49 | 0.36 | 2.12 | 1.85 | 1.23 | 0.86 |
| Limnetic | 0.47 | 0.65 | 0.24 | 0.15 | 48.62 | 54.99 | 0.86 | 1.01 | 0.77 | 0.76 | 1.38 | 0.85 | |
| Lucerne | Benthic | 0.07 | 0.12 | 0.20 | 0.14 | 35.82 | 38.19 | 1.70 | 1.02 | 0.82 | 0.71 | 1.46 | 0.71 |
| Limnetic | 14.40 | 25.85 | 0.73 | 0.91 | 21.32 | 15.67 | 0.94 | 0.81 | 0.33 | 0.19 | 1.87 | 0.62 | |
| Pasvik | Benthic | 6.62 | 10.47 | 0.78 | 0.28 | 41.80 | 27.85 | 1.49 | 0.79 | 0.87 | 0.92 | 1.82 | 0.48 |
| Limnetic | 8.05 | 8.32 | 0.84 | 0.97 | 42.29 | 17.84 | 1.75 | 1.94 | 1.33 | 0.88 | 2.20 | 0.40 | |
| Stuorajávri | Benthic | 1.71 | 1.90 | 0.44 | 0.16 | 5.20 | 4.08 | 1.05 | 0.34 | n/a | n/a | n/a | n/a |
| Limnetic | 5.17 | 9.16 | 0.71 | 0.27 | 32.18 | 13.44 | 0.89 | 0.20 | n/a | n/a | n/a | n/a | |
| Cisco | 0.64 | 1.06 | 0.33 | 0.14 | 3.62 | 2.53 | 1.14 | 0.24 | n/a | n/a | n/a | n/a | |
| Vendace | 0.64 | 0.60 | 0.01 | 0.00 | 44.22 | 38.05 | 1.49 | 0.66 | 0.09 | 0.06 | 3.30 | 0.97 | |
| Predictionc | L > B | L > B | L > B | L > B | L < B | L > B | |||||||
| P value, all datad | 0.05 | 0.32 | 0.15 | 0.29 | 0.12 | 0.07 | |||||||
| P value, whitefish speciesd | 0.02 | 0.005 | 0.19 | 0.29 | 0.24 | 0.03 | |||||||
n/a, not applicable
Normal and dwarf populations: Coregonus clupeaformis; benthic and limnetic populations: Coregonus lavaretus.
Expression values are relative to an arbitrary reference sample, for which expression for all candidate genes is equal to 1.
The mean relative gene expression for each limnetic whitefish, cisco and vendace, population was compared with that of its sympatric benthic whitefish counterpart (the cisco was compared with the mean of both natural American benthic populations). Pairs matching the predicted parallel direction in expression difference are in bold characters. L: limnetic fishes; B: benthic fishes.
For every gene, one-tailed paired t-tests were used to test the null hypothesis that there is no overall difference in expression between benthic and limnetic fishes. P values are given for all data and whitefish species only (α = 0.05).
Gene Expression in White Muscle
The parvalbumin gene was overexpressed in all limnetic morphs and species compared with benthic populations except in Indian Pond and Lake Lucerne, where the opposite pattern was observed (table 2). Therefore, our results are not in full agreement with previous microarray experiments, which revealed parallel overexpression among North American limnetic lake whitefish and cisco populations compared with benthic lake whitefish populations for this gene (Derome and Bernatchez 2006; Derome et al. 2006). The LDH-A gene was overexpressed in all limnetic compared with benthic whitefish populations except in Lakes Lucerne and Stuorajávri (table 2). No difference between cisco and vendace compared with benthic whitefish was observed.
Gene Expression in Eye
The LWS opsin gene showed clear evidence of parallel divergence, as it was overexpressed in all limnetic populations, including vendace, compared with benthic whitefish populations (table 2). SWS1, on the other hand, was underexpressed in all limnetic populations compared with benthic whitefish populations with the exception of limnetic whitefish from the Pasvik river catchment (table 2).
Multivariate Mixed Linear Models
The aim of the multivariate models was to assess the species effect on candidate gene expression while keeping other parameters constant. The results of these models generally corroborated the patterns reported in table 2 but added more statistical resolution to the species, sex, and lake of origin effects on candidate gene expression as well as their relative contributions.
Liver and White Muscle Tissues
The species × sex interaction was not statistically significant (P = 0.09) and therefore removed from the fixed effects in order to facilitate result interpretation. Both species and sex had a highly significant effect on candidate gene expression but varied according to genes (table 3). More specifically, the species effect was significant for trypsin, carboxylesterase, and parvalbumin but not for LDH-A. The sex effect was particularly strong on carboxylesterase level of expression and also significant for trypsin (table 3). Here, carboxylesterase was overexpressed in males compared with females for all species, whereas no such pattern was observed for trypsin (results not shown). Trypsin and carboxylesterase showed the strongest patterns of parallelism among whitefish species through significant overexpression in limnetic lake whitefish and European whitefish (fig. 2). Lake whitefish exhibited mean fold changes of 8.2 and 1.7, whereas European whitefish exhibited mean fold changes of 9.7 and 1.5 in trypsin and carboxylesterase expression, respectively. Moreover, for trypsin, cisco and vendace were significantly different neither from one another nor from benthic whitefish species. Thus, they only differed from limnetic whitefish species (fig. 2). For carboxylesterase, cisco and vendace were significantly different from one another, with the cisco being similar to all whitefish species while the vendace showed significant underexpression compared with all other species. For parvalbumin, it is clear from figure 2 that the significant species effect on gene expression presented in table 3 is entirely attributable to the difference between American and European species, independently of species identity. Finally, no differences in LDH-A expression were found, suggesting expression stasis among coregonine species investigated in this study.
SAS MIXED Procedure Multivariate Mixed Linear Model Results for Candidate Genes from Liver and White Muscle Tissues
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 1.59 | 0.06 | |
| Species (fixed) | 53.36 | <0.0001 | |
| Trypsin | 15.50 | <0.0001 | |
| Carboxylesterase | 108.88 | <0.0001 | |
| Parvalbumin | 16.58 | <0.0001 | |
| LDH-A | 1.38 | 0.23 | |
| Sex (fixed) | 17.00 | <0.0001 | |
| Trypsin | 6.89 | 0.009 | |
| Carboxylesterase | 62.83 | <0.0001 | |
| Parvalbumin | 0.28 | 0.59 | |
| LDH-A | 3.10 | 0.08 |
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 1.59 | 0.06 | |
| Species (fixed) | 53.36 | <0.0001 | |
| Trypsin | 15.50 | <0.0001 | |
| Carboxylesterase | 108.88 | <0.0001 | |
| Parvalbumin | 16.58 | <0.0001 | |
| LDH-A | 1.38 | 0.23 | |
| Sex (fixed) | 17.00 | <0.0001 | |
| Trypsin | 6.89 | 0.009 | |
| Carboxylesterase | 62.83 | <0.0001 | |
| Parvalbumin | 0.28 | 0.59 | |
| LDH-A | 3.10 | 0.08 |
The P value for the random lake effect was obtained through a Wald test for covariance parameters (Z statistic). The P values for the fixed effects were obtained through a type 3 test of fixed effect (F statistic). Individual P values for each candidate gene were obtained from F statistics using the slice option of the least squares means statement (α = 0.05).
SAS MIXED Procedure Multivariate Mixed Linear Model Results for Candidate Genes from Liver and White Muscle Tissues
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 1.59 | 0.06 | |
| Species (fixed) | 53.36 | <0.0001 | |
| Trypsin | 15.50 | <0.0001 | |
| Carboxylesterase | 108.88 | <0.0001 | |
| Parvalbumin | 16.58 | <0.0001 | |
| LDH-A | 1.38 | 0.23 | |
| Sex (fixed) | 17.00 | <0.0001 | |
| Trypsin | 6.89 | 0.009 | |
| Carboxylesterase | 62.83 | <0.0001 | |
| Parvalbumin | 0.28 | 0.59 | |
| LDH-A | 3.10 | 0.08 |
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 1.59 | 0.06 | |
| Species (fixed) | 53.36 | <0.0001 | |
| Trypsin | 15.50 | <0.0001 | |
| Carboxylesterase | 108.88 | <0.0001 | |
| Parvalbumin | 16.58 | <0.0001 | |
| LDH-A | 1.38 | 0.23 | |
| Sex (fixed) | 17.00 | <0.0001 | |
| Trypsin | 6.89 | 0.009 | |
| Carboxylesterase | 62.83 | <0.0001 | |
| Parvalbumin | 0.28 | 0.59 | |
| LDH-A | 3.10 | 0.08 |
The P value for the random lake effect was obtained through a Wald test for covariance parameters (Z statistic). The P values for the fixed effects were obtained through a type 3 test of fixed effect (F statistic). Individual P values for each candidate gene were obtained from F statistics using the slice option of the least squares means statement (α = 0.05).
Least squares means of log2 (relative candidate gene expression) for 6 coregonine species and 4 candidate genes expressed in the liver or white muscle. Error bars represent standard error. Ccl: Coregonus clupeaformis (lake whitefish); Cla: Coregonus lavaretus (European whitefish); Car: Coregonus artedi (lake cisco); Cal: Coregonus albula (vendace). Dotted lines separate North American (left) from European (right) species. The P values correspond to t-tests for differences in least squares means (*P < 0.05, **P < 0.01, and ***P < 0.001). The corresponding q values for P values <0.05 all remained significant with a false discovery rate = 0.05.
Eye Tissue
The species × sex interaction was nonsignificant (P = 0.34) and therefore removed from the fixed effects. The species effect on gene expression was highly significant overall as well as for both the SWS1 and LWS candidate genes considered separately (table 4). The sex effect on gene expression was nonsignificant. SWS1 was significantly underexpressed in vendace compared with benthic and limnetic European whitefish, and LWS was overexpressed in vendace compared with both whitefish species (fig. 3). It is noteworthy that the two expression patterns tended to mirror each other, in a trend corresponding to first expectations, that is, limnetic fishes should overexpress LWS and underexpress SWS1.
SAS MIXED Procedure Multivariate Mixed Linear Model Results for Candidate Genes from the Eye
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 0.81 | 0.21 | |
| Species (fixed) | 29.80 | <0.0001 | |
| SWS1 | 28.81 | <0.0001 | |
| LWS | 4.48 | 0.01 | |
| Sex (fixed) | 0.07 | 0.93 | |
| SWS1 | 0.05 | 0.82 | |
| LWS | 0.14 | 0.70 |
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 0.81 | 0.21 | |
| Species (fixed) | 29.80 | <0.0001 | |
| SWS1 | 28.81 | <0.0001 | |
| LWS | 4.48 | 0.01 | |
| Sex (fixed) | 0.07 | 0.93 | |
| SWS1 | 0.05 | 0.82 | |
| LWS | 0.14 | 0.70 |
The P value for the random lake effect was obtained through a Wald test for covariance parameters (Z statistic). The P values for the fixed effects were obtained through a type 3 test of fixed effect (F statistic). Individual P values for each candidate gene were obtained from F statistics using the slice option of the LSMEANS statement (α = 0.05).
SAS MIXED Procedure Multivariate Mixed Linear Model Results for Candidate Genes from the Eye
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 0.81 | 0.21 | |
| Species (fixed) | 29.80 | <0.0001 | |
| SWS1 | 28.81 | <0.0001 | |
| LWS | 4.48 | 0.01 | |
| Sex (fixed) | 0.07 | 0.93 | |
| SWS1 | 0.05 | 0.82 | |
| LWS | 0.14 | 0.70 |
| Variable | Gene | Z or F Valuea | P Valuea |
| Lake (random) | 0.81 | 0.21 | |
| Species (fixed) | 29.80 | <0.0001 | |
| SWS1 | 28.81 | <0.0001 | |
| LWS | 4.48 | 0.01 | |
| Sex (fixed) | 0.07 | 0.93 | |
| SWS1 | 0.05 | 0.82 | |
| LWS | 0.14 | 0.70 |
The P value for the random lake effect was obtained through a Wald test for covariance parameters (Z statistic). The P values for the fixed effects were obtained through a type 3 test of fixed effect (F statistic). Individual P values for each candidate gene were obtained from F statistics using the slice option of the LSMEANS statement (α = 0.05).
Least squares means of log2 (relative candidate gene expression) for 3 coregonine species and 2 candidate genes expressed in the eye. Error bars represent standard error. Cla: Coregonus lavaretus (European whitefish); Cal: Coregonus albula (vendace). The P values correspond to t-tests for differences in least squares means (*P < 0.05, **P < 0.01, ***P < 0.001). The corresponding q values for P values <0.05 all remained significant with a false discovery rate = 0.05.
Effect of Body Size on Candidate Gene Expression
According to a comparison of information criteria between species-based and fork length–based models (table 5), our data were clearly better explained by the species fixed effect than by the body size effect. Indeed, the smallest delta AIC between species-based and fork length–based models was equal to 367.4 for the liver/muscle model and 8.2 for the eye model. Both values are greater than eight, which indicates that fork length–based models are unlikely (eye model) or very unlikely (liver/muscle model) to better fit the data compared with species-based models (Burnham and Anderson 2002).
Information Criteria for Multivariate Mixed Models of Candidate Gene Expression with Different Focal Fixed Effect Variables
| Model | Focal Variables | −2log (Likelihood) | AICa | BICb |
| Liver/muscle | Species | 2629.8 | 2707.8 | 2715.5 |
| FL | 3063.9 | 3109.9 | 3114.4 | |
| FL, FL2 | 3021.2 | 3075.2 | 3080.5 | |
| Eye | Species | 525.0 | 549.0 | 538.1 |
| FL | 545.3 | 563.3 | 555.1 | |
| FL, FL2 | 535.2 | 557.2 | 547.3 |
| Model | Focal Variables | −2log (Likelihood) | AICa | BICb |
| Liver/muscle | Species | 2629.8 | 2707.8 | 2715.5 |
| FL | 3063.9 | 3109.9 | 3114.4 | |
| FL, FL2 | 3021.2 | 3075.2 | 3080.5 | |
| Eye | Species | 525.0 | 549.0 | 538.1 |
| FL | 545.3 | 563.3 | 555.1 | |
| FL, FL2 | 535.2 | 557.2 | 547.3 |
FL, Fork length; BIC, Bayesian information criterion.
Akaike criterion; smaller values indicate a better fit.
Schwarz criterion; smaller values indicate a better fit.
Information Criteria for Multivariate Mixed Models of Candidate Gene Expression with Different Focal Fixed Effect Variables
| Model | Focal Variables | −2log (Likelihood) | AICa | BICb |
| Liver/muscle | Species | 2629.8 | 2707.8 | 2715.5 |
| FL | 3063.9 | 3109.9 | 3114.4 | |
| FL, FL2 | 3021.2 | 3075.2 | 3080.5 | |
| Eye | Species | 525.0 | 549.0 | 538.1 |
| FL | 545.3 | 563.3 | 555.1 | |
| FL, FL2 | 535.2 | 557.2 | 547.3 |
| Model | Focal Variables | −2log (Likelihood) | AICa | BICb |
| Liver/muscle | Species | 2629.8 | 2707.8 | 2715.5 |
| FL | 3063.9 | 3109.9 | 3114.4 | |
| FL, FL2 | 3021.2 | 3075.2 | 3080.5 | |
| Eye | Species | 525.0 | 549.0 | 538.1 |
| FL | 545.3 | 563.3 | 555.1 | |
| FL, FL2 | 535.2 | 557.2 | 547.3 |
FL, Fork length; BIC, Bayesian information criterion.
Akaike criterion; smaller values indicate a better fit.
Schwarz criterion; smaller values indicate a better fit.
Discussion
The analysis of RT-PCR profiles for candidate genes provided new insights into the evolution of coregonine fishes and to our knowledge has not been applied so far to other nonmodel organisms in studies of adaptive radiation. Here, our aim was to document transcriptional divergence among members of the coregonine adaptive radiation by testing the hypothesis that parallel phenotypic adaptation toward the use of the limnetic niche in coregonine fishes was accompanied by parallelism in candidate gene transcription. Results obtained for candidate genes from the liver and the eye, where parallelism in expression was generally observed across all whitefish species pairs, provide strong support for the hypothesis that divergent natural selection plays an important role in the functional evolution associated with these organs during the course of coregonine adaptive radiation. Indeed, random processes are very unlikely to have generated such a pattern (Schluter and Nagel 1995). However, the parallelism in expression for both candidate genes from the liver did not extend to cisco and vendace, thereby infirming transcriptional convergence between limnetic whitefish species and their limnetic congeners. Evidence for parallelism in expression was less pronounced in white muscle because the expected overexpression of parvalbumin and LDH-A in limnetic whitefish species compared with benthic species was not observed in 2 out of 6 lakes.
A first general analysis of the raw data demonstrated significant parallelism in expression divergence between benthic and limnetic whitefish populations for three candidate genes. Moreover, the multivariate mixed models results showed that even taking into account other key sources of variation such as the lake of origin and sex, it was still possible to detect a strong species signal in the expression data. This demonstrates the efficiency of our approach, especially considering that all benthic and limnetic whitefish species pairs diverged within the past 15,000 years only. Obviously, we are not arguing that the candidate genes selected for this study are necessarily the most important ones underlying the adaptive radiation of coregonine fishes (see Rogers and Bernatchez 2005; Derome and Bernatchez 2006; Derome et al. 2006; Rogers and Bernatchez 2007; St-Cyr et al. 2008). Nevertheless, focusing on one or a few candidate genes is the only way of properly dissecting expression differences, and quantitative real-time PCR is a very sensitive method to investigate such differences.
Microarray Result Validation
The correlation between microarray and real-time PCR results is often considered to be high (Dallas et al. 2005). However, this correlation may not be as strong when studying closely related species that have not been exposed to chemical or physical stress, in which case among individual variability is high and fold change in transcription is relatively low (e.g., Derome et al. 2006; Harr et al. 2006; Morey et al. 2006). In the present study, real-time PCR results did not confirm microarray data for all candidate genes. In particular, it is noteworthy that the pattern obtained for parvalbumin was more similar to that obtained using microarrays for beta parvalbumin SSPRVB1 (GenBank accession number) rather than that of beta parvalbumin AF538283, which was the candidate gene for this study (Derome et al. 2006). Nevertheless, real-time PCR amplicon sequencing confirmed that beta parvalbumin AF538283 was in fact the gene amplified by our real-time PCR primers. Considering that beta parvalbumin is part of a multigene family (Gerday 1982) and that there were more than one parvalbumin EST spotted on the microarray used by Derome et al. (2006), imperfect hybridization specificity between paralogs and their respective ESTs on the array could explain this discrepancy.
Liver and White Muscle Tissue Gene Expression
Liver isolated trypsin and carboxylesterase showed the clearest patterns of both intra- and intercontinental parallelism in expression divergence between sympatric benthic and limnetic whitefish species. However, the results for these two genes neither confirmed transcriptional convergence among limnetic whitefish species, cisco and vendace, nor matched expectations that could have been derived from phylogenetic considerations. Indeed, the vendace was either more similar to the cisco (trypsin) or completely different from all species (carboxylesterase) while it is phylogenetically more closely related to both European and lake whitefish than to the cisco (Bernatchez et al. 1991). In the case of parvalbumin, the clear difference between samples from different continents, which appears to be independent of species identity, is more suggestive of environmental differences, such as climate or limnological factors, than of genetic differences among species, although this remains hypothetical. It could also be caused by faster degradation of this gene's transcript compared with those of other candidate genes, as all samples from Europe are 1 or 2 years more recent compared with North American samples. Finally, LDH-A expression levels tended to be highly conserved among all species surveyed here despite the fact that the real-time PCR amplicon for this candidate gene was the only one showing clear polymorphism, especially between continents. This suggests that our approach is conservative in cases of unwanted SNPs in the probe region, as we might have committed a type II error in this specific case. Indeed, differences in LDH activity have previously been demonstrated between benthic lake whitefish and cisco (Guderley et al. 1986), which could also have been caused by differential expression of another LDH isoform or different posttranscriptional regulatory processes (reviewed in Schulte 2004). Nevertheless, as it has been suggested by previous microarray studies, the differing metabolic strategies between species are likely to imply many genes in the metabolic pathway (Derome and Bernatchez 2006; Derome et al. 2006; St-Cyr et al. 2008).
The multivariate mixed linear model for the liver and white muscle candidate genes also revealed that sex had a significant effect on the expression of carboxylesterase and trypsin but not of parvalbumin and LDH-A. This is in line with recent results obtained from expression QTL (eQTL) mapping of white muscle and brain transcriptomes, which clearly demonstrated a sex-biased transcriptional genetic architecture in lake whitefish (Derome et al. 2008; Whiteley et al. 2008). Similarly, recent QTL mapping in the threespine stickleback (Gasterosteus aculeatus) revealed that selection on sex-limited expression of genes may have contributed to shape phenotypic evolution in this species complex (Albert et al. 2008). Changes in sex-biased gene expression are increasingly believed to play a major role in adaptive divergence between species but have received little empirical investigation (Ellegren and Parsch 2007). Interestingly, for all species surveyed here, carboxylesterase was consistently overexpressed in males compared with females. This could be associated with differential feeding ecology and/or toxin metabolism between sexes. For instance, significant differences in hepatic metallothionein concentrations were previously observed between lake whitefish males and females (Cooley et al. 2002), supporting differential detoxification capacities between sexes.
Eye Tissue Gene Expression
Results obtained for the two genes extracted from the eye generally confirmed the predicted SWS1 underexpression and LWS overexpression in limnetic fishes compared with benthic whitefish. However, the strongest pattern observed in the multivariate analysis was the clear difference between the vendace and both limnetic and benthic whitefish. This is consistent with the observation that the dwarf (limnetic) lake whitefish is intermediate between normal (benthic) lake whitefish and cisco with reference to the transcription profile of key candidate genes (Derome and Bernatchez 2006). Likewise, our results could be linked to the general observation that the vendace is superior when in competition with its limnetic whitefish counterpart (Bohn and Amundsen 2001) given that in this case, its opsin gene expression pattern appears better fit for a limnetic lifestyle. Indeed, the LWS opsin gene is crucial for vision in the water column, whereas the SWS1 opsin gene quickly becomes useless with increasing depth (Lythgoe 1968, 1984). Of course, environmental variation in opsin gene expression with respect to depth conditions cannot be ruled out, as our data may reflect the depth at which fish from each population were caught. For instance, the Pasvik limnetic population was sampled much closer to the surface compared with both limnetic populations from Switzerland and does not show the expected SWS1 underexpression. In addition, it could be argued that clear parallelism in expression is a lot more difficult to observe in this case because of the reduced data set. Nevertheless, the fact that the limnetic group was represented by more than one lake together with the quality of the available information on opsin gene function, fish ecology and the link between the two allow a relatively thorough interpretation of our data on eye tissue gene expression.
Insights from Transcriptional Adaptation among Coregonine Fishes: Parallel and Convergent Evolution
With respect to the two candidate genes expressed in liver, our study revealed patterns that are in agreement with two different, yet closely related, evolutionary processes: parallelism and convergence. Both are defined as the repeated independent evolution of similar features and provide strong evidence for adaptation through natural selection. However, parallelism implies homologous features among closely related lineages, whereas convergence occurs through changes in different antecedent features among more distantly related lineages (Futuyma 1997). West-Eberhard (2005) proposed the idea that environmental induction can initiate a reorganization of ancestral phenotypes toward the emergence of new phenotypes, which can subsequently undergo natural selection fueled by either standing genetic variation or subsequent mutation and recombination. This hypothesis, which seems parsimonious in cases of recurrent phenotypes, is also consistent with variation in gene expression as a key source of phenotypic novelty (West-Eberhard 2005). Within this framework, the genes implicated in parallel evolution among closely related lineages should be the same because of the shared genetic and developmental constraints (Stern 2000; Schluter et al. 2004; West-Eberhard 2005). Indeed, evidence can be found for such a pattern in stickleback species pairs (Schluter et al. 2004; Colosimo et al. 2005) and experimental populations of Escherichia coli (Woods et al. 2006), among others. As for more distantly related lineages, it has recently been demonstrated that nonadaptive processes could play a key role in the evolution of genetic networks (Lynch 2007a), thus providing genomic architectural features subsequently utilized by adaptive processes as evolutionary resources (Lynch 2007b). Therefore, independently evolved lineages are expected to harbor homologous phenotypic characters that diverge in their genetic evolutionary trajectories (True and Haag 2001). Clearly, the impact of these nonadaptive processes on the discrepancy between convergent phenotypes and their respective genetic basis is expected to increase with time since divergence, which could partly explain the incongruities we observed in this study in patterns of gene expression between limnetic forms of whitefish, cisco and vendace. Of course, there are other possible explanations. For instance, because they have been exploiting the limnetic niche for a longer period of time, cisco and vendace may have reached a better alternative in terms of trypsin and carboxylesterase expression levels. As for both limnetic whitefish species, the observed parallelism could be a by-product of selection on another functionally related gene product. In any case, clear patterns of parallelism such as the ones observed for trypsin and carboxylesterase expression levels are highly inconsistent with any random process.
Conclusion
Our study contributes to a better understanding of population divergence and adaptive radiation in a gene expression framework, which is increasingly considered as being important in evolutionary processes (Schulte 2001; True and Haag 2001; Nuzhdin et al. 2004; Schulte 2004). It also points to the role of natural selection in the adaptive radiation of coregonine fishes. Three major conclusions can be drawn from this study. First, real-time PCR results do not always confirm microarray data, particularly in a situation where the expected fold change between groups is relatively small and the microarray used is not completely specific. This highlights the importance of corroborating microarray results when subsequent research on candidate genes is intended. Second, this study demonstrates that phenotypic adaptation toward the use of the limnetic niche in whitefish species is accompanied by parallelism in candidate gene transcription representing diverse biological functions. Indeed, results obtained for the liver showed parallel divergence in expression for two candidate genes in benthic/limnetic species pairs from three distinct whitefish lineages in their natural environments as well as from a controlled environment. Parallelism in gene expression across this range provides strong support to the hypothesis that divergent natural selection plays an important role in shaping patterns of gene expression involved in the adaptive radiation of whitefish species. Third, the phylogenetic extent of such parallelism appears to be limited to sister species. In fact, the results of this study suggest that the candidate genes selected do not take part in the phenotypic convergence between limnetic whitefish morphs, cisco and vendace. As such, a plausible explanation could be that convergent evolution following nonadaptive shaping of genome architecture in independently evolved lineages has occurred in this group of fish.
Funding
Natural Sciences and Engineering research Council of Canada postgraduate scholarship to J.J.; Natural Sciences and Engineering research Council of Canada Discovery grant to L.B.
We are grateful to N. Derome, G. Côté, K. Giguère, R. C. Lévesque, and J. Lake for their advice and help in the laboratory. We also thank J. Blais and H. Crépeau for their help with statistical analyses. We thank P. Vonlanthen and all Swiss fishermen from lakes Zurich and Lucerne for their valuable contribution to fish sampling. We also thank J. Blais and S. Renaut for their constructive comments on earlier versions of the manuscript.
References
Author notes
Diethard Tautz, Associate Editor


