-
PDF
- Split View
-
Views
-
Cite
Cite
Chen-Hung Kao, Zhao-Bang Zeng, Modeling Epistasis of Quantitative Trait Loci Using Cockerham's Model, Genetics, Volume 160, Issue 3, 1 March 2002, Pages 1243–1261, https://doi.org/10.1093/genetics/160.3.1243
Close -
Share
Abstract
We use the orthogonal contrast scales proposed by Cockerham to construct a genetic model, called Cockerham's model, for studying epistasis between genes. The properties of Cockerham's model in modeling and mapping epistatic genes under linkage equilibrium and disequilibrium are investigated and discussed. Because of its orthogonal property, Cockerham's model has several advantages in partitioning genetic variance into components, interpreting and estimating gene effects, and application to quantitative trait loci (QTL) mapping when compared to other models, and thus it can facilitate the study of epistasis between genes and be readily used in QTL mapping. The issues of QTL mapping with epistasis are also addressed. Real and simulated examples are used to illustrate Cockerham's model, compare different models, and map for epistatic QTL. Finally, we extend Cockerham's model to multiple loci and discuss its applications to QTL mapping.
GENES interact when they express their effects; i.e., the effects of genotypes at one locus depend on what genotypes are present at other loci. Interaction (epistasis) between genes affecting qualitative trait variation has been demonstrated for a long time since Gregor Mendel in 1865. Although the evidence of epistasis between genes controlling quantitative traits [quantitative trait loci (QTL)] has been reported by traditional techniques, such as variance component analyses (Brim and Cockerham 1961; Lee et al. 1968; Stuber and Moll 1971), epistasis between individual QTL generally has been difficult to discern by traditional techniques. The recent advances in molecular biology have allowed fine-scale genetic marker maps of various organisms to be constructed for the study of individual QTL. Using such maps, statistical methods for estimating the positions and effects of individual QTL (QTL mapping) have been proposed (Lander and Botstein 1989; Jansen 1993; Zeng 1994; Kao et al. 1999; Sen and Churchill 2001). The problem of epistasis has been considered in some QTL mapping studies (e.g., Stuber et al. 1992; Cheverud and Routman 1995; Doebley et al. 1995; Cockerham and Zeng 1996; Kao et al. 1999; Goodnight 2000; Zeng et al. 2000), but not sufficiently, and many theoretical and statistical issues involved with epistasis have not been discussed. Here, we discuss a genetic model, called Cockerham's model, in relation to QTL mapping with epistasis. We also investigate the model properties under linkage disequilibrium.
Fisher (1918) first partitioned genetic variance into components corresponding to additive, dominance, and epistatic variances using the least-squares principle. Cockerham (1954) further partitioned the epistatic variance into components using orthogonal contrasts. Kempthone (1957) and Hayman and Mather (1955) adopted the same epistasis model. Hayman and Mather (1955) and Mather (1967) proposed other epistasis models for modeling epistasis. Van Der Veen (1959) reviewed the genetic models of digenic epistasis published by then and summarized them into three categories:
The pure-line-metric or F∞-metric model (F∞ denotes the population derived from selfing F2 individuals for t generations as t → ∞): The parameters in the F∞-metric model show orthogonality with respect to genotypic frequencies of an F∞ population under linkage equilibrium.
The F2-metric model (corresponding to Cockerham's model): The parameters in the F2-metric model are mutually orthogonal with respect to genotypic frequencies of an F2 population under linkage equilibrium.
The mixed-metric model (corresponding to Hayman and Mather's model): The mixed-metric model is a mixture of the Cockerham's model and F∞-metric model, and it can be transformed to the F2-metric model by subtraction of the mean.
In this article, we start from the traditional partition of genetic variance into variance components using Cockerham's (1954) orthogonal contrasts, then lead up to a definition of the genetic parameters for genetic effects, and present Cockerham's epistasis model. The properties of Cockerham's model in modeling and mapping epistatic genes are investigated when genes are in linkage equilibrium and disequilibrium. The differences between Cockerham's model and the other models are compared, and the advantages of Cockerham's model are discussed. It shows that Cockerham's model is a more appropriate model than the other models for the study of epistasis between genes and QTL mapping in the populations, such as F2 and backcross. Real and simulated examples are used to illustrate Cockerham's model, compare different genetic models in the analysis of epistasis between genes, and map for epistatic QTL. Finally, we generalize Cockerham's model to multiple loci and discuss its applications to QTL mapping.
COCKERHAM'S GENETIC MODEL
Cockerham (1954) used eight orthogonal contrast scales to partition the genetic variance contributed by two genes into eight components and to define the genotypic value of a genotype to find the correlation between relatives in a population. His definition of genotypic value using the orthogonal scales leads the way to construct a genetic model, which is called Cockerham's model, for modeling epistasis and defining gene effects in a population. In this section, the orthogonal contrast scales are introduced to present Cockerham's model, and the genetic parameters of Cockerham's model are defined. The similarities and differences between Cockerham's model and alternative models are compared, and their variance component structures are presented.
The eight orthogonal contrast scales (W's) for the F2 population
| Genotype . | AABB . | AABb . | AAbb . | AaBB . | AaBb . | Aabb . | aaBB . | aaBb . | aabb . |
|---|---|---|---|---|---|---|---|---|---|
| G | G22 | G21 | G20 | G12 | G11 | G10 | G02 | G01 | G00 |
| P | 1/16 | ⅛ | 1/16 | ⅛ | ¼ | ⅛ | 1/16 | ⅛ | 1/16 |
| W1 | 1 | 1 | 1 | 0 | 0 | 0 | −1 | −1 | −1 |
| W2 | −½ | −½ | −½ | ½ | ½ | ½ | −½ | −½ | −½ |
| W3 | 1 | 0 | −1 | 1 | 0 | −1 | 1 | 0 | −1 |
| W4 | −½ | ½ | −½ | −½ | ½ | −½ | −½ | ½ | −½ |
| W5 | 1 | 0 | −1 | 0 | 0 | 0 | −1 | 0 | 1 |
| W6 | −½ | ½ | −½ | 0 | 0 | 0 | ½ | −½ | ½ |
| W7 | −½ | 0 | ½ | ½ | 0 | −½ | −½ | 0 | ½ |
| W8 | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ |
| Genotype . | AABB . | AABb . | AAbb . | AaBB . | AaBb . | Aabb . | aaBB . | aaBb . | aabb . |
|---|---|---|---|---|---|---|---|---|---|
| G | G22 | G21 | G20 | G12 | G11 | G10 | G02 | G01 | G00 |
| P | 1/16 | ⅛ | 1/16 | ⅛ | ¼ | ⅛ | 1/16 | ⅛ | 1/16 |
| W1 | 1 | 1 | 1 | 0 | 0 | 0 | −1 | −1 | −1 |
| W2 | −½ | −½ | −½ | ½ | ½ | ½ | −½ | −½ | −½ |
| W3 | 1 | 0 | −1 | 1 | 0 | −1 | 1 | 0 | −1 |
| W4 | −½ | ½ | −½ | −½ | ½ | −½ | −½ | ½ | −½ |
| W5 | 1 | 0 | −1 | 0 | 0 | 0 | −1 | 0 | 1 |
| W6 | −½ | ½ | −½ | 0 | 0 | 0 | ½ | −½ | ½ |
| W7 | −½ | 0 | ½ | ½ | 0 | −½ | −½ | 0 | ½ |
| W8 | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ |
G's and P's denote the genotypic values and expected genotypic frequencies for the nine genotypes of two unlinked genes, A and B.
The eight orthogonal contrast scales (W's) for the F2 population
| Genotype . | AABB . | AABb . | AAbb . | AaBB . | AaBb . | Aabb . | aaBB . | aaBb . | aabb . |
|---|---|---|---|---|---|---|---|---|---|
| G | G22 | G21 | G20 | G12 | G11 | G10 | G02 | G01 | G00 |
| P | 1/16 | ⅛ | 1/16 | ⅛ | ¼ | ⅛ | 1/16 | ⅛ | 1/16 |
| W1 | 1 | 1 | 1 | 0 | 0 | 0 | −1 | −1 | −1 |
| W2 | −½ | −½ | −½ | ½ | ½ | ½ | −½ | −½ | −½ |
| W3 | 1 | 0 | −1 | 1 | 0 | −1 | 1 | 0 | −1 |
| W4 | −½ | ½ | −½ | −½ | ½ | −½ | −½ | ½ | −½ |
| W5 | 1 | 0 | −1 | 0 | 0 | 0 | −1 | 0 | 1 |
| W6 | −½ | ½ | −½ | 0 | 0 | 0 | ½ | −½ | ½ |
| W7 | −½ | 0 | ½ | ½ | 0 | −½ | −½ | 0 | ½ |
| W8 | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ |
| Genotype . | AABB . | AABb . | AAbb . | AaBB . | AaBb . | Aabb . | aaBB . | aaBb . | aabb . |
|---|---|---|---|---|---|---|---|---|---|
| G | G22 | G21 | G20 | G12 | G11 | G10 | G02 | G01 | G00 |
| P | 1/16 | ⅛ | 1/16 | ⅛ | ¼ | ⅛ | 1/16 | ⅛ | 1/16 |
| W1 | 1 | 1 | 1 | 0 | 0 | 0 | −1 | −1 | −1 |
| W2 | −½ | −½ | −½ | ½ | ½ | ½ | −½ | −½ | −½ |
| W3 | 1 | 0 | −1 | 1 | 0 | −1 | 1 | 0 | −1 |
| W4 | −½ | ½ | −½ | −½ | ½ | −½ | −½ | ½ | −½ |
| W5 | 1 | 0 | −1 | 0 | 0 | 0 | −1 | 0 | 1 |
| W6 | −½ | ½ | −½ | 0 | 0 | 0 | ½ | −½ | ½ |
| W7 | −½ | 0 | ½ | ½ | 0 | −½ | −½ | 0 | ½ |
| W8 | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ | −¼ | ¼ |
G's and P's denote the genotypic values and expected genotypic frequencies for the nine genotypes of two unlinked genes, A and B.
Definition of genetic parameters
| Solution . | Parameter definition . | Notation . |
|---|---|---|
| E0 | Mean | μ |
| E1 | Additive effect of locus A | a1 |
| E2 | Dominance effect of locus A | d1 |
| E3 | Additive effect of locus B | a2 |
| E4 | Dominance effect of locus B | d2 |
| E5 | Additive × additive effect of loci A and B | iaa |
| E6 | Additive × dominance effect of loci A and B | iad |
| E7 | Dominance × additive effect of loci A and B | ida |
| E8 | Dominance × dominance effect of loci A and B | idd |
| Solution . | Parameter definition . | Notation . |
|---|---|---|
| E0 | Mean | μ |
| E1 | Additive effect of locus A | a1 |
| E2 | Dominance effect of locus A | d1 |
| E3 | Additive effect of locus B | a2 |
| E4 | Dominance effect of locus B | d2 |
| E5 | Additive × additive effect of loci A and B | iaa |
| E6 | Additive × dominance effect of loci A and B | iad |
| E7 | Dominance × additive effect of loci A and B | ida |
| E8 | Dominance × dominance effect of loci A and B | idd |
Definition of genetic parameters
| Solution . | Parameter definition . | Notation . |
|---|---|---|
| E0 | Mean | μ |
| E1 | Additive effect of locus A | a1 |
| E2 | Dominance effect of locus A | d1 |
| E3 | Additive effect of locus B | a2 |
| E4 | Dominance effect of locus B | d2 |
| E5 | Additive × additive effect of loci A and B | iaa |
| E6 | Additive × dominance effect of loci A and B | iad |
| E7 | Dominance × additive effect of loci A and B | ida |
| E8 | Dominance × dominance effect of loci A and B | idd |
| Solution . | Parameter definition . | Notation . |
|---|---|---|
| E0 | Mean | μ |
| E1 | Additive effect of locus A | a1 |
| E2 | Dominance effect of locus A | d1 |
| E3 | Additive effect of locus B | a2 |
| E4 | Dominance effect of locus B | d2 |
| E5 | Additive × additive effect of loci A and B | iaa |
| E6 | Additive × dominance effect of loci A and B | iad |
| E7 | Dominance × additive effect of loci A and B | ida |
| E8 | Dominance × dominance effect of loci A and B | idd |
can also be represented in a different form as Table 3. Note that the marginal means of the three genotypes, , , and , for locus A are μ + a1 − d1/2, μ + d1/2, and μ − a1 − d1/2, respectively, as the segregation ratio is 1:2:1. There are similar forms for locus B. The grand mean is equivalent to μ.
Linkage disequilibrium:The coded variables in Cockerham's
model (the scales in Table 1) are orthogonal and contrast to each other when the ratio of genotypic frequencies is 1:2:1:2:4:2:1:2:1 (genes are unlinked) in an F2 population. Therefore, the definition of the genetic parameters in Table 2 is appropriate for interpreting the gene effects and the genetic variance can be partitioned (Equation 12) as if genes are unlinked. If there is segregation distortion and/or linkage, the ratio will deviate from 1:2:1:2:4:2:1:2:1 (Table 6) and there will be covariances between some genetic effects (Equation 34). To take linkage disequilibrium into account in using Cockerham's model, we introduce statistical parameters to contrast with genetic parameters in interpreting gene effects when genes are in linkage disequilibrium (see next section).
The equation form for the mixed-metric model, which is a mixture of Cockerham's model and the F∞-metric model with the first part of marginal effects from the F∞-metric model and the latter part of epistatic effects from Cockerham's model, is trivial (not shown), and it is tabulated in Table 5. The coded variables of the mixed-metric model are orthogonal, but not contrasts. Except
for μ, the solutions of the genetic parameters of the mixed-metric model are the same as those of Cockerham's model. The solution of μ is not equal to . By subtracting d1/2 + d2/2, the mixed-metric model will become Cockerham's model. In Table 5, the marginal means of one locus involve the dominance effect of another locus, which deviates from the one-locus analysis. For example, the marginal mean of genotype AA, , is μ + a1 + d2/2. Except for μ, the solutions of the genetic parameters of the mixed-metric model are the same as those of Cockerham's model.
MODELING QUANTITATIVE TRAITS
Genotypic frequencies (P's) in terms of allele frequencies (p's) and the linkage disequilibrium coefficient (D)

Genotypic frequencies (P's) in terms of allele frequencies (p's) and the linkage disequilibrium coefficient (D)

However, when genes are linked, is not a measure of the difference between the two homozygote means as the ratio is no longer 1:2:1:2:4:2:1:2:1. Likewise, the LSE of the genetic parameters are appropriate estimates of the nine gene effects when genes are unlinked, but they are not appropriate estimates when genes are linked. To remedy this problem, statistical parameters of gene effects are introduced for interpretation in contrast to genetic parameters of gene effects.
Given genotypic values G's, the quantities, β's, will have different values according to different strengths of linkage (ratios of genotypic frequencies). On the contrary, the genetic parameters, E's, will not change according to different strengths of linkage. Therefore, we define β's as statistical parameters to contrast with the genetic parameters of gene effects. The genetic parameters can be obtained directly from Cockerham's model, but the statistical parameters cannot. However, there exists a one-to-one relationship between the two kinds of parameters as shown below. It allows that once the genetic parameters are obtained from the model the statistical parameters can be obtained by transformation.
QTL MAPPING USING COCKERHAM'S MODEL
In this section, we apply Cockerham's model to construct a statistical epistasis model to map for epistatic QTL and analyze epistasis between QTL. The problems when epistasis is present and ignored in QTL mapping are also investigated. By taking epistasis into account in QTL mapping, the accuracy of estimation and power of detection can be improved.
When epistasis is present and ignored in QTL mapping, the genetic variance contributed by epistasis is not controlled in the model and becomes a part of the genetic residual. Thus, the sampling variances of the effects are inflated, and the power of detecting QTL decreases. If epistasis is taken into account, the epistatic variance can be controlled, and the power will increase. The increase in power depends on the size of the epistatic effect. The larger the epistatic effect that can be controlled in mapping, the larger the increase in power that can be gained. In conclusion, by taking epistasis into account in QTL mapping, the chance of finding more QTL and the accuracy of estimating QTL positions and effects can be improved.
ADVANTAGES OF COCKERHAM'S MODEL
Cockerham's model has several advantages in the study of epistasis as compared to the F∞-metric and mixed-metric models. When genes are in linkage equilibrium, the advantages include the following:
The genetic variance can be partitioned into eight independent components (Equation 12), and there is no genetic covariance. Each component is contributed by its corresponding genetic parameter. This is a desirable property in modeling. On the contrary, the F∞-metric does not have such a property (Equation 18).
The marginal means of one locus do not involve the parameters of another locus and the epistasis parameters, which would make Cockerham's model readily interpretable (Table 3). The marginal means of locus A are (a1 − d1/2), d1/2, and (−a1 − d1/2), which correspond to the one-locus analysis (differing by a constant d1/2) despite epistasis. For the F∞-metric model, the marginal means of locus A are (a1 + d2/2 + iad/2), (d1 + d2/2 + idd/2), and (−a1 + d2/2 − iad/2), which are confounded by the genetic parameter of dominance effect of locus B (d2) and their epistasis parameters, iad and idd. In the mixed-metric model, the marginal means of locus A are a1 + d2/2, d1 + d2/2, and −a1 + d2/2, which are confounded by the genetic parameters of dominance effect of locus B (d2). Both the F∞-metric and mixed-metric models do not follow the definition in the one-locus analysis.
The difference between the two homozygote means, (G2. − G0.)/2[(G.2 − G.0)/2], estimates the genetic parameter a1 (a2) of locus A (B), and the departure of the heterozygote mean to the midpoint between the two homozygote means, (2G1. − G2. − G0.)/2[(2G.1 − G.2 − G.0)/2], estimates the genetic parameter d1 (d2) of locus A (B). They follow the same definition of additive and dominance effects in the one-locus analysis. In the F∞-metric model, they estimate a1 + iad/2 (a2 + ida/2) and d1 + idd/2 (d2 + idd/2) and violate the definition in the one-locus analysis.
With the orthogonal property, the estimation of one genetic (marginal or epistatic) effect will not be affected by the presence or absence of other genetic effects in the model. Essentially, when epistasis is
present and ignored, the estimation of the marginal effects and the location of epistatic QTL is still asymptotically unbiased and not affected by epistasis. This advantage ensures that QTL mapping can be first performed without taking epistasis into account without causing a problem under Cockerham's model. The F∞-metric model does not have such property (see qtl mapping using cockerham's model).
The means of trait LBIL in the F2 population from Doebley et al. (1995)
| . | QA . | . | ||
|---|---|---|---|---|
| AA | Aa | aa | Mean | |
| QB | 101.6a | 83.62 | 47.80 | 77.21 |
| BB | 8b | 20 | 11 | 39 |
| 66.50 | 47.55 | 54.57 | 54.19 | |
| Bb | 22 | 42 | 21 | 85 |
| 61.11 | 40.94 | 17.98 | 36.37 | |
| bb | 3 | 24 | 10 | 37 |
| 74.52 | 54.09 | 44.08 | 55.67 | |
| Mean | 33 | 86 | 42 | 161 |
| . | QA . | . | ||
|---|---|---|---|---|
| AA | Aa | aa | Mean | |
| QB | 101.6a | 83.62 | 47.80 | 77.21 |
| BB | 8b | 20 | 11 | 39 |
| 66.50 | 47.55 | 54.57 | 54.19 | |
| Bb | 22 | 42 | 21 | 85 |
| 61.11 | 40.94 | 17.98 | 36.37 | |
| bb | 3 | 24 | 10 | 37 |
| 74.52 | 54.09 | 44.08 | 55.67 | |
| Mean | 33 | 86 | 42 | 161 |
LBIL, average length of vegetative internodes in the primary lateral branch. QA and QB represent unlinked genes UMC107 and BV302, respectively.
Trait mean.
Sample size.
The means of trait LBIL in the F2 population from Doebley et al. (1995)
| . | QA . | . | ||
|---|---|---|---|---|
| AA | Aa | aa | Mean | |
| QB | 101.6a | 83.62 | 47.80 | 77.21 |
| BB | 8b | 20 | 11 | 39 |
| 66.50 | 47.55 | 54.57 | 54.19 | |
| Bb | 22 | 42 | 21 | 85 |
| 61.11 | 40.94 | 17.98 | 36.37 | |
| bb | 3 | 24 | 10 | 37 |
| 74.52 | 54.09 | 44.08 | 55.67 | |
| Mean | 33 | 86 | 42 | 161 |
| . | QA . | . | ||
|---|---|---|---|---|
| AA | Aa | aa | Mean | |
| QB | 101.6a | 83.62 | 47.80 | 77.21 |
| BB | 8b | 20 | 11 | 39 |
| 66.50 | 47.55 | 54.57 | 54.19 | |
| Bb | 22 | 42 | 21 | 85 |
| 61.11 | 40.94 | 17.98 | 36.37 | |
| bb | 3 | 24 | 10 | 37 |
| 74.52 | 54.09 | 44.08 | 55.67 | |
| Mean | 33 | 86 | 42 | 161 |
LBIL, average length of vegetative internodes in the primary lateral branch. QA and QB represent unlinked genes UMC107 and BV302, respectively.
Trait mean.
Sample size.
EXAMPLES
In this section, real and simulated data were used to illustrate Cockerham's model, compare the differences between Cockerham's model and other models, verify the properties in statistical estimation, and map for epistatic QTL.
Real data: Doebley et al. (1995) crossed two corn inbred lines, Teosinte-M1L × Teosinte-M3L, to generate 183 F2 progeny, and they concluded that two unlinked markers UMC107 (QA) and BV302 (QB) are the candidate QTL for trait LBIL (average length of vegetative internodes in the primary lateral branch) in QTL analysis. Among the 183 progeny, 21 individuals have a missing trait and one individual has a missing genotype. Therefore, only the 161 individuals with complete trait and genotype information were used in the analysis. The observed allele frequencies are , , , and . The genotypic frequencies are 0.050, 0.137, 0.019, 0.124, 0.261, 0.149, 0.068, 0.130, and 0.062 for genotypes AABB, AABb, AAbb, AaBB, AaBb, Aabb, aaBB, aaBb, and aabb, respectively, which significantly deviate from the expected frequencies for two unlinked genes. The small sample size of AAbb in 3 individuals is responsible for the deviation.
The observed genotypic means () and sample sizes (nij's) of the data are listed in Table 7. If all eight genetic parameters (full model) are considered, the estimated genetic parameters by Cockerham's model, the F∞-metric model, and the mixed-metric model are listed in Table 9. In Table 9, except for μ, the estimates of the eight genetic parameters by Cockerham's model and the mixed-metric model are the same. Cockerham's model and the F∞-metric model have different estimates of marginal effects, but the same estimates of epistatic effects (see Cockerham's genetic model for the reasons). The estimates of a1 and d1 are 15.11 (P value 0.0008) and −3.92 (P value 0.5035), respectively, for Cockerham's model, and they are 24.25 (P value 0.0008) and 5.15 (P value 0.5617), respectively, for the F∞-metric model. The estimates of a2 and d2 are 19.46 (P value 0.0001) and −5.66 (P value 0.3336), respectively, for Cockerham's model, and they are 17.59 (P value 0.0001) and 3.40 (P value 0.3336), respectively, for the F∞-metric model. Very likely, the marginal effects of QA and QB are mostly additive, and their dominance effects are not significant. The estimate of iaa is 2.68 (P value 0.7054). Analytically, it means that the additive effects of QB (QA) in the background of AA (BB) and aa (bb), which are and , differ by 2.68, and this difference is not statistically significant at the 5% level (Figure 1a). The estimate of iad is −18.28 (P value 0.0411). Analytically, it means that the dominance effects of QB in the background of AA and aa, which are and , are significantly different at the 5% level. The significance of additive-by-dominance interaction can be illustrated by Figure 1b. In Figure 1b, the cross between the two lines tells that genotype Bb performs better than BB in the background of aa, but it does worse in the background of AA. The estimate of ida, 3.75, is not significant (P value 0.6725) as illustrated by the three nearly parallel lines in Figure 1c. The estimate of idd, −18.13, is not statistically significant at the 5% level (P value 0.1227), although it shows that there is a cross between lines in Figure 1d. The proportion of the genetic variance in the total variance (model R2) is 23.66% (Table 8).
The estimates of the statistical parameters are , , , , , , , , and β8 = −38.88 following the definitions, or they can be obtained by using Equations A1, A2, A3, A4, A5, A6, A7, A8 and A9 by plugging in observed genotypic frequencies in Table 7 and the nine estimated genetic parameters in Table 9. Although the values of the statistical and genetic parameters are expected to be very close for unlinked genes, they are very different based on this data set. The difference occurs because the observed segregation ratio deviates from the expected segregation ratio.
If only the significant effects, a1, a2, and iad (reduced model), are considered for Cockerham's model, the estimates of a1, a2, and iad are 15.27 (SD 4.13, P value
0.0003), 19.13 (SD 4.04, P value < 0.0001), and −18.44 (SD 8.27, P value <0.0271), respectively, which are very close to the estimates in the full model. This shows that the estimation of one genetic parameter in Cockerham's model will not be affected by the presence or absence of other genetic parameters due to its orthogonal property. However, the F∞-metric model does not have such a property. If the reduced model is considered for the F∞-metric model, the estimates of a1, a2, and iad become 24.48 (SD 6.28, P value 0.0001), 19.12 (SD 4.04, P value < 0.0001), and −18.44 (SD 8.27, P value 0.0271), respectively. The estimate of a2 changes from 17.59 in the full model to 19.12 in the reduced model due to the confounding of ida/2 = 3.75/2 (estimated in the full model) by Equation 46. Both models have the same model R2 = 0.2121.
If only the significant effects are taken into account in calculating the variance components, the additive effect of QA, a1, contributes ~34.05% to the total genetic variance (Equation 12), the additive effect of QB, a2, contributes ~52.04% to the total genetic variance, and the epistatic effect, iad, contributes ~13.90% to the total genetic variance under Cockerham's model. There is no genetic covariance between effects for unlinked loci. The mixed-metric model has the same genetic variance structure as Cockerham's model. The genetic variance and covariance components under the F∞-metric model can be obtained using Equation 18.
Simulation: Assume that a quantitative trait is affected by two unlinked epistatic QTL. The first QTL, QA, is located at 52 cM on the first chromosome, and the second QTL, QB, is located at 93 cM on the second chromosome. There are 11 15-cM equally spaced markers on each chromosome. The additive and dominance effects
| Source . | d.f. . | Sum of square . | Mean square . | F value . | P value . |
|---|---|---|---|---|---|
| QA | 2 | 16995.08 | 8497.54 | 6.89 | 0.0014 |
| QB | 2 | 19227.48 | 9613.74 | 7.80 | 0.0006 |
| QA × QB | 4 | 10921.70 | 2730.42 | 2.21 | 0.0701 |
| Error | 152 | 187440.72 | 1233.16 | ||
| Total | 160 | 245527.72 |
| Source . | d.f. . | Sum of square . | Mean square . | F value . | P value . |
|---|---|---|---|---|---|
| QA | 2 | 16995.08 | 8497.54 | 6.89 | 0.0014 |
| QB | 2 | 19227.48 | 9613.74 | 7.80 | 0.0006 |
| QA × QB | 4 | 10921.70 | 2730.42 | 2.21 | 0.0701 |
| Error | 152 | 187440.72 | 1233.16 | ||
| Total | 160 | 245527.72 |
R-square is 0.2366.
| Source . | d.f. . | Sum of square . | Mean square . | F value . | P value . |
|---|---|---|---|---|---|
| QA | 2 | 16995.08 | 8497.54 | 6.89 | 0.0014 |
| QB | 2 | 19227.48 | 9613.74 | 7.80 | 0.0006 |
| QA × QB | 4 | 10921.70 | 2730.42 | 2.21 | 0.0701 |
| Error | 152 | 187440.72 | 1233.16 | ||
| Total | 160 | 245527.72 |
| Source . | d.f. . | Sum of square . | Mean square . | F value . | P value . |
|---|---|---|---|---|---|
| QA | 2 | 16995.08 | 8497.54 | 6.89 | 0.0014 |
| QB | 2 | 19227.48 | 9613.74 | 7.80 | 0.0006 |
| QA × QB | 4 | 10921.70 | 2730.42 | 2.21 | 0.0701 |
| Error | 152 | 187440.72 | 1233.16 | ||
| Total | 160 | 245527.72 |
R-square is 0.2366.
| . | Parameter estimate . | Standard error . | Test for parameter = 0 . | P value . | ||||
|---|---|---|---|---|---|---|---|---|
| . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . |
| μ . | 56.87 . | 57.13 . | 2.92 . | 7.07 . | 19.48 . | 8.08 . | 0.0001 . | 0.0001 . |
| a1 | 15.11 | 24.25 | 4.47 | 7.07 | 3.41 | 3.30 | 0.0008 | 0.0008 |
| d1 | −3.92 | 5.15 | 5.84 | 8.05 | −0.67 | 0.58 | 0.5035 | 0.5617 |
| a2 | 19.46 | 17.59 | 4.42 | 7.07 | 4.40 | 2.49 | 0.0001 | 0.0140 |
| d2 | −5.66 | 3.40 | 5.84 | 8.87 | −0.97 | 0.38 | 0.3336 | 0.7022 |
| iaa | 2.68 | 2.68 | 7.07 | 7.07 | 0.38 | 0.38 | 0.7054 | 0.7054 |
| iad | −18.28 | −18.28 | 8.87 | 8.87 | −2.06 | −2.06 | 0.0411 | 0.0411 |
| ida | 3.75 | 3.75 | 8.85 | 8.85 | 0.42 | 0.42 | 0.6725 | 0.6725 |
| idd | −18.13 | 18.13 | 11.68 | 11.68 | −1.55 | −1.55 | 0.1227 | 0.1227 |
| . | Parameter estimate . | Standard error . | Test for parameter = 0 . | P value . | ||||
|---|---|---|---|---|---|---|---|---|
| . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . |
| μ . | 56.87 . | 57.13 . | 2.92 . | 7.07 . | 19.48 . | 8.08 . | 0.0001 . | 0.0001 . |
| a1 | 15.11 | 24.25 | 4.47 | 7.07 | 3.41 | 3.30 | 0.0008 | 0.0008 |
| d1 | −3.92 | 5.15 | 5.84 | 8.05 | −0.67 | 0.58 | 0.5035 | 0.5617 |
| a2 | 19.46 | 17.59 | 4.42 | 7.07 | 4.40 | 2.49 | 0.0001 | 0.0140 |
| d2 | −5.66 | 3.40 | 5.84 | 8.87 | −0.97 | 0.38 | 0.3336 | 0.7022 |
| iaa | 2.68 | 2.68 | 7.07 | 7.07 | 0.38 | 0.38 | 0.7054 | 0.7054 |
| iad | −18.28 | −18.28 | 8.87 | 8.87 | −2.06 | −2.06 | 0.0411 | 0.0411 |
| ida | 3.75 | 3.75 | 8.85 | 8.85 | 0.42 | 0.42 | 0.6725 | 0.6725 |
| idd | −18.13 | 18.13 | 11.68 | 11.68 | −1.55 | −1.55 | 0.1227 | 0.1227 |
Except for μ, the mixed-metric model gives the same estimates of the parameters a1, d1, a2, d2, iaa, iad, ida, and idd as those of Cockerham's model. The estimate of μ by the mixed-metric model is 61.66 with SD 5.79.
| . | Parameter estimate . | Standard error . | Test for parameter = 0 . | P value . | ||||
|---|---|---|---|---|---|---|---|---|
| . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . |
| μ . | 56.87 . | 57.13 . | 2.92 . | 7.07 . | 19.48 . | 8.08 . | 0.0001 . | 0.0001 . |
| a1 | 15.11 | 24.25 | 4.47 | 7.07 | 3.41 | 3.30 | 0.0008 | 0.0008 |
| d1 | −3.92 | 5.15 | 5.84 | 8.05 | −0.67 | 0.58 | 0.5035 | 0.5617 |
| a2 | 19.46 | 17.59 | 4.42 | 7.07 | 4.40 | 2.49 | 0.0001 | 0.0140 |
| d2 | −5.66 | 3.40 | 5.84 | 8.87 | −0.97 | 0.38 | 0.3336 | 0.7022 |
| iaa | 2.68 | 2.68 | 7.07 | 7.07 | 0.38 | 0.38 | 0.7054 | 0.7054 |
| iad | −18.28 | −18.28 | 8.87 | 8.87 | −2.06 | −2.06 | 0.0411 | 0.0411 |
| ida | 3.75 | 3.75 | 8.85 | 8.85 | 0.42 | 0.42 | 0.6725 | 0.6725 |
| idd | −18.13 | 18.13 | 11.68 | 11.68 | −1.55 | −1.55 | 0.1227 | 0.1227 |
| . | Parameter estimate . | Standard error . | Test for parameter = 0 . | P value . | ||||
|---|---|---|---|---|---|---|---|---|
| . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . | Cockerham . | F∞ . |
| μ . | 56.87 . | 57.13 . | 2.92 . | 7.07 . | 19.48 . | 8.08 . | 0.0001 . | 0.0001 . |
| a1 | 15.11 | 24.25 | 4.47 | 7.07 | 3.41 | 3.30 | 0.0008 | 0.0008 |
| d1 | −3.92 | 5.15 | 5.84 | 8.05 | −0.67 | 0.58 | 0.5035 | 0.5617 |
| a2 | 19.46 | 17.59 | 4.42 | 7.07 | 4.40 | 2.49 | 0.0001 | 0.0140 |
| d2 | −5.66 | 3.40 | 5.84 | 8.87 | −0.97 | 0.38 | 0.3336 | 0.7022 |
| iaa | 2.68 | 2.68 | 7.07 | 7.07 | 0.38 | 0.38 | 0.7054 | 0.7054 |
| iad | −18.28 | −18.28 | 8.87 | 8.87 | −2.06 | −2.06 | 0.0411 | 0.0411 |
| ida | 3.75 | 3.75 | 8.85 | 8.85 | 0.42 | 0.42 | 0.6725 | 0.6725 |
| idd | −18.13 | 18.13 | 11.68 | 11.68 | −1.55 | −1.55 | 0.1227 | 0.1227 |
Except for μ, the mixed-metric model gives the same estimates of the parameters a1, d1, a2, d2, iaa, iad, ida, and idd as those of Cockerham's model. The estimate of μ by the mixed-metric model is 61.66 with SD 5.79.
of QA are a1 = 3 and d1 = 1. QB has additive effect a2 = 1 and no dominance effect. The additive-by-additive epistatic effect is iaa = 2, and the other three epistatic effects are assumed to be zero. With these parameter settings, the marginal effects of QA and QB contribute 76 and 8% to the total genetic variance, and epistasis contributes 16% to the total genetic variance. The heritability of the quantitative trait is assumed to be 0.2, or equivalently the environmental variance is 25. The sample size is 200, and the number of simulated replicates is 500. When using the statistical model in Equation 35 for QTL mapping, a stepwise selection procedure (Kao et al. 1999) was adopted to detect QTL and analyze epistasis, and the critical value for claiming significance was chosen as , where k is the number of parameters in testing.
The simulation results are shown in Table 10. When epistasis is ignored in QTL mapping, the powers to detect QA and QB are 1.0 and 0.238 (; ), respectively. The mean estimates of positions of QA and QB are 51.25 with standard deviation (SD) 7.73 and 89.63 with SD 24.19, respectively. The means of the estimated additive and dominance effects of QA are 2.9941 (SD 0.5969) and 0.9816 (SD 0.9018). The mean estimate of the additive effect of QB (from significant replicates) is 1.8567 (SD 0.4196), which is poorly estimated. If the mean of the estimated effect of QB is calculated on the basis of all 500 replicates, it is 1.1214 (SD 0.7956), which is much closer to the true value. This corresponds to the theoretical proof of asymptotical unbiasedness for marginal effect in estimation if epistasis is present and ignored under Cockerham's
Simulation results of mapping epistatic QTL in the F2 population
| . | Without epistasisa . | With epistasisb . | ||
|---|---|---|---|---|
| . | QA . | QB . | QA . | QB . |
| Power | 1.000 | 0.238 | 1.000 | 0.500 |
| Position | 51.25 (7.73) | 89.63 (24.19) | 50.99 (7.95) | 90.46 (18.67) |
| 0.0125 (0.3701) | 0.0091 (0.3635) | |||
| a | 2.9941 (0.5696) | 1.8567 (0.4196) | 2.9658 (0.5700) | 1.3314 (0.6368) |
| d | 0.9816 (0.9018) | 1.0024 (0.8697) | ||
| iaa | 1.9897 (0.9948) | |||
| σ2 | 24.26 (3.29) | 24.02 (3.50) | ||
| h2 | 0.2158 (0.0445) | 0.2257 (0.0494) | ||
| . | Without epistasisa . | With epistasisb . | ||
|---|---|---|---|---|
| . | QA . | QB . | QA . | QB . |
| Power | 1.000 | 0.238 | 1.000 | 0.500 |
| Position | 51.25 (7.73) | 89.63 (24.19) | 50.99 (7.95) | 90.46 (18.67) |
| 0.0125 (0.3701) | 0.0091 (0.3635) | |||
| a | 2.9941 (0.5696) | 1.8567 (0.4196) | 2.9658 (0.5700) | 1.3314 (0.6368) |
| d | 0.9816 (0.9018) | 1.0024 (0.8697) | ||
| iaa | 1.9897 (0.9948) | |||
| σ2 | 24.26 (3.29) | 24.02 (3.50) | ||
| h2 | 0.2158 (0.0445) | 0.2257 (0.0494) | ||
Numbers in parentheses are standard deviations. The critical values for claiming significance are χ21,0.05/20 = 9.14 and χ22,0.05/20 = 11.98. Two unlinked QTL, QA and QB, are simulated on the chromosomes with 11 15-cM equally spaced markers. QA is placed at 52 cM with additive effect 3 and dominance effect 1. QB is placed at 93 cM with additive effect 1 (a1 = 3, d1 = 1, a2 = 1; VA/VG = 76%, VB/VG = 16%). The mean is 0, and the additive-by-additive epistatic effect is 2 (μ = 0, iaa = 2; Vi/VG = 16%). The heritability is 0.2 (h2 = 0.2, σ2 = 25). Th sample size is 200 and the simulated replicates are 500.
QTL mapping with epistasis ignored.
QTL mapping with epistasis taken into account.
Simulation results of mapping epistatic QTL in the F2 population
| . | Without epistasisa . | With epistasisb . | ||
|---|---|---|---|---|
| . | QA . | QB . | QA . | QB . |
| Power | 1.000 | 0.238 | 1.000 | 0.500 |
| Position | 51.25 (7.73) | 89.63 (24.19) | 50.99 (7.95) | 90.46 (18.67) |
| 0.0125 (0.3701) | 0.0091 (0.3635) | |||
| a | 2.9941 (0.5696) | 1.8567 (0.4196) | 2.9658 (0.5700) | 1.3314 (0.6368) |
| d | 0.9816 (0.9018) | 1.0024 (0.8697) | ||
| iaa | 1.9897 (0.9948) | |||
| σ2 | 24.26 (3.29) | 24.02 (3.50) | ||
| h2 | 0.2158 (0.0445) | 0.2257 (0.0494) | ||
| . | Without epistasisa . | With epistasisb . | ||
|---|---|---|---|---|
| . | QA . | QB . | QA . | QB . |
| Power | 1.000 | 0.238 | 1.000 | 0.500 |
| Position | 51.25 (7.73) | 89.63 (24.19) | 50.99 (7.95) | 90.46 (18.67) |
| 0.0125 (0.3701) | 0.0091 (0.3635) | |||
| a | 2.9941 (0.5696) | 1.8567 (0.4196) | 2.9658 (0.5700) | 1.3314 (0.6368) |
| d | 0.9816 (0.9018) | 1.0024 (0.8697) | ||
| iaa | 1.9897 (0.9948) | |||
| σ2 | 24.26 (3.29) | 24.02 (3.50) | ||
| h2 | 0.2158 (0.0445) | 0.2257 (0.0494) | ||
Numbers in parentheses are standard deviations. The critical values for claiming significance are χ21,0.05/20 = 9.14 and χ22,0.05/20 = 11.98. Two unlinked QTL, QA and QB, are simulated on the chromosomes with 11 15-cM equally spaced markers. QA is placed at 52 cM with additive effect 3 and dominance effect 1. QB is placed at 93 cM with additive effect 1 (a1 = 3, d1 = 1, a2 = 1; VA/VG = 76%, VB/VG = 16%). The mean is 0, and the additive-by-additive epistatic effect is 2 (μ = 0, iaa = 2; Vi/VG = 16%). The heritability is 0.2 (h2 = 0.2, σ2 = 25). Th sample size is 200 and the simulated replicates are 500.
QTL mapping with epistasis ignored.
QTL mapping with epistasis taken into account.
The three orthogonal contrast scales (W's) for the backcross population
| Genotype . | AABB . | AABb . | AaBB . | AaBb . |
|---|---|---|---|---|
| G | G22 | G21 | G12 | G11 |
| P | ¼ | ¼ | ¼ | ¼ |
| W1 | ½ | ½ | −½ | −½ |
| W2 | ½ | −½ | ½ | −½ |
| W3 | ¼ | −¼ | −¼ | ¼ |
| Genotype . | AABB . | AABb . | AaBB . | AaBb . |
|---|---|---|---|---|
| G | G22 | G21 | G12 | G11 |
| P | ¼ | ¼ | ¼ | ¼ |
| W1 | ½ | ½ | −½ | −½ |
| W2 | ½ | −½ | ½ | −½ |
| W3 | ¼ | −¼ | −¼ | ¼ |
G's and P's denote the genotypic values and expected genotypic frequencies for the four genotypes of two unlinked genes, A and B.
The three orthogonal contrast scales (W's) for the backcross population
| Genotype . | AABB . | AABb . | AaBB . | AaBb . |
|---|---|---|---|---|
| G | G22 | G21 | G12 | G11 |
| P | ¼ | ¼ | ¼ | ¼ |
| W1 | ½ | ½ | −½ | −½ |
| W2 | ½ | −½ | ½ | −½ |
| W3 | ¼ | −¼ | −¼ | ¼ |
| Genotype . | AABB . | AABb . | AaBB . | AaBb . |
|---|---|---|---|---|
| G | G22 | G21 | G12 | G11 |
| P | ¼ | ¼ | ¼ | ¼ |
| W1 | ½ | ½ | −½ | −½ |
| W2 | ½ | −½ | ½ | −½ |
| W3 | ¼ | −¼ | −¼ | ¼ |
G's and P's denote the genotypic values and expected genotypic frequencies for the four genotypes of two unlinked genes, A and B.
model (Equation 42). The estimates of environmental variance and heritability are 24.26 (SD 3.29) and 0.2158 (SD 0.0445), respectively. If epistasis is considered, the powers to detect QA and QB are 1.0 and 0.5, respectively. The power of detecting QB improves from 0.238 without epistasis to 0.5 with epistasis. The mean estimates of positions of QA and QB are 50.99 (SD 7.95) and 90.46 (SD 18.67), respectively. The estimated additive and dominance effects of QA have means 2.9658 (SD 0.5700) and 1.0024 (SD 0.8697), respectively. The mean of the estimated additive effect for QB is 1.3314 (SD 0.6368) from significant replicates, and it is 1.0447 (SD 0.6941) from all replicates. The mean of the estimated epistatic effects is 1.9897 (SD 0.948). The estimates of environmental variance and heritability are 24.02 (SD 3.50) and 0.225 (SD 0.0494).
CONCLUSION AND DISCUSSION
We use the orthogonal contrast scales proposed by Cockerham (1954) to define gene effects and to construct a genetic model, called Cockerham's model, for the study of epistasis between genes. The properties of Cockerham's model in modeling and mapping epistatic genes are investigated, and its variance component structure is also derived when genes are in linkage equilibrium and disequilibrium. The differences between Cockerham's model and other models in analyzing epistasis and mapping epistatic QTL are also compared. There are several advantages of using Cockerham's model in modeling epistasis of genes because of its orthogonal property. The advantages can benefit the study of QTL mapping. The issues of QTL mapping when epistasis is involved are also discussed. Real and simulated examples are used to illustrate Cockerham's model, verify its statistical properties, and map for epistatic QTL.
Parameterization of epistasis: Different types and degrees of epistasis can be also quantified by Cockerham's model. For example, if a1 = a2 = d1 = d2 = 3iaa/2 = 3iad/2 = 3ida/2 = 3idd/2, the genes show classical complementary interaction with a 9:7 ratio among different genotypic values, and the marginal effects of A and B
contribute 42.86% each and the epistatic effects contribute 14.28% to the total genetic variance. If a1 = a2 = d1 = d2 = −iaa/2 = −iad 2 = −ida/2 = −idd/2, the genotypes show classical duplicate interaction with a 15:1 ratio. The contributions of marginal effects of A and B and epistatic effects to the total genetic variance are 20, 20, and 60%, respectively. Other classical epistasis, such as recessive, dominant, and suppression, can also be quantified by Cockerham's model. The parameterization of epistasis can facilitate the study of epistasis in quantitative trait analyses.
Cockerham's model distinguishes itself from other models by the property of orthogonal contrast and will show the advantages in partitioning variance, genetic interpretation, statistical estimation, and QTL mapping over the other models as described in this article. When Cockerham's multiple-locus model is used to form the base of a multiple interval mapping (MIM) model (Kao et al. 1999) for QTL mapping, the likelihood of the MIM model is a 3m (2m) normal mixture for the F2 (backcross) population, and the general formulas by Kao and Zeng (1997) can be applied to obtain the MLE and the asymptotic variance-covariance matrix by using the coded variables to set up the genetic design matrix D and the conditional independence property to construct the conditional probability matrix Q. The orthogonalized MIM model can facilitate the search of epistatic QTL, enhance the resolution of QTL mapping, and help outline a better QTL mapping strategy.
Acknowledgement
This article is dedicated to the memory of Dr. C. Clark Cockerham. The authors are grateful to Dr. Christopher Basten for his suggestions and two anonymous reviewers for helpful comments. C.-H.K. was supported by grants NSC90-2313-B-001-019 from the National Science Council, Taiwan, Republic of China. Z.-B.Z. was funded by GM45344 from the National Institutes of Health and no. 9801300 from the U.S. Department of Agriculture, Plant Genome Program.
APPENDIX A
The expected normal equations can be expressed as a function of genotypic values, genotypic frequencies, and genetic parameters.
APPENDIX B
When the genotypic frequencies (Pij's) are expressed in terms of allele frequencies (p's) and the linkage disequilibrium coefficient (D) as shown in Table 6, and the left-hand sides of the expected normal equations in appendix a are replaced with statistical parameters, they can be expressed as the following equations.
APPENDIX C
Footnotes
Communicating editor: M. A. F. Noor
LITERATURE CITED

















