Abstract

The modular variation of organismal form during evolution or development can be viewed as the result of dissociated local developmental processes. Whereas studies of modularity often are experimental, morphological integration is a more descriptive concept applying to groups of correlated phenotypic characters. Using simple path models, this paper shows that the classic underlying assumption of modularity (high correlations within modules, lower correlations between modules) is met only when local developmental factors have high effects on the traits relative to all factors' variations of effect (i.e., allometry). Accordingly, many classic approaches to morphological integration are meaningful only when local as well as common growth factors are nearly isometric. We show that morphometric modules might instead be defined more generally as sets of variables with non-zero within-module covariances, even when the covariances due to common factors have been removed, so that the residual between-module covariances are all near zero. Although it is still inherently unreliable to identify modules from phenotypic covariances in nonexperimental data, patterns of integration can sometimes be estimated based on prior identification of the modules themselves. We outline a simple approach for this case using Wright-style factor analysis and demonstrate the relation of its algebra to the more familiar approach via partial least squares.

Modularity is a property of complex structures or processes that are experimentally or conceptually separable into several “near-decomposable” (Simon, 1962, 1969) modules. This notion of modularity is entertained in a wide range of disciplines (e.g., biology, psychology, computer science, economics), for which diverse operational criteria of what is a module have been devised. Some authors also reasoned that it is useful to construe the general concept of modularity as the decomposition of the explanatory task itself (e.g., von Dassow and Munro, 1999; Callebaut, 2005). For biological modules, Bolker (2000:773) outlined three general requisites: (1) a module is a biological entity (a structure, a process, or a pathway) characterized by more internal than external integration; (2) modules are biological individuals that can be delineated from their context, and the behavior or function of which reflects the integration of their sum; and (3) a module can be delineated from other entities with which it interacts in some way. Modules of a biological system consequently exhibit some degree of independence as they vary over development, evolution, populations, or experimentally induced conditions. Further definitions and examples of biological modules can be found in Raff (1996), Wagner (1996), von Dassow and Munro (1999), Winther (2001), Schlosser and Wagner (2004), Brandon (2005), and Callebaut and Rasskin-Gutman (2005). Related concepts are the notion of “quasi-independent” characters by Lewontin (1978) and the idea of “gene nets” proposed by Bonner (1988).

There is wide agreement among biologists that modularity is a prerequisite for the hierarchical phenotypic organization of higher organisms as well as for the appearance of complex adaptations. Quantitative genetic models show that traits serving a common functional role tend to evolve a common genetic basis that is more or less independent from other gene nets (Riedl, 1978; Cheverud, 1994, 1996; Wagner, 1996; Wagner and Altenberg, 1996). Conversely, genetic and developmental interdependencies specifically constrain the direction of evolution because such correlated traits frequently evolve together (Lande, 1979; Cheverud, 1994). Wagner and Altenberg thus underscored the importance of a modular genotype-phenotype map to increase evolvability—the speed and capacity of evolution.

The notion of modularity—recognized as the dissociability of developmental processes—was already implicit at the founding of experimental embryology by Wilhelm Roux in the late 19th century (Raff, 1996). In his attempt to unveil “developmental mechanics,” Roux (1894) perturbed individual components or particular processes in an embryo, expecting that other parts of the embryo would proceed in their development. The first explicit definition of dissociability was provided by Joseph Needham in 1933 in the first extended treatise on this topic.

In the development of an animal embryo, proceeding normally under optimal conditions, the fundamental processes are seen as constituting a perfectly integrated whole. They fit in each other in such a way that the final product comes into being by means of a precise co-operation of reactions and events. But it seems to be a very important, if perhaps insufficiently appreciated, fact, that these fundamental processes are not separable only in thought; that on the contrary they can be dissociated experimentally or thrown out of gear with one another. This conception of out-of-gearishness still lacks a satisfactory name, but in the absence of better words, dissociability or disengagement will be used in what follows. (Needham, 1933:180.)

Needham, as a “chemical embryologist,” provided a series of examples about the experimental dissociation of processes like growth, cell division and differentiation, hormone activities, or metabolic processes. Heterochrony, too, is a frequently cited example of dissociation and refers to the temporal dissociation of growth and development or of particular developmental processes (Gould, 1977; Alberch et al., 1979; Zelditch, 2001; Mitteroecker et al., 2005). Ernst Haeckel (1866), who invented heterochrony as a class of exceptions to his theory of recapitulation, contrasted it to heterotopy, which can be regarded as a form of regional dissociation (Gould, 1977; Zelditch and Fink, 1996; Mitteroecker et al., 2004a).

Needham regarded it as important—even surprising—that the fundamental developmental processes in an organism that are usually “seen as constituting a perfectly integrated whole” are in fact separable. Evolutionary biologists and paleontologists, in contrast, often emphasize variations among, and differences in patterns of change (Winther, 2001). For instance, cladistics, as a central method in systematics, formally assumes that all measured phenotypic traits are uncorrelated and that they provide independent evidence for phylogenetic history. In the 1950s Everett Olson and Robert Miller, two paleontologists, therefore devoted a series of papers and one book to yet another “surprise”:

It seemed evident to the writers, as it has to others, that character changes occurring in evolution of species could not be considered to be independent of each other and that studies which did not consider this dependency ignored a significant aspect of change. Not only should the interrelationships of changing characters be a primary point of interest, but the nature and intensity of the relationships should remain evident at all stages of their study. (Olson and Miller, 1958:1.)

Since then, the morphometric study of those interrelationships among characters is usually referred to as morphological integration and is based on the comparison of covariances or correlations among morphological measurements from specimens of a single or several distinct natural populations. The fundamental idea is that within the set of measured variables there exist subsets (“ρ-sets” in Olson and Miller's terminology) exhibiting relatively high mutual phenotypic correlations, whereas correlations among variables pertaining to different sets are comparatively low. Elements of those subsets are often assumed to share functional, developmental, or evolutionary properties.

Even though this concept of morphological integration is usually attributed to Olson and Miller, the very same idea had already been formulated more than 20 years earlier in a German article in Biometrika by the Russian Paul Terentjev (1931:26):

Nehmen wir ein Objekt als eine Gesamtheit von Merkmalen und berücksichtigen wir alle denkbaren Kombinationen ihrer korrelativen Zusammenhänge. Schon die oberflächliche Beobachtung zeigt, dass die Gleich mässigkeit solcher Korrelation der Merkmale einen ziemlich seltenen Fall bildet. In der Regel gruppieren sich die Merkmale in einigen “Gesell schaften”, die “Korrelationsplejaden” genannt werden können. Die Hin- und Herbezüge zwischen den Gliedern der verschiedenen Plejaden zeigen die Neigung zum Mindestmass. [Let us take an object as a totality of characteristics and consider all their mutual correlations. Even a superficial observation shows that the evenness of those correlations would be a quite rare case. Typically, traits are grouped into some “societies” that can be called “correlation pleiades”. The relationships among elements from different pleiades show a tendency to be minimal.]

Terentjev argued that traits are usually grouped according to their correlations in “correlation pleiades” where correlations among elements of different pleiades are low. In his 1931 paper, he proceeds by applying a battery of statistical and graphical techniques to identify such correlation pleiades in measurements on frogs. In contrast to Terentjev, Olson and Miller (1951) explicitly related such sets of correlated variables to biological hypotheses—that is, to a grouping of the variables (F-sets) according to their developmental or functional relationships. Chernoff and Magwene (1999:319) consequently defined morphological integration as “the correspondence of patterns of covariation among traits to a priori or a posteriori biological hypotheses.” Yet, in their 1958 book, Olson and Miller discarded the a priori construction of F-sets to derive hypotheses a posteriori from ρ-sets only. This reformulation of the original approach has not remained uncriticized (see Chernoff and Magwene, 1999). Also, the original statistical maneuvers used by Olson and Miller have been regarded as complicated and partially inadequate (e.g., Bock, 1960; Van Valen, 1965).

Referring to Olson and Miller's seminal treatise, a series of statistically more elaborate methods to study morphological integration have been suggested meanwhile (for reviews see, e.g., Chernoff and Magwene, 1999; Mitteroecker, 2007). These methods vary in their purpose and strategy. Whereas several techniques are concerned to empirically test a priori formulated models of integration, often based on specific measures of integration, other methods are designed to identify modules in an exploratory fashion. Another bundle of techniques compares integration patterns among different taxa or patterns attributable to different causes. Finally, some approaches, intended to model integration and modularity of several anatomical parts, usually incorporate both a priori knowledge and empirical morphometric data. All these developments seem to have taken place without the recognition of Terentjev's work. Still, the concept of correlation pleiades has been taken up by several Russian biologists, particularly botanists (Berg, 1960; Koloslava, 1971; Rostava, 1985), and it has influenced research in the modularity of plants (for a recent review see Murren, 2002). Recently, some authors also addressed morphological integration in the context of phylogenetic inference (e.g., O'Keefe and Wagner, 2001; Strait, 2001; Athreya and Glantz, 2003).

Modularity and morphological integration appear as complementary concepts, and the current literature frequently equates groups of correlated variables to modules. When considered in more detail, however, the two concepts exhibit practical and conceptual differences. In biology, modularity typically denotes the (experimental) dissociability of developmental, functional, or evolutionary processes. Morphological integration, in contrast, is about the decomposition of morphological traits according to their (phenotypic) covariance structure compared across adults and/or subadults of a single taxon or across several taxa. Hence, modularity is typically regarded as a property of processes, whereas integration is a specific pattern of variation and covariation among parts that these processes brought about. Berg (1959) wrote: “The presence of correlations between, for instance, the dimensions of some parts of an organism and the absence of correlations between the dimensions of these and the other parts of the same organism indicate the independence of certain developmental processes with respect to other processes within the organism.” The fundamental assumption in most studies of morphological integration is hence that ρ-groups or correlation pleiades match or reflect the dissociation of local developmental processes.

In the present paper we attempt to formalize the intuitive relationship between morphological integration and modularity by a heuristic use of path models simulating particular developmental systems. We will show that groups of correlated characters do not reflect developmental modules unless certain conditions are met. Yet we demonstrate that under given notions of modularity, the pattern of morphological integration can be estimated. Furthermore, we will explore statistical properties of several published methods to assess morphological integration, and we will discuss the conditions under which they may reflect biological modularity. We will thereby focus on methods for testing, identifying, and measuring integration; the comparison of integration patterns will be the subject of a forthcoming paper. The following three sections present these arguments in an algebraic form, whereas the subsequent examples illustrate most of them via modeled covariance structures. The final discussion summarizes the arguments and their consequences for the morphometric assessment of modularity.

Path Models of Developmental Processes

In the early 20th century, Sewall Wright (1918, 1921, 1934) introduced the formalism of path models to quantify causal relationships among observed variables (for a recent review on path models and causality, see Pearl, 2000). Frequently applied in the social sciences, path models are less often encountered nowadays in biology and morphometrics. They allow for a conceptualization and computation of the effects of different (measured) factors on several response variables.

Consider first the simple path model of Figure 1, where it is assumed that the variable A affects B and C. Without loss of generality, take A to have a mean of 0 and a variance of 1. The path coefficient from A to B, i.e., the causal effect of A upon B, is b and the path from A to C is c. Furthermore, to both variables B and C are attributed error variables representing unexplained variance that have means of zero and are uncorrelated with all other variables. Algebraically, this model may be expressed as B = bA+e1 and C = cA+e2, where the e's are “errors” around the two predictions. The variance of B is then b2+Var(e1) and that for C is c2+Var(e2). As the error terms are uncorrelated among themselves and with B and C and as Var(A) = 1, the covariance between B and C is simply bc. The correlation between B and C is  

formula
thus depending on the error variances as well.

Figure 1

A simple path model where the variable A affects B and C via the path coefficients b and c, respectively. Additional uncorrelated error terms are denoted by e1 and e2.

Figure 1

A simple path model where the variable A affects B and C via the path coefficients b and c, respectively. Additional uncorrelated error terms are denoted by e1 and e2.

In Sewall Wright's classic expositions, the deviation of a variable from its mean is expressed as a multiple of its standard deviation. Instead of covariances, then, he addresses correlation coefficients, and path coefficients describe the explained variance for the standardized responsive variables. In his approach, the three variables A, B, and C in Figure 1 would all have unit variance and the correlation (instead of the covariance) is bc. This requires that Var(e1) = 1−b2 and Var(e2) = 1−c2 so that the denominator in Equation (1) cancels out.

Accordingly, both correlation and covariances can be interpreted in the light of path models. In traditional morphometrics, which is frequently applied to variables having different units, analyses are often based on correlations (see, e.g., Blackith and Reyment, 1971; Marcus, 1990). Similarly, most published methods for studying morphological integration rely on correlations. In geometric morphometrics, instead, the variables all possess the same units and analysis proceeds via covariance matrices (Bookstein, 1991). Therefore, most of the following discussion is about covariances instead of correlations—variables are left unstandardized. Also, the implicit error model when working with covariances is more consistent with other morphometric models than that of correlations. Nonetheless, the arguments of this paper apply to correlations without substantial change, and hence to the published methods on integration, whenever the conventional error model is assumed.

Common and Local Factors

The morphology of an organism is determined by a variety of factors (sensu lato) that unfold their morphogenetic effects during ontogeny. In defining different kinds of integration, Cheverud (1996) distinguished among developmental, functional, genetic, and evolutionary causes, all of which lead to statistical associations among morphometric variables. Others are particularly concerned to differentiate between genetic developmental factors and epigenetic factors (e.g., Müller and Olsson, 2003) such as tissue interactions or environmental factors. In the following approach, these diverse kinds of developmental factors will not be specifically distinguished—they are assumed simply to affect the measured morphological variables in an additive way. Empirical identification of the sources of these effects can be achieved only by appropriate experimental designs. Yet, certainly many developmental factors do not affect the organism in a quantitative and additive way. We know, for example, that developmental events are often threshold-dependent (such as tissue induction) and may cause the emergence of entirely new structures instead of solely quantitative variation. Studies of morphological integration, however, are typically based on the comparison of adult or postnatal specimens from closely related species. The observed interindividual differences are assumed to be due to differences in such quantitative and additive factors, insofar as more fundamental developmental processes were identical in all compared organisms.

Fundamental for the following arguments will be the distinction between common and local factors. Global or common factors (such as circulating growth hormones, early developmental events, or environmental influences) are those affecting all the variables under study. In contrast, local factors affect only a subset of the given variables. Many genes, for example, are expressed only in specific tissues or organs; the effect of many tissue interactions might be primarily local. Also, structural or topographical separation of anatomical parts will contribute to the isolation from developmental events at other sites. In the following models, in order to keep the arguments as clear as possible, we assume that local factors are non-overlapping and not nested. However, it is straightforward to extend these models to more complex and realistic factor structures—the crucial properties and the ensuing conclusions remain the same.

Let us assume that nine variables X1, …, X9 are influenced by one common factor A and two local factors M and N affecting four and five variables, respectively (Fig. 2). The factors are assumed to be mutually uncorrelated as indicated by the absence of any direct paths relating the factors in the path model. The covariance between two variables Xi and Xj due to the common factor is aiaj, so that the contribution of A to the covariance matrix Σ is the outer product aat, where a is a column vector containing all path coefficients a1,a2, …, a9 from A to the variables X1 through X9. The local factors contribute to the respective parts of the covariance matrix ΣI = Σ1… 4,1… 4 and ΣII = Σ5… 9,5… 9 only. The additional covariances due to the local factors are mmt and nnt, where mt = (m1,m2,m3,m4,0,0,0,0,0) and nt = (0,0,0,0,n1,n2,n3,n4,n5) so that  

formula
where Ψ is the diagonal covariance matrix of individual error variances.

In the present approach, the factor structure of the path model shall represent known or hypothesized developmental processes that cause a particular covariance structure among the phenotypic variables in a sample of organisms. The subsets of variables that are separately influenced by local factors (X1, …, X4 and X5, …, X9 in Fig. 2) will be called morphometric modules or simply modules in the remaining text. They are subject to separate and dissociated developmental control and thus may vary to some degree independently. The “near-decomposable” character of the modules can be characterized effectively by the residual covariance structure when the effect of the common factor(s) is removed from the raw covariances:  

formula
where m* = m1… 4,1… 4, n* = n5… 9,5… 9, and 0 is a matrix of zeros. That is, the within-module variables exhibit a non-zero covariance structure even when the common factors have been eliminated so that the cross-block covariance matrix ΣI,II becomes a zero matrix. Modular shape variation, therefore, is the residual variation after common factors have been correctly accounted for (see also Hansen, 2003). Modules are those blocks of variables that exhibit a zero cross-block covariance structure after adjustment for global factors. (For an observed sample, of course, these covariances are not exactly zero and (3) is only an approximate equality.) After accounting for common factors, the variables within one module remain correlated but are uncorrelated with those from other modules. Note that there exist several different approaches to account for factors; see the next section for more details.

Estimating Common and Local Factors

There exists an extensive literature on the statistical estimation of unmeasured factors underlying measured variables. Such techniques, usually referred to as exploratory factor analysis, were invented in the course of early 20th century attempts by Karl Pearson, Charles Spearman, and others to measure human intelligence (Johnson and Wichern, 1998). They assumed that a correlated set of measured variables X1,X2,…,Xp could be usefully modeled as linearly dependent on a smaller number k of common factors F1,F2,…,Fk plus p additional sources of variation e1,e2,…,ek that are uncorrelated with each other and with the factor scores so that  

formula
where X is the n × p data matrix, F the n × k matrix of factor scores, A the p × k matrix of factor loadings, E the n × p matrix of residuals, and the superscript t denotes a matrix transpose. The residual term is usually taken to be the sum of two uncorrelated parts, the specific part and measurement error. The specific part contains that part of Xi unaccounted for by the factors and not due to measurement error (Reyment and Jöreskog, 1993). The coefficient aij is called the loading of the ith variable on the jth factor—the matrix A is called the matrix of factor loadings. The expected covariance matrix is then  
formula
where Φ is the covariance matrix of factor scores and Ψ is the diagonal covariance matrix of residuals. In order to estimate the factor loadings A, most approaches assume that the factors are uncorrelated and of the same unit variance, so that Φ = I and  
formula

Factor analysis provides estimates of A best accounting for the covariances (or correlations) among the variables under a wide range of meanings for the idea of “best.” One classic criterion for fitting the factor model to the data is to minimize the sum of squares of all the elements of SAAt−Ψ, where S is the sample covariance matrix. If Ψ were known, one could fit AAt to S−Ψ by an eigenvalue decomposition as familiar from principal component analysis. In practice, Ψ is not known and thus is estimated from the data, e.g., as the squared multiple correlation coefficient.

The iterated principal factor method avoids an explicit estimation of Ψ: Start with any first guess Â, perhaps the first principal component of S, and compute forumla = diag(SÂÂt). The new estimate  is then u1 √λ1 where u1 and λ1 are the first eigenvector and first eigenvalue, respectively, of Sforumla. Update forumla and repeat these steps until convergence. This classic method has meanwhile been superseded by other, maximum likelihood–based approaches (for reviews see, e.g., Harman, 1976; Bartholomew and Knott, 1999), but for the present purpose we will stay with principal factor analysis because it closely relates to other methods in the context of morphological integration.

It is well known that when neither the factors nor the additional variation are measurable, a unique solution to Equation (4) is not possible without additional constraints. This indeterminacy is due to the fact that any set of factors may be transformed linearly into another set of factors having the same fit. In practice, an initial factor solution is rotated until some formal criterion or a better scientific interpretability of the factors is achieved. A satisfactory solution is often called a simple structure: According to Thurstone (1947), a variable should not depend on all factors but only on a small sublist of them. Moreover, one factor should only be connected with a small sublist of the variables. These criteria closely resemble the definition of modularity in Figure 2 and Equation (3), yet without any further constraints or scientific input such a simple structure is more a convenient representation than a biologically meaningful factor. The application of exploratory factor analysis to morphometric data (e.g., in Sokal, 1961, or Gould and Garwood, 1969) cannot identify modules from morphometric measurements alone. See also Zelditch (1987) for a discussion of exploratory and confirmatory factor analysis in that context.

Figure 2

Extended path model with one common factor A affecting all nine variables X1, …, X9 and two more local factors M and N influencing four and five variables, respectively. Variable-specific error variance is not shown here.

Figure 2

Extended path model with one common factor A affecting all nine variables X1, …, X9 and two more local factors M and N influencing four and five variables, respectively. Variable-specific error variance is not shown here.

Wright-Style Factor Analysis

Sewall Wright (1932) suggested a distinctive conceptual approach to the analysis of covariance structures, even though its computation is largely identical to principal factor analysis. Wright required that his factors be “consistent”; that is, that the loadings of a factor simultaneously be the coefficients in the corresponding path diagram of their scientific relevance. In his 1932 paper, he applied his method to six measures on Leghorn chicken bones, identifying one common factor and three local factors. As his data consisted of measured distances, he specifically interpreted his results as primary and secondary size factors.

Wright begins his analysis by iteratively computing a first estimate of the vector of path coefficients a for a common factor where  

formula
when σii, the variance of the ith variable is taken as ai2 (Bookstein, 1991:40). Bookstein et al. (1985) explained that, at convergence, the vector a is also identical to the first eigenvector (scaled by the square root of the first eigenvalue) of the correlation matrix when the respective diagonal elements are replaced by ai2. This, in turn, is the limit of convergence for the principal factor algorithm described above, so that Wright's first step in estimating the common factor is numerically identical to that approach.

Both algorithmically and conceptually, the following steps distinguish Wright's method from other factor analysis approaches. Based on the pattern of residual correlations after subtracting the effect of the common factor, he explicitly states speculations about sets of variables that are separately affected by specific or secondary factors. He specified sets or blocks of variables that are each responsive to one such group factor. Those correlations among variables pertaining to the same group are sequestered from the analysis of an updated estimate of the common factor (as had been the diagonal elements of Σ in the step before). For his data of six variables, Wright ended up with one common size factor (estimated only from the correlations between the specified subsets of variables) and three local size factors (estimated from the residual correlations within the subsets of variables), each influencing two variables. In Bookstein's reformulation of Wright's algorithm, the vector of loadings on the first common factor a is the scaled first eigenvector of the appropriately sequestered correlation matrix; it follows that the products aiaj are a least squares fit to the elements σij for all i and j not belonging to the same subset of variables; that is, to all between-block correlations or covariances (for a proof see Mitteroecker, 2007).

Wright's algorithm to extract and rotate factors conveniently enables the incorporation of empirical or prior knowledge about morphological modules in order to estimate common factors. In (5) or in Bookstein's algorithm, all covariances σikwithin one module will be replaced by aiak. The iterative estimation of a is therefore solely based on the between-module covariances. Local factors are computed subsequently from the within-module covariances after the covariance attributable to the common factor has been subtracted so that  

formula
When inferring notions of modularity not from the data, as Wright did, but from prior biological hypotheses, this algorithm still provides a tool to calibrate (Martens and Naes, 1989) vague notions of common and local factors.

Partial Least Squares

For a two-block or two-module design, the variables can be ordered so that the between-module covariances are all contained in one off-diagonal block of the covariance matrix (the cross-block covariance matrix). Sampson et al. (1989) and Bookstein et al. (1996) employ a singular value decomposition (SVD) of this matrix to arrive at a two-block partial least squares (PLS) analysis (Wold, 1966). Let ΣI,II be the cross-block covariance matrix between the first and second blocks of variables XI and XII and decompose this matrix such that  

formula
where U and V are the matrices of left and right singular vectors and Λ is a diagonal matrix of singular values in descending order of absolute value. The first pair of these vectors, u1 and v1, maximizes Cov(XIu1,XIIv1); that is, the covariance between the scores along the left and right first singular vectors. The outer product u1v1t scaled by the first singular value λ1 is the rank-1 least squares fit of ΣI,II. It follows that  
formula
and further that  
formula
where aI are the elements of a corresponding to the first block and aII those for the second block. The first common factor, therefore, differs from the two-block PLS solution only by virtue of this separate scaling block by block. For more blocks of variables, this identity is not exact any more. There is no single generalization of the two-block PLS algorithm, but instead there exist a range that yield slightly different results (Bookstein et al., 2003). Our numerical experiments show that Wright's algorithm yields results that are very close to all these PLS generalizations.

PLS has recently been suggested as a method to model the pattern of integration among two or more blocks of morphometric variables (Rohlf and Corti, 2000; Bookstein et al., 2003). The numerical identity between Wright's factor approach and PLS when both are applied to a two-block design and their close resemblance for more blocks is somewhat surprising inasmuch as these approaches were invented for rather different purposes. PLS is a method for multivariate prediction or calibration optimizing the covariance between the blockwise scores. Wright's approach is a kind of factor analysis modeling correlations among the original variables. Their numerical similarity, however, sheds light on the interpretation of PLS when applied to study morphological integration and modularity. The PLS loading vectors may serve as (separately scaled) loadings on a common factor assuming a specific local factor structure (i.e., a specific notion of modularity). When they are visualized or interpreted as one joint shape deformation—when they serve as singular warps—they have to be rescaled appropriately because the amount of shape change depicted by one (unit-length) PLS loading vector does not necessarily correspond to the associated amount of shape change depicted by the other (unit-length) loading vector.

Scaling of Common Factor Loadings

For the common factor a in the two-block case, the products aiaj, for all i,j from different blocks, are minimizing the squared deviations from σij. There is, however, no unique criterion for scaling aI and aII, or equivalently for u and v in the PLS algorithm, because  

formula
where f and g are any two arbitrary scalars. As the least squares algorithm is not based on any within-block covariance, the separate scales of these vectors cannot be meaningfully constrained. In PLS with any number of blocks, by convention they are standardized to vector length one while in Wright's iterative algorithm, when applied to a two-block design, the relative scaling is affected by the initial guess of a. The common factor loadings can thus be interpreted only relative to the other loadings of the same block but not to those of the other block(s). A joint visualization of the unit length PLS loading vectors as done in Bookstein et al. (2003) is not guaranteed to be meaningful. For PLS, this problem remains for multiple-block designs, whereas Wright's algorithm scales the parts of a correctly whenever more than two subsets of variables are used in the algorithm.

For three-block PLS, a simple solution to scale the loading vectors is obtainable by extending the rationale behind the SVD approach for two blocks. Regard the three PLS vectors u, v, and w for a three-block design as approximations of the three cross-block covariance matrices when scaled appropriately. Let r1,2, r1,3, and r2,3 be the regression coefficients of the cross-block covariances on their respective estimates through the PLS vectors, so that r1,2 is the regression coefficient of the elements of σI,IIij on the elements of uivj and so forth. In two-block PLS, this regression coefficient is equal to the covariance between the PLS scores. This property holds true for more than two blocks and allows another computation of these coefficients:  

formula
The separate scaling factors for the PLS vectors d_ u, d_ v, and d_ w relate to those regression coefficients or covariances as  
formula
and the solution to these equations is  
formula

If u, v, and w are vectors of shape coordinates, the composite vector (duut|dvvt|dwwt) can then be visualized and interpreted jointly as one shape deformation.

The loadings of the first principal component of the block-wise common factor scores may serve as approximations of the scaling factors that are applicable for any number of blocks. Let S be the n × q matrix of block-wise common factor scores (XIuI| XIIuII|…| Xquq), where q is the number of blocks or modules and ui the first singular vector for the ith block, and decompose it as S = CΓ Dt where D is the q × q matrix of right singular vectors or eigenvectors and Γ the diagonal matrix of singular values. The elements of the first column of D, denoted as d1dq, now contain the weightings for the PLS vectors so that  

formula
is the scaled loading vector for the common factor.

Factor Scores

In his published work, Wright did not compute scores for his factors, whereas the computation of scores is implicit in the PLS algorithm. The computation of scores allows one to compare specimens or other aspects of the data such as group means or regressions. In exploratory factor analysis, scores are often estimated by a weighted least squares method or through regression, whereas PLS scores are computed as linear combinations XIU and XIIV (and similarly for additional blocks). These linear combinations can be conceived as the result of an unweighted least squares procedure where the error variances are treated as equal and where the factors are orthogonal (Johnson and Wichern, 1998). It is thus convenient to compute scores for the common factor simply as linear combinations Xas where as = a*/|a*| and |…| denotes the vector norm. As is customary in PLS studies on morphological integration, it might also be interesting to plot XIu versus XIIv and similarly for more dimensions or blocks.

Wright (1932) did not extract more than one common factor for his data, whereas the estimation of subsequent factors is ubiquitous in both factor analysis and PLS. Several published variants of PLS, however, differ in how dimensions are removed from the data in order to estimate subsequent dimensions. In Wright's algorithm and many other factor analysis approaches, further factors (like local factors) are computed from the residual covariances/correlations after the effect attributable to the first factor has been subtracted from the raw covariances/correlations (“matrix deflation”). Alternatively, factor scores or PLS latent variable scores can be regressed out from the data, or dimensions might be projected out (as in two-block PLS when performed by an SVD). Although the first approach is consistent with the idea of path models, orthogonal or uncorrelated factors are more convenient for further statistical procedures.

Clustering Variables by Observed Correlations

In order to identify correlation pleiades, Terentjev (1931) plotted the distribution of all pairwise correlation coefficients and interpreted the lower range of them as the correlations between pleiades and the higher correlations as those among traits within the same pleiade. He then graphed the higher range of correlations in a “correlation ring” that visually supports the identification of clusters of variables. In Olson and Miller's (1951, 1958) approach for exploring morphological integration, ρ-sets are constructed from the correlation matrix of morphological measures. Formally, they define ρ-groups as “all maximal classes of measures such that within each class all pairs have ρ greater than the selected level” (1958:46). Of course, Olson and Miller recognized that such groups often overlap and that some variables may have “bonds” (i.e., high correlations) to several groups.

The rationale underlying both approaches is that all within-module correlations are higher in their absolute value than correlations between modules. (Terentjev as well as Olson and Miller took all correlations as positive values.) But this is a very demanding assumption, particularly in applications to shape coordinates. In Figure 2, the modeled correlation ri,j is aiaj if i,j are from different modules, and akal+mkml if k,l are from the same module. Evidently, rk,l exceeds ri,j only if the path coefficients m for the local factors are non-zero and of the same sign as the common factor loadings (compare also Cheverud, 1994; Gromko, 1995; Pigliucci, 2006). Furthermore, for  

formula
to hold for all corresponding pairs of correlations, the path coefficients a of the common factor need to be relatively homogeneous as compared to the local effects. If the path coefficients of all factors are equal, then all within-module correlations are indeed higher than the ones between—but this implies that all factors induce isometric growth. For real data this is very unlikely; owing to growth allometries and the geometry of the measurements (see below), not all variables are affected to the same amount by a developmental factor. Shape coordinates, in particular, cannot all increase together. It might then happen, e.g., that two variables from different modules are relatively strongly affected by a common factor inducing high covariance between those variables. Simultaneously, variables within the same module might only be marginally affected by common factors so that their correlation is lower even in the presence of local factors that increase the correlation. The success of Olson and Miller's approach hence depends on the variation of all the path coefficients. The variation of these coefficients is, in fact, an estimate of the allometry of a growth factor: the more the coefficients vary, the more allometric is the shape or form change induced by this factor.

An approach exploiting the same classic assumptions was presented by Leamy (1977) and Cheverud (1982). They applied Q-mode cluster analysis to morphometric variables and to their loadings in multidimensional scaling or principal component analysis. Functionally related measures, so they argued, would end up in one cluster. Q-mode cluster analysis seeks to cluster variables in an n-dimensional space, often called dual space, based on a particular distance function. In dual space, variables are usually taken as vectors from the origin and the angle α between two vectors is one typical distance function between the two variables. The correlation r between two variables can be expressed in terms of this angle as r = cos α and, when the variables are standardized, the Euclidian distance d among two variables in dual space is d = 2 sin (α/2), so that those two distance functions are closely related. The assertion that variables within the same module might cluster in a Q-mode cluster analysis is therefore equivalent to Olson and Miller's classic assertion that variables within modules are mutually highly correlated and less correlated with variables in other modules. The loadings of the variables on a subset of principal components (scaled by the square root of the respective eigenvalue) relate to a projection of full dual space onto fewer dimensions.

Partial Correlations

In Terentjev's analysis of 22 measurements on 203 living frogs (Rana ridibunda), he selected the snout-anus distance as being representative for overall size and computed partial correlations conditioned on this size measure. Terentjev found that the variables clustered much more clearly based on the partial correlations than on the raw correlations. Because allometric size is usually a dominant factor in morphometric data, this approach is similar to partialing out a common factor. As we set out already (Equation (3)), a modular structure of the covariance matrix is more apparent from these residual covariances or correlations than from the raw correlations. Similarly, Zelditch (1987, 1988) elaborately compared a one-factor model of allometric size with more complex models of morphological integration. Hallgrímsson et al. (2002) tested whether the correlations of traits within tentative modules are higher than expected for integration due to size alone.

For one correlation pleiade that Terentjev identified, he proceeded by searching for subpleiades within that group of traits. He partialed out another variable that had the highest average correlation to all other variables in this group—the variable that he regarded as the most effective indicator for that pleiade. The partial correlations that were conditional on both size and this indicator exhibited two subpleiades, one for the bony elements of the head and one for the bony elements of the limbs. Terentjev's method obviously relates to the factor approach that we presented above when taking his size variable and his indicator variable as rough common factor scores estimates—which they were apparently intended to be.

Magwene (2001) drew conclusions about modularity by grouping variables according to the elements of the inverted covariance matrix (and several derived statistics) that relate to partial correlations conditional on all other variables. As indicated by Equation (3), within-module covariances conditional on those variables outside the module relate to the residual covariances in (6) after the common factors have been partialed out. But when correlations are conditional on variables from the same module, that partial correlation will be near zero because, per definition, the variables within one module are highly correlated and even the local factors are removed. In Magwene's example, the chicken data from Wright, his approach accidentally worked because all modules consisted of two variables only. The pairwise correlations conditional on all variables, then, are simultaneously conditional only on the variables outside of the modules because there are no further variables in one module. But for any data with more than two variables per module this approach must yield uninterpretable results.

Geometric Properties of Morphometric Variables

The above discussion as well as the majority of published methods for studying integration implicitly assume a null model of no integration for which the correlations among the variables are zero. Non-zero correlations are expected for integrated variables or those within one module. We have just shown that these assumptions are met only for specific cases. It turns out that the geometric properties of morphometric variables make this classic model of integration even more problematic. For p landmarks in k dimensions, shape space is of kpkk(k−1)/2−1 dimensions—the matrix of shape variables thus has k+k(k−1)/2+1 degrees of freedom fewer than it has variables. As a consequence, the pk shape coordinates are necessarily correlated. This is true to an even greater extent when describing shape or form by all possible linear distances. For details on the joint distribution of morphometric variables see Bookstein (1991) or Dryden and Mardia (1998).

Closely adjacent structures in an organism are likely to covary in shape and size even if they do not share a common genetic or developmental basis. Roth (1996) argued that regardless of any genetic contingency, structures that occupy adjacent regions in space must be mutually accommodating in volume (“spatial packaging”). Similarly, Enlow et al. (1969) emphasized that in a functional assemblage of bones, certain key dimensions must necessarily correspond between these bones in order to provide proper fit (the “counterpart principle”). Such an association between phenotypic correlation and spatial contiguity, which is also called “Pearson's rule of neighbourhood” (Whiteley and Pearson, 1899; Lewenz and Whiteley, 1902), has been documented for a range of organisms. Sawin et al. (1970), for example, reported an approximately linear decline in the correlation among dimensions of rabbit bones with their spatial distance.

Most statistical and morphometric maneuvers (also those suggested above) do not take into account the spatial structure of the measured variables or their geometric properties and may not be able to separate interesting signals from geometric necessities. In particular, covariances among variables from two distant parts would likely be lower than covariances among more adjacent variables within one part—but this would be true for any spatially coherent parts even in the absence of dissociated genetic control.

Conversely, completely integrated shape variables do not necessarily exhibit non-zero correlations. Imagine a configuration of four landmarks constituting a square that is sheared into a parallelogram. When these forms are superimposed and projected into tangent space, the covariances among Procrustes shape coordinates are all non-zero—as one would expect for this (geometrically) completely integrated shape change (the uniform shape change involves all landmarks). Alternatively, the same forms may be described by the four side lengths and the two diagonals. The side lengths all remain identical across the forms (and are hence mutually of covariance 0); only the diagonals change in a perfectly correlated fashion. Yet, an interpretation in terms of one module consisting of the two diagonals would be greatly misleading. In general, for any landmark configuration there exist geometrically integrated shape changes that leave some variables uncorrelated. Or, conversely, for a given shape transformation that is large scale and thus involves all parts of that shape, there exist sets of descriptors that are arbitrarily correlated. Imagine an unmeasured theoretical space of all possible shape descriptors so that the space of actual measured variables can be regarded as a projection of that. As in PCA, the full space can be rotated and projected so that all or some resulting linear combinations are uncorrelated. The pattern of observed correlation is hence also a function of the measures chosen to describe differences among particular forms (compare also Bookstein, 1991: chapter 6.3).

Measures of Integration

As morphological integration and modularity are each a matter of degree, several authors have suggested different methods to quantify the extent to which two or more sets of variables are integrated in morphometric data. Such indices can be compared among groups or may serve to identify modules when comparing the actual index to those of other hypothesized modularizations or even random partitionings of variables. Measures of overall integration (without specifying groups of variables) are not considered here.

Average Correlations and Covariances

Cheverud (1982) suggested comparing the average correlation or average squared correlation within the tentative modules to the average (squared) correlation between the modules. Similarly, Klingenberg and Zaklan (2000) used the average squared covariances to quantify integration among two sets of variables (see also Rohlf and Corti, 2000).

For the path model in Figure 2, the submatrix of covariances within the first module is a sum of the two outer products ΣI = aI (aI)t+ mI(mI)t, where aI = (a1,a2,a3,a4)t and mI = (m1,m2,m3,m4)t (for simplicity we ignore individual error variances). The mean of the elements of the outer product aI(aI)t, denoted as M(aiaj), equals the squared mean of the elements of a, denoted as M2(ai), and accordingly for m and n, so that  

formula
and similarly for the second module. In words, the average covariance within one module equals the squared mean of the respective path coefficients of the common factor plus the squared mean of the path coefficients of the local factor. The submatrix of Σ containing the covariances between the two modules, denoted as ΣI,II, depends only on the common factor so that ΣI,II = aI(aII)t. The mean of these matrix elements is  
formula

The square root of the average between-module covariance is thus an estimate of the average path coefficient of the common factor. When taking M(aIi)≈ M(aIIi), the difference between the average within-module covariance and the average between-module covariance is an estimate of the average path coefficient of all local factors:  

formula

The situation is slightly more complicated with the squaredcovariances/correlations. It turns out that  

formula
or, in words, the mean of the squared elements of the outer product aat is a function of both the mean and the variance of the elements of a. This gives  
formula
similarly for the second module, and  
formula

Equation (8) and Equation (9) show that averages of the squared covariances are a function of both the means and the variation of the path coefficients. The variation of the path coefficients is a measure of the allometry induced by this growth factor. Although average covariances or correlations can be related to average path coefficients, that is, to the average effect of factors, the comparisons of average squared covariances/correlations are quite difficult to interpret.

Cheverud et al. (1989) introduced a related approach to evaluate models of morphological integration based on the element-wise matrix correlation between an empirical correlation matrix and a “theoretical matrix” or “connectivity matrix” that formalizes a priori notions of integration. A matrix correlation is the sum of the products of corresponding matrix elements  

formula
where tij and rij are the i,jth elements of the theoretical and the correlation matrices, respectively, and M is the number of variables. For the typical theoretical matrices, tij is 1 when the ith and jth variables both pertain to the same module or functional unit and is 0 otherwise. It follows that rijtij is rij within the modules and zero between them so that the matrix correlation ΓRT is the sum of all within-module correlations. Further, Cheverud and colleagues suggested testing the matrix correlation by a quadratic assignment test, also called Mantel's test (Mantel, 1967). This algorithm jointly permutes the rows and columns of one matrix, recomputing the matrix correlation each time. The proportion of permuted matrix correlations exceeding the observed one is a measure of the strength of association between the two matrices. In this application, the original matrix correlation is higher than the permuted ones only if the between-module correlations are lower in average than the ones within. This Mantel's test is therefore a measure of the difference between the average correlation within the modules and the average correlation between them. But it is not a valid significance test as it permutes the variables instead of the cases (cf. Good, 2000; Mantel's original paper compared n× n distance matrices and not p × p covariance matrices).

The above approaches assume that the average correlation or covariance among functionally or developmentally related variables—those within the same module—are higher than the average correlation between such modules, and equivalently for the squared correlations. As both within and between average covariances depend solely on the average path coefficients, this basic assumption is met only if at least the local factors have positive average path coefficients. But this assumption is not necessarily guaranteed to be the case—path coefficients can take on mixed signs. Only for measured interlandmark distances and factors that include a marked size increase will the coefficients necessarily have a positive central tendency. Certainly, this assumption is not met for shape variables and specifically not for shape coordinates.

Trace Correlation and Correlation among Singular Warp Scores

Recently, two multivariate methods to quantify the degree of integration have been proposed. Klingenberg et al. (2003) applied the trace correlation, which was introduced by Hooper (1959) as one multivariate extension of the coefficient of determination among two sets of variables (see also Mardia et al., 1979:170–171, and Hallgrímsson et al., 2006). The trace correlation ranges from 0, when no part of the variation in one block can be explained by the other block, to 1, when everything can be explained. It is therefore a measure of the amount of variation that is attributable to the common factors: When all variation within one block can be predicted by the other block, there are no local factors and no modular residual variance (at least for that partitioning of the variables). When, in turn, no variation at all can be predicted across the blocks, the variation is completely driven by local factors—the highest degree of modularity. Another published measure of integration among two or more blocks of variables is the correlation between the first PLS scores (or singular warp scores) of those blocks (e.g., in Bookstein et al., 2003; Bastir and Rosas, 2005; Gunz and Harvati, 2007).

Both suggested measures can be directly related to the path model of common and local factors but it turns out that they possess several undesirable properties. First, the trace correlation r2T is—unlike the univariate R2—not symmetric; that is, r2T(X, Y)≠ r2T(Y, X), except when both blocks have the same number of variables. Maybe most seriously, the trace correlation depends on the number of predictor variables. Similarly, the correlation between PLS scores increases with the number of variables and decreases with the number of cases (Fig. 3). When using these two methods to compare the integration among several anatomical regions, those regions should be captured by the same number of variables; when comparing integration in different taxa, these groups should have the same sample size. Yet, a significance test for the correlation among PLS scores as suggested by Rohlf and Corti (2000) is expected to be independent of the variable count.

Figure 3

Simulation illustrating the properties of trace correlation and correlation among partial least squares (PLS) scores. For a given p (plotted along the abscissa), two blocks of p random variables were generated with n = 100 cases each—the two blocks or modules are thus statistically completely independent. For each p the trace correlation and the correlation between the first PLS scores was computed among the two blocks. The data simulation was repeated 50 times and the results averaged for each p. Even though the two blocks of variables are constructed to be completely independent in each simulation, the figure demonstrates that both measures increase with the number of variables and that the trace correlation is 1 when p ≥ (n−1).

Figure 3

Simulation illustrating the properties of trace correlation and correlation among partial least squares (PLS) scores. For a given p (plotted along the abscissa), two blocks of p random variables were generated with n = 100 cases each—the two blocks or modules are thus statistically completely independent. For each p the trace correlation and the correlation between the first PLS scores was computed among the two blocks. The data simulation was repeated 50 times and the results averaged for each p. Even though the two blocks of variables are constructed to be completely independent in each simulation, the figure demonstrates that both measures increase with the number of variables and that the trace correlation is 1 when p ≥ (n−1).

Example 1: Estimating Common Factor Loadings

We present two different examples to further explore some of the methods described above. In this first example, we model the covariances among 30 variables via additive global and local factors according to Equation (3). As shown in Figure 4, we assume one common factor and three non-overlapping local factors that affect modules of 10 variables each. The actual path coefficients—the ‘causal’ effects of the factors on each variable—are drawn from a normal distribution, and then specific variance of 1 is added to each variable. Because the following methods to estimate the common factor are applied to a covariance matrix without error, they are expected to estimate the true path coefficients exactly.

Figure 4

Path model for the covariance structures used in the examples. A total of 30 variables X1, …, X30 are influenced by a common factor A and three local factors B, C, and D.

Figure 4

Path model for the covariance structures used in the examples. A total of 30 variables X1, …, X30 are influenced by a common factor A and three local factors B, C, and D.

Figure 5a confirms that the Wright-style factor solution correctly matches the true common factor path coefficients. The loadings of the first three-block PLS vectors do not adequately reflect the true path coefficients when all three vectors are taken together. Each vector separately exhibits a linear relationship with the true common factor—they are identical up to separate scaling (Fig. 5b). When the scaling factors for the PLS solution are estimated by Equation (7), they exactly match the path coefficients of the common factor. Figure 5c compares the true path coefficients with the loadings on the first principal component that could be regarded as an approximation of a single common factor in the absence of any further local factors (i.e., when the covariance matrix is of rank one). Figure 5d presents the loadings of a Wright-style factor analysis with the wrong notion of modularity (module 1: X1, …, X5 and X26, …, X30; module 2: X6,…, X15; module 3: X16, …, X25). In both cases, the wrong assumptions lead to incorrect estimation of the common factor.

Figure 5

Scatterplots of the true common factor path coefficients against their estimations by (a) Wright-style factor analysis and also scaled partial least squares analysis (PLS) when using the correct notion of modules, (b) unscaled PLS, (c) principal component analysis, and (d) Wright-style factor analysis based on the wrong modules. The filled gray circles denote the variables from the first module, the black ones from the second module, and the open circles from the third module.

Figure 5

Scatterplots of the true common factor path coefficients against their estimations by (a) Wright-style factor analysis and also scaled partial least squares analysis (PLS) when using the correct notion of modules, (b) unscaled PLS, (c) principal component analysis, and (d) Wright-style factor analysis based on the wrong modules. The filled gray circles denote the variables from the first module, the black ones from the second module, and the open circles from the third module.

Example 2: Identifying Modules

Olson and Miller as well as Terentjev assumed that the absolute values of within-module correlations exceed the correlations between modules. Based on the same assumption, other authors argued that modules should be seen as clusters of morphometric variables in a Q-mode cluster analysis. As a measure of integration, several authors use the difference of the average covariance or correlation (or their mean squares) within all modules and the average covariation/correlation among the modules. If the average within-module covariance is assumed to exceed the average covariance between the modules, a Mantel's test between the covariance/correlation matrix and a design matrix specifying the modules is frequently used as a measure of modularity. To further explore these approaches we modeled three different covariance structures that all correspond to the path model in Figure 4.

  • Model 1: The path coefficients for the common factor as well as for the three local factors have a mean of 0 and a standard deviation of 1—the average effect of the factors is small as compared to their variation or allometry.

  • Model 2: Same as simulation 1 but the coefficients have a mean of 3. The path coefficients are higher on average than in simulation 1 but their variation is identical.

  • Model 3: The coefficients of the common factor range evenly from 2 to 7.8, modeling a markedly allometric growth gradient. This model consists of no local factors, so that no notion of modules is warranted. For the exemplary analysis the first 10 values are nonetheless treated as belonging to the first tentative module, the second 10 to the second, and the highest 10 coefficients to the third.

Figure 6a, Figure 6b, Figure 6c shows frequency plots of the within-module covariances (dark gray) and the between-module covariances (light gray) separately for each model. As we showed algebraically, the two distributions of covariances differ clearly in model 2 where the local path coefficients are high relative to their variation (i.e., allometry), whereas the distributions are identical in model 1. When assuming the same modules for model 3 (although this is formally incorrect because model 3 has no local factors), the distribution of within-module covariances consists of three clusters and differs from the distribution of between-module covariances.

Figure 6

Frequency plots for the within-module covariances (dark gray) and between-module covariances (light gray) in the three different models (a–c). (d) the frequency plot for model 2 after the effects of allometry have been removed.

Figure 6

Frequency plots for the within-module covariances (dark gray) and between-module covariances (light gray) in the three different models (a–c). (d) the frequency plot for model 2 after the effects of allometry have been removed.

The allometries of the factors—the variation of their effects—complicate the identification of modules from phenotypic covariances. It is therefore expected that after the removal of allometric size (the consequences of overall size on shape), estimated either as regression on overall size or as the first principal component, modules are more clearly apparent in the covariance structure. When doing so, it turns out that the distributions of within-module covariances and between-module covariances become narrower; i.e., their variation decreases, but their central tendencies remain unaltered. For module 2 there is thus nearly no overlap among the two distributions after removing allometry (Fig. 6d), and within-module covariances clearly exceed the between-module residual covariances. When removing allometry from model 1, the distributions become narrower as well but their central tendencies do not change—they remain completely overlapping. The success of removing allometry depends, of course, on the overall size increase caused by common factors and on the amount of induced shape variation relative to local factors.

Also, the average covariances fulfill the classic expectations only for model 2 due to the positive average path coefficients (average covariance within/between: 19.70/9.47; average squared covariance within/between: 436.14/113.39). In model 1 the average squared covariance within (1.44) exceeds the one between (0.70), whereas this is not the case for the average covariances (−0.12 versus −0.03). For model 3, which does not include any local factors, both the average covariance within (26.6) and squared covariance within (985.10) exceed the respective values between the assumed “modules” (22.67, 599.53). A Mantel's test (1000 permutations) identifies modules for models 2 and 3 but not for model 1 (0.826, 0.001, and 0.001 for the three models).

For models 1 and 2, Figure 7 plots the loadings of all 30 variables on the first three eigenvectors. Clusters of variables that correspond to modules would be apparent in these plots if they existed. Because the necessary separation of path coefficients from 0 is not the case, we do not see modules for model 1, whereas the variables apparently cluster corresponding to the modules in model 2. Because model 3 consists of a single factor, only one meaningful eigenvector exists, and dual space is one-dimensional. In this space the variables are ordered according to their common factor path coefficient, that is, along the growth gradient, and so they cluster “correctly” here, even though the notion of modules is formally incorrect.

Figure 7

Loadings of all 30 variables on the first three eigenvectors (dual space) for the first two models (a and b). Because there is only one meaningful eigenvector for model 3, dual space is one-dimensional here (c).

Figure 7

Loadings of all 30 variables on the first three eigenvectors (dual space) for the first two models (a and b). Because there is only one meaningful eigenvector for model 3, dual space is one-dimensional here (c).

From the three different models, therefore, those classic approaches can identify correctly only model 2 with positive average path coefficients. The modules in model 1 cannot be identified, and even in the absence of any local factors a modular structure is incorrectly attributed in model 3. A growth gradient or a strongly allometric common factor may induce a covariance structure that appears modular when assessed by these indices. Similar results ensue when correlations instead of covariances are analyzed for the three simulations but they depend partly on the amount of error variance added to the variables.

Discussion

Modularity has become a key concept in the contemporary study of evolution and development. Morphological modules can be construed as parts of an organism that are separately affected by local genetic, developmental, evolutionary, or environmental factors so that they exhibit some degree of independence as they vary over ontogenetic or evolutionary time scales, individuals, or experimentally induced conditions. While developmental biologists typically intend to demonstrate the independence of organismal parts or processes under experimental conditions, comparative biologists and paleobiologists often seek to identify groups of correlated characters from adult specimens. Evidently, those two approaches differ fundamentally in their design. Although both intend to identify dissociated local developmental factors, in an experimental setting one explicitly controls some of those factors (e.g., through manipulations of the genotype or through direct manipulation of developmental pathways) and observes their effects. Even when factors are not under direct control, they might still be observable, such as levels of growth hormones, alleles of relevant genes, etc. Most classical studies of morphological integration, however, do not have any information on such factors, so that they must be inferred from (adult) phenotypic variables.

When the developmental factors are known, estimating their quantitative effects can be done by regression analysis and conclusions about modularity can be drawn from those results directly. Recent approaches based on the effects of quantitative trait loci (QTL) appear promising in that context (Cheverud et al., 1997; Mezey et al., 2000; Klingenberg et al, 2004). Hallgrímsson et al. (2004, 2006) explored morphological integration in different transgenic mice. Badyaev and Foresman (2004) compared patterns of integration in species raised under different environments. Smith (1996), Mitteroecker et al. (2004b, 2005), and Bastir et al. (2006) analyzed regional differences of developmental timing and drew conclusions about regional dissociation and modularity. When the effects of observed factors are discrete traits, such as the presence or absence of specific anatomical features, no statistical inference is needed at all.

When factors are completely unknown, sometimes they can be inferred from phenotypic variables by factor analysis if there is some prior notion of modularity. From the variety of published approaches to factor analysis, Sewall Wright's (1932) classic method seems to be a simple and effective approach to estimate common and local factors based on a specific geographic assumption of modularity. As we have shown, partial least squares analysis yields the same results as Wright's method when the PLS loading vectors are scaled correctly. We have outlined two methods to estimate those scaling factors in order to enable a meaningful joint visualization and interpretation of PLS loadings as common factors.

Several measures of integration among specified blocks of variables had been proposed in the literature. In general, all these measures describe different aspects of a multivariate, multiple-factor distribution of variables by one single number; it is difficult to pass from such a single measure back to any relevant biological or statistical property. Table 1 summarizes the assumptions that some of those methods require to be interpretable in terms of factor models. A meaningful comparison of average correlations or covariances between and within tentative modules assumes high average effects of local factors relative to their allometry. A matrix correlation between an empirical correlation matrix and a connectivity matrix equals the sum of all within-module correlations and thus employs the same assumptions. A Mantel's test of this matrix correlation is a measure of the average difference among within-module and between-module correlations, but is not a valid significance test. Trace correlation and correlation among PLS scores are consistent with the factor model; i.e., they may reflect modularity, but their actual values depend on sample size and the number of variables. They should thus be used to compare modules with the same variable count or groups with the same sample size only.

Table 1.

Overview of some published morphometric methods addressing morphological integration and modularity. The first column contains the main references, the second column briefly describes the methods, and the third column states their implicit assumptions so that their results can be interpreted in the light of modular factor models.

Publications Method Assumptions 
Terentjev (1931),   Olson and Miller (1951),   Leamy (1977),   Cheverud (1982) Clustering standardized variables according to their population correlation coefficient or equivalently to their (projected) distance in dual space. Path coefficients (i.e., effects) of the local factors must be of the same sign as the common factors and high relative to the variation of all coefficients (i.e., allometry of local and common factors). 
 Cheverud (1982) Comparison of the average correlation or average squared correlation within the modules to the average (squared) correlation between the modules. Comparing average correlations requires that the average path coefficients for the local factors differ from zero. Average squared correlations are difficult to interpret as they are functions of both the mean and the variation (allometry) of path coefficients. The same is true for average squared covariances as, e.g., in Klingenberg and Zaklan (2000)
Cheverud (1989) Matrix correlation between a population correlation matrix and a connectivity matrix that codes the assignment of variables to modules. The matrix correlation equals the sum of all within-module correlations and the suggested Mantel's test is a measure of the difference between the average correlation within modules and the average correlation between modules (but no significance test). It thus has the same assumptions as Cheverud (1982)
Magwene (2001) Clustering variables according to their pairwise partial correlation coefficients conditional on all other variables (elements of the inverted correlation matrix). Requires that all modules consist of two variables only. 
Rohlf and Corti (2000), Bookstein et al. (2003) Correlation coefficient among partial least squares scores (i.e., singular warp scores) as a measure of integration among two modules. When the correlation between PLS scores is compared among different modules or different samples, numbers of variables as well as sample sizes need to be equal. 
Klingenberg et al. (2003) Trace correlation as a measure of integration among two modules. When the trace correlation is compared among different modules or different samples, numbers of variables (p) as well as sample sizes (n) need to be equal and n must be larger than p
Bookstein et al. (2003) Joint visualization of PLS vectors (singular warps) as one single-shape deformation. The vectors have to be scaled appropriately (see main text) before they can be visualized as one shape deformation. The same is true for Wright's (1932) algorithm when applied to two local factors only. 
The current paper (based on Wright, 1932, and Bookstein et al., 1985Estimation of common and local factors to describe and visualize integration among modules as well as dissociated local shape variability. Requires a reasonable a priori assumption of the actual modules. Estimation of local factors further assumes additivity of the major developmental effects on the morphological variables. 
Publications Method Assumptions 
Terentjev (1931),   Olson and Miller (1951),   Leamy (1977),   Cheverud (1982) Clustering standardized variables according to their population correlation coefficient or equivalently to their (projected) distance in dual space. Path coefficients (i.e., effects) of the local factors must be of the same sign as the common factors and high relative to the variation of all coefficients (i.e., allometry of local and common factors). 
 Cheverud (1982) Comparison of the average correlation or average squared correlation within the modules to the average (squared) correlation between the modules. Comparing average correlations requires that the average path coefficients for the local factors differ from zero. Average squared correlations are difficult to interpret as they are functions of both the mean and the variation (allometry) of path coefficients. The same is true for average squared covariances as, e.g., in Klingenberg and Zaklan (2000)
Cheverud (1989) Matrix correlation between a population correlation matrix and a connectivity matrix that codes the assignment of variables to modules. The matrix correlation equals the sum of all within-module correlations and the suggested Mantel's test is a measure of the difference between the average correlation within modules and the average correlation between modules (but no significance test). It thus has the same assumptions as Cheverud (1982)
Magwene (2001) Clustering variables according to their pairwise partial correlation coefficients conditional on all other variables (elements of the inverted correlation matrix). Requires that all modules consist of two variables only. 
Rohlf and Corti (2000), Bookstein et al. (2003) Correlation coefficient among partial least squares scores (i.e., singular warp scores) as a measure of integration among two modules. When the correlation between PLS scores is compared among different modules or different samples, numbers of variables as well as sample sizes need to be equal. 
Klingenberg et al. (2003) Trace correlation as a measure of integration among two modules. When the trace correlation is compared among different modules or different samples, numbers of variables (p) as well as sample sizes (n) need to be equal and n must be larger than p
Bookstein et al. (2003) Joint visualization of PLS vectors (singular warps) as one single-shape deformation. The vectors have to be scaled appropriately (see main text) before they can be visualized as one shape deformation. The same is true for Wright's (1932) algorithm when applied to two local factors only. 
The current paper (based on Wright, 1932, and Bookstein et al., 1985Estimation of common and local factors to describe and visualize integration among modules as well as dissociated local shape variability. Requires a reasonable a priori assumption of the actual modules. Estimation of local factors further assumes additivity of the major developmental effects on the morphological variables. 

When neither the factors nor the modules are known, it is impossible to estimate factors of biological relevance with confidence, due to the indeterminacy of factor solutions. It is frequently assumed that groups of correlated characters (ρ-groups or correlation pleiades) equate to modules: groups of characters influenced by dissociated developmental processes. To test this assumption we simulated modular developmental systems by path models consisting of local dissociated factors and assessed the induced covariance structures. We found that groups of correlated variables reflect modules only when the effects of local factors on the measured variables are high relative to the variability of the effects of all factors. The variation of effects, i.e., the variation of path coefficients or factor loadings in a path model, can be interpreted as the allometry induced by that factor. The classic approaches to morphological integration are therefore only meaningful in the near vicinity of isometric growth factors. This is not a very realistic assumption that is certainly not met for shape variables. In the presence of strongly allometric common factors, the identification of morphological modules based on observed correlations or covariances is very unreliable. For instance, a strong global growth gradient induces a covariance structure that would be regarded as modular by most current methods. It is therefore helpful to remove common factors from the data before interpreting differences in correlations among variables in terms of newly discovered modularity. Modules appear more clearly in this residual covariance structure than in the raw correlations/covariances. The customary practice of removing allometric size from the data prior to statistical analysis is one step along these lines, but other common factors may remain. The same residual covariance structure can be arrived at when computing partial correlations/covariances that are conditional on a measure of overall size. However, partial correlations conditional on all other variables, that is, statistics based on the inverse correlation or covariance matrix, are meaningless except when all modules consist of two variables only. When the observed shape variation can be explained sufficiently by common factors alone (the most parsimonious explanation), no further notions of modularity appear useful.

It follows from these thoughts that modularity is a notion about the amount and pattern of residual covariance when the contribution of common factors to the covariances has been accounted for. In a multivariate language, one can thus define modules as blocks of variables that exhibit non-zero within-block covariance matrices even when the covariances due to common factors have been subtracted. In contrast, the cross-block covariances (those among variables from different modules) are near zero after partialing out the common factors (Equation (3)).

To enhance the identification of modules, some authors included further geometric information in their analysis of morphological integration. Zelditch and colleagues (Zelditch et al., 1992; Zelditch and Fink, 1995) considered the spatial scale of shape deformations during ontogeny. Although this still appears as a promising direction for further research, their actual methodology was based on partial warps (Bookstein, 1989) that, as is now commonly agreed on, lack any biological meaning (Rohlf, 2002). Klingenberg and Zaklan (2000) and Klingenberg et al. (2001) applied several techniques specifically to purely asymmetric shape variation (fluctuating asymmetry [FA]; see also Mardia et al., 2000). Because FA is generated by random perturbations of developmental processes that are assumed to arise locally and independently, they argue that integrated FA in two parts of a structure can only occur if the parts share the effects of the same perturbations. This interesting approach certainly permits further insights into the diverse origins of integration but is subject to the same methodological problems as are raised in the present paper.

Cheverud (1982) incorporated the concept of morphological integration into a quantitative genetic framework and computed “G-sets” deriving from genetic correlation matrices that in turn rely on heritability estimates of the assessed traits. He compared such groups of genetically correlated traits with groups deriving from phenotypic and environmental correlations or to functional hypotheses (see also Bailey, 1956; Leamy, 1977; Atchley, et al., 1981; Klingenberg and Leamy, 2001). Cheverud's work led to a series of novel theoretical and empirical insights, yet the discussed problems of identifying modules remain the same whether applied to phenotypic or to genetic correlations.

An important issue that most published studies on morphological integration (including ours) tend to repress are the geometric properties of morphometric variables. Usually, neither landmark coordinates nor interlandmark distances can be uncorrelated even when they might be under wholly independent biological regulation. Conversely, for any geometrically integrated shape deformation some variables will not change their value at all. Although those properties do not matter when modeling or estimating patterns of integration, they massively complicate the morphometric identification of modules and the measurement of integration. A promising direction for further research would thus be to account for the specific geometric properties of morphometric variables in the methodology of morphological integration and modularity.

Acknowledgments

We are grateful to Philipp Gunz, Christian Huber, Johann Kim, Dennis Slice, and James Rohlf for thoughtful comments on earlier versions of the manuscript and to Benedikt Hallgrímsson, Norman McLeod, Roderic Page, and one anonymous reviewer for their constructive suggestions during the review process. We also thank Katrin Schaefer, Horst Seidler, and Gerhard Weber for their support. This study was supported by grant GZ 200.033/1-VI/I/2004 of the Austrian Council for Science and Technology to Horst Seidler, EU FP6 Marie Curie Research Training Network MRTN-CT-019564, and a fellowship from the Konrad Lorenz Institute for Evolution and Cognition Research to P.M.

References

Alberch
P.
Gould
S. J.
Oster
G. F.
Wake
D. B.
Size and shape in ontogeny and phylogeny
Paleobiology
 , 
1979
, vol. 
5
 (pg. 
296
-
317
)
Atchley
W.
Rutledge
J.
Cowley
D.
Genetic components of size and shape. II. Multivariate covariance patterns in the rat and mouse skull
Evolution
 , 
1981
, vol. 
35
 (pg. 
1037
-
1055
)
Athreya
S.
Glantz
M. M.
Impact of character correlation and variable groupings on modern human population tree resolution
Am. J. Phys. Anthropol.
 , 
2003
, vol. 
122
 (pg. 
134
-
146
)
Badyaev
A. V.
Foresman
Evolution of morphological integration. I. Functional units channel stress-induced variation in shrew mandibles
Am. Nat.
 , 
2004
, vol. 
163
 (pg. 
868
-
879
)
Bailey
D. W.
A comparison of genetic and environmental influences on the shape of the axis in mice
Genetics
 , 
1956
, vol. 
41
 (pg. 
207
-
222
)
Bartholomew
D. J.
Knott
M.
Latent variable models and factor analysis
 , 
1999
London
Arnold
Bastir
M.
Rosas
A.
Hierarchical nature of morphological integration and modularity in the human posterior face
Am. J. Phys. Anthropol.
 , 
2005
, vol. 
128
 (pg. 
26
-
34
)
Bastir
M.
Rosas
A.
O'Higgins
P.
Craniofacial levels and the morphological maturation of the human skull
J. Anat.
 , 
2006
, vol. 
209
 (pg. 
637
-
654
)
Berg
R. L.
The ecological significance of correlation pleiades
Evolution
 , 
1960
, vol. 
14
 (pg. 
171
-
180
)
Blackith
R. E.
Reyment
R. A.
Multivariate morphometrics
 , 
1971
London, New York
Academic Press
Bock
W.
A critique of “Morphological Integration.”
Evolution
 , 
1960
, vol. 
14
 (pg. 
130
-
132
)
Bolker
J. A.
Modularity in development and why it matters in Evo-Devo
Am. Zool.
 , 
2000
, vol. 
40
 (pg. 
770
-
776
)
Bonner
J. T.
The evolution of complexity by means of natural selection
 , 
1988
Princeton, New Jersey
Princeton University Press
Bookstein
F.
Morphometric tools for landmark data: Geometry and biology
 , 
1991
Cambridge, UK, New York
Cambridge University Press
Bookstein
F.
Chernoff
B.
Elder
R. L.
Humphries
J. J. M.
Smith
G. R.
Strauss
R. E.
Morphometrics in evolutionary biology: The geometry of size and shape change, with examples from fishes
 , 
1985
 
Academy of Sciences of Philadelphia Special Publication 15
Bookstein
F.
Sampson
P. D.
Streissguth
A. P.
Barr
H. M.
Exploiting redundant measurement of dose and developmental outcome: New methods from the behavioral teratology of alcohol
Dev. Psychol.
 , 
1996
, vol. 
32
 (pg. 
404
-
415
)
Bookstein
F. L.
Principal warps: Thin plate splines and the decomposition of deformations
IEEE Trans. Pattern Anal. Machine Intelligence
 , 
1989
, vol. 
11
 (pg. 
567
-
585
)
Bookstein
F. L.
Gunz
P.
Mitteroecker
P.
Prossinger
H.
Schaefer
K.
Seidler
H.
Cranial integration in Homo: Singular warps analysis of the midsagittal plane in ontogeny and evolution
J. Hum. Evol.
 , 
2003
, vol. 
44
 (pg. 
167
-
187
)
Brandon
R. N.
Callebaut
W.
Rasskin-Gutman
D.
Evolutionary modules: Conceptual analyses and empirical hypotheses
Modularity: Understanding the development and evolution of natural complex systems
 , 
2005
Cambridge, Massachusetts
MIT Press
(pg. 
51
-
60
)
Callebaut
W.
Callebaut
W.
Rasskin-Gutman
D.
The ubiquity of modularity
Modularity: Understanding the development and evolution of natural complex systems
 , 
2005
Cambridge, Massachusetts
MIT Press
(pg. 
3
-
28
)
Callebaut
W.
Rasskin-Gutman
D.
Modularity: Understanding the development and evolution of natural complex systems
 , 
2005
Cambridge, Massachusetts
MIT Press
Chernoff
B.
Magwene
P. M.
Morphological integration: Forty years later
Morphological integration
 , 
1999
Chicago
University of Chicago Press
(pg. 
319
-
354
)
Cheverud
J.
Phenotypic, genetic, and environmental morphological integration in the cranium
Evolution
 , 
1982
, vol. 
36
 (pg. 
499
-
516
)
Cheverud
J.
Developmental integration and the evolution of pleiotropy
Am. Zool.
 , 
1996
, vol. 
36
 (pg. 
44
-
50
)
Cheverud
J.
Routman
E. J.
Irschick
D. K.
Pleiotropic effects of individual gene loci on mandibular morphology
Evolution
 , 
1997
, vol. 
51
 (pg. 
2004
-
2014
)
Cheverud
J.
Wagner
G. P.
Dow
M. M.
Methods for the comparative analysis of variation patterns
Syst. Zool.
 , 
1989
, vol. 
38
 (pg. 
201
-
213
)
Cheverud
J. M.
Quantitative genetic and developmental constraints on evolution by selection
J. Theor. Biol.
 , 
1984
, vol. 
110
 (pg. 
155
-
171
)
Dryden
I. L.
Mardia
K. V.
Statistical shape analysis
 , 
1998
New York
John Wiley and Sons
Enlow
D. H.
Moyers
R. E.
Hunter
W. S.
McNamara
J. A.
Jr.
A procedure for the analysis of intrinsic facial form and growth
Am. J. Orthod.
 , 
1969
, vol. 
56
 (pg. 
6
-
23
)
Good
P.
Permutation tests: A practical guide to resampling methods for testing hypotheses
 , 
2000
2nd edition
New York
Springer
Gould
S. J.
Ontogeny and phylogeny
 , 
1977
Cambridge
Harvard University Press
Gould
S. J.
Garwood
R. A.
Levels of integration in mammalian dentitions: An analysis of correlations in Nesophontes micrus (Insectivora) and Oryzomys couesi (Rodentia)
Evolution
 , 
1969
, vol. 
23
 (pg. 
276
-
300
)
Gromko
M. H.
Unpredictability of correlated response to selection: Pleiotropy and sampling interact
Evolution
 , 
1995
, vol. 
49
 (pg. 
685
-
693
)
Gunz
P.
Harvati
K.
The Neanderthal “chignon”: Variation, integration, and homology
J. Hum. Evol.
 , 
2007
, vol. 
52
 (pg. 
262
-
274
)
Haeckel
E.
Generelle Morphologie der Organismen
 , 
1866
Berlin
Georg Reimer
Hallgrímsson
B.
Brown
J. J. Y.
Ford-Hutchinson
A. F.
Sheets
D. S.
Zelditch
M. L.
Jirik
F. R.
The brachymorph mouse and the developmental-genetic basis for canalization and morphological integration
Evol. Dev.
 , 
2006
, vol. 
8
 (pg. 
61
-
73
)
Hallgrímsson
B.
Willmore
K.
Cooper
D. M. L.
Craniofacial variability and modularity in macaques and mice
J. Exp. Zool. Part B
 , 
2004
, vol. 
302
 (pg. 
207
-
225
)
Hallgrímsson
B.
Willmore
K.
Hall
B. K.
Canalization, developmental stability, and morphological integration in primate limbs
Am. J. Phys. Anthropol.
 , 
2002
, vol. 
119
 (pg. 
131
-
158
)
Hansen
T. F.
Is modularity necessary for evolvability? Remarks on the relationship between pleiotropy and evolvability
Biosystems
 , 
2003
, vol. 
69
 (pg. 
83
-
94
)
Harman
H. H.
Modern factor analysis
 , 
1976
Chicago
University of Chicago Press
Hooper
J. W.
Simultaneous equations and canonical correlation theory
Econometrica
 , 
1959
, vol. 
27
 (pg. 
245
-
256
)
Johnson
R. A.
Wichern
D. W.
Applied multivariate statistical analysis
 , 
1998
Upper Saddle River, New Jersey
Prentice-Hall
Klingenberg
C. P.
Badyaev
A. V.
Sowry
S. M.
Beckwith
N. J.
Inferring developmental modularity from morphological integration: Analysis of individual variation and asymmetry in bumblebee wings
Am. Nat.
 , 
2001
, vol. 
157
 (pg. 
11
-
23
)
Klingenberg
C. P.
Leamy
L. J.
Quantitative genetics of geometric shape in the mouse mandible
Evolution
 , 
2001
, vol. 
55
 (pg. 
2342
-
2352
)
Klingenberg
C. P.
Leamy
L. J.
Cheverud
J. M.
Integration and modularity of the quantitative trait locus effects on geometric shape in the mouse mandible
Genetics
 , 
2004
, vol. 
166
 (pg. 
1909
-
1921
)
Klingenberg
C. P.
Mebus
K.
Auffray
J. C.
Developmental integration in a complex morphological structure: How distinct are the modules in the mouse mandible? Evol
Dev.
 , 
2003
, vol. 
5
 (pg. 
522
-
31
)
Klingenberg
C. P.
Zaklan
S. D.
Morphological integration between developmental compartments in the Drosophilawing
Evolution
 , 
2000
, vol. 
54
 (pg. 
1273
-
1285
)
Koloslava
L. D.
On the question of the evolution of correlation pleiades (on the example of two species of Veronica) (in Russian)
J. Gen. Biol.
 , 
1971
, vol. 
32
 (pg. 
409
-
416
)
Lande
R.
Quantitative genetic analysis of multivariate evolution, applied to brain:body size allometry
Evolution
 , 
1979
, vol. 
33
 (pg. 
402
-
416
)
Leamy
L.
Genetic and environmental correlations of morphometric traits in randombred house mice
Evolution
 , 
1977
, vol. 
31
 (pg. 
357
-
369
)
Lewenz
M. A.
Whiteley
M. A.
Data for the problem of evolution in man
A second study of variability and correlation of the hand. Biometika
 , 
1902
, vol. 
1
 (pg. 
345
-
360
)
Lewontin
R. C.
Adaptation
Sci. Am.
 , 
1978
, vol. 
239
 (pg. 
156
-
169
)
Magwene
P. M.
New tools for studying integration and modularity
Evolution
 , 
2001
, vol. 
55
 (pg. 
1734
-
1745
)
Mantel
N.
The detection of disease clustering and a generalized regression approach
Cancer Res.
 , 
1967
, vol. 
27
 (pg. 
209
-
220
)
Marcus
L. F.
Rohlf
F. J.
Bookstein
F. L.
Traditional morphometrics
Proceedings of the Michigan Morphometrics Workshop
 , 
1990
Ann Arbor, Michigan
University of Michigan Museums
(pg. 
77
-
122
)
Mardia
K. V.
Bookstein
F.
Moreton
I.
Statistical assessement of bilateral symmetry of shapes
Biometika
 , 
2000
, vol. 
87
 (pg. 
285
-
300
)
Mardia
K. V.
Kent
J. T.
Bibby
J. M.
Multivariate analysis
 , 
1979
London
Academic Press
Martens
H.
Naes
T.
Multivariate calibration
 , 
1989
Chichester
John Wiley and Sons
Mezey
J. G.
Cheverud
J.
Wagner
G. P.
Is the genotype-phenotype map modular? A statistical approach using mouse quantitative trait loci data
Genetics
 , 
2000
, vol. 
156
 (pg. 
305
-
311
)
Mitteroecker
P.
Evolutionary and developmental morphometrics of the hominoid cranium
 , 
2007
Vienna
Department of Anthropology University of Vienna
 
Ph.D. Thesis
Mitteroecker
P.
Gunz
P.
Bernhard
M.
Schaefer
K.
Bookstein
F.
Comparison of cranial ontogenetic trajectories among great apes and humans
J. Hum. Evol.
 , 
2004
, vol. 
46
 (pg. 
679
-
697
)
Mitteroecker
P.
Gunz
P.
Bookstein
F. L.
Heterochrony and geometric morphometrics: A comparison of cranial growth in Pan paniscus versus Pan troglodytes
Evol. Dev.
 , 
2005
, vol. 
7
 (pg. 
244
-
258
)
Mitteroecker
P.
Gunz
P.
Weber
G. W.
Bookstein
F. L.
Regional dissociated heterochrony in multivariate analysis
Ann. Anat.-Anat. Anz.
 , 
2004
, vol. 
186
 (pg. 
463
-
470
)
Müller
G. B.
Olsson
L.
Hall
B. K.
Olson
W.
Epigenesis and epigenetics
Keywords and concepts in evolutionary developmental biology
 , 
2003
Cambridge, Massachusetts
Harvard University Press
(pg. 
114
-
123
)
Murren
C. J.
Phenotypic integration in plants
Plant Species Biol.
 , 
2002
, vol. 
17
 (pg. 
89
-
99
)
Needham
J.
On the dissociability of the fundamental processes in ontogenesis
Biol. Rev.
 , 
1933
, vol. 
8
 (pg. 
180
-
223
)
O'Keefe
F. R.
Wagner
P. J.
Inferring and testing hypotheses of cladistic character dependence by using character compatibility
Syst. Biology
 , 
2001
, vol. 
50
 (pg. 
657
-
675
)
Olson
E. C.
Miller
R. L.
A mathematical model applied to a study of the evolution of species
Evolution
 , 
1951
, vol. 
5
 (pg. 
325
-
338
)
Olson
E. C.
Miller
R. L.
Morphological integration
 , 
1958
Chicago
University of Chicago Press
Pigliucci
M.
Genetic variance-covariance matrices: A critique of the evolutionary quantitative genetics research program
Biol. Philos.
 , 
2006
, vol. 
21
 (pg. 
1
-
23
)
Raff
R.
The shape of life: Genes, development, and the evolution of animal form
 , 
1996
Chicago
University of Chicago Press
Reyment
R. A.
Jöreskog
K. G.
Applied factor analysis in the natural sciences
 , 
1993
Cambridge, UK
Cambridge University Press
Riedl
R. J.
Order in living organisms
 , 
1978
New York
John Wiley and Sons
Rohlf
F. J.
Forey
P.
Macleod
N.
Geometric morphometrics and phylogeny
Morphology, shape and phylogenetics
 , 
2002
London
Taylor & Francis
(pg. 
175
-
193
)
Rohlf
F. J.
Corti
M.
The use of two-block partial least-squares to study covariation in shape
Syst. Biol.
 , 
2000
, vol. 
49
 (pg. 
740
-
753
)
Rostova
N. S.
Comparative analysis of correlation structure (in Russian)
Analysis of biological systems by mathematical methods
 , 
1985
Leningrad
Leningrad University
(pg. 
37
-
54
)
Roth
V. L.
Cranial Integration in the Sciuridae
Amer. Zool.
 , 
1996
, vol. 
36
 (pg. 
14
-
23
)
Roux
W.
The problems, methods, and scope of developmental mechanics
Biological lsectures at the Marine Biology Laboratory
 , 
1894
Ginn, Boston
Woods Hole
(pg. 
149
-
190
)
Sampson
P. D.
Streissguth
A. P.
Barr
H. M.
Bookstein
F. L.
Neurobehavioral effects of prenatal alcohol: Part II. Partial least squares analysis
Neurotoxicol. Teratol.
 , 
1989
, vol. 
11
 (pg. 
477
-
491
)
Sawin
P. B.
Fox
R. R.
Latimer
H. B.
Morphogenetic studies of the rabbit XLI. Gradients of correlation in the architecture of morphology
Am. J. Anat.
 , 
1970
, vol. 
128
 (pg. 
137
-
145
)
Schlosser
G.
Wagner
G. P.
Modularity in development and evolution
 , 
2004
Chicago
The University of Chicago Press
Simon
H. A.
The architecture of complexity
Proc. Am. Phil. Soc.
 , 
1962
, vol. 
106
 (pg. 
467
-
482
)
Simon
H. A.
The sciences of the artificial
 , 
1969
Cambridge, Massachusetts
MIT Press
Smith
K. K.
Integration of craniofacial structures during development in mammals
Am. Zool.
 , 
1996
, vol. 
36
 (pg. 
70
-
79
)
Sokal
R. R.
Variation and covariation of characters of alate Pemphigus populi-transversus in eastern North America
Evolution
 , 
1961
, vol. 
16
 (pg. 
227
-
245
)
Strait
D. S.
Integration, phylogeny, and the hominid cranial base
Am. J. Phys. Anthropol.
 , 
2001
, vol. 
114
 (pg. 
273
-
297
)
Terentjev
P. V.
Biometrische Untersuchungen über die morphologischen Merkmale von Rana ridibunda Pall. (Amphibia, Salientia)
Biometrika
 , 
1931
, vol. 
23
 (pg. 
23
-
51
)
Thurstone
L. L.
Multiple-factor analysis
 , 
1947
Chicago
University of Chicago Press
Van Valen
L.
The study of morphological integration
Evolution
 , 
1965
, vol. 
19
 (pg. 
347
-
349
)
Von Dassow
G.
Munro
E.
Modularity in animal development and evolution: Elements of a conceptual framework for EvoDevo
J. Exp. Zool. B
 , 
1999
, vol. 
285
 (pg. 
307
-
325
)
Wagner
G. P.
Homologues, natural kinds and the evolution of modularity
Am. Zool.
 , 
1996
, vol. 
36
 (pg. 
36
-
43
)
Wagner
G. P.
Altenberg
L.
Complex adaptations and the evolution of evolvability
Evolution
 , 
1996
, vol. 
50
 (pg. 
967
-
976
)
Whiteley
M. A.
Pearson
K.
Data for the problem of evolution in Man. i. A first study of the variability and correlation of the hand
Proc. R. Soc.
 , 
1899
, vol. 
65
 (pg. 
126
-
151
)
Winther
R. G.
Varieties of modules: Kinds, levels, origins, and behaviors
J. Exp. Zool. B
 , 
2001
, vol. 
291
 (pg. 
116
-
129
)
Wold
H.
Krishnaiah
P. R.
Estimation of principal components and related models by iterative least squares
Multivariate analysis
 , 
1966
New York
Academic Press
(pg. 
391
-
420
)
Wright
S.
On the nature of size factors
Genetics
 , 
1918
, vol. 
3
 (pg. 
367
-
374
)
Wright
S.
Correlation and causation
J. Agric. Res.
 , 
1921
, vol. 
20
 (pg. 
557
-
585
)
Wright
S.
General, group and special size factors
Genetics
 , 
1932
, vol. 
15
 (pg. 
603
-
619
)
Wright
S.
The method of path coefficients
Ann. Math. Stat.
 , 
1934
, vol. 
6
 (pg. 
161
-
215
)
Zelditch
M. L.
Evaluating models of developmental integration in the laboratory rat using confirmatory factor analysis
Syst. Zool.
 , 
1987
, vol. 
36
 (pg. 
368
-
380
)
Zelditch
M. L.
Ontogenetic variation in patterns of phenotypic integration in the laboratory rat
Evolution
 , 
1988
, vol. 
42
 (pg. 
28
-
41
)
Zelditch
M. L.
Beyond heterochrony: The evolution of development
 , 
2001
New York
Wiley-Liss
Zelditch
M. L.
Bookstein
F. L.
Lundrigan
B.
Ontogeny of integrated skull growth in the cotton rat Sigmodon fulviventer
Evolution
 , 
1992
, vol. 
46
 (pg. 
1164
-
1180
)
Zelditch
M. L.
Fink
W. L.
Allometry and developmental integration of body growth in the piranha, Pygocentrus nattereri (Teleostei: Ostariophysi)
J. Morph.
 , 
1995
, vol. 
223
 (pg. 
341
-
355
)
Zelditch
M. L.
Fink
W. L.
Heterochrony and heterotopy: Stability and innovation in the evolution of form
Paleobiology
 , 
1996
, vol. 
22
 (pg. 
241
-
254
)