Chorotypes—statistically significant groups of coincident distribution areas—constitute biogeographic units that are fuzzy by nature. This quality has been referred to in the literature but has not been analyzed in depth or methodologically developed. The present work redefines chorotypes as fuzzy sets from a pragmatic perspective and basically focuses on the methodological and interpretative implications of this approach. The amphibian fauna in the Iberian Peninsula was used as an example to explore the fuzzy nature of chorotypes. The method on which this article is based is a widely used technique to define chorotypes. This method involves the fuzziness that is inherent to the identification between degree of similarity and degree of membership and includes a probabilistic analysis of the classification for the objective delimitation of chorotypes. The main innovation of this paper is a procedure to analyze chorotypes as fuzzy biogeographic units. A set of fuzzy parameters to deal with the biogeographic interpretation of fuzzy chorotypes is also described. A computer program has been developed and is freely available. History may be related to the degree of fuzziness of chorotypes. In our example, with amphibian distributions in Iberia, less fuzzy chorotypes could have a historical explanation, and the internal fuzziness of chorotypes increases with their distance to hypothetical Pleistocene refugia.
Baroni-Urbani et al. (1978) defined a “chorotype” as a distribution pattern, followed by one or several species, which can be operatively recognized within an area. Similar definitions are found in Vargas et al. (1997) and in Zunino and Zullini (2003). This concept has also been named “distributional type” (Dzwonko and Kornaś 1978), “biotic element” (Birks 1987, Hausdorf 2002), “chorological category” (Zunino and Zullini 2003), and “chorological type” (Ferrer-Castan and Vetaas 2003, Ojeda et al. 2005). In recent times, the term chorotype has also been used to name a typical region for a “biochore” (“group of similar biotopes, the largest division of animal and plant ecosystems”): According to this, a chorotype would be a biochore of lower rank that is chosen in the same way as a type species is designated for a genus in taxonomy (Westermann 2000). The subject concerning this work deals with chorotypes as defined by Baroni-Urbani et al. (1978).
Chorotypes may result from ecological causes, that is, differential responses to environmental conditions shared by several species, or from historical causes, that is, past events that restricted or biased certain groups of species to different parts of the Earth (Real, Olivero, et al. 2008). If chorotypes are detected, then it is not necessary to invoke a different cause for explaining each species distribution, but some factors may be responsible for the distributions shared by each group of species, and the overall environmental interpretation may be more comprehensible (Márquez et al. 1997). Because of this, chorotypes enhance the search for phenomena that influence the spatial variation of biodiversity from the perspective of a global process in which multiple species are involved with different degrees of interrelationship (Real et al. 2002). In this regard, numerous studies have been published that have objectively detected and analyzed chorotypes worldwide, for example, for algae (Báez el al. 2005), vascular plants (Birks 1976, Myklestad and Birks 1993, Márquez et al. 1997, Gómez-González et al. 2004, Teneb et al. 2004, Finnie et al. 2007), insects (Baroni-Urbani and Collingwood 1976; Baroni-Urbani et al. 1978, Hausdorf and Henning 2003), fish (Carmona et al. 1999), amphibians (Vargas and Real 1997, Flores et al. 2004), reptiles (Real et al. 1997, Vargas and Real 1997, Real, Márquez, et al. 2008), birds (Muñoz et al. 2003, Real, Márquez, et al. 2008, Real, Olivero, et al. 2008), and mammals (Vargas et al. 1997, Sans-Fuentes and Ventura 2000, Real et al. 2003, Real, Márquez, et al. 2008).
In a new approach, chorotypes have been recently used for the biogeographic deconstruction of biodiversity. Marquet et al. (2004) proposed the deconstruction of biodiversity patterns (as a “turning to their roots” or a “disaggregation” to make apparent what is hidden in them) according to the attributes of species, as a way of gaining a better insight into the causes of biodiversity trends. Huston (1994) also suggested that it is useful to decompose biological diversity into components with consistent biogeographic patterns and then analyze the processes that influence each of them. Chorotypes, as consistent biogeographic responses among subsets of species, aided Real, Olivero, et al. (2008) in tracking down the different spatial responses of waterbird species richness in Europe to environmental energy, after Bárcena et al. (2004) had detected, for this group, a general trend shaped by this factor (for a deconstruction of vascular plant species richness in Europe according to chorotypes, see Finnie et al. 2007). Chorotypes have also been proposed by Thuiller et al. (2005) as a way to explore the relationships between ecological and distributional properties of species and their projected sensitivity to climate change.
All environmental factors may affect all species, although not necessarily to the same extent. If a chorotype is interpreted as a biogeographic response to an environmental factor, then the degree to which the distribution of a species belongs to such a chorotype could be a measure of the degree to which this species is affected by the factor. This is an approach that can clearly benefit from the tools of fuzzy logic.
A fuzzy set is a class of objects with a continuum of grades of membership, so that such a set is characterized by a membership function that assigns to each object a real number in the interval [0, 1] (Zadeh 1965). The difference between the logic behind fuzzy sets and the traditional probabilistic approach is that in probability something is either true or false, whereas in fuzzy set theory the membership describes the degree to which an element meets the definition of the set: It is neither absolutely true nor false but true to some extent. Thus, the fuzzy membership values indicate that each class exists for each object to some degree (Brown 1998). The fuzzy sets approach recognizes that, in some cases, none of the classes is appropriate for a certain object, whereas other objects may be reasonably classified into two or more categories (Townsend and Walsh 2001). As chorotypes may partly overlap, each species could be defined as having a degree of membership in the different chorotypes described. In this sense, the distributions included in a chorotype constitute a group that is fuzzy by nature, so that the principles and rules of fuzzy logic may be easily applied to them (Real, Olivero, et al. 2008).
The fuzzy logic approach has become frequent in studies belonging to different biological disciplines in the two last decades. Fuzzy sets have been applied to taxonomy (e.g., Pappas 2006), where even the fuzzy nature of taxa, including species, has been postulated (Hall 1997, Turner 1999). Within the environmental sciences, fuzzy sets have formed the core of studies aimed at determining priority areas for conservation (e.g., Stoms et al. 2002, Estrada et al. 2008) and have provided tools for the management of ecosystems (e.g., Freyer et al. 2000, Adriaenssens et al. 2004, Prato 2005). In biogeography, two main fields have strongly benefited from fuzzy logic: land classification according to biotic parameters and investigation into the spatial distribution of species. Fuzzy classifications of land have been used to identify plant associations (e.g., Brown 1998, Olano et al. 1998), identify animal communities (e.g., Tepavčevic and Vujić 1996, Eyre et al. 2003), and delimit ecosystems using remote sensing imagery (e.g., Townsend and Walsh 2001, Arnot et al. 2004). On the other hand, distribution modeling with methods based on fuzzy logic has been applied to animals and plants (e.g., Robertson et al. 2004, Gevrey et al. 2006, Van der Broekhoven et al. 2006, Real et al. 2009). Chorotypes are in an intermediate position between biogeographic land area classification and species distribution modeling. Chorotypes are the result of classifying species distributions according to how they cover the same area and are defined as types of distributions that can be submitted to environmental modeling in the same way as species distributions (e.g., Real, Olivero, et al. 2008).
A widely used technique for the detection of chorotypes is described in Márquez et al. (1997) (see also Real et al. 2002, Real, Olivero, et al. 2008). It is based on a quantitative classification according to similarities between objects (in this case species distributions), which involves the fuzziness that is inherent in the identification between degree of similarity and degree of membership (Salski 2007). A later probabilistic analysis of the classification, which follows the approach of McCoy et al. (1986), permits the objective detection of chorotypes. The results of this method can be, however, expressed in possibilistic nonprobabilistic ways, such that the degree to which a certain species belongs to every chorotype is expressed, thus allowing a fuzzy approach to the classification results. Our objectives are 1) to improve the method to detect chorotypes with a procedure to handle chorotypes as fuzzy biogeographic units and 2) to explore the interpretative possibilities and implications of this new approach with an example based on the distribution of amphibians in the Iberian Peninsula.
MATERIAL AND METHODS
Species and Study Area
Amphibian distribution data were taken from the Atlas and Red Book of Amphibians and Reptiles of Spain (Pleguezuelos et al. 2004) and from the Atlas of Amphibians and Reptiles of Portugal (Loureiro et al. 2008). The 27 species inhabiting the Iberian Peninsula were used (see Fig. 1) all of which have been traditionally considered indigenous species. Recent phylogeographic studies, however, state that the current Iberian distribution of Discoglossus pictus and the northeastern populations of Hyla meridionalis are the result of a modern dispersion from France, where they were probably anthropogenically introduced at the end of the 19th century (Martínez-Solano 2004, Recuero et al. 2007). These populations have been considered in our analyses because their century-long occurrence has resulted in populations that are now integrated into Iberian ecosystems. Nevertheless, the northeastern range of H. meridionalis is currently segregated from its southwestern range in Iberia, and both ranges have been analyzed as different distributions. This choice has been made because a common history and similar environmental involvement cannot be assumed for both ranges, whereas keeping them together could create significant noise in the search for common biogeographic patterns among amphibians from Iberia.
The exploration of chorotypes was made on 50×50-km UTM squares. All squares at this resolution have been prospected to build the atlases (Pleguezuelos et al. 2004, Loureiro et al. 2008), and so the problem of false absences is minimized. For discussion purposes, chorotypes were also explored separately in Portugal and in Spain.
Real et al. (1992) proposed a probabilistic procedure for recognizing biotic boundaries based on species distributions. This procedure was adapted in Márquez et al. (1997) for its use in detecting chorotypes and was later modified in Muñoz et al. (2003). This method allows the simultaneous detection of continuous and discrete biogeographic patterns that could coexist and even partially overlap within the study area, which permits the interpretation of chorotypes as fuzzy sets.
A matrix of geographic similarities between the distributions of each pair of amphibian species in Iberia was obtained using Baroni-Urbani and Buser (1976) index:
where A is the number of grids in which the species a is present, B is the number of grids in which the species b is present, C is the number of grids where both species a and b are present, and D is the number of grids from which both species a and b are absent. A similarity value 1 means completely coinciding distributions, and 0 means completely noncoinciding distributions. This coefficient takes into account shared absences, that is, grids in the Iberian Peninsula outside the distribution area of both species, and so the similarities were considered in relation to the study area and not only in relation to the two combined distribution areas (Real et al. 1992). This characteristic makes the similarity value increase with the study area if no new presences are added, which is in tune with the fuzzy concept of similarity. Shared absences are important because they could be due to either ecological or historical reasons that should be taken into account (Baroni-Urbani and Buser 1976). However, this index gives more importance to shared presences and the possibility that two distributions are considered similar only because of their shared absences is avoided by multiplying shared absences by shared presences. In addition, the Baroni-Urbani and Buser's (1976) index has a table of critical values (Baroni-Urbani and Buser 1976), as the method described below to search for clusters requires.
An important requisite of chorotypes is to maximize similarity within groups. Because of this, an agglomerative method of classification, the unweighted pair-group method using arithmetic averages (UPGMA), was used. Among the phenetic classification methods, average linkage produces less distortion in relation to the original similarities than complete or single linkages, and UPGMA produces less distortion than other average linkages like WPGMA and UPGMC (Sneath and Sokal 1973). Kreft and Jetz (2010) found that UPGMA was the consistently best performing clustering algorithm for a biogeographical classification across a set of different methods including 7 agglomerative hierarchical clusters (UPGMA, UPGMC, WPGMA, WPGMC, Ward's method, single linkage, and complete linkage), an iterative algorithm (neighbor-joining trees), and a divisive hierarchical method (DIANA algorithm). Our result was expressed as a dendrogram.
As the null model is that similarities between distributions are not different from those expected at random, we used the table of critical values presented in Baroni-Urbani and Buser (1976) to perform exact randomization tests (Sokal and Rohlf 1981, p. 788) where the observed similarity values were compared with all the possible outcomes (see also Real and Vargas 1996). Values of the similarity index higher than 95% of outcomes were considered “significant similarities” (+), values lower than 95% of outcomes were “significant dissimilarities” ( − ), and the rest were considered values expected at random (0). These “significances” do not reflect the statistical probability of committing a type II inference error, that is, the probability of making a false inference about a population on the basis of what is found in a sample, but the exact mathematic probability of finding a lower or higher similarity value in the population.
To detect chorotypes, we examined the dendrogram to identify branches that exhibited significant positive within-branch shared distributions and which were significantly disjoint from adjoining branches. For each merging of branches in the dendrogram, let A be the set of species in the left branch, and let B be the set of species in the right branch. The Cartesian product A×A represents the pairwise comparison of every species in set A with every other species in set A. If there are nA species in set A, there will be (nA2 − nA)/2 comparisons in the product because the product would be symmetrical and we are only interested in one triangle. Similarly for set B, the product B×B compares every species in set B with every other species in set B, and there are (nB2 − nB)/2 pairwise comparisons. Finally, let the product A×B represent the comparison of every species in set A with every species in set B; there are nA×nB comparisons.
If set A represents a chorotype, we would expect to find primarily significant similarities in the product A×A, with relatively few significant dissimilarities. If set A is distinct from set B, we would expect to find primarily significant dissimilarities or similarities expected at random in product A×B, with relatively few significant similarities.
represent the number of species in set A or B that have at least one significant similarity with another species in the other set.
where dp is the predominance of significant similarities within set A, dm is the predominance of significant dissimilarities in set A, and dp′ is the predominance of significant similarities in product A×B.
where IH is the index of internal homogeneity and distinctness. Division by rescales the IH to [ − 1, 1] (without such a rescaling, IH is equal to DW(A×A) in Márquez et al. 1997). In this way, we computed the IH values for every branch of the dendrogram. A cluster was considered “chorotypical” if: 1) IH = 1, that is, all the criteria were completely met or else 2) IH was positive, higher than those of the other clusters including the distributions involved, and statistically significant. Our statistical approach is based on challenging the null hypothesis that + similarities between the distributions grouped in the tested cluster are not more frequent than + similarities between such cluster and the most similar branch of the dendrogram, using a G-test of independence (Sokal and Rohlf 1981), which only is applicable if the cluster includes at least three species and A×B includes at least six comparisons. Distributions that did not fulfil either of these conditions did not belong to any chorotypical cluster.
The abovementioned procedure reduces drastically the number of clusters tested with the G-test of independence (2 out of 55 branches in our case). Nevertheless, we dealt with the probability of erroneously rejecting one of the true null hypotheses due to the familywise error rate involved in multiple hypotheses testing (Benjamini and Hochberg 1995, Hausdorf and Henning 2003) by controlling the false discovery rate (FDR) using the procedure proposed by Benjamini and Hochberg (1995) under an FDR value of q = 0.05. This procedure orders the tested clusters according to decreasing significance (increasing P value), being i the position of each cluster in this ordered list, and only accepts clusters up to the highest i position whose P value is lower than i*q/V, where V is the total number of tested clusters.
The explained criteria identify the least fuzzy chorotypical clusters, as all the conditions to be chorotype are maximized. However, applying the concepts of the fuzzy set theory, IH represents the degree to which a cluster meets the definition of the chorotypical cluster, which is neither absolutely true nor false, but true to some extent. Thus, every cluster with IH values higher than 0 fulfils to certain degree the conditions to be a chorotypical cluster. Certain fuzzy chorotypical clusters may be nested within others, although it is always possible to identify the nonnested fuzzy chorotypes. Nevertheless, we applied the following fuzzy parameters to the least fuzzy chorotypes identified as mentioned above, although they may be equally applied to all fuzzy chorotypical clusters.
The average of the similarities (Sij) between a certain species distribution (di) and all n distributions included in a chorotypical cluster (ch) was considered a measure of the degree of membership of di in such a chorotype:
Starting with this measure, various fuzzy parameters were computed (after Zadeh 1965, Dubois and Prade 1980; Kosko 1986; Kuncheva 2001) to describe the fuzzy nature of every chorotype. The importance of a chorotype in the context of all the species distributions considered depends on how many species have a high degree of membership in it. The parameter that measures the importance of a fuzzy chorotype is cardinality. Relative cardinality and height, which are, respectively, the average and the maximum degree of membership observed in a fuzzy chorotype, provide a context to evaluate the membership of any particular distribution. N being the total number of species distributions considered in the study, these parameters may be expressed as follows:
The species whose membership degree in a chorotype coincided with its height may be considered the species most representative of the chorotype. The union of two chorotypes ch1∪ ch2 is the smallest fuzzy set containing both ch1 and ch2, whereas their intersection ch1∩ ch2 is the largest fuzzy set which is contained in both ch1 and ch2 (Zadeh 1965). The cardinality of the union between two chorotypes measures how much, taken together, they comprise the species degrees of membership. The cardinality of the intersection indicates pairwise coincidences in the species membership, and thus, how fuzzy the limit between these two chorotypes is. The degree of membership in the union between two chorotypes (ch1∪ ch2) is
The fuzzy overlap—in reality a fuzzy application of Jaccard's (1901) similarity index—measures the proportion of the union that is in the intersection:
A modification of this overlap provides us with a measure of how much a chorotype is included in another:
The fuzzy entropy of a chorotype is the degree of similarity between “belonging-and-not-belonging” and “belonging-or-not-belonging” to the chorotype (Kosko 1986) and is a measure of fuzziness:
A key question is how to map the fuzziness of chorotypes, so that their fuzzy nature may be, on the one hand, made observable throughout the study area, and on the other hand, considered in the search for historical and environmental causes that might underlie a chorotype. The difficulty lies in the fact that elements of a chorotype are species distributions, and because of this, the degree of membership in a chorotype refers to these and not to locations in space. In every square in which the study area was divided, we represent the maximum degree of membership shown by any species reported in that square in a given chorotype. Fuzzy membership in every chorotype is consequently associated with every point in the space. Expressed as a formula, the maximum degree of membership in a given chorotype (MMDch) in a certain square is
where pi is either equal to 1 if the species distribution di includes that grid or it is equal to 0 if the reported distribution di does not include that grid.
The fuzzy nature of chorotypes has also been taken into consideration as a criterion to deconstruct biodiversity, such that a measure of fuzzy species richness of each chorotype (FSRch) could be computed in every square of the study area with the following formula:
In essence, this is to weight every distribution—when computing the species richness—according to every species degree of membership in a given chorotype.
The MMDch can be seen as a local measure of the chorotype's height, whereas the FSRch can be seen as a local measure of the chorotype's cardinality (i.e., respectively, the chorotype's height and cardinality by only considering the species that were reported at a given site). As a basis to discuss the biogeographic meaning of these representations, we calculated geographical Spearman correlations between the species richness of each chorotypical cluster (SRch), MMDch and FSRch for each chorotype, and geographical Spearman correlations between chorotypes for each of these parameters.
Chorotype detection and the associated fuzzy approach are entirely performed with the RMACOQUI v1.0 software, which is freely available on request to the authors of this article.
Comparison with a Nonfuzzy Method to Obtain Chorotypes
To compare our results with those of a nonfuzzy method, we used the procedure proposed by Hausdorf and Henning (2003) for the objective detection of biotic elements, which are equivalent to our chorotypes. The model-based Gaussian clustering method, as implemented in the “prabclust” function in the R-package PRABCLUS v2.1-4, was used because it also provides a decision about the number of meaningful clusters and about ranges that cannot be assigned adequately to any biotic elements (see Hausdorf and Henning 2003 for details about this method). Prabclust performs a metric multidimensional scaling on a matrix of distances between distributions. Following Hausdorf and Henning (2003), we used the Kulczynski distance (dK, Shi 1993):
where A is the number of grids in which the species a is present, B is the number of grids in which the species b is present, and C is the number of grids where both species a and b are present. Before performing the Gaussian clustering, pradclust makes an estimation of noise that detects the ranges that are not assigned to any group. The “hprabclust” function that implements a model-based hierarchical clustering (introduced as experimental in the PRABCLUST manual) was also applied to our data set.
We identified 16 clusters of the dendrogram whose IH are positive for amphibians in the Iberian Peninsula (Fig. 1). These can be interpreted as biogeographic patterns with different levels of fuzziness, some of them nested within others. Among them, a maximum of 11 nonnested fuzzy chorotypes may be distinguished. The four significant chorotypes (Table 1; Figs. 1 and 2) represent the least fuzzy distribution patterns, where conditions to be chorotype are maximized. Their chorotypical clusters included 22 distributions. The 13 distributions of chorotypical Cluster 1 include widespread species, although the highest concentration of them occurs in the western half of the Iberian Peninsula, mainly in the south-west, between the Central Mountain Range and the Guadalquivir River. Chorotype 2 is based on the distribution of Alytes dickhilleni, in southeastern Spain, and chorotypical Cluster 3 comprises the distributions of two species occurring in the eastern half of Iberia. On the other hand, the six species of chorotypical Cluster 4 are mainly distributed over the northern half of the Iberian Peninsula, and all of them have been reported to the north of the northwestern mountain ranges.
|Rana dalmatina and Mesotriton alpestris||0.888||—||—|
|Discoglossus pictus and N-E Hyla meridionalis||0.684||—||—|
|Calotriton asper and Rana pyrenaica||0.684||—||—|
|Rana dalmatina and Mesotriton alpestris||0.888||—||—|
|Discoglossus pictus and N-E Hyla meridionalis||0.684||—||—|
|Calotriton asper and Rana pyrenaica||0.684||—||—|
Note: P is the probability associated with the G-test (degrees of freedom = 1).
The six distributions left were classified into three clusters that were not significantly chorotypical (Table 1; Figs. 1 and 3). Both Rana dalmatina and Mesotriton alpestris show partially overlapping distributions north-east of the Cantabrian Mountains and are associated with Chorotype 4. The four distributions left occur in the north-east, around the Pyrenees (see Fig. 3).
The degree to which each amphibian distribution in the Iberian Peninsula belongs to each chorotype is shown in Fig. 1. Representations of fuzzy chorotypes in space are shown in Fig. 2. Table 2 contains the set of parameters that describe chorotypes as fuzzy sets: fuzzy cardinality, relative cardinality, height and entropy, and parameters to quantify the fuzzy relationships between chorotypes: pairwise union cardinality, pairwise intersection cardinality and fuzzy overlap, and the degree of inclusion of each chorotype in the others. Chorotype 1 obtained the maximum cardinality, followed by chorotype 4, and then by chorotype 3. The highest degree of fuzziness (entropy) corresponded to chorotype 4, whereas the monospecific chorotype 2 was the least fuzzy.
The fuzziest chorotype 4 showed also the highest union and intersection cardinalities (both with chorotype 1) and the highest overlap (with chorotype 3). On the other hand, the least fuzzy chorotype 2 showed also the lowest union and intersection cardinalities (with chorotypes 3 and 1, respectively) and the lowest overlap (with chorotype 4). Fuzziness was thus highest in the limits between chorotype 4 and both chorotypes 1 and 3 (see Table 2 and Fig. 2).
MMDch is clearly related to the areas where at least one species of the chorotypical cluster is present (Fig. 2). A noticeable similarity was also observed between FSRch and SRch maps within the area where the chorotypical cluster is present (Fig. 2). The relationship between FRSch and SRch approximated a positive linear trend (see Fig. 4). The correlation between FSRch and SRch was always significant, and it was higher than 0.800 in chorotypes where SRch was above 2 (Table 3). MMDch rapidly increased as SRch increased from 0 to 1 (except in chorotype 1, where SRch values below 1 do not exist), and it then stabilized around the value that corresponded to the chorotype's height (Fig. 4). The correlation between MMDch and SRch was always positive and statistically significant (Table 3).
Note: Subscript “ch” was replaced with the corresponding number when referred to a single chorotype. All correlations are significant (P < 0.01).
Correlations between the SRch of different chorotypes and between their MMDch were lower than 0.250 and often negative, whereas for FSRch, the pairwise correlation values between chorotypes ranged between 0.350 and 0.629 except for the negative correlation between chorotypes 2 and 4 (see Table 4).
|MMD||0.030 n.s.||0.040 n.s.||0.048 n.s.|
|MMD||0.030 n.s.||0.040 n.s.||0.048 n.s.|
Note: , Significant correlation (P < 0.01); , significant correlation (P < 0.05); n.s., nonsignificant correlation (P > 0.05).
Figure 5 represents the chorotypes detected in Portugal and in Spain based on separate analyses.
The biotic elements detected using the model-based Gaussian and hierarchical clustering methods are represented on the two first axes of the multidimensional scaling plot (Fig. 6). In the Gaussian clustering, 12 distributions were considered noise and thus remained ungrouped in the plot; the four biotic elements grouped species differently from chorotypes: The southwestern biotic element (upper-left in the plot of Fig. 6) included 4 of the 13 distributions of chorotypical Cluster 1; both the northwestern biotic element (lower-left) and the northern-widespread biotic element (lower-right) combined distributions from chorotype 1, and the fourth biotic element included two man-introduced distributions that were ungrouped in our output. Maps with the species richness of each biotic element reflect the combinations described (Fig. 6). However, the performance of the hierarchical clustering provided a more similar output of biotic elements to chorotypes: A first biotic element coincided completely with chorotypical Cluster 1, a second one included some of the distributions in chorotypical Cluster 3, and the other distributions left remained ungrouped (Fig. 6).
Methodological Framework of Fuzzy Chorotypes
Fuzzy methods in biogeography often start with the definition of linguistic rules (i.e., fuzzy rules). Such is the case of fuzzy distribution modeling, where fuzzy rules describe a priori the relationship between species and the environment (e.g., Van der Broekhoven et al. 2006), often in the framework of environmental envelope techniques (e.g., Robertson et al. 2004). There are, nonetheless, alternative methodologies that do not use linguistic rules for the model construction; instead, a fuzzy output with the form of degree of membership is obtained starting from a crisp species distribution. Examples of this are the self organizing map (SOM) models, based on neural networks (Gevrey et al. 2006), and the favorability models, based on binary logistic regressions (Real et al. 2006). Our approach and that of the Fuzzy-C-Means clustering algorithm (FCM, Bezdeck et al. 1984) provide a fuzzy partition starting from a collection of crisp data, the presences and absences, based on the perception that relationships between these species ranges are not crisp (Salski 2007).
FCM derives fuzzy memberships based on the minimization of similarities between objects (e.g., sample points) and the mean values of classes located in multivariate space (see, e.g., Brown 1998, Arnot et al. 2004, Eyre et al. 2003). In principle, FCM could be used to search for fuzzy chorotypes. This technique, however, would present limitations when applied to such a goal: 1) with FCM, it is necessary to know a priori the number of clusters (thus, it would be necessary to predefine a number of chorotypes); 2) all classified elements are forced to be included in a cluster (unclassified distributions would then not be allowed); and 3) a degree of partition fuzziness is also required (i.e., a fuzzifier: the weighting exponent m), so that different chorotypes could be obtained, with the same classification method, depending on this decision. The first problem could be solved by comparing different classifications obtained with different numbers of clusters by means of the partition efficiency indicators (Roubens 1982), whereas both the first and the second limitations are solved by a modification of FCM: the Possibilistic C-Means algorithm (PCM, Krishnapuram and Keller 1993). PCM avoids the FCM's need to specify the number of clusters to be produced and also accepts unclassified elements, but it is still necessary to consider a priori by which degree of overlap two clusters are distinguishable as different groups. This is not necessary with our method because the number of chorotypes derives from the structure of the data set.
Ordination techniques may also be used for the detection of biogeographic patterns by locating distributions into a multidimensional space. In principle, an ordination plot can show a picture of vague or fuzzy limits between groups of distributions, but the existence of these groups need to be explored by means of partitioning procedures according to different clustering criteria (Birks 1987). Myklestad and Birks (1993), for example, used two-way indicator species analysis (Hill 1979), a divisive classification technique, to search for floristic elements along the axes of a correspondence analysis of Salix species distributions in Europe. The model-based Gaussian clustering (Hausdorf and Henning 2003) incorporates a testing procedure to make objective partitions between clusters within a multidimensional scaling plot. Figure 6 shows that the output of the model-based Gaussian clustering is very different from the chorotypes for Iberian amphibians proposed in this paper, but the performance of the hierarchical clustering provided a more similar output of biotic elements to chorotypes. The use of a hierarchical classification thus determined the convergence between chorotypes and biotic elements.
Some published distribution patterns defined with other taxa using hierarchical classification methods show coincidences with our chorotype pattern for amphibians. For example, the distribution of chorotype 1 is found in patterns of Iberian pteridophytes (Márquez et al. 1997), of Iberian Cytiseae (Fabaceae) species (Gómez-González et al. 2004), and of European vascular plants (Finnie et al. 2007); chorotype 2 appears also in patterns of Iberian Cytiseae species and of European vascular plants; and both chorotypes 3 and 4 coincide with two chorotypes for pteridophytes. The biogeographical patterns of Iberian reptiles and amphibians were analyzed by Sillero et al. (2009) by hierarchically classifying species potential occurrences based on remote sensing variables. Three of the chorotypes they found included all species in chorotypical Cluster 1, A. dickhilleni was classified alone as in chorotypical Cluster 2, and two other chorotypes involved most species in chorotypical Cluster 4. On the other hand, the model-based Gaussian clustering also showed two biotic elements for Iberian Helicoidea landsnails (Hausdorf and Henning 2006) similar to our chorotypes 3 and 4.
In any case, the main strength of our method, compared with other available classification and ordination approaches, is that the former provides criteria to interpret the complexity of a biogeographical pattern considering its fuzziness, and also to establish the limits to be a chorotype, based on the distributions of species. In this way, our method recognizes the least fuzzy clusters where conditions to be a chorotype are maximized (i.e., in our example, the four significant chorotypes) and also the maximum number of chorotypes possible throughout a fuzzy biogeographical pattern (in the example, 16 chorotypes considering nested and nonnested clusters and 11 chorotypes considering only nonnonnested clusters).
The Fuzzy Nature of Chorotypes
The resultant biogeographic pattern constitutes an evidently fuzzy model of amphibian diversity in the Iberian Peninsula (see Fig. 2): 1) there is a partial overlap between the chorotypes, 2) the geographic limits of chorotypes are vague despite being statistically significant because the overlap between the distributions of species composing a chorotype is not perfect, and 3) unclassified distributions also overlap partially with others and with certain chorotypes. Any numerical classification of vegetation or fauna distributions should allow for some degree of overlap and even allow for leaving some elements unclassified (De Cáceres et al. 2009). Because of this, considerable idealization is required to describe these chorotypes by using mechanistic models, whereas a more realistic interpretation of complex ecological and biological systems is enabled by using a fuzzy approach (Setnes et al 1997, Olano et al. 1998).
The degree of fuzziness of a chorotype, that is, the entropy, is a sign of biogeographic complexity and of having blurred geographic limits. Fuzziness cannot be simply deduced from the species number of the chorotypical cluster, from the cardinality, or from the height of a chorotype. Chorotype 1, for example, could be expected to be the fuzziest chorotype because it represents the most widespread distributions, obtained the highest cardinality and was highly overlapped with other chorotypes, but it is not, as several species had either very high or very low membership degree in chorotype 1. In contrast, chorotype 4 shows the highest fuzziness (with an entropy value of 0.425), which comes together with a low height, and with the highest intersection cardinalities and fuzzy overlap with two other chorotypes. On the other hand, the lowest fuzziness is shown by chorotype 2 that has a single-species chorotypical cluster (A. dickhilleni), the highest height, and the lowest cardinality, unions, intersections and overlap with the other chorotypes.
To find exactly where the fuzziness is highest within the overall pattern, pairwise relationships between chorotypes provide valuable information. According to the pairwise intersections, overlapping and mutual inclusion values, the fuzziest limits are those located between chorotype 4 and both chorotypes 1 and 3 (see Table 2). Fuzziness, however, can be found inside and outside significant chorotypes, and so it is not restricted to the limits between them. Distributions that are not included in any chorotypical cluster are sometimes grouped together so that, though they do not constitute significant chorotypes, represent patterns in which species substitute gradually to each other (e.g., D. pictus, the northeastern distribution of H. meridionalis, C. asper, and R. pyrenaica, Fig. 1; see also marine waterbirds in Real, Olivero, et al. 2008). These “extrachorotypical” clusters may be seen as “gradual patterns” or “very-fuzzy-chorotypes”, and it can be worth analyzing them as complementary patterns to chorotypes in which every distribution also has a degree of membership. Inside significant chorotypes, other “infrachorotypical” clusters can be detected that can be treated as “subchorotypes.” For example, chorotypical Cluster 1 is comprised by three main clusters of the dendrogram (Fig. 1) that are worth considering in the same way as gradual patterns to get a deeper insight in the overall biographic pattern. These gradual patterns and subchorotypes are some of the set of 11 nonnested fuzzy chorotypes that may be distinguished at the most by allowing a high level of fuzziness.
An apparent paradox of fuzzy membership serves to illustrate the biogeographic meaning of biogeographic fuzziness. Some species can have high degrees of membership in more than one chorotype. For example, two species of chorotypical Cluster 1, Salamandra salamandra (its membership degree being 0.663) and H. arborea (its membership degree being 0.650), have also high degrees of membership in chorotype 4 (0.537 and 0.569, respectively). Similarly, the species of chorotypical Cluster 4 A. obstetricans (its membership degree being 0.644) has also high degree of membership in chorotype 3 (0.501). In fact, these species play a pivotal role between chorotype 4 and the others, which contribute highly to fuzzify the limits between this chorotype and both chorotypes 1 and 3. A more extreme example of this pivotal role arises from the definition of amphibian chorotypes separately in Spain and in Portugal (Fig. 5). In the Spanish pattern, Pleurodeles waltl was included in the widespread chorotypical cluster, but it had, however, a higher membership degree in the western chorotype; another species of the same cluster, Triturus marmoratus, had also a higher membership degree in the northern chorotype. In the Portuguese and in the Iberian patterns, however, these pivotal distributions were relocated, so that they were grouped with the species of the chorotypical clusters in which they belonged most in Spain. The contribution of fuzzy logic to stabilize this extent-depending species oscillation for chorotype interpretation is discussed in the last paragraph of this article.
A more biogeographic interpretation of the chorotypes' fuzziness may be obtained by analyzing their respective species worldwide ranges: Two out of the six species in the fuzziest chorotype 4 are Iberian endemics (Chioglossa lusitanica and R. iberica), and a third species spreads also in western France (T. marmoratus), whereas the other three spread either throughout central Europe (A. obstetricans) and Great Britain (Lissotriton helveticus) or beyond the European boundaries (R. temporaria). Five of the 13 species that configure chorotype 1 are Iberian endemics, and two more, H. meridionalis (i.e., its nonintroduced populations) and P. waltl are endemic to Iberia and northern Africa. This chorotype includes also species that are widespread throughout Iberia and the surrounding territories (Pelobates cultipres, Pelophylax perezi), and also species that spread eastward (Bufo calamita) and far beyond the European boundaries (B. bufo). Finally, the only species in the least fuzzy chorotype 2 is endemic (A. dickhilleni), and one of the two species in chorotype 3 is endemic too (D. jeanneae). The Pearson's correlation coefficient between the entropy and the proportion of endemic species was − 0.964 (P < 0.05), and it was − 0.958 (P < 0.05) when distributions restricted to Iberia and closely surrounding areas (i.e., southern France and northern Africa) were included too. Thus, at least in our example, fuzziness had an inverse relationship to endemicity. This might constitute an indicator of the relative importance of history in shaping the biogeographic pattern described by a chorotype if, as proposed by Webb and Gaston (2000) (see also Araújo et al. 2008), narrow ranging species are less likely to be in equilibrium with current climate conditions: When comparing chorotypes, lower fuzziness would indicate that history played a stronger role in a chorotype's configuration. Interesting conclusions may be obtained by comparing the fuzzy chorotype concept with Morrone's (1994) areas of endemism, biogeographic units that are supposed to be related to historical events such as vicariance (Hausdorf 2002).
Fuzzy Chorotypes and History
If it were possible to relate the origin of chorotypes to a historical hypothesis based on glacial refugia, then an overall pattern with stronger historical antecedents could be expected to show higher fuzziness in areas far from refugia than in areas close to refugia. After the last glaciation, and facing newly available lands, species that had shared refugial areas may have shown different dispersal patterns driven by the variety of possible responses to ecological factors. As a result of this, even species groups that usually share distributions may show certain degrees of divergence over space.
Fuzziness in the most important chorotypes detected for amphibians in the Iberian Peninsula increases eastward (Fig. 2). Additionally, the six distributions that did not constitute chorotypes represent the fuzziest component in the overall biogeographic pattern, and four of them are eastern distributions (Fig. 3). In contrast, two chorotypes included all the species with a preferably westerly distribution in the study area, although the highest species richness was reached, by each chorotype, at a different geographic latitude.
The simultaneous existence of both chorotypes and ungrouped distributions in the same region has been interpreted as either reflecting a historical origin (Real et al. 1997) or being the result of different strategies in the species use of space (Real, Olivero, et al. 2008). History is potentially an important factor behind amphibian distributions in the Iberian Peninsula. Geological events in the Tertiary period and Pleistocene glaciations are considered decisive historical events in Europe whose effects on the current vertebrate distributions are still detectable (Lomolino et al. 2006). All the freshwater systems in Europe disappeared or were strongly modified in the Pleistocene and their later recolonization is so recent that lakes and river systems are relatively very “young” and quite impoverished compared with tropical systems (Lévêque and Balian 2005). Although the Iberian Peninsula was considered to have been a refugium for fauna and flora during the glaciations, phylogeographical studies show that Iberia was, in fact, a bio-climatically complex region within which refugial areas were heterogeneously distributed (Gómez and Lunt 2006). The main Pleistocene glacial refugia for amphibians were located in the west of the Iberian Peninsula and were located specifically in the southwest (Algarve, southern Portugal) and the central-west (central Portugal and the westerly Central Range) but also around the Cantabrian (north) and the Baetic (southeast) Mountains.
A possible relationship between fuzziness and distance to glacial refugia is highly feasible in relation to chorotype 1, which contains 50% of Iberian endemic distributions in its chorotypical cluster. The phylogeographical analyses available for two of them, D. galganoi (García-París and Jockush 1999, Martínez-Solano 2004) and L. boscai (Martínez-Solano et al. 2006), place the glacial refugia of these Iberian endemisms in both the central-west and southwestern areas. These are the northerly and southerly limits of the areas where chorotype 1 showed the highest species richness (see SRch and FSRch in Fig. 2), that is, where this chorotype is less fuzzy.
In the case of the fuzziest chorotype 4, the relationship between refugia and the current chorotype distribution is not as clear as in chorotype 1. Chorotype 4 is mainly distributed in the north-west of Iberia, though most of the species in its chorotypical cluster show also some southern populations in the west. It is there, in the central-west of the Iberian Peninsula, where phylogeographical studies locate the possible glacial refugia of its two endemic species, C. lusitanica (Alexandrino et al. 2000, Alexandrino et al. 2005) and R. iberica (Martínez-Solano et al. 2005), and also of A. obstetricans (Martínez-Solano et al. 2004). The current main distributions of these species would then be a result of a northward spread from their glacial refugia, with which they are still connected across Portugal.
The biogeographic pattern in the eastern half of Spain is fuzzier probably as a result of a multiplicity of historical antecedents and responses to the environment (see Fig. 3). Regarding chorotype 3, phylogeographical studies suggest a Pleistocenic origin in a Baetic glacial refuge, in the south-east, for D. jeanneae (García-París and Jockush 1999, Martínez-Solano 2004). Among the ungrouped species, Iberia represents for R. dalmatina and M. alpestris a marginal region of worldwide ranges that reach central Europe and the Balkan Peninsula; D. pictus and the northern ranges of H. meridionalis seem to be a very recent anthropogenic introduction from France (Martínez-Solano 2004, Recuero et al. 2007); and for the Pyrenean endemic Calotriton asper (and probably for R. pyrenaica), it has been suggested that its current range is actually a recent interglacial refugium that was colonized from peripheral areas (Pleguezuelos et al. 2004). The only nonfuzzy element in the east is chorotype 2, the Baetic endemic A. dickhilleni, which probably occurs close to a Pleistocene refuge (Martínez-Solano 2004). In this way, the amphibian eastern distributions in Iberia seem to be linked to a heterogeneous historical scenario that does not affect the species as a group and in which relatively recent processes are often involved. Hence, the extreme fuzziness they represent in the global pattern.
Fuzzy Chorotypes: A Criterion to Deconstruct Biodiversity Patterns
History and ecological factors (including climate, topography and human influence) could have driven, or still be driving, the configuration of the species current distributions. These factors could be seen as geographic attractors dynamically delimiting the behavior of biodiversity over time and throughout space. In such a complex scenario, chorotypes would be the geographical responses of species, depending on the past and present attractors acting in a certain area. The fuzzy membership of a species in a chorotype would then be a measure of how much such a species has been attracted toward a certain biogeographic response. In this way, the fuzzy approach helps to connect chorotypes with these biogeographic attractors.
In previous studies, the environmental and historical factors behind chorotypes have been explored using two main approaches: either by analyzing the area with at least one species of the chorotypical cluster or by analyzing the response of the species richness of each chorotypical cluster to the environmental conditions (e.g., Real, Olivero, et al. 2008). The second option constituted a criterion for the biogeographic deconstruction of biodiversity, as a way of gaining insight into the causes of biodiversity trends (Marquet et al. 2004, Real, Olivero, et al. 2008). Now the challenge is to integrate the fuzzy nature of chorotypes in their explanatory analysis. In principle, chorotype fuzziness is based on degrees of membership that do not describe spatial units (in contrast to both presence/absence and the species number), but they describe species distributions. By computing the maximum membership degree (MMDch) and the fuzzy species richness (FSRch) of every location, chorotype fuzziness is transferred to space and can thus be related to the spatial trends of possible explanatory factors.
MMDch values are actual degrees of membership of a certain distribution at a given position in space. The area where at least one species of the chorotypical cluster occurs is not only visually well delimited by them but also the degree of membership represented inside this area corresponded to one of these species (see Fig. 1). The MMDch, thus, closely resembles the area with at least one species of the chorotype, but the former introduces a fuzzy component mainly related to the membership of distributions that did not configure the chorotypical cluster itself. We propose the MMDch as the fuzzy alternative for the area with at least one species of the chorotype to perform historical and ecological analyses within the study area (see Real, Olivero, et al. 2008).
FSRch, unlike MMDch, quantifies a kind of weighted species richness, that is, a fuzzy alternative for the crisp species richness as is expressed in the chosen name for FSRch. Visually, the general structure of SRch is preserved in the map of FSRch, but fuzziness is added inside and outside the area where at least one species of the chorotypical cluster is present. Although a linear relationship is conserved between SRch and FSRch for any SRch value multiple FSRch values can exist (see Fig. 4). As a result of this, the biogeographically deconstructed pattern for amphibian diversity in the Iberian Peninsula is blurred by including all those species that are susceptible to being affected by any geographical attractor. This is the reason for the higher correlation between chorotypes according to FSRch than according to SRch (these pairwise correlations were higher in highly intersected chorotypes: compare Tables 2 and 4). This consequence of including fuzziness in the species richness deconstruction is not unexpected but indicates the close interrelation between biogeographic responses, despite being significantly different according to a statistical criterion.
The deconstruction of biodiversity benefits also from one of the most relevant advantages of the fuzzy approach to chorotypes, which is illustrated with the comparative examination of Figs. 1–5. As a result of the choice of the geographical extent of the analysis, some species are grouped in different chorotypical clusters in the separate analyses of Iberia, Portugal, and Spain. With the fuzzy approach, however, biogeographical patterns that look different from each other using a crisp perspective tend to converge. Dissimilarities in the comparison of SRch maps between Iberia, Spain, and Portugal disappear almost completely when FSRch are compared. This suggests that patterns that could be derived from data sets showing differences (e.g., nonidentical extents, different data sources, different lattices, different scale, or different taxonomic criteria), or even from different classification algorithms, could be more similar, and thus more consistent, using the fuzzy logic than using a crisp approach.
This work was made possible thanks to the Ministerio de Educación y Ciencia, Spain (CGL2006-09567/BOS); the Ministerio de Ciencia e Innovación, Spain, and European Regional Development Fund, European Union (CGL2009-11316); and the Consejería de Innovación, Ciencia y Empresa, Junta de Andalucía, Spain (P05-RNM-00935).
We thank Ramón Hidalgo for his work to develop the software for the exposed methodology, Dr. Dave Roberts for his useful comments and his help in the mathematic notation, and Dr. Adrian Paterson, Prof. John Birks, and anonymous referees for their valuable remarks about a previous version of the manuscript.