## Abstract

The direction of covariation in behavioral traits (a behavioral syndrome) is typically the same in males and females, although intersexual differences in life history could lead to intersexual heterogeneity in syndrome structure. We explored whether a behavioral syndrome was the same in both sexes. We recorded 2 metrics of nest defense in wild blue tits Cyanistes caeruleus: 1) Nest defense of a female around the time her eggs hatch (hatching defense), 2) nest defense of the male and female parent when their offspring was 16 days old (nestling defense), and 3) handling aggression. We used repeated records of these behaviors collected on 392 females and 363 males during 2007–2012 in a hierarchical mixed model to separate the between-individual from the residual (co)variances. We find that 1) hatching defense is not repeatable across breeding seasons (but highly repeatable within years). 2) Nestling defense intensity is 38% repeatable in males and females. 3) Contrary to our expectation, females that defended their nestlings more intensively were less aggressive when being handled (negative between-individual correlation), but 4) nestling defense and handling aggression showed a positive between-individual correlation in males although nonsignificantly. These correlations were completely masked on the phenotypic level by a low residual correlation. Individuals may hence display repeatable and correlated aggressive behavior in different contexts, but the direction of this syndrome may differ between the sexes. Intersexual heterogeneity in animal personality and syndrome structure remains poorly understood, and our findings hence urge future work to explicitly consider this heterogeneity.

## INTRODUCTION

The concept of animals having a personality has in the recent decade become firmly established in the field of behavioral ecology (Wilson 1998; Dingemanse and Réale 2005; Réale et al. 2007; Schuett et al. 2010; Stamps and Groothuis 2010). Evidence for between-individual variation and hence individual consistency (repeatability) is considered the hallmark of a behavior capturing an aspect of animal personality (Bell et al. 2009). Repeatable behaviors covering potentially different aspects of personality tend to be correlated and are then referred to as behavioral syndromes (Sih, Bell, and Johnson 2004a). Moreover, such syndromes remain stable across situations (Sih, Bell, and Johnson 2004a). For example, aggressive individuals are also bold, and this relationship holds both when a predator is present and absent. The existence of behavioral syndromes forces one to place a focal behavior, which by itself may seem to be disadvantageous, within a multivariate and multiple context framework in order to properly understand the selective forces acting on it. For example, an individual may have an increased risk to get killed if it is bold when a predator is near, but such risk taking may be offset by benefits accrued through being aggressive when predators are not present (Sih, Bell, and Johnson 2004a). To understand whether such processes occur in the wild, an important first step is to study whether aspects of animal personality covary across different contexts.

An adaptive view of behavioral syndromes poses that these result from positive, self-enforcing feedback driven by variation between individuals in their residual (i.e., future) reproductive value (Wolf et al. 2008). Individuals with a lot of future value to loose should avoid risk, whereas those with poorer prospects should be willing to take risks. This difference in risk -taking applies across contexts and thereby leads to syndromes. Thus, individuals with a low residual reproductive value should be aggressive and readily attack a predator in defending their offspring, whereas individuals with a high residual reproductive value should be nonaggressive and shy. Another, more proximate view to explain the existence of syndromes is based on assuming that individuals differ in “hard-wired” properties described by, for example, coping styles (Koolhaas et al. 1999; Sih, Bell, and Johnson 2004a). A proactive coping style represents individuals that are bolder, more aggressive and more explorative, whereas the reactive coping style describes individuals who are more passive (i.e., less aggressive, shy, and less explorative). Stress hormone is one of the most likely proximate mechanism inducing links between behaviors (Cockrem and Silverin 2002; Carere et al. 2003; Cockrem 2007). For example, Cockrem (2007) proposed that bolder individuals are characterized by a lower corticosterone level (i.e., lower stress response). In great tits Parus major, bold individuals have a low response to stress (Carere and van Oers 2004). Furthermore, a stressed individual is expected to reduce its parental care (Wingfield et al. 1997; Silverin 1998; Wingfield and Sapolsky 2003).

Life-history considerations underlie variation in personality (Wolf et al. 2008), and life-history trade-offs and roles often differ between the sexes (Schuett et al. 2010). In species with biparental care, there is an inherent conflict between parents over the amount of parental effort they are willing to make (Lessells and McNamara 2012). In general, males are selected to make their female partner work harder than she wants (Kokko and Jennions 2008). Increases in current parental effort entail reduction in residual reproduction through lowered survival or future fecundity. Hence, life-history differences between sexes are expected to be reflected in intersex heterogeneity in behavioral types (Schürch and Heg 2010; Dammhahn 2012). One could therefore expect males and females to differ in personality as well as in the direction of syndromes between multiple behaviors. However, few studies have investigated these issues. There is evidence that repeatability of behavior is higher in males than females (reviewed in Schuett et al. 2010, cf. Bell et al. 2009), which implies that the “strength” of personality differs between sexes. Nevertheless, most studies on behavioral syndromes concern 1 sex or pooled sexes. When investigated, studies typically find that the direction of the behavioral syndrome is the same in both sexes. For example, an aggression–sociality syndrome in comb-footed spiders Anelosimus studiosus was the same in males and females although selection on this syndrome differed between the sexes (Pruitt and Riechert 2009). Similarly, the correlational structure in an aggression–activity syndrome (which also included other behavioral traits) was the same in male and female reef fish Paraoercis cylindrica, which was probably because a high cross-sex genetic correlation constrained the correlation structure (Sprenger et al. 2012). In general, however, intersex heterogeneity in behavioral syndromes has not been studied extensively to date.

Nest defense is defined as a behavior which decreases the probability that a predator will harm the offspring while simultaneously increasing the probability of injury or death to the parents (Montgomerie and Weatherhead 1988). Parents engaged in defending their offspring are hence subject to a trade-off between risking themselves and protecting their young. The intensity of nest defense is thus commonly considered as an indication of the parental care or investment made by a parent. Classical explanations for variation in nest defense are based on the notion that the underlying value of the offspring may vary, which determines variation in parental defense (Montgomerie and Weatherhead 1988; Rytkönen 2002). Another perspective, however, is that the interindividual difference in the intensity of nest defense reflect differences in personality across individuals (Gosling 2001; Réale et al. 2007). This latter notion requires that the intensity of nest defense is repeatable (i.e., it must have between-individual variation). Repeatable nest defense aggression has, for example, been found in Ural owls Strix uralensis (R = 52%, Kontiainen et al. 2009) and western bluebirds Sialia mexicana males (R = 80%, Duckworth 2006). Furthermore, nest defense can be part of a behavioral syndrome. For example, more explorative great tits are also more active during nest defense (Hollander et al. 2008), and western bluebirds males that are more aggressive to conspecifics have a higher intensity of nest defense but a lower feeding rate during incubation (Duckworth 2006).

In this paper, we study a wild population of individually marked blue tits Cyanistes caeruleus during 6 consecutive breeding seasons. We focus on whether a potential behavioral syndrome exists between aspects of aggressive behavior measured in the context of defending the brood versus defending self, and whether this putative syndrome has the same direction in both sexes or whether correlations differ in sign or magnitude. We first test if the intensity by which an individual defends its brood during different stages in the breeding cycle forms an aspect of its personality. That is, is nest defense a repeatable property of an individual? We then explore whether the intensity of nest defense is correlated to the individual’s behavioral response in another context and hence part of a behavioral syndrome. Throughout, we ask whether syndrome structure differs between sexes by analyzing males and females separately. We assay 3 behavioral traits. 1) The hatching defense intensity of a female around the day her eggs hatch. 2) The nestling defense intensity of both individual parents when their nestlings are 16 days old. These 2 behaviors characterize aggressive behavior to an observer when the current reproduction is under threat. In addition, we measure 3) aggression of an individual when handled. This latter measure captures aggressive behavior toward an observer when the individual’s self (and hence its future reproduction) is under threat. We have found earlier that handling aggression is a repeatable behavior (Kluen et al. 2014). We here hypothesize that the 2 nest defense behaviors also are repeatable and thus capture an aspect of blue tit personality. We further hypothesize that all these 3 behaviors form a behavioral syndrome characterized by positive correlations because we primarily expect that aggressive individuals behave aggressively independently of the context. However, because males and females face different life-history decisions and may be in conflict over current investment, the direction and magnitude of this syndrome could differ between sexes.

## METHODS

A nest-box breeding blue tit population was studied in the years 2007–2012 during the breeding seasons. The study area (10 km2) was close to the city of Tammisaari in Southwestern Finland (60°01N, 23°31 E). Nest-boxes were visited at 5–8 day intervals to check for their occupancy. Some blue tit pairs produce 2 clutches in a breeding season, but in this paper we only considered first clutches. The laying date of the first egg in a clutch was established on the basis of finding an incomplete clutch and assuming that 1 egg per day was laid. The final clutch size was established during the regular visits. The expected hatching date of the clutch was based on the assumption of an incubation period of 12 days (expected hatching day = laying date + final clutch size + 12) (Kluen et al. 2011). The nest was visited each day starting 1 day before the expected hatching date in order to establish the real hatching date, defined as the date when at least 1 egg has hatched. The hatching date is then defined as the day 0 for the nestlings’ age (chick days).

Males and females were caught in the nest-box when feeding their offspring when these were around 9 days old. Tarsus was measured with a sliding calliper (accuracy, 0.1mm), and body mass was weighed with a 20-g Pesola spring balance (accuracy, 0.1g). Birds were ringed (if unringed) permitting recognition of individuals across years. Age (yearling or older) was estimated based on plumage characteristics, and sex was determined based on presence or absence of a brooding spot (Svensson 1992).

### Behaviors assayed

We collected information on the following behaviors:

1) Hatching defense: The nest defense behavior of the female was recorded each time the nest was checked for hatching, starting 1 day before the expected hatching date and repeated daily up and until the day the clutch hatched. The intensity of hatching defense of the female when opening the nest-box was classified on an interval scale from 1 (low) to 4 (high), which are illustrated in Supplementary Figure S1. Hatching defense score 1 (Supplementary Figure S1a) denotes a female that attempts to hide in the nest, facing away from the observer, typically sticking her head under her clutch or in the nest material. Hatching defense score 2 (Supplementary Figure S1b) indicated a female who faces the observer but does not cover all her eggs. Hatching defense score 3 (Supplementary Figure S1c) was when a female faced the observer and covered her eggs by spreading her wings and fluffing her feathers. Hatching defense score 4 (Supplementary Figure S1d) characterized a female that attacked the observer, typically making a loud hissing noise, and audibly flapping her wings. The differences across these classes are very distinct and permitted accurate classification of all encountered female nest defense behaviors. Blue tits commonly delay their hatching (described by Kluen et al. 2011 for this population), and we thus obtained repeated observations for some females within the same year. Multiple observations of nest defense at hatching of the same individual within a year were used to measure its within-year repeatability. However, for the measure of the across-year repeatability, we only used the first measure of female nest defense at hatching at each year in our analysis in order to avoid a possible habituation effect of the bird to the consecutive visits of the observer.

2) Parental nestling defense: When the nestlings were 16 days old, 1 randomly chosen nestling was taken out of the nest immediately after the observer arrived at the next-box. This nestling was held upright by holding its legs and kept by the observer at shoulder height close to the nest-box. The nestling was clearly visible and able to flap its wings. The observer made a hissing noise which simulated the distress call blue tit fledglings make. The number of attacks of each parent and the closest distance of the attack (scored by eye as 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.8, 1, 1.5, 2, 3, or 5 m) was noted during a 2-min assay period. Blue tit parents sometimes were away from the nest site (presumably foraging) but make their presence at the nest site known by alarm calls. The assay of a parent bird started only when the parent was present. An attack was considered when a bird flew straight toward the observer and changed its flight direction abruptly at a certain distance. In general, the “decoy nestling” was quiet although a nestling could make an occasional distress call which would evoke a parental response. We do not believe that the activity of the nestlings was a strong driver of variance in nestling defense between broods. Sexes were distinguished by their coloration; blue tit males are more colorful than females in terms of blueness of their cap and tail, intensity of yellow coloration and color contrast of their head (black, white, yellow). The accuracy of sexing was checked by placing a temporary mark on either the male or female (randomly chosen) when they were caught around day 9 in order to check whether the sexing determined visually during the nestling defense assay at day 16 was correct. The mark was either a small piece of red duct tape around the ring or the ring was colored with a marker. This was done in 2007 and 2008 on a subset of all individuals. In total, 18 individuals with a marked ring were recorded during the nestling defense assay, and all of these were sexed correctly. The nestling defense assay was performed mainly by 2 experienced observers (E.K. and J.E.B.) during all years, except for 7 assays performed by an experienced assisting person in 2011. The nestling was placed back in the nest after the assay.

3) Handling aggression: Handling aggression described how aggressive (struggling, flapping wings, pecking, or biting with the bill) the individual was while morphometric measurements were taken of tarsus, head, wing, and tail size (see Brommer and Kluen 2012 for details). The response of an individual to being handled and measured was noted on an interval scale from 1 (lowest handling aggression score) to 5 (highest score). Handling aggression score 1 denoted an individual that stayed completely passive during all measurements and a score of 5 was an individual that did not decrease in aggressiveness during the morphometric measures. Scores 2, 3, and 4 denoted behavior that was initially aggressive toward being handled but which calmed down during the sequence of measurements, where a score of 3 denoted an individual calming down approximately in the middle of the measuring protocol (after measuring the head length), and 2 and 4 denoted intermediate levels.

### Data included and analyses

Only data of ringed individuals were considered in order to permit assigning behavior to an individual across years. In general, the first measure of a behavior on an individual taken in a given year was included unless stated otherwise. Repeated behavioral measures on an individual thereby concern the same behaviors quantified at different breeding seasons, except for the within-season repeatability of hatching defense that concerned behavior of the same individual during a single breeding season. When one of the behaviors was not recorded during a particular instance, it was considered a missing value. We hence included all information available in order to maximize power.

In the analysis, we consider each behavioral metric as an approximately normally distributed variable and assumed it obeyed the standard hierarchical structure of a linear mixed model. That is, for measures z of traits x and y taken at instance t on a specific individual n, we assume that

(1)

where individual n’s intrinsic value i for traits x and y is considered as a deviation from the trait’s fixed-effect mean value μx and μy, respectively. The distribution of i across individuals is assumed to be normal, with a mean of zero and between-individual trait-specific variance VI. Residuals are denoted as ε and are random draws for each trait and individual n at instance t from a normal distribution with a mean of zero and a trait-specific residual variance VR. Equation 1 is solved by implementing a hierarchical mixed model where i was considered a random effect. In the multivariate mixed model, we assumed that i and ε stemmed from a multivariate normal distribution, such that in addition to the above variances also all pairwise covariances were estimated. For example, for the 2 traits listed in Equation 1, cov(ix, iy) and cov(εx, εy) were estimated. The estimated covariance was scaled to a correlation following the standard definition of a correlation, that is, r(ix, iy) = cov(ix, iy)/ √(VI,x * VI,y). Fixed effects included in the linear mixed models focusing on behavior measured across years were year, an individual’s age (as a 2-level factor; yearling vs. 2 years or older) and the identity of the observer. For the analysis of within-year repeatability of female hatching defense we included the sequence of the measure (first, second, etc.) as a factor in order to test whether females changed their behavior when measured multiple times and whether some of the eggs had hatched or not. We assumed that habituation was absent when measures were taken 1 year apart. All statistics were performed in ASReml-R (Butler et al. 2009) in R version 3.0 (R Development Core Team 2012), which solves the linear mixed model using Residual Maximum Likelihood (REML).

Repeatability was calculated on the basis of the REML estimates of the above specified variance components in a model where the fixed effect was only a constant. We calculated repeatability for trait trait (Rtrait) as:

(2)
$Rtrait=VI,traitVI,trait+VR,trait=VI,traitVP,trait,$

where VP is the REML phenotypic variance. The standard error for the repeatability was calculated following the delta method (Lynch and Walsh 1998) as implemented in ASReml-R. The statistical test for repeatability was based on a likelihood ratio test (LRT) comparing the likelihoods of the linear mixed model with the individual’s ID as a random effect and a general linear model without this random effect, both fitted by REML. The LRT test statistic was calculated as −2 times the difference in log-likelihood between these models, which was tested against a chi2 distribution with 1 degree of freedom. Following recommendations by Nakagawa and Schielzeth (2010), information on individuals with only one measure were retained in the model. Putative differences between sexes were tested using an approximate t-test directly on the estimates. The statistic was calculated as the ratio of the difference in repeatability over the square root of the sum of the squared standard errors (Zar 1999), tested by comparing it with the Student’s T distribution with the degrees of freedom arbitrarily set at 100.

A behavioral syndrome concerns between-individual covariance across traits (Dingemanse et al. 2012; Brommer 2013; Dingemanse and Dochtermann 2013). Thus, only behavioral metrics that are repeatable can form a behavioral syndrome as there cannot be between-individual covariance in absence of between-individual variance. We thus constructed the above-described multivariate linear mixed model only for those metrics that were significantly repeatable. Our hypothesis that all behavioral metrics form a syndrome was tested by LRT comparing the unconstrained multivariate mixed model to a model where all covariances were constrained to be zero. The LRT test statistic was tested against a chi2 distribution with the degrees of freedom equal to the number of constrained covariances.

In the analyses, the square root of the number of attacks was used as a metric for an individual’s nestling defense in order to approximate a normal distribution.

## RESULTS

### Univariate analysis of the behavior measured

Behaviors were measured in more than 350 individuals of each sex although repeated measures were only obtained for about 30% of these individuals (detailed in Table 1). Variation in hatching defense was limited; most observations belonged to scores 2 and 3 (Supplementary Figure S1e), where the female faced the observer but did not attack. During the nestling defense assay, typically (>70% of cases) individuals attacked the observer at least once (Supplementary Figure S1a and b). The modal handling aggression score was 2 and 3 in females and males, respectively (Supplementary Figure S1c and d). Males have a higher handling aggression than females (Table 1; Supplementary Figure S1).

Table 1

Descriptive statistics of the behavioral measures taken, sex-specific sample sizes and repeatability (R)

Behavioral metric Sex Nind Nobs/ind, 1, 2, 3, 4+ Mean ± SD R ± SE
Hatching defense 449 325, 73, 41, 10 2.64±0.75 0.078±0.066a
Nestling defense 377 268, 67, 36, 6 1.83±1.43 0.38±0.067b
Nestling defense 353 255, 66, 23, 9 1.73±1.39 0.38±0.067c
Handling aggression 392 276, 75, 33, 8 2.77±1.11 0.37±0.065d
Handling aggression 363 258, 72, 20, 13 3.27±0.95 0.29±0.071e
Behavioral metric Sex Nind Nobs/ind, 1, 2, 3, 4+ Mean ± SD R ± SE
Hatching defense 449 325, 73, 41, 10 2.64±0.75 0.078±0.066a
Nestling defense 377 268, 67, 36, 6 1.83±1.43 0.38±0.067b
Nestling defense 353 255, 66, 23, 9 1.73±1.39 0.38±0.067c
Handling aggression 392 276, 75, 33, 8 2.77±1.11 0.37±0.065d
Handling aggression 363 258, 72, 20, 13 3.27±0.95 0.29±0.071e

One measure per individual per year was included. Hatching defense is displayed only by females and is a score on an interval scale 1–4 (Supplementary Figure S1). Nestling defense is the square root of the number of attacks an individual makes during 2min when defending its 16 days old offspring. Handling aggression is a score on an interval scale of 1–5. Given are per sex, the number of individuals (Nind). “Nobs/ind” denotes how many individuals there were with 1, 2, 3, 4, or more (4+) observations. The frequency distributions of variables are plotted in Figure 1. Repeatability was tested by LRT that is presented below the table, and statistically significant repeatability is printed in bold.

aχ2 = 1.47; P = 0.23.

bχ2 = 26.55; P < 0.001.

cχ2 = 30.77; P < 0.001.

dχ2 = 28.88; P < 0.001.

eχ2 = 15.86; P < 0.001.

A female’s hatching defense was not repeatable between years (Table 1). However, a female’s hatching defense was highly (22–46%) and significantly repeatable within years (Table 2). The repeatability of the intensity of nestling defense at day 16 was high and significant in both sexes. Handling aggression score was also repeatable (Table 1). Repeatability clearly did not differ between the sexes (Table 1, nestling defense: t100 = 0.012, P = 0.50; handling aggression: t100 = 0.84, P = 0.80).

Table 2

Annual repeatability (R) of hatching defense

Year Nobs Nind R ± SE LRT P
2007 278  73 0.46±0.07 68.55 <0.0001
2008 218  91 0.31±0.08 15.80  0.0001
2009 275 116 0.39±0.07 27.53 <0.0001
2010 204  84 0.25±0.09  7.75  0.0054
2011 276 105 0.33±0.07 29.67 <0.0001
2012 199  87 0.22±0.09  8.87  0.0029
Year Nobs Nind R ± SE LRT P
2007 278  73 0.46±0.07 68.55 <0.0001
2008 218  91 0.31±0.08 15.80  0.0001
2009 275 116 0.39±0.07 27.53 <0.0001
2010 204  84 0.25±0.09  7.75  0.0054
2011 276 105 0.33±0.07 29.67 <0.0001
2012 199  87 0.22±0.09  8.87  0.0029

Results are based on a linear mixed model. For each year, the number of observations (Nobs) of hatching defense and the number of individuals (Nind) are presented. In the analyses, fixed effects for “habituation” (how many times hatching defense has been scored on the same individual) and for whether some eggs had hatched or not were included. However, these fixed effects were never significant and are thus not reported. LRT is tested against a chi-square distribution with 1 degree of freedom and gives the significance of the repeatability.

Within the subset of individuals that attacked to defend their nestling, individuals that attacked more also came closer to the observer during their attacks (Female: rS = −0.67, n = 384, P < 0.0001; Male: rS = −0.58, n = 404, P < 0.0001). Parents who did not attack typically stayed >3 m from the observer. Hence, the number of attacks is likely to adequately capture the intensity and risk taking of parental nestling defense.

### Multivariate analysis of aspects of personality

Because hatching defense was not repeatable, we do not consider it as an aspect of blue tit personality and therefore here only consider the potential relationship between nestling defense (square root of the number of attacks during 2min) and handling aggression. The mixed model separated the between-individual (co)variances from the residual (co)variances because only the former describes the behavioral syndrome properly (Dingemanse et al. 2013).

Both handling aggression and nestling defense showed significant yearly differences and differences between observers (Table 3). For handling aggression, annual and between-observer differences were relatively small compared with the overall mean (effect ± SE: year ≤ −0.17±0.14, observer ≤ 0.06±0.12, Supplementary Table S1). For nestling defense, annual fluctuation and differences between observers were relatively strong compared with the overall mean (effect ± SE: year ≤0.73±0.14, observer ≤ 0.49±0.13, Supplementary Table S1). Yearling females and males were not significantly different in their personality traits than individuals older than 1 year (Table 3). Contrary to our expectation, the individual-level correlation rI between handling aggression and nest defense was significantly negative in females (Table 4; Figure 1c). In males, this correlation was not significant but showed a positive trend (Table 4; Figure 1d). A striking finding was that the handling aggression–nestling defense relationship was absent on the phenotypic level in both sexes (Table 4, Figure 1a and b). A phenotypic correlation was absent despite a clear between-individual correlation because the phenotypic correlation is a combination of both the individual-level correlation and the residual correlation, and this latter correlation was close to zero (Table 4; Figure 1e and f).

Table 3

The fixed effects and their statistical significance in univariate linear mixed models for female and male blue tit handling aggression and nestling defense (square root of the number of attacks during 2min)

Trait Variable Effect SE df Wald χ2 P
Handling aggression: Females Intercept  3.12 0.15 3067.88 <0.0001
Year −0.17 0.14   49.81 <0.0001
Age (yearling 0.05 0.09    0.27  0.5299
Observer  0.06 0.12   22.83  0.0004
Nestling defense: Females Intercept  1.48 0.19  739.07 <0.0001
Year  0.73 0.20   15.03  0.0102
Age (yearling 0.13 0.12    1.34  0.2470
Observer −0.49 0.13   14.38  0.0008
Handling aggression: Males Intercept  3.41 0.13 5258.00 <0.0001
Year −0.14 0.14   22.00 <0.0001
Age (yearling 0.13 0.08    3.73  0.0534
Observer  0.04 0.11   19.76  0.0014
Nestling defense: Males Intercept  1.40 0.18  645.24 <0.0001
Year  0.69 0.20   26.35  0.0001
Age (yearling−0.17 0.12    2.04  0.1535
Observer −0.33 0.13   10.27  0.0059
Trait Variable Effect SE df Wald χ2 P
Handling aggression: Females Intercept  3.12 0.15 3067.88 <0.0001
Year −0.17 0.14   49.81 <0.0001
Age (yearling 0.05 0.09    0.27  0.5299
Observer  0.06 0.12   22.83  0.0004
Nestling defense: Females Intercept  1.48 0.19  739.07 <0.0001
Year  0.73 0.20   15.03  0.0102
Age (yearling 0.13 0.12    1.34  0.2470
Observer −0.49 0.13   14.38  0.0008
Handling aggression: Males Intercept  3.41 0.13 5258.00 <0.0001
Year −0.14 0.14   22.00 <0.0001
Age (yearling 0.13 0.08    3.73  0.0534
Observer  0.04 0.11   19.76  0.0014
Nestling defense: Males Intercept  1.40 0.18  645.24 <0.0001
Year  0.69 0.20   26.35  0.0001
Age (yearling−0.17 0.12    2.04  0.1535
Observer −0.33 0.13   10.27  0.0059

Significance of effects was tested with a Wald’s chi-square test, and significant variables are printed in bold. Contrasts are provided in bracketed italics. The variable Age notes the contrast of yearlings to those older than 1 year. All fixed-effect contrast are reported in Supplementary Table S1.

Table 4

Random effect (co)variances from the bivariate mixed model on handling aggression (HA) and nestling defense (ND) of female and males, respectively

Level  Females Males
HA ND HA ND
Phenotypic HA 1.45±0.084 −0.078±0.046 0.85±0.057 0.027±0.048
ND −0.12±0.069 1.97±0.13 0.033±0.059 1.82±0.12
Individual HA 0.45±0.084 0.33±0.14a 0.34±0.069 0.17±0.15b
ND −0.19±0.082 0.77±0.15 0.083±0.071 0.69±0.14
Residual HA 0.68±0.071 0.087±0.077 0.51±0.057 −0.065±0.08
ND 0.078±0.070 1.20±0.13 −0.049±0.061 1.13±0.13
Level  Females Males
HA ND HA ND
Phenotypic HA 1.45±0.084 −0.078±0.046 0.85±0.057 0.027±0.048
ND −0.12±0.069 1.97±0.13 0.033±0.059 1.82±0.12
Individual HA 0.45±0.084 0.33±0.14a 0.34±0.069 0.17±0.15b
ND −0.19±0.082 0.77±0.15 0.083±0.071 0.69±0.14
Residual HA 0.68±0.071 0.087±0.077 0.51±0.057 −0.065±0.08
ND 0.078±0.070 1.20±0.13 −0.049±0.061 1.13±0.13

The fixed effects of this model are as reported in Table 2 and Supplementary Table S1. Results are presented in matrix form with the variances in the diagonal, the covariance in the lower left-hand corner and correlations in the upper right-hand corner. All estimates are given with their standard error. The covariance structure is presented for the phenotypic, between-individual, and residual levels. A LRT was used to test whether the between-individual covariance/correlation was significant and a significant correlation is printed in bold.

aLRT: χ2 = 4.59, df = 1; P = 0.032.

bχ2 = 0.99, df = 1; P = 0.3.

Figure 1

Plots of the relationship between handling aggression and nestling defense (square root of the number of attacks during 2min) in females and males on 3 different levels. These plots are for illustration only, and the real analysis is based on a hierarchical multivariate mixed model (see text). Observed measures are plotted (panels a and b) representing covariance on the phenotypic level, where a small random value was added to the handling aggression score in order to avoid data points being plotted on top of each other. The mixed-model derived individual-specific values i (Equation 1) as estimated by the Best Linear Unbiased Predictor values where the sex-specific intercept (Supplementary Table S2) was added (panels c and d) were used to illustrate the individual-level relationship. Lastly, residual values (panels e and f) present the relationship on the residual level. Statistics from the multivariate mixed model are presented in Table 2. Sample sizes underlying these plots are described in Table 1.

Figure 1

Plots of the relationship between handling aggression and nestling defense (square root of the number of attacks during 2min) in females and males on 3 different levels. These plots are for illustration only, and the real analysis is based on a hierarchical multivariate mixed model (see text). Observed measures are plotted (panels a and b) representing covariance on the phenotypic level, where a small random value was added to the handling aggression score in order to avoid data points being plotted on top of each other. The mixed-model derived individual-specific values i (Equation 1) as estimated by the Best Linear Unbiased Predictor values where the sex-specific intercept (Supplementary Table S2) was added (panels c and d) were used to illustrate the individual-level relationship. Lastly, residual values (panels e and f) present the relationship on the residual level. Statistics from the multivariate mixed model are presented in Table 2. Sample sizes underlying these plots are described in Table 1.

## DISCUSSION

Parental defense of offspring is, by definition, a risky behavior, and the intensity of nest defense displayed by an individual is typically interpreted as a metric for the investment it is willing to make in its offspring (Montgomerie and Weather 1988). Here, we show that the intensity of a blue tit parent defending its nestlings close to the time of fledging is a highly (38%) repeatable trait in both sexes. This finding fits with previous work demonstrating that nest defense intensity can be repeatable and hence an aspect of avian personality (Duckworth 2006; Kontiainen et al. 2009). However, the intensity of nest defense of females is negatively associated to how aggressively she behaves when being handled, as quantified by our handling aggression score. In males, we find some evidence of the positive association between nest defense and handling aggression which we expected, but this correlation was not significant. We expected that repeatable aggressive behavior in 1 context (such as, e.g., nest defense) correlates positively with aggressive behavior and stress response in another context, such as handling, and that this syndrome is the same in both sexes. Our study therefore underlines that one must be careful with generalizations of aspects of personality across different contexts. In addition, our findings illustrate how the direction of a putative behavioral syndrome can be opposite in the different sexes and thereby shows the importance of analyzing personality in each sex separately and urges for an investigation of sex-specific syndrome structure, whenever possible.

The repeatability of nestling defense suggests a differential propensity for exposure to risk across individuals. Nestling defense concerns the propensity of individuals to protect their current reproductive value (their offspring), and life-history theory predicts that individuals with a high residual (future) reproductive value should avoid risking themselves in defending their current offspring (Winkler 1987; Hakkarainen et al. 1994). Such life-history considerations are pivotal also to current theory on the existence of behavioral syndromes. This is because theory views syndromes as the outcome of a process where individuals differ in their propensity to take risks as determined by their residual (future) reproductive value (Wolf et al. 2007). From this perspective, one could perhaps consider handling aggression as the behavior displayed when an individual is caught and thus risks its future reproductive value (assuming its offspring will survive in its absence). One possible interpretation, therefore, is that females with a high handling aggression score have a strong propensity to protect themselves and their future; they avoid the risk of harming themselves when their offspring is under threat (low nestling defense) but also defend themselves vigorously when under threat (high handling aggression). Conversely, other females have a propensity to emphasize their current offspring over their future value (high nestling defense, and low handling aggression). Dissecting this line of argumentation further requires, however, careful manipulation of current versus future reproductive value (cf. Nicolaus et al. 2012). Although the handling aggression–nestling defense correlation is not statistically significant in males, its positive sign does suggest that the above verbal argumentation does not apply to male blue tits. Instead, it seems that male blue tits with an “aggressive personality” are aggressive both when defending their descendants (high nestling defense) and themselves (high handling aggression). Possibly, intersex differences in behavioral syndrome structure are due to the 2 sexes differentially weighting current versus future reproductive value in taking risks.

The relationship between handling aggression and nestling defense was not apparent on all levels. In particular, there was no phenotypic correlation (i.e., correlation on the level of the measurements) between these behavioral traits in either sex, despite clear evidence of individual-level covariances. The correlation on the phenotypic level presents an average of the individual- and residual-level correlations, weighted by the proportion of phenotypic variance explained by individual- and residual-level variances (Dingemanse et al. 2012; Brommer 2013). In our case, the absence of any residual covariance masks the individual covariance when studying the correlation between handling aggression and nestling defense on the phenotypic level. This scenario, where the phenotypic correlation underestimates the magnitude of the individual-level correlation, is likely to be typical because residuals are random deviations of the phenotypic (i.e., measured) value z from the individual’s intrinsic value i (Equation 1). Hence, one would thus not a priori expect a covariance in the residuals of 2 independently measured behavioral traits (Brommer 2013). Most studies on animal personality quantify behavioral syndrome correlations on the phenotypic level, despite the fact that researchers agree that behavioral syndromes concern correlations between traits on the between-individual level. In principle, correlations on phenotypic, between-individual, and residual levels need not align. Hence, phenotypic correlations may be misleading (Dingemanse et al. 2012). It has been argued that the data required to achieve partitioning of phenotypic covariances into hierarchically lower level components such as between-individual and residual covariances may be outside the reach of behavioral ecologists (Garamszegi and Herczeg 2012). Our study demonstrates that one way to amass sufficient repeated measures is to integrate the quantification of field-based behavioral assays in a long-term, individual-based study protocol provided that sufficient individuals survive and return to breed from 1 breeding season to the next.

One critical aspect of our method to quantify parental nest defense intensity and link this metric to a specific individual is our accuracy to sex blue tit adults based on their appearance in the field. Although blue tits are sexually dimorphic, the variation within males and females do overlap. We have intentionally largely refrained from putting conspicuous markings on the birds (e.g., color rings) in order to avoid any possible side effects of such markings (in terms of its possible effects on predation risk or mate choice). For the small number of marked individuals, which were recorded during the nestling defense assay (n = 18), sexing during the nestling defense assay was always correct. In addition, large inaccuracy in sexing is not consistent with our finding significant repeatability in nest defense intensity. Field-based sexing of blue tits is relatively easy when both parents come close in order to attack and thereby can be compared under the same light conditions. When both parents stay at some distance and do not attack, the accuracy of sexing them is not important because both parents have the same nest defense intensity (zero attacks). The potential error in assigning sex to a particular parent and thus in assigning nest defense intensity to a particular individual is largest in those situations when 1 parent attacks aggressively, whereas the other one stays far away. Such cases, however, are relatively uncommon (personal observations).

A second aspect of our measure of nestling defense is that it assumes that female and male parents act independently, whereas they may be interacting. Nestling defense is an assay of the behaviors of 2 individuals taken at the same time, and the intensity of defense by 1 individual may be affected (positively or negatively) by how the other individual behaves. A powerful framework for the analysis of such reciprocal interactions exists (Moore et al. 1997; cf. Wilson et al. 2009), but we have at present insufficient data to address this possibility quantitatively. If present, these interactions would probably act to increase between-individual variance (and hence the repeatability) of nestling defense when not modeled because individuals have different partners and their behavior is hence differentially affected by their partners.

We find that the intensity of a female’s nest defense around the day her clutch hatches is not repeatable between years although it is clearly repeatable within years. It is not uncommon that researchers consider behavioral metrics that show repeatability within 1 year as aspects of an individual’s personality (Garamszegi et al. 2009). Our results, however, underline the importance of quantifying the interannual repeatability of behavior and thus of long-term individual-based studies. Such studies are needed in distinguishing a personality trait from behavioral variation, as also generally recommended (Sih, Bell, Johnson, and Ziemba 2004b; Bell 2007; Reale et al. 2007). Strong within-year repeatability, but absence of between-year repeatability, in the same behavior could signal that the behavior is determined by environmental conditions, which change randomly from year to year, but are relatively stable from day to day within each year. From a life-history perspective, such annual fluctuations could determine the female’s assets (reproductive value) for the breeding season and hence cause low interannual repeatability of personality (cf. Dammhahn 2012). In addition, an individual may simply reproduce the behavior it happens to make the first time it was assayed because it then knows from experience that this behavior was successful in avoiding predation (Knight and Temple 1986).

## CONCLUSIONS

Our results provide a worked case study on how a between-individual correlation between behaviors (i.e., a behavioral syndrome) can be masked on the phenotypic level by a low residual correlation. Standard statistical assumptions regarding the nature of residuals imply that such masking may be a common feature (Brommer 2013). Our findings hence urge for careful interpretation of the meaning of a low phenotypic correlation between multiple behaviors: A syndrome may be hidden. Whenever possible, repeated measures should thus be collected in order to partition out the between-individual (co)variances. We further find clear evidence that female aggression during handling forms a behavioral syndrome with the intensity of her nestling defense. This syndrome is characterized by a negative correlation between these behaviors. In males, this syndrome is not statistically significant and hence either absent or, judging from the sign of the estimated correlation, oriented in the opposite direction compared with females. In general, intersexual differences in personality are poorly understood in animals, despite the potential importance of such heterogeneity in understanding the ecological and evolutionary dynamics in natural populations (Schuett et al. 2010). Males and females often have different life histories and hence intrinsic differences in current versus future reproductive values. In species with biparental care, intersexual conflict will exacerbate life-history differences (Kokko and Jennions 2008). As life-history differences between individuals underlie the formation of behavioral syndromes, intersex differences in the direction of syndromes could be a general phenomenon. Apart from empirical studies on this phenomenon, there is also a need to develop theory to pinpoint under which conditions and to what extent we would expect intersex difference in the direction and magnitude of the correlations, which characterize behavioral syndromes. Our finding hence draws attention to the need to more explicitly consider intersexual differences in the general study of animal personality.

## SUPPLEMENTARY MATERIAL

Supplementary material can be found at http://www.beheco.oxfordjournals.org/.

## FUNDING

EU-ERASMUS fund (N.F.); Blue tit research was funded in part by the Academy of Finland (1131390 to J.E.B.).

We thank the land owners for permission to work on their property. Two anonymous reviewers and the associate editor are thanked for their constructive comments which improved the paper. Author contributions for design: J.E.B., E.K.; measures: J.E.B., E.K., and N.F.; analysis and writing: N.F. prepared a first data compilation and draft, J.E.B. carried out additional analysis and writing, E.K. commented. In addition to the authors, J. Kekkonen, L. Kurvinen, and M. de Heij contributed information through many hours of field work.

## REFERENCES

Bell
AM
.
2007
.
Future directions in behavioural syndromes research
.
Proc Roy Soc B
.
274
:
755
761
.
Bell
AM
Hankison
SJ
KL
.
2009
.
The repeatability of behaviour: a meta-analysis
.
Anim Behav
.
77
:
771
783
.
Brommer
JE
.
2013
.
On between-individual and residual (co)variances in the study of animal personality: are you willing to make the individual gambit?
Behev Ecol Sociobiol
.
67
:
1027
1032
.
Brommer
JE
Kluen
E
.
2012
.
Exploring the genetics of nestling personality traits in a wild passerine bird: testing the phenotypic gambit
.
Ecol Evol
.
2
:
3032
3044
.
Butler
DG
Cullis
BR
Gilmour
AR
Gogel
BJ
.
2009
.
ASReml-R Reference manual version 3.0
.
State of Queensland, Australia: Department of Primary Industries and Fisheries
.
Carere
C
Groothuis
TG
Möstl
E
Daan
S
Koolhaas
JM
.
2003
.
Fecal corticosteroids in a territorial bird selected for different personalities: daily rhythm and the response to social stress
.
Horm Behav
.
43
:
540
548
.
Carere
C
van Oers
K
.
2004
.
Shy and bold great tits (Parus major): body temperature and breath rate in response to handling stress
.
Physiol Behav
.
82
:
905
912
.
Cockrem
JF
.
2007
.
Stress, corticosterone responses and avian personalities
.
J Ornithol
.
148
:
169
178
.
Cockrem
JF
Silverin
B
.
2002
.
Sight of a predator can stimulate a corticosterone response in the great tit (Parus major)
.
Gen Comp Endocrinol
.
125
:
248
255
.
Dammhahn
M
.
2012
.
Are personality differences in a small iteroparous mammal maintained by a life-history trade-off?
Proc Roy Soc B
.
279
:
2645
2651
.
Dingemanse
NJ
Dochtermann
NA
.
2013
.
Quantifying individual variation in behaviour: mixed-effect modelling approaches
.
J Anim Ecol
.
82
:
39
54
.
Dingemanse
NJ
Dochtermann
NA
Nakagawa
S
.
2012
.
Defining behavioural syndromes and the role of ‘syndrome deviation’ in understanding their evolution
.
Beh Ecol Sociobiol
.
66
:
1543
1548
.
Dingemanse
NJ
Réale
D
.
2005
.
Natural selection and animal personality
.
Behaviour

142
:
1165
1190
.
Duckworth
RA
.
2006
.
Behavioral correlations across breeding contexts provide a mechanism for a cost of aggression
.
Behev Ecol
.
17
:
1011
1019
.
Garamszegi
LZ
Eens
M
Török
J
.
2009
.
Behavioural syndromes and trappability in free-living collared flycatchers Ficedula albicollis
.
Anim Behav
.
77
:
803
812
.
Garamszegi
LZ
Herczeg
G
.
2012
.
Behavioral syndromes, syndrome deviation and the within- and between-individual components of phenotypic correlations: when reality does not meet statistics
.
Behav Ecol Sociobiol
.
66
:
1651
1658
.
Gosling
SD
.
2001
.
From mice to men: what can we learn about personality from animal research?
Psychol Bull
.
127
:
45
86
.
Hakkarainen
H
Korpimäki
E
.
1994
.
Nest defence of Tengmalm’s owls reflects offspring survival prospects under fluctuating food conditions
.
Anim Behav
.
48
:
843
849
.
Hollander
FA
van Overveld
T
Tokka
I
Matthysen
E
.
2008
.
Personality and nest defence in the great tit (Parus major)
.
Ethology
.
114
:
405
412
.
Kluen
E
de Heij
ME
Brommer
JE
.
2011
.
Adjusting the timing of hatching to changing environmental conditions has fitness costs in blue tits
.
Behav Ecol Sociobiol
.
65
:
2091
2103
.
Kluen
E
Siitari
H
Brommer
JE
.
2014
.
Testing for between-individual correlations of personality and physiological traits in a wild bird
.
Behav Ecol Sociobiol
.
68
:
205
213
.
Knight
RL
Temple
SA
.
1986
.
Why does intensity of avian nest defense increase during the nestling cycle?
The Auk
.
103
:
318
327
.
Kokko
H
Jennions
MD
.
2008
.
Parental investment, sexual selection and sex ratios
.
J Evol Biol
.
21
:
919
948
.
Kontiainen
P
Pietiainen
H
Huttunen
K
Karell
P
Kolunen
H
Brommer
JE
.
2009
.
Aggressive Ural owl mothers recruit more offspring
.
Behev Ecol
.
20
:
789
796
.
Koolhaas
JM
Korte
SM
De Boer
SF
Van Der Vegt
BJ
Van Reenen
CG
Hopster
H
De Jong
IC
Ruis
MA
Blokhuis
HJ
.
1999
.
Coping styles in animals: current status in behavior and stress-physiology
.
Neurosci Biobehav Rev
.
23
:
925
935
.
Lessells
CM
McNamara
JM
.
2012
.
Sexual conflict over parental investment in repeated bouts: negotiation reduces overall care
.
Proc Biol Sci
.
279
:
1506
1514
.
Lynch
M
Walsh
B
.
1998
.
Genetics and analysis of quantitative traits
.
Sunderland, (MA)
:
Sinauer
.
Montgomerie
RD
PJ
.
1988
.
Risks and rewards of nest defence by parents birds
.
Q Rev Biol
.
63
:
167
187
.
Moore
AJ
Brodie
ED
Wolf
JB
.
1997
.
Interacting phenotypes and the evolutionary process: I. direct and indirect genetic effects of social interactions
.
Evolution
.
51
:
1352
1362
.
Nakagawa
S
Schielzeth
H
.
2010
.
Repeatability for Gaussian and non-Gaussian data: a practical guide for biologists
.
Biol Rev
.
85
:
935
956
.
Nicolaus
M
Tinbergen
JM
Bouwman
KM
Michler
SP
Ubels
R
Both
C
Kempenaers
B
Dingemanse
NJ
.
2012
.
Experimental evidence for adaptive personalities in a wild passerine bird
.
Proc Biol Sci
.
279
:
4885
4892
.
Pruitt
JN
Riechert
SE
.
2009
.
Sex matters: sexually dimorphic fitness consequences of a behavioural syndrome
.
Anim Behav
.
78
:
175
181
.
R Development Core Team
.
2012
.
R: a language and environment for statistical computing
.
Vienna (Austria)
:
R Foundation for Statistical Computing
.
Réale
D
SM
Sol
D
McDougall
PT
Dingemanse
NJ
.
2007
.
Integrating animal temperament within ecology and evolution
.
Biol Rev
.
82
:
291
318
.
Rytkönen
S
.
2002
.
Nest defence in great tits Parus major: support for parental investment theory
.
Behav Ecol Sociobiol
.
52
:
379
384
.
Schuett
W
Tregenza
T
Dall
SRX
.
2010
.
Sexual selection and animal personality
.
Biol Rev
.
85
:
217
246
.
Schürch
R
Heg
D
.
2010
.
Life history and behavioral type in the highly social cichlid
Neolamprologus pulcher
. Beh Ecol.
21
:
588
598
.
Sih
A
Bell
A
Johnson
JC
.
2004
.
Behavioral syndromes: an ecological and evolutionary overview
.
Trends Ecol Evol
.
19
:
372
378
.
Sih
A
Bell
A
Johnson
JC
Ziemba
RE
.
2004
.
Behavioral syndrome: an integrative overview
.
Q Rev Biol
.
79
:
241
277
.
Silverin
B
.
1998
.
Behavioural and hormonal responses of the pied flycatcher to environmental stressors
.
Anim Behav
.
55
:
1411
1420
.
Sprenger
D
Dingemanse
NJ
Dochtermann
NA
Theobald
J
Walker
SP
.
2012
.
Aggressive females become aggressive males in a sex‐changing reef fish
.
Ecol Lett
.
15
:
986
992
.
Stamps
JA
Groothuis
TGG
.
2010
.
Ontogeny of animal personality: relevance, concepts and perspectives
.
Biol Rev
.
85
:
301
325
.
Svensson
L
.
1992
.
Identification guide to European passerines. Stockholm
: L. Svensson (privately published).
Wilson
AJ
Gelin
U
Perron
MC
Réale
D
.
2009
.
Indirect genetic effects and the evolution of aggression in a vertebrate system
.
Proc Biol Sci
.
276
:
533
541
.
Wilson
DS
.
1998
.
Adaptive individual differences within single populations
.
Phil Trans R Soc B
.
353
:
199
205
.
Wingfield
JC
Hunt
K
Breuner
C
Dunlap
K
Fowler
GS
Freed
L
Lepson
J
.
1997
.
Environmental stress, field endocrinology, and conservation biology
. In:
Clemmons
JR
Buchholz
R
, editors.
Behavioral approaches to conservation in the wild
.
London (UK)
:
University Press
. p.
95
131
.
Wingfield
JC
Sapolsky
RM
.
2003
.
Reproduction and resistance to stress: when and how
.
J Neuroendocrin
.
15
:
711
714
.
Winkler
DW
.
1987
.
A general model for parental care
.
Am Nat
.
130
:
526
543
.
Wolf
M
van Doorn
GS
Weissing
FJ
.
2008
.
Evolutionary emergence of responsive and unresponsive personalities
.
.
105
:
15825
15830
.
Zar
JH
.
1999
.
Biostatistical analysis
.
4th ed
.
London UK
:
Prentice-Hall International
.

## Author notes

Handling editor: Wolfgang Forstmeier