Specialized visual learning of facial signals of quality in the paper wasp, Polistes dominula

Some primates and one species of paper wasp recognize faces using speciﬁc processing strategies to extract individual identity information from conspeciﬁc faces. Explanations for the evolution of face specialization typically focus on the complexity associated with individual recognition because all currently identiﬁed species with face specialization use faces for individual recognition. In the present study, we show an independent evolution of face specialization in a paper wasp species with facial patterns that signal quality rather than individual identity. Quality signals are simpler to process than individual identity signals because quality signals do not require simultaneous integration across multiple stimuli or learning and memory. Therefore, the results of the present study suggest that the complexity of processing may not be the key factor favouring the evolution of specialization. Instead, the predictable location of socially important signals relative to other anatomical features may allow easy categorization of features, thereby favouring specialized visual processing. Given that visual quality signals are found in many taxa, speciﬁc-processing mechanisms for social signals may be widespread. © 2014 The Linnean Society of London, Biological Journal of the Linnean Society , 2014, 113 , 992–997.


INTRODUCTION
Communication requires coordination between information production by senders and reception by receivers (Bradbury & Vehrencamp, 2011). It is well documented that the signal form is strongly shaped by receiver sensory and cognitive abilities such that signals evolve to be conspicuous and detectable by receivers (Endler et al., 2005). However, less is known about the evolution of signal reception. Theory predicts that signal reception should evolve to efficiently extract information from sender phenotypes (Enquist & Arak, 1993), although little empirical work has examined which aspects of social stimuli influence receiver cognition. In particular, it is unclear whether receivers commonly possess cognitive adaptations for processing signals and what aspects of signals influence the evolution of receiver processing.
One of the best studied examples of signals shaping receiver cognition is the specialized processing associated with individual recognition of faces (Parr, 2011;Sheehan & Tibbetts, 2011). Although frequently connected in the literature, individual recognition and face-learning are two distinct phenomena. Individual recognition refers to an ability to remember particular individuals based on their unique phenotypes (Tibbetts, Sheehan & Dale, 2008). Specialized facelearning refers to the phenomenon where differences between images of faces (e.g. individual identity, emotions, etc.) are processed using cognitive mechanisms distinct from those used in general pattern recognition (Yovel & Kanwisher, 2004). In both primates and wasps, conspecifics learn individual identity via specialized, face-selective cognitive mechanisms.
A result of face specialization is that the alteration of natural face-images (e.g. turning faces upside down) leads to reduced discrimination, even when the manipulations do not alter the information in the images. For example, the paper wasp Polistes fuscatus learns to discriminate images of conspecific faces faster than simple patterns or other natural images (Sheehan & Tibbetts, 2011), suggesting that this species is particularly attuned to differences in conspecific faces. Interestingly, slight alterations to normal face-images (i.e. removal of the antennae) lead to decreased rates of learning, showing that discrimination of the face-images depends on detecting a face within the image (Sheehan & Tibbetts, 2011). The convergent evolution of face-specific visual learning mechanisms across disparate taxa is striking, although the factors that favour face specialization are poorly understood.
The foremost hypothesis for the evolution of facespecific learning is that specialization is favoured because learning individual identity from faces is such a complex task (Leopold & Rhodes, 2010;Parr, 2011). Individual recognition of faces is complex because observers must attend to and integrate information across multiple features to extract the relevant information (e.g. the second-order configuration of facial features in humans) (Parr, 2011). In addition, learning and remembering many unique individuals is considered to be cognitively taxing (Tibbetts & Dale, 2007). The convergent evolution of specialized mechanisms for the individual recognition of faces in primates and wasps is consistent with the complexity hypothesis because individual recognition in both taxa requires integration of information from multiple stimuli and flexible learning and memory. Additionally, there are no known examples of face specialization in taxa that lack individual recognition. Therefore, the demands of processing and remembering complex identity information could favour the evolution of dedicated processing solutions.
The predictability of signalling stimuli has also been hypothesized to favour specialized learning abilities (Gould & Marler, 1984). This hypothesis was originally developed in the context of avian imprinting (Bateson, 1966) because imprinting is based on specific configurations of adult features that are predictable across generations. The predictable arrangement of certain features facilitates categorization and visual search (Avarguès-Weber et al., 2010) because individuals focus on the most informative visual features (Wolfe, 1994;Yang & Zelinsky, 2009). Accordingly, stimuli that maintain a predictable configuration over evolutionary time may favour the evolution of learning mechanisms that are specialized for extracting information embedded within the expected stimulus configuration (Bateson, 1966).
Faces have a common structure, and so the location of variation is predictable, which may favour the evolution of processing mechanisms that are attuned to the predicted stimulus arrangements. This could give rise to improved learning of features showing normal facial configurations in primates (Pascalis & Kelly, 2009) and wasps (Sheehan & Tibbetts, 2011). Although categorization and specialization are related, it is important to note that they are distinct. Many stimuli that are easily categorized are not learned via specialized mechanisms but rather use general visual processing mechanisms (e.g. animals can readily categorize novel classes of stimuli) (Wu et al., 2013). The predictability hypothesis posits that specialized cognitive mechanisms will arise to process stimuli that are predictable and straightforward to categorize if the information present in the stimuli is socially or sexually important.
The complexity and predictability hypotheses can be distinguished by testing for face specialization in species that differentiate among conspecifics but lack individual recognition. This scenario occurs in some animals with quality signals, where variation in a particular trait signals information about the bearer's quality but not their individual identity (Bradbury & Vehrencamp, 2011).
If signal complexity favours specialized visual learning, quality signals are not expected to be learned in a specialized manner because quality signals are much less complex than identity signals. Individual identity traits are relatively complex because they are composed of multiple traits that vary independently (e.g. eyes, nose) and must be integrated for recognition. By contrast, quality signals vary along a single, continuous axis (Dale, 2006) (Fig. 1), and so a single template can be used to assess all individuals in a population. Furthermore, quality signals do not require learning and memory, whereas individual recognition depends on learning and memory.
If signal predictability favours specialized visual learning, quality signals are predicted to be learned in a specialized manner because quality-signalling stimulus variation occurs within signals that have predictable bounds relative to the overall structure of the face. The informative features of both quality and identity signals are predictable. For example, in Polistes wasps, variation is constrained to certain facial areas and there is a predictable range of possible variants (Sheehan & Tibbetts, 2010).
In the present study, we use the paper wasp P. dominula to test the complexity and predictability hypotheses. Polistes dominula has a well-studied quality signal that consists of variation in the black markings on the yellow clypeus (Tibbetts & Dale, 2004). Although there is abundant evidence showing that P. dominula differentiates among conspecifics based on black facial markings that signal high versus low agonistic ability (Tibbetts & Lindsay, 2008; SPECIALIZED LEARNING OF QUALITY SIGNALS 993 Tibbetts & Izzo, 2010), experimental tests have shown that P. dominula does not recognize individuals. Unlike P. fuscatus, which learn and remembers the identity of conspecifics, P. dominula show no evidence of learning the identity of individuals with whom they have previously interacted (Sheehan & Tibbetts, 2010).
We test whether P. dominula use face-specific mechanisms to distinguish among individuals based on variation in their quality signals by comparing wasps' abilities to learn pictures of normal faces and faces that are experimentally altered by digitally removing the antennae ( Fig. 2A). In our previous work, we demonstrated face-specific learning of individual identity signals in P. fuscatus using the same method (Sheehan & Tibbetts, 2011). Comparing learning of normal and altered face-images provides a particularly good test of face specialization because manipulated faces are composed of the same colours and patterns as normal faces, although alteration may prevent the perceptual system from identifying the stimuli as a face (Chittka & Dyer, 2012).
If the complexity of the discrimination task is the main driver of the evolution of specialized face learning, face specialization will only occur in species that extract complex information from faces. Therefore, P. dominula is not expected to use face-specific mechanisms to differentiate between simple quality signals. They should learn to discriminate between pairs of faces and antennaeless faces equally well. Alternatively, if the predictable location of signal information favours face specialization, we expect that P. dominula will exhibit specialized face learning because the P. dominula quality signal occurs in a predictable location, the centre of the clypeus.

MATERIAL AND METHODS
Training procedures for P. dominula followed those previously described for P. fuscatus and P. metricus (Sheehan & Tibbetts, 2011;Tibbetts & Sheehan, 2013). Wild-caught foundresses were trained in a negatively reinforced T-shaped maze, with one arm leading to a reward (in this case, the absence of an electric shock). All wasps used in training were wildcaught foundresses captured near Ann Arbor, Michigan, in the early stage of the nesting cycle. We trained 12 wasps each to discriminate pairs of normal and antennae-less faces, although one wasp trained on normal faces escaped in the middle of training and so was excluded from the analysis. We used three pairs of unmanipulated face images and the manipulated versions of the same face images as our stimuli ( Fig. 2A). Each wasp was exposed to one image pair during training. Importantly, altering the antennae does not alter the appearance of the quality-signalling black marks on the face of P. dominula. The correct image and the location of the non-electrified portion of floor switched from right to left in a pre-determined pseudo-random order. Wasps could learn the location of the non-electrified portion of the maze by the location of the correct image within the maze. Each wasp was trained to distinguish between a single pair of images over the course of 50 consecutive trials. A wasp made a 'choice' when it entered one of the arms of the maze. All wasps were used for one training session on either normal or antennae-less faces and were naïve to our maze set-up prior to testing.
We examined rates of learning using a binomial logistic regression, which accounts for the rate of change in the number of correct choices over the course of 50 trials (Hartz, Ben-Shahar & Tyler, 2001). The dependent variable was whether or not wasps made a correct choice. The independent variables were trial and the interactions between trial and image type (normal or antenna-less faces) ( Fig. 2A). The variable of critical importance in testing differences in learning is the interaction term (Hartz et al., 2001). We also considered the number of individuals that reached a specified criterion within a given block of ten trials, as has been done in other studies (Kendrick et al., 1996). We set our criterion as 70% correct or greater.

RESULTS
Consistent with face-specific learning of a quality signal, P. dominula wasps learned normal conspecific faces far more rapidly and accurately than the same images without antennae (Fig. 2B) (generalized estimating equation, image type × trial: Wald χ 2 = 40.7, d.f. = 1, P < 0.0001). By the second block of trials, over half of the wasps trained to discriminate normal face images had reached criterion (Fig. 2C), although none of the wasps trained to discriminate antenna-less face images reached criterion.

DISCUSSION
Discriminating among conspecifics based on their quality signals depends on face-specific learning mechanisms in P. dominula because wasps trained to distinguish normal faces learned these faces faster and more accurately than wasps trained to distinguish faces without antennae. Although antennae are not variable across individuals, they likely provide an essential cue that facilitates face discrimination. Without this cue, P. dominula apparently do not register images as faces and have difficultly learning to discriminate between them.

SPECIALIZED LEARNING OF QUALITY SIGNALS 995
The clear difference in learning abilities when wasps were shown the same images with and without antennae provides striking evidence for specialized visual learning in P. dominula (Chittka & Dyer, 2012). Similar deficits of learning altered faces are shown in primates, where inverting faces leads to decreased performance in discrimination tasks (Yovel & Kanwisher, 2004;Adachi, Chou & Hampton, 2009). In both P. dominula and P. fuscatus wasps, it appears that images are first categorized as faces based on the presence of features common to all conspecific faces, such as antennae. Then, the quality or identity information present on the faces is assessed. In both species, image discrimination is reduced when the face is altered (Sheehan & Tibbetts, 2011;present study).
Face learning in P. dominula and P. fuscatus likely refelects independent origins of face-specific learning. The two species' signals arose independently and the common ancestor of Polistes likely lacked either type of signal (Tibbetts, 2004). Furthermore, other species, such as P. metricus, which lack signals, also lack specialized face learning (Sheehan & Tibbetts, 2011).
The occurrence of face-specific processing of a quality signal suggests that the predictable placement of social information rather than the complexity of information processing favours the evolution of specialized visual learning. The cognitive demands of assessing quality signals are much less than recognizing individuals (Dale, 2006), yet specialized face learning is found in species with both signal types. This suggests that the complexity of information processing is not the major driver of face specialization. Although quality and identity signals differ in processing complexity, they share an important trait: both traits occur in a fixed, predictable location relative to other facial features within a species (Fig. 1).
The results of the present study suggest that the common first-order configuration of relevant facial information across conspecifics is the crucial feature favouring specialized learning. Indeed, previous studies have shown that other insects such as honeybees can use the configuration of features to discriminate among human face-like stimuli (Avarguès-Weber et al., 2010). Thus, it is not surprising that insects use the configuration of conspecific features to process social information. Because the quality signal is a singular feature, processing the information present in the quality signal per se does not require configural processing. Nevertheless, P. dominula wasps appear to be sensitive to the configuration of normal wasp facial features when discriminating between faces. The configuration of facial features in P. dominula likely provide species information, which may prime the wasps for locating the quality-signalling stimulus. The predictability of informative stimuli relative to other fixed body features may facilitate the evolution of specialized cognitive subroutines and efficient extraction of important social information.
One interesting aspect of results is that P. dominula performed so poorly on antennae-less faces (Fig. 2). A similar treatment reduced rates of face-learning in P. fuscatus, although P. fuscatus learned the antennaeless face images, whereas P. dominula did not (Sheehan & Tibbetts, 2011). One possible explanation for why the two species show different magnitudes of response to antennae removal is that the facial pattern differences are much more subtle in P. dominula the P. fuscatus. Perhaps P. dominula are able to better locate and attend to the relatively minor differences in black clypeus patterns by first categorizing the images as faces. Indeed, a recent comparative analysis of visual abilities across Polistes suggests that differentiating between small colour markings may be a challenging task for paper wasps (Sheehan, Jinn & Tibbetts, 2014).
The results of the present study indicate that specialized processing may be much more widespread than commonly considered because specialization plays an important role in information extraction from quality signals as well as individual identity signals. Visual displays are widespread across many animal taxa and are important in both social and sexual contexts (Baird, 2013;Tyers & Turner, 2013;Gluckman, 2014). Our results suggest that specialized visual learning mechanisms for efficiently extracting relevant information from displays may be similarly widespread. Although our work has focused on facial signals, it is plausible that other animals may use specialized mechanisms to learn other features, such as tails, wings, dewlaps, etc., depending on the location of relevant stimuli. Research on additional taxa will shed light on the evolution of cognitive strategies for extracting important social information.