Virtual simulated environments provide multiple ways of testing cognitive function and evaluating problem solving with humans (e.g., Woollett et al. 2009). The use of such interactive technology has increasingly become an essential part of modern life (e.g., autonomously driving vehicles, global positioning systems (GPS), and touchscreen computers; Chinn and Fairlie 2007; Brown 2011). While many nonhuman animals have their own forms of "technology", such as chimpanzees who create and use tools, in captive animal environments the opportunity to actively participate with interactive technology is not often made available. Exceptions can be found in some state-of-the-art zoos and laboratory facilities (e.g., Mallavarapu and Kuhar 2005). When interactive technology is available, captive animals often selectively choose to engage with it. This enhances the animal’s sense of control over their immediate surroundings (e.g., Clay et al. 2011; Ackerman 2012). Such self-efficacy may help to fulfill basic requirements in a species’ daily activities using problem solving that can involve foraging and other goal-oriented behaviors. It also assists in fulfilling the strong underlying motivation for contrafreeloading and exploration expressed behaviorally by many species in captivity (Young 1999). Moreover, being able to present nonhuman primates virtual reality environments under experimental conditions provides the opportunity to gain insight into their navigational abilities and spatial cognition. It allows for insight into the generation and application of internal mental representations of landmarks and environments under multiple conditions (e.g., small- and large-scale space) and subsequent spatial behavior. This paper reviews methods using virtual reality developed to investigate the spatial cognitive abilities of nonhuman primates, and great apes in particular, in comparison with that of humans of multiple age groups. We make recommendations about training, best practices, and also pitfalls to avoid.
Exploration of novel environments can be highly rewarding, even when presented virtually (Polizzi di Sorrentino et al. 2014). In some cases, nonhuman animal exploration of the technology is of equal value in its novelty, and can also be enriching. Under controlled experimental conditions, we can present interactive virtually realistic ‘built’, ‘naturalistic’, and other types of environments to explore through technology. These provide a multitude of ways to test cognitive processes and problem solving, most typically applied in evaluating humans (e.g., Woollett et al. 2009). Interactive virtual technology, such as autonomously driving vehicles, global positioning systems (GPS), touchscreen computers, and virtual reality, has increasingly become an essential part of modern life (Chinn and Fairlie 2007; Brown 2011).
Opportunities to utilize interactive technology are not as widely available for captive animals despite the abundant potential that exists in interactive systems on iPads, as well as virtual reality (Ackerman 2012; Perdue et al. 2012; Dolins et al. 2014). While many nonhuman animals, and in particular, some nonhuman primates species, are notable for being able to create and use tools (e.g., chimpanzees, bonobos, orangutans, and capuchins; Whiten et al. 2005; Fragaszy et al. 2004; Visalberghi et al. 1995), in most captive animal environments interactive technology is not provided. There are a few exceptions in some state-of-the-art zoos and laboratory facilities (e.g., Mallavarapu and Kuhar 2005; Martin et al. 2014; for films created to entertain captive primates, see details about the visual media art at http://rachelmayeri.com/about/). In a few zoo settings, interactive technology is used for testing cognitive and perceptual faculties of captive chimpanzees and other primate species (e.g., Edinburgh Zoo, Edinburgh, Scotland; Lincoln Park Zoo, Chicago, Illinois; and, The National Zoo, Washington, D.C.). When interactive technology is made available, it may enhance the animal’s sense of control over, and engagement with, their immediate surroundings (e.g., Clay et al. 2011; Sambrook and Buchanan-Smith 1997). It also helps to fulfil requirements of a species’ daily activity budget where problem-solving typically involves foraging behaviors. As such, it creates a parallel in fulfilling the underlying need for contrafreeloading and exploration expressed behaviorally by many species in captivity (Young 1999; 2003).
The authors of this article make a strong recommendation that more facilities, laboratories and zoos, provide interactive technology for captive animals. Even rats have been tested using joysticks in virtual reality (Doucet et al. 2016). If species as biologically diverse from each other as chimpanzees and rats can manipulate joysticks and touchscreens (e.g., Aronov and Tank, 2014), so too could many other species in a multitude of settings. For example, captive pachyderms, cetaceans, Psittaciformes (parrots and cockatoos), and corvids may experience environmental enrichment and cognitive stimulation by interacting with technology that presents virtual environments and virtual social counterparts.
Perceived and actual control over some aspects of one’s own environment is considered beneficial for the subjective well-being of humans (Bandura 1993; Owusu-Ansah 2008) and equally so for captive nonhuman animals and in particular, nonhuman primates (Badihi et al. 2007; Buchanan-Smith 2010). For example, the opportunity to use interactive technology in which to explore an environment was given to Kanzi, a bonobo at the Ape Cognition and Conservation Initiative facility in Des Moines, Iowa. Efforts were made to provide a mobile interactive device for Kanzi to explore sections of the facility where he lives but does not have access to or sometimes even visibility of, in the “human space” (space limited to only the humans) as opposed to the “ape space” (space where the apes live). While this is not using virtual reality as described in this article, it has important implications for how such interactive technology can be used and applied to address both theoretical research questions as well as the psychological well-being and welfare of captive primates.
In this article, we present the potential positive outcomes and reasons for using interactive virtual reality in testing nonhuman primates in captivity. We discuss the possible difficulties in presenting virtual reality to nonhuman animals and pitfalls to avoid when using this methodology. Then, as an example, we present the summary of a study that employed virtual reality testing of nonhuman primates in comparison with that of humans of varying age groups.
Issues of Translating Perception of Two-Dimension to Three-Dimension in Virtual Reality
In virtual environments, when navigating between locations an individual must be able to discriminate what is meant to be perceived as three-dimensional (3D) space from the actual two-dimensional (2D) presentation. Motion parallax is used to increase depth perception and is one of several depth cues used in visual perception (Nawrot and Stroyan 2009). Motion parallax occurs when we move our heads or bodies and nearby objects appear to move more rapidly across our field of vision compared to distant objects. Cats side-to-side head motion before pouncing is an example of motion parallax: it provides them with greater depth perception cues to ensure accuracy prior to their leap forward.
Human, chimpanzee, and bonobo visual perception is very similar (Matsuzawa 1990; Fagot and Tomonaga 1999; Kano and Tomanaga 2009). Interestingly, the four chimpanzees we tested on the interactive virtual reality platform (Dolins et al. 2014) showed what appeared to be motion parallax, by rocking back and forth during testing trials (A. Cowey, personal communication). Unlike the chimpanzees, Kanzi, a bonobo, did not rock back and forth but did increase saccadic eye movements very similar to that of humans when tested on the same virtual reality platform.
Point of view (POV) is limited in virtual reality compared to real, natural environments but not necessarily compared to “built” environments. However, the distortions of close-up visual images in virtual reality can limit the ecological validity of the situation and problem-solving context. We tested human and nonhuman ape participants using a large, flat-screen monitor, which did not present an immersive experience of the virtual environment. In this version of virtual reality, the 2D visual stimuli appear as 3D. Immersive environments may more realistically convey such primary depth cues as stereopsis, motion parallax, and occlusion through the presentation of separate offset images to the left and right eye using either projection systems or headsets. Stereopsis is the perception of depth and three-dimensional structure that arises when the brain combines the two retinal images. While it is true that conventional single image 3D displays cannot achieve true stereopsis effects, neither motion parallax nor occlusion depend on binocular vision as is easily seen by shutting one eye and moving your head from left to right. As Doucet et al. (2016) wrote about testing nonhuman animal navigation in virtual environments, “flat computer monitors are sufficient for animals with frontally positioned eyes and stereovision (i.e., primates).” as compared to animals with wide-set eyes (“laterally positioned”), such as birds and rodents (p. 91). Immersive virtual environments also have the potential to cause motion sickness to a greater degree, due to the “delay between user action and change on the visual display” (Doucet et al. 2016, p. 91).
From the participant’s POV, objects further away move more slowly and can be occluded by objects nearby. What is more essential than stereopsis to heightened immersion is how well the subject identifies with the camera’s point of view. In immersive environments, POV changes automatically through head and eye movements. In non-stereoscopic 3D displays, such as the ones used in these experiments, the translation and rotation of the viewpoint depends on joystick manipulations, an admittedly less intuitive interface at first but one to which the subject can learn to accommodate.
To emulate reality virtually, we have adopted a “first person” point of view. That is, the scene is viewed from the avatar’s point of view and changes as the avatar’s invisible head is translated and rotated through joystick controls. This is arguably more realistic than a “third person” POV, where the POV is somewhere above and behind the avatar. The ability to imagine oneself in such a space, seeing through the avatar’s “eyes” paired with evocative sensory stimuli, can create conditions similar to being in an immersive virtual environment or in the real environment (Mitchell 2002). Imagination is a powerful aspect of all cognition and when presented with concomitant visual and auditory stimuli that support the imagined and projected experiences, it can be a powerful exploratory experimental tool.
Overwhelmingly, studies with nonhuman animals (the limited number that exist to date) that have used virtual reality as a research tool, have implemented a virtual reality system that is interactive but not always immersive (as in a cave or with headsets). Moreover, some the most cited virtual reality studies in the past 19 years from the lab of Professor Eleanor Maguire (University College London), were conducted using interactive, nonimmersive virtual reality with human subjects. No criticism has been leveled at her studies as not fulfilling the requirements of using “virtual reality”. The definition as well as the apparatus of the virtual reality, therefore, as it has been and is currently used in multiple publications, is sufficiently flexible for comprehensive application for humans and other animal species (e.g., Maguire et al. 1997; Pine et al. 2002; Maguire et al. 2006; Woollett et al. 2009; Aronov et al. 2014; Dolins et al. 2014; and Doucet et al. 2016). Interactive VR of this type provides a change of point of view and motion feedback, as the subject navigates through the environment in comparison to the static objects, walls, ceilings, buildings, etc.
For the purposes of this article and for future studies employing virtual reality with nonhuman and human subjects, we define interactive virtual reality as the presentation of 2D stimuli that appear to be 3D, that is, the 3D-appearing stimuli are projected onto a 2D plane where subject’s actions taken within the environment follow the orientation and perspective of the observer’s gaze of the virtual physical parameters of that environment. In the virtual environment presented in the study described in this article, the appearance of the 3D stimuli provides pictorial depth cues, linear perspective, while also allowing for occlusion of objects, and these shift with the perspective framework of the viewer by orientation, distance, and position. Moreover, from the viewer’s perspective framework, the viewer is part of the scene with which they observe and interact. As such, in the interactive virtual reality system implemented for use with nonhuman and human primates in the study described, the environment parameters and action of the observers followed the position, distance and orientation of the observer’s gaze. Thus, although an immersive environment was not employed, the virtual system fulfills the requirements for virtual reality.
Interactive technology has been used to investigate nonhuman primates’ cognitive abilities, and is a practical and tangible approach to create both built and naturalistic conditions that maintain higher ecological validity. Yet, these environments also provide flexibility and a high degree of experimental control over variables. The results can be illuminating, particularly when used comparatively across multiple species. The capacity to present the exact same environments under almost the same experimental conditions allows for clear comparisons of similarities and differences in spatial behavior, problem-solving, and spatial memory. It also presents a unique opportunity to examine the generation and use of internal mental representations of landmarks and environments under differing conditions (e.g., small- and large-scale space; fewer and greater numbers of landmarks; open and closed space; etc.).
Virtual Control over the Environment for Captive Primates: Robo-Bonobo
Technology has assisted even nonhuman primates, such as captive bonobos, to explore their environment, and to interact with humans in unusual and interesting ways. Robo-Bonobo was a tele-presence robot designed for ape use (see Figure 1 for a photograph of the Robo-bonobo bot). The purpose of the Robo-Bonobo bot was to allow apes to explore areas outside their enclosures (in the human spaces) that were not visible or available to them. Under video guidance the bot could be directed down hallways, into offices, and even outside on suitable terrain. By setting up various environments to be navigated, we planned to conduct experiments on spatial cognition, navigation, and exploring Theory of Mind issues (deWaal 2016; Krupenye et al. 2016). Could the apes watch and remember, for example, where objects were hidden outside their enclosures, and could they after a period of time, direct the robot to these same locations? Could they guide the robot to make correct “Sally-Anne” choices in Theory of Mind experiments (the Sally-Anne task is used in developmental psychology to assess an individual’s ability to attribute false beliefs/Theory of Mind to others; Tager-Flusberg 2007)? Apes are naturally curious and the ability to explore inaccessible parts of their environment was thought to be an excellent enrichment activity in itself. Using the bot to follow, caregivers would also allow the apes to keep informed of local events important to them and to interact at a greater range with human visitors. We anticipated that the apes would quickly adapt the bot to the customary “chase” games that they love to play with visitors.
The Robo-bonobo bot consisted of a two level platform 50 cm square supported by two 20 cm drive wheels and a supporting caster in front. The bot was controlled through an industrial-strength joystick, and was connected wirelessly to an onboard computer.
The joystick was available inside the ape enclosure and could steer the bot in any direction. Speed was constant at approximately 4 km/h. The bot worked best on flat and level surfaces but could navigate a smooth 10-degree incline. An animatronic chimpanzee head was mounted on the top platform and could emit ape vocalizations under joystick button control. A second button controlled a forward facing water gun used for interactive “chase” play with visitors. A video camera mounted on the top platform streamed pictures to a monitor mounted outside the ape enclosure near the joystick. To ensure that the bot did not collide with persons or objects while under ape control, detection sensors were mounted around the platform periphery and stopped all bot motion when obstacles were detected.
Kanzi, one of the bonobos at the ape facility, was trained on the Robo-Bonobo bot and was beginning to learn how to navigate it successfully. Unfortunately, the Robo-bonobo bot was destroyed in a 2008 flood at the Des Moines facility. Since then, many companies have marketed telepresence robots allowing joystick control and video streaming. The next incarnation of the Robo-Bonobo bot will likely leverage and build on these readily available platforms. We expect this version to include the capacity for two-way video interaction so that caregivers and apes could both see each other. In addition, by mounting a lexigram keyboard next to the joystick (a lexigram is a pictogram that refers to a word or phrase known by the bonobos), we could give the apes the ability to “talk” to the caregivers and seek assistance. For example, Kanzi might navigate the bot into a particular caregiver’s office and click on the lexigram “apple” for the caregiver to hear.
Training Non-Human Primates for Testing Using Virtual Reality: the Challenge of Joystick and Visual Coordination
Typically, when nonhuman primate species are tested on cognitive tasks such as judging amounts or number line representation, lexigram recall and use in communication, navigation or types of cause-and-effect problem solving, they are likely to be given objects to handle or given a view of the pieces of a problem. They are then required to respond in order to achieve the final solution and obtain a reward. For virtual tasks using joystick manipulation (such as for navigation and testing spatial cognitive abilities), subjects are requested to hold the joystick and shift it to alter the position of the cursor while simultaneously watching the movement on a flat screen monitor. In this way, they can achieve a solution to a task and obtain a food reward. Training such coordination involves a number of steps for an ape or monkey. Once they have achieved this coordination, however, the type of cognitive task presented can be varied, and can build in complexity.
Given that manipulating a joystick and coordinating it with the cursor on a screen is not typical of most species’ natural behavioral repertoires, including humans, learning this coordination may require many trials. Individual nonhuman primates vary as to what types of tasks they enjoy, and their interest is often correlated with being motivated when successful on a task; this applies also to joystick training. If they become frustrated in the training and/or the testing, their interest and motivation may diminish. Cognitive tasks are best completed when nonhuman primate subjects are willing volunteers. Their interest in solving the task and their attention to the parts of the problem are fundamental to the learning process.
In training a nonhuman primate to use virtual reality in an experimental task, the physical set-up of the joystick in relation to that of the monitor is important. An appropriate set-up will maximize the subject’s attention and potential coordination of the joystick with movement of the cursor on the screen. The ape or monkey should be able to sit in a position where they can manipulate the joystick with full range, while also visually following the cursor moving on the screen in relationship to their hand movement (Richardson et al. 1990; Washburn and Rumbaugh 1992; Dolins et al. 2014). This coordination is key to test subjects being successful.
It is optimal to use a large a monitor and even better with a large, curved screen monitor. The improved width affords a greater POV and widens the visual scene. Such experimental tasks are more ecologically valid in terms of the realistic way in which primate vision works—affording a wider field of vision than would normally be available on a typical television screen or small computer monitor. The quality of the image is also important. The higher quality the digital image is, the more likely the 3D depth cues will be supported, albeit on a 2D screen.
To test the apes, as in the study described in this article (Dolins et al. 2014), we used a modified Logitech joystick mounted inside of a pre-installed “food box”. For this task, the joystick handle was modified and reduced in size for the apes’ hands: for the chimpanzees’ hands it was made smaller (and should also smaller for testing monkeys); for the bonobos, it was shaped into a “T” with the use of PVC pipe. These adjustments were done to facilitate the comfort of the subject’s grip. Commercial joysticks are not made with nonhuman primate ergonomic grips in mind: chimpanzees and bonobos have significantly longer fingers than humans, while their thumbs are substantially shorter in relation to their other digits. Thus, the goal is to provide the subject with a grip in which their fingers should be able to manipulate the joystick with ease.
Training an ape or monkey to use the joystick (for virtual navigation or any virtual task) has four main phases (Figure 2). In all four phases, providing food and verbal rewards are essential; the use of a clicker during training is also possible.
We used a specially designed software program modeled on the SIDE task of the NASA/LRC Computerized Test System for training apes to use the joystick in coordinating movement of the cursor on the monitor (Richardson et al. 1990). This program assists in the training of the individual primate to manipulate the cursor towards multiple goals across “dead-space”. As the training progresses, the size and number of goals decreases.
The first phase of training (Figure 2) is to familiarize the subject with handling the joystick and shifting it in all cardinal directions. The program presents a simple training regimen with thick, wide green borders along the four sides of a square against a black background. Each border is a goal, so that when the cursor touches one of the walls, a positive sound is emitted (a bell) and the subject is provided with a preferred food reward that they do not normally receive in their daily diet (e.g., grapes, orange slices, blueberries, juice, etc.).
When the subject had achieved the criteria of 80% successful trials of a specified number (e.g., 16 of 20 trials), we introduced the next step of training. We typically trained in blocks of trials, or until the ape became tired of the session, demonstrating less attentiveness and/or a desire to end the session (by moving to the door of the room, or gesturing to finish).
In the second phase in the training program, the borders can be made thinner so that over training trials, the subject has to push or pull the joystick for a longer duration and distance from the start point of the middle of the screen to the border. In the second phase, subjects should become familiarized with moving the joystick and attending to movement on the monitor simultaneously. Subjects should be actively encouraged to look at the screen while moving the joystick with rewards by the experimenter, who can point to the screen and the cursor specifically to encourage movement in a desired direction. When subjects achieved criteria, we moved to the third phase.
In the third phase, the borders can be narrowed as well as thinned (Figure 2). These changes require the subject to manipulate the joystick with more precision from the start point to localize one of the four border-goals. In the fourth training phase, the program allows for the number of green borders to be reduced from four to three to two and finally to one. With each reduction, the likelihood the subject will connect with a border-goal by chance alone decreases. This should serve to enhance their attentiveness to the goal object(s) on the screen. Reducing the width and number of borders trains the apes and monkeys to move towards where goal objects are on the screen, with greater flexibility (C. Menzel; personal communication).
There are several difficulties that may arise with this type of training of apes and monkeys. Preferences for unidirectional joystick movement can be difficult to eliminate. Subjects may intuitively pull the joystick towards them. The physical set-up of the testing apparatus and where the subject sits in relation to the joystick and screen may also influence unidirectional movement of the joystick. Figure 3A is a photograph of Kanzi, showing him using the joystick while in training on the virtual reality program. Modifications that may alleviate this issue are to ensure the subjects are positioned where they get most movement of their chosen hand and arm in manipulating the joystick in all four cardinal directions. Food and verbal rewards encourage these behaviors.
Use of virtual reality with nonhuman primates is most efficient when individuals are tested alone and with minimal distractions in the testing area (Evanset al. 2008). It is also important to keep the individual being trained or tested engaged with the task (Evans et al. 2008). Verbal encouragement and providing sufficient break times for the trainees helps. For example, in training apes, we found it is important to allow and even encourage frequent breaks for social interactions (e.g., play) with the experimenters or for the ape to occupy their time with some other task of their choosing (e.g., manipulating an object or looking through the glass at other apes nearby). This assists in maintaining their interest when they return to the experimental task.
Enhanced Ecological Validity and Experimental Control Using Virtual Reality
Some of the major issues in the study of spatial cognition with nonhuman animals are identifying what types of specific visuo-spatial information form the basis of spatial strategies and internal representations, and how spatial scale, spatial complexity and experience may interact with these factors. These are often difficult to address in field research; with the use of GPS data, the procedures are becoming increasingly more effective. However, using a controlled experiment in virtual reality, we can examine and identify navigational strategies, what types, locations and numbers of landmarks animals attend to, and the influence of virtual ecological factors in their navigational efficiency and spatial problem-solving. We can also address what types of strategies nonhuman animals generate in response to different types of environmental complexity and size. Additionally, we can indirectly assess types of internal spatial representations they have generated on which their spatial strategies are based.
There is a trade-off between the ecological validity and experimental control when conducting studies with captive and wild primates. There are difficulties assessing which landmarks are salient to the navigating animal in many free-ranging spatial-foraging studies (e.g., Garber and Dolins 2010). There is limited ecological validity in captive studies, and limitations of spatial scale. This leads to difficulties in generalizing the results with the wild population of that species. The virtual reality software program helps to overcome some of these methodological difficulties, presenting simulations of first-person perspective environments with varied landmark features, scale, and complexity in which the viewer can take actions and share being part of the scene in relation to, and with the virtual objects. Virtual reality has the potential to present a higher degree of ecologically valid spatial conditions than typical experimental designs. The viewer-based perspective that changes with actions enhances the subject experience. The researcher also has greater experimental control over landmarks and geometric features and thus precision in assessing subject’s attention to the various spatial cues available (e.g., De Lillo and James 2012). There is great flexibility in generating virtual environments with variations according to scale and complexity, and number and types of 2D and 3D landmarks and geometric features. Moreover, virtually simulated environments afford presentation of either naturalistic or built environments.
Traveling animals visit multiple foraging, sleeping, and resting locations. They face a different set of navigational challenges when locations and landmarks are visible and distant or not. Internal spatial representations of differing scales (e.g., small- and large-scale) may generate either topological (encoding of exaggerated distance, angle, and direction, with corrections re-adjusted at known sites, nodes, during navigation) or metric representations (encoding of actual distance, angle, and direction among multiple landmarks) (Maguire et al. 1997; Byrne and Janson 2007; Dolins 2009; Dolins and Mitchell 2010; Garber and Dolins 2010; Healy and Braithwaite 2010; Asensio et al. 2011). Moreover, wild primates’ knowledge of foraging sites must include updated ecological information based on the seasonal availability of fruit and amount available for consumption (taking into account group size) (Janmaat et al. 2013). Being able to evaluate their spatial memory for varied sites is difficult in field conditions but highly testable using virtual reality with captive populations.
The study described and summarized in this article uses virtual reality as a novel method to investigate the spatial cognitive abilities of captive chimpanzees Pan troglodytes (for the original and detailed presentation of this study, see Dolins et al. 2014) and one bonobo Pan paniscus. One of the key issues in establishing virtual reality as a viable method to investigate nonhuman and human primate spatial cognition comparatively is whether the nonhuman primates will perceive the virtual space correspondingly to that of their human counterparts. As such, we tested the performance of humans of varying age groups, chimpanzees and one bonobo with the aim of evaluating their relative ability to navigate in virtual space, and their attention to, and discrimination of, two different types of landmarks in environments of increasing complexity.
Important to the efficacy of virtual reality as an experimental method, is the ability to evaluate whether the nonhuman animal (primates, in this instance) will perceive the actual two-dimensional virtual space as three-dimensional in which to navigate within. Thus, in the summarized study we present in this article, the question we aimed to address was whether there was a significant degree of ecological validity of the visuo-spatial experience presented in virtual environments for nonhuman primates. Virtual environments are, by default, presentations of 3D visual stimuli on a 2D plane, although perceived and utilized by most humans as if a 3D space populated with objects, geometry, topographical features, and landmarks. Chimpanzees’ perception of 3D objects presented in a 2D format (on a computer monitor) has been demonstrated in visual search tasks of images of 3D objects presented against a set visual ground (Imura and Tomonaga 2008). The chimpanzee’s performance paralleled that of the human’s performance, with the chimpanzee’s exhibiting visual search patterns commensurate with the distribution of the 3D depth-cues against the background. They showed perception of the ground dominance effect, which is defined for both species as using ground, walls and ceiling as anchors for visually investigating forward-perceived features (Bian et al. 2005). Compared to humans, chimpanzees’ eye movements display patterns of shifting fixation more regularly and rapidly, and to increased locations on the stimulus (Kano and Tomonaga 2009). Overall, it appears that there is close similarity in chimpanzees’ and humans’ visual perceptual strategies and eye movements on visuo-spatial information presented in a 2D format, and likely very similar for bonobos, as close relations of both other ape species tested.
We presented a series of virtual simulated environments of increasing complexity (numbers of landmarks and choice points), relative scale, and open and maze environments to the subjects. Our aim was to determine how efficiently chimpanzees and bonobos could navigate, and the degree of correspondence in their performance in virtual environments, compared to that of humans. We measured performance for all three species on actual distance traveled from start to goal compared to an optimally generated distance. We additionally determined where the chimpanzees and bonobo would fall in comparison with a human developmental framework. Depending on the task, chimpanzee and bonobo intellectual abilities have been projected to parallel early human age trajectories (3- to 8-years old; Rumbaugh and Washburn 2003). For comparison, we tested children in a range of ages, from three to 12 years old in age groups of 3 to 4, 5 to 6, and 11- to 12-year-olds. We also tested adult humans (38–48 years old).
Specifically, our goals were to determine whether 1) chimpanzees, a bonobo, and humans of varying age groups would show parallel performance in navigating in the virtual environments measured by length of travel path, success in localizing the goal, decisions at choice points, and latency to achieve the goal; 2) chimpanzees, a bonobo, and humans would discriminate between positive (“go”) and negative (“don’t go”) landmarks in the virtual environments; and 3) whether increasing complexity (via number of landmarks) and size of environment would impact the performance of the chimpanzees, bonobo, and humans.
We predicted that as individuals gained experience with the directional cues in virtual space, even in more complex environments, their latency and path length would decrease, and decisions at choice-points would become more accurate, with fewer instances of backtracking to localize the goal.
The methods described here are a summary of those published in Dolins et al. (2014). We present the main methodological points but refer the reader to that paper for more detail.
The general methods we employed in the virtual test environments required participants to attend to directional information provided by the landmarks (positive or negative) to successfully localize the goal. Two landmark types, positive and negative, were presented in each environmental design (maze or open space) with the goal randomized per trial. The goal of this study was to test reliance on landmarks and not recall of pathways by kinesthetic feedback (right and left turns). In these environments, it was not necessary for participants to learn the geometric format to localize the goal (for what would be considered a metric strategy), however, doing so would increase their chance of distance reduction in path length (Dolins 2009; Garber and Dolins 2014; Dolins et al. 2014). The visual environment also presented a relatively homogenous set of walls and floors, so that the landmarks were the most salient cues providing direction to the goal (Lipman 1991; Dolins and Mitchell 2010).
We tested participants on tasks providing virtual environments of increasing complexity. These included T-mazes with consistent start and randomized goal locations, and open space designs (goal present randomly in one of eight locations or goal hidden behind one or two barriers). The same two landmarks were presented throughout all tests and were directional cues. There was a positive landmark, a 2D blue square indicating “go”, and negative landmark, 2D brown triangle indicating “don’t go”. The goal stimulus in all environments was a 2D image of a tree and 3D green ball. The cursor connecting with either goal object caused a ring tone to be emitted, signaling successful completion of that trial and the task.
The participants we tested comprised the four adult chimpanzees tested (Lana, Mercury, Panzee, and Sherman) who were housed at The Language Research Center, Georgia State University, and Kanzi, the one bonobo, who was housed in the Ape Cognition and Conservation Initiative, Des Moines, Iowa. These individuals were trained and tested using the virtual maze and open space environments in their familiar laboratory setting. The chimpanzees and bonobo have long-term experience with cognitive and perceptual tasks using joysticks and computers, and have lived in a language-rich environment since birth (Rumbaugh and Washburn 2003). Three of the four chimpanzees and the bonobo were symbol-referent (lexigram) trained on the lexigram board and understood a great deal of spoken English. All of the apes were volunteers on the virtual reality task, and the trials ended when they requested it or when they showed no more interest in continuing.
Of the 16 human participants we tested, these included: 12 children (equal number of male and female, four in each age group), aged 3 to 4, 5 to 6, and 11 to 12 years; and four adult humans (two males 43- and 49-years old, two females 38- and 48-years old). They were all presented with the same experimental virtual reality designs as those presented to the chimpanzees and bonobo. All human participants were tested in their homes or a familiar environment. Each parent or guardian signed consent forms for their child to take part in the study, and during testing a parent or guardian was always present.
All animal care, housing and testing procedures complied fully with Georgia State University’s and the Ape Cognition and Conservation Initiative’s Animal Care and Use Committee and with that of the USDA regulations on animal care and welfare. All research reported in this manuscript adheres to the University of Michigan principles for the ethical treatment of nonhuman animals. Testing of all human participants complied fully with the ethical standards set by the United Kingdom Home Office.
The testing apparatus consisted of an Apple computer, 47-inch flat screen monitor, and a modified Logitech joystick (the handle was altered to be ergonomic for the apes’ hand; Figure 3B) to test the chimpanzees and bonobo; human participants had an unmodified Logitech joystick. Food rewards were given to the chimpanzees and bonobo. The humans were given verbal rewards and a small token at the end of their testing session (children received colorful stickers or pencils; adults received gift certificates to a national bookstore chain).
The virtual reality program was specially created for this study. The virtual reality program we used in this study was written in C ++ and OpenGL. It allows for the presentation of virtual environments of varying scales and populated with varying numbers of two types of landmarks, as well as a start object and goal site (see Dolins et al. 2014 for more details). It also offers high-quality visual environments for conducting experiments. It has in-built flexible design, such as length and location of T-junctions in mazes, and capability to generate large open spaces that vary in scale and placement of barriers/walls, goal and landmarks. On a frame-by-frame basis, this virtual reality program automatically records the position and orientation of the cursor, which designates the navigation path of the subject. Performance is measured in milliseconds by path taken (in X, Y coordinates), latency from start to goal, and overall distance traveled. The program also records the sequence of movements so that the exact path can be replayed. It is possible to include an avatar that looks human, not like a nonhuman ape in this version of the program. However, avatars were not used in testing for the present study.
The virtual reality program affords sequences of different environmental designs (e.g., repeated trials and series of mazes in selected or randomized order) to be presented in automated trials. For automatic testing of animals, the computer can be connected to an automated food/treat delivery device, to provide a reward for every successful trial. We did not use this method, but handed each chimpanzee and bonobo a treat after each successful trial. In testing, when participants reached criterion performance (success on 80% of all trials for one environmental design), the program automatically shifted to the next environmental design. These were pre-set in randomized or specific order depending on training or testing requirements.
The start position in all virtual environments always opened up facing north. The virtual cardinal directions were designated according to the following: north = ‘joystick up’, south = ‘joystick down’, east = ‘joystick right’, and west = ‘joystick left’.
We presented participants with a minimum of 10–20 training trials on the virtual T-mazes and open space designs. Training trials on mazes presented a straight-alley maze (one alleyway, fixed start position, goal visible) and then a straight-alley maze (one alleyway, fixed start position, goal visible, two positive landmarks proximate to the goal). Training trials on the virtual open space design presented an open arena surrounded by boundary walls (with no additional visible barriers), a fixed start position and the goal located randomly in one of eight locations around the perimeter of the walls. On a straight-alley maze, when chimpanzee, bonobo and human participants reached criteria (16/20 or 80%), we moved to the testing phase.
We tested participants on three types of virtual T-mazes. The 1T-, 2T-, and 3T-mazes mazes each had a fixed start location but a randomized goal location on each trial, although always at the end of a distal alley. The 1T-maze presented one choice-point, the 2T-maze presented three choice-points, and the 3T-maze presented five choice-points.
We presented four types of virtual open space designs in testing. In the open space environments, the goal was hidden behind one of two opaque barriers designated by either positive or negative landmarks or the goal was behind various walls to be navigated around. In this way, the goal was not visible from the start location. Navigator’s movements were unrestricted in open space environments. In contrast, the multi-alley maze environments presented fixed alleyways on every trial that varied with a random goal location per trial (goal was not visible from the start position).
In the first open space test design, one 3D barrier was located to occlude the view of the goal. Two visible, positive landmarks were located on the wall, on either side of the barrier. The “barrier + 2landmarks + goal” array (One-Barrier design) was located randomly on each trial, in one of four locations. Increasing the complexity, the next open space design presented two barriers with the goal behind one barrier with associated positive landmarks; the other barrier had two associated negative landmarks. The two barriers’ locations were randomly located on two of the four walls: in conditions A and B, the barriers were in visual and spatial proximity to one another; in other conditions C and D, the barriers were divided along opposite walls. In the next design, increasing in complexity, large visual barriers created a complex set of alleyways. In these environments, the start location was fixed while the goal position was randomized per trial. The final design was the same construction but the start location was randomized per trial while the goal position remained fixed.
Human and apes participants volunteered to be tested. The chimpanzees and bonobo were asked to position themselves in front of a Plexiglas workstation (see Figures 3A–C and 4) where they could see the computer monitor while also reaching joystick they could manipulate from a sitting position. The joystick was located in a food port for safety of the ape and equipment. At the start of each session, a technician initiated the virtual reality program sequence, which automatically progressed through a set of pre-specified tests until either all the trials were finished or the participant declared or showed their desire to end the testing session. The experimenter initiated the sequence of trials and positioned herself where she was unable to see the screen and therefore unable to influence performance. For the chimpanzees and bonobo, a research technician assisted by giving food rewards at the conclusion of each successful trial. Verbal reinforcements were also given.
Data collected via the virtual reality program were automatically recorded in text files for each participant and trial. These data files contained movements in X, Y coordinates recorded per millisecond. We measured latency to localize the goal, path length traveled, and paths/path directions selected at choice-points (T-junctions in the mazes).
We assessed performance in the different species and age groups across environments types. We also evaluated degree of reliance on landmarks to localize the goal. We used the shortest path length calculated via Euclidean distances (taking into account barriers and alley structure from the start to goal location for each environment type), here referred to as the “optimal path”. For each participant trial completed, we calculated the path length traversed in that environment, referred to as the “actual path length” or distance traveled.
Participants’ “actual routes” were compared with that of a generated “optimal path” for each particular environment design, taking into account the placement of barriers. This computation was done using a purpose-written software program allied with the VR program generated data. The total length of the subject’s route for each trial was determined by measuring the sum total of distances between the test’s output of X, Y data point coordinates. The subject’s route distance was then compared to the distance of the optimal path, that is, the shortest distance from the starting point to the goal site.
Comparisons across participants provided a measure of travel efficiency (“shortest path ratio”). This was calculated as the ratio of (length optimal path)/(length animal’s path) (Menzel et al. 2002; Menzel and Menzel 2007; Dolins 2009; Dolins et al. 2014). Units of measurement for travel are in virtual meters and do not correspond to actual distances in real space. The closer the ratio was to 1.0, the more proximal that trial performance was to optimal; the closer the ratio was to 0.0, the less optimal the trial performance.
Maps generated for each trial by the analysis program overlaid the subject’s route data points with the given environmental design. These routes were examined pixel-by-pixel in the visual displays, relating all information to the maps (100 pixels per cell in a 35 × 35 cell design). The data generated included the following: decisions at choice points (defined as correct/incorrect), numbers of errors, latencies, touches/collisions onto objects in the virtual environments, and information about each virtual environment (e.g., where landmarks were located in relation to choice points and distance from start to goal).
We used a Euclidean metric and the Pythagorean Theorem to estimate the shortest possible path lengths (see Dolins et al. 2014, for more details, and Garber 1989; Menzel et al. 2002, pp. 607–608; Menzel and Menzel 2007).
We used a linear Mixed Model to analyze the data, and applied various filters for environment types, participant groups and interactive effects. The Model afforded a comparison of travel efficiency (shortest path ratio) using participant type (adult humans, teens, young children, and chimpanzees), with the chimpanzees as the comparison group. This permitted a determination of differences between participant types from the comparison group based on shortest path ratio.
The results presented here are a summary of those published in Dolins et al. (2014), with summary data from one bonobo tested. We refer the reader to the Dolins et al. (2014) paper for more details of the results.
We evaluated the data by the shortest path ratio: the closer to 1, the shorter the path length, the more efficient navigation performance. All participants, humans, chimpanzees, and the one bonobo, were successful in localizing the goal in all of the environment types (open space + barriers and mazes), with a few exceptions. The 3T-maze and complex (multi-alley) maze environments were challenging for the 3- to 6-year-olds and for two chimpanzees, and some of the adult humans stated that these maze types were more challenging compared to others. Kanzi, the one male bonobo tested, like Panzee, an adult female chimpanzee, excelled in all of the environment types.
The average shortest path ratio varied across species and age groups (see Figures 5–7 as examples). The average chimpanzees’ travel efficiency was most like the 3- to 6-year-old children (3 to 4-year-olds). Of the apes, Panzee and Kanzi were the most successful individual participants of the three species, with consistent shortest path ratios to the goal in all of the environment types. Panzee outperformed the two younger groups of human participants averaged across all environment types on travel efficiency, and used shorter distances to travel to the goal. Kanzi’s performance was equivalent or better than Panzee across trials in some of the environment types.
Participants in all three species exhibited decreasing travel accuracy with increasing degree of complexity of environment type. The shortest path ratio decreased as the complexity of mazes and open space environments increased; complexity was defined by the number of choice points per environment and distance from start to goal.
The travel efficiency path ratio varied by age and species. The younger children traveled the longest distances to reach the goals and made the most errors, while the 11- to 12-year-olds’ and the human adults travel had the shortest paths, other than Panzee and Kanzi. Individually, Kanzi and Panzee exhibited shortest path ratios consistently. As a group, however, the chimpanzees’ average travel behavior displayed less efficiency and greater distance than the older children and adult human counterparts. However, most of the 3- to 6-year-old children tested were not able to navigate in the most complex mazes (3T and Complex mazes), and so did not provide comparable data with the apes and the older humans.
With Panzee’s data set as the comparative filter for analysis, across environment type and using the shortest path ratio as the dependent variable, her performance in the virtual spatial tasks showed greater accuracy than that of the other chimpanzees and humans overall. The borderline significant difference in performance between Panzee and the humans favored Panzee when compared to the two younger groups of children (3 to 4 and 5 to 6-year-olds).
Increasingly, interactive technology, such as virtual reality, has become more prevalent (Chinn and Fairlie 2007) and has been found to have significant relevance in research with nonhuman animals. The opportunity for nonhuman animals to interact with dynamic forms of technology provides unique opportunities to test their cognitive and perceptual capabilities, and also provides a form of environmental enrichment for captive animals (e.g., Martin et al. 2014; Doucet et al. 2016).
This article reviews methods developed for investigating the spatial cognitive abilities of nonhuman primates, and apes in particular, using virtual reality to present virtually realistic environments, and it presents a very brief summary of results from Dolins et al. (2014), demonstrating that apes are capable of successfully navigating in virtual space. In addition, we have discussed the opportunities for using virtual methods to afford captive apes additional control for exploring and learning about their environments, and for interacting with humans in positive ways. We have also made recommendations as to best practices and pitfalls to avoid involved in training and testing captive apes when presenting them with the opportunities to interact with virtual reality.
The summary of the results above, investigating the use of virtual reality to compare the spatial cognitive abilities of four captive chimpanzees, one bonobo, twelve children and four adult humans, tasked them with navigating in increasingly complex virtual environments to localize the goal. Our objective was to determine how nonhuman primates’ performance compared with that of humans on travel efficiency, distance traveled, and their attention to landmarks to localize the goal in relation to the quality of decision-making at choice-points. From the parallels with the humans, and the successful performance of the nonhuman apes, we determined that virtual reality is a viable mode for testing chimpanzee and bonobo spatial cognitive abilities compared to that of humans and other primate species (e.g., monkeys). All three species tested were found to discriminate effectively between positive and negative landmarks and demonstrated attention to visuo-spatial features during navigation. Assessing path efficiency revealed that all three species and all age groups used relatively efficient, distance-reducing routes during navigation to localize the goal site.
Interestingly, in the most complex mazes (3T and Complex mazes), the humans’ performance was less accurate compared to two of the apes, one female chimpanzee, Panzee, and one male bonobo, Kanzi. Older children and adults reported the more complex environments as being “challenging” and “difficult”.
All participants’ performance showed a decrease in the shortest path ratio score as the mazes and open space environments increased in complexity (increasing number of turns, the goal was not visible until the last turn, and the increased total distance). However, the younger children’s performance decreased more so; they appeared to have paid insufficient attention to the directional cues, for example, when choosing to turn at choice points, irrespective of the landmark information. This suggests that the three apes species perceived the virtual environments sufficiently similarly to respond in equivalent ways, although age was a factor in successful performance.
The results suggest that the use of virtual reality to test captive primates’ problem-solving ability, and in particular, chimpanzees and bonobos, affords high validity. It also provides a verifiable method for direct cross-species comparisons to investigate spatial cognition, developmental trajectories and other cognitive capabilities. Thus, our findings imply that chimpanzees, bonobos, and humans learn and respond to virtual environments similarly. They discriminate between landmark types to make navigational decisions, and they alter their spatial strategies in response to environmental challenges of increasing complexity and size.
In summary, the ability to assess nonhuman primates’ spatial strategies in their natural habitat carries many difficulties, such as being able to determine which visuo-spatial landmarks are salient, and how the navigator perceives and represents their environment, as individual locales or as multiple locales linked within a whole. Virtual reality presenting virtually simulated environments affords an experimental situation where there is some degree of control over variables (landmarks, size of environment, complexity, etc.) but also maintains greater ecological validity. Virtual reality also provides significant flexibility in presenting different types of simulated environments, from built to naturalistic, and from simple to complex, with varying numbers and types of landmarks. Virtual environments can also incorporate auditory information and test how perception of aural patterns influence animals’ behavior in a number of experimental contexts.
As a novel methodological approach, the use of interactive virtual reality affords the ability to investigate and address issues of the integration of visuo-spatial information in the spatial strategies of human and nonhuman primates, comparatively. The nonhuman apes’ navigational performance in multiple virtual environment types was comparable to that of the humans tested, and suggests that, similar to the humans, the apes’ perception of the 2D virtual simulations was as a “space” within which to navigate.
Virtual reality provides a unique method to further research on how animals perceive, organize, and utilize encoded and recalled information. Virtual reality also provides a comparative window with which to examine the cognitive mechanisms that animals use to interpret perceptual details and successfully solve adaptive problems important for survival. For example, it can be used to investigate foraging behavior, memory, perception of amount and quality of food sites, spatial and temporal patterns of behavior, feeding competition, as well as Theory of Mind (Normand et al. 2009; Sayers and Menzel 2012; Janmaat, Ban and Boesch 2013; de Waal 2016).
Virtual reality is inherently flexible, and affords the ability to test equivalent real-life problems across species, and without the need for language. Thus, virtual reality allows for a distinct insight into the cognitive, developmental, and evolutionary origins of adaptive behaviors of nonhuman and human animals, comparatively.
Data presented in this article are generated from an ongoing project that has been supported in part, by internal grants from The University of Michigan-Dearborn. Data collected at the Language Research Center was supported in part by the Leakey Foundation, the Wenner-Gren Foundation, and HD-056352, HD-38051, and HD-060563 from the NIH. The contents of this article do not necessarily represent the official views of these funding agencies. We also thank Charles Menzel and Jeanine Stefanucci for suggestions for this article. And, we thank the reviewers for their careful comments that helped improve this article.
FLD, Luddite extraordinaire, dedicates this paper to GAG, who is not a Luddite and excels in navigating, for a great ape.