What makes an integrated object in visual working memory (WM)? Past evidence suggested that WM holds all features of multidimensional objects together, but struggles to integrate color–color conjunctions. This difficulty was previously attributed to a challenge in same-dimension integration, but here we argue that it arises from the integration of 2 distinct objects. To test this, we examined the integration of distinct different-dimension features (a colored square and a tilted bar). We monitored the contralateral delay activity, an event-related potential component sensitive to the number of objects in WM. The results indicated that color and orientation belonging to distinct objects in a shared location were not integrated in WM (Experiment 1), even following a common fate Gestalt cue (Experiment 2). These conjunctions were better integrated in a less demanding task (Experiment 3), and in the original WM task, but with a less individuating version of the original stimuli (Experiment 4). Our results identify the critical factor in WM integration at same- versus separate-objects, rather than at same- versus different-dimensions. Compared with the perfect integration of an object’s features, the integration of several objects is demanding, and depends on an interaction between the grouping cues and task demands, among other factors.
Visual working memory (WM) is the mechanism responsible for the dynamic storing and updating of visual representations (for a recent review, see Luck and Vogel 2013). It is a temporary storage containing a small number of the currently active visual representations (usually estimated at only 3–4 items; Cowan 2001), protected from interruptions. The robust connections WM has with factors such as fluid intelligence, problem solving, and attentional control on the one hand (e.g., Cowan et al. 2005; Fukuda et al. 2010), and with psychiatric disorders such as schizophrenia and Alzheimer's disease on the other hand (e.g., Gold et al. 2003; Parra et al. 2011) corroborate the central role of visual WM and its representations in guiding behavior. One important way of overcoming the strict capacity limit of WM is to chunk several items to one group, making the issue of what can constitute a WM unit a fundamental research question. The goal of the present work is to show that grouping objects into an integrated representation is not as pervasive as previously claimed, and to highlight the factors that contribute to WM-integration, or disrupt it.
Luck and Vogel (1997) (see also Vogel et al. 2001) demonstrated a full integration of the different features of an object in WM, by showing that adding to-be-remembered features to an item did not damage performance. Subjects were equally accurate when memorizing only the orientation of the presented bars, or simultaneously memorizing the orientation, color, size, and presence of a gap on the bars, a result that can be easily explained if WM stores all the features of an object as one unit. They also found that accuracy for maintaining a color–color conjunction (i.e., stimuli made of a small colored square inside a larger colored square) was the same as for a single colored square, suggesting that perfect integration exists for same-dimension conjunctions as well (Luck and Vogel 1997; Vogel et al. 2001). However, several attempts to replicate the perfect integration of color–color conjunctions were unsuccessful (Olson and Jiang 2002; Wheeler and Treisman 2002; Delvenne and Bruyer 2004; Parra et al. 2009), leading researchers to claim that integration of features from the same dimension comes at a certain cost, or that same-dimension (e.g., color–color) conjunctions simply would not be represented as a single object in WM.
The different status of color–color conjunctions was recently supported by electrophysiological evidence relying on the contralateral delay activity (CDA), an event-related potential (ERP) component reflecting the online activity of visual WM (Vogel and Machizawa 2004). The CDA is a negative slow wave found at posterior electrodes starting ∼300 ms after stimulus presentation, whose amplitude is a marker of the number of items that are maintained in visual WM (Vogel and Machizawa 2004; Vogel et al. 2005; McCollough et al. 2007; Ikkai et al. 2010; Luria et al. 2010). Critically for the issue of WM integration, the CDA was shown to be sensitive to the number of objects that are represented in visual WM, rather than to the number of features composing the items. Luria and Vogel (2011) (see also Woodman and Vogel 2008) recently showed that the CDA amplitude for one object carrying 2 relevant features (e.g., a colored tilted bar, whose color and orientation had to be maintained) was the same as for one object carrying only one relevant feature (e.g., a black tilted bar whose color never changed, thus carrying only orientation information), and lower than the CDA amplitude for the same 2 features when they were carried by 2 separate objects rather than one (e.g., a colored square next to a tilted bar).
When monitoring the CDA of color–color conjunctions, Luria and Vogel (2011) found a somewhat different pattern relative to the different features of a single object. The CDA amplitude associated with a color–color conjunction was initially larger than that of a single colored square, but throughout the retention interval it lowered and was eventually the same as that of one colored square. This suggested that color–color integration in WM is possible, but that unlike the binding of the different features of one object, it is gradual and more demanding.
The common explanation for the differences between the integration of an object's features and the integration of color–color conjunctions is that same-dimension conjunctions have a special status in WM, and are somehow more difficult to bind (e.g., Olson and Jiang 2002; Wheeler and Triesman 2002). However, we suggest an alternative explanation to the pattern of the CDA (Luria and Vogel 2011) and behavioral costs (e.g., Olson and Jiang 2002): the difficulty in such integration could be attributed to the fact that the 2 colors belong to 2 different independent objects (the smaller square and the larger one), which also appeared separately in the control conditions of the above-mentioned experiments (for a similar reasoning, see Kim and Kim 2011). In its most basic form, an object must include a shape and a color, and must occupy a spatial location. If a stimulus includes more than one color or shape, as is the case with color–color conjunctions, it can be potentially perceived as more than one object. It is possible that while the integration of the different features of one object is perfect and immediate (Luck and Vogel 1997; Vogel et al. 2001; Luria and Vogel 2011), the integration of several distinct objects to one coherent object is demanding, gradual, and nonobligatory. This would mark the basic factor that determines WM-integration of different features not as to whether they belong to the same dimension or to different ones, but rather as to whether they belong to the same object or to different ones.
The somewhat counterintuitive prediction of this view is that 2 different-dimension objects would also be difficult to integrate (similarly to color–color conjunctions), in sharp contrast to the efficient integration of the same features when presented within a single object. Thus, we argue that visual WM should have difficulties integrating a tilted bar (carrying orientation information) laid on top of a colored square (carrying color information), unlike the perfect integration of the orientation and color of a single bar (e.g., Luck and Vogel 1997). This is because that while a multidimensional object such as a colored bar can only be perceived as a single object, 2 overlapping items can be perceived either as one complex item or as 2 items one on top of each other. Thus, the faith of such ambiguous stimuli could be either integration or individuation. Understanding these situations is crucial since most real life situations include complex items that are more similar to ambiguous overlapping items than to simple objects.
The goal of the present study was to examine the hypothesis that the same features that are perfectly integrated when they belong to a single object (e.g., the color and orientation of a colored bar) will be difficult to integrate in WM when they belong to 2 different objects (e.g., the color and orientation of a black bar that is placed on top of a colored square). To isolate the WM representation from factors preceding and following WM maintenance, we monitored the CDA as an online marker of visual WM, exploiting the fact that this component is specifically sensitive to the number of objects in WM rather than to the number of features that compose each items (Woodman and Vogel 2008; Luria and Vogel 2011). In the first experiment, we tested whether a colored square and an oriented bar grouped by a shared location would be integrated in WM. To this aim, we compared the CDA amplitude of 4 objects (i.e., 2 colored squares and 2 oriented bars) in 2 locations with that of 2 and 4 separate objects. Participants were asked to encode and maintain the colors and orientations, and hence the shared location was task-irrelevant. If a shared location induces an integration of the comprising objects, the CDA amplitude in the conjunction condition (4 objects in 2 locations) should be similar to the amplitude produced by 2 objects. This would disprove our alternative explanation, suggesting instead that WM can efficiently group different-dimension features even when they are presented across distinct objects. Alternatively, if WM fails to integrate distinct objects when they only share a location, the CDA amplitude in the conjunction condition should be higher than 2 objects (or at least take time to reach the amplitude of 2 objects, like color–color conjunctions; Luria and Vogel 2011), supporting our claim that when the 2 features (which were previously shown to be perfectly integrated when belonging to a single object) belong to different objects they are not easily integrated in WM, even though the features are from 2 different dimensions.
If, as we suggest, WM does not necessarily integrate distinct objects, it is important to understand the factors which contribute to integration or disrupt it. To that aim, we conducted additional experiments, which examined the potential roles of the grouping cues strength (Experiment 2), the task characteristics (Experiment 3), and their joint influence (Experiment 4), in the integration of distinct objects in visual WM.
Materials and Methods
All participants gave informed consent following the procedures of a protocol approved by the Ethics Committee at the Tel Aviv University. They were either Tel Aviv University students who received course credit or 40 NIS (approximately $10) per hour for participation, or volunteers. All subjects had normal or corrected-to-normal visual acuity and normal color-vision. Experiments 1 and 3 each included 10 participants in the final analysis (5 females, mean age 24.6 in Experiment 1, and 5 females, mean age 27.3 in Experiment 3), and Experiments 2 and 4 each included 15 participants in the final analysis (8 females, mean age 25.1 in Experiment 2, and 5 females, mean age 26.9 in Experiment 4). In all 3 electrophysiological experiments, Subjects with a rejection rate higher than 35% were replaced (one in Experiment 2 and one in Experiment 4). Another participant in Experiment 2 was replaced due to a noisy recording.
Experiment 1: Stimuli and Procedure
We used the bilateral version of the change detection task (e.g., Vogel and Machizawa 2004). The stimuli were colored squares and tilted bars. We used 4 highly discriminable colors (red, yellow, green, and cyan) and 4 highly discriminable orientations (0°, 45°, 90°, and 135°). From a viewing distance of ∼60 cm, each square subtended ∼1.3° × 1.3° of visual angle. The bars were ∼0.2° wide, and ∼1.3° in length. The exact stimuli were randomly selected at the beginning of each trial, with the restrictions that any stimulus could appear at most once (on each side). Stimuli appeared inside a 7.4° × 13.7° rectangle (one in each side of fixation). Inside each rectangle, the exact positions of the stimuli were randomized on each trial, with the constraint that the distance between the centers of each stimulus would be not <4°.
Each trial started with the presentation of a fixation point (“+”) in the middle of the screen for 500 ms. Then, 2 arrow-cues were presented for 200 ms above and below fixation, indicating the to-be-attended side for the upcoming trial (right or left, with an equal probability). Participants were instructed to memorize only the stimuli presented on the side indicated by the arrows. After a random interval (300, 400, or 500 ms, from the cues offset), the memory array was presented for 200 ms, followed by a retention interval (during which only the fixation cross was presented) of 900 ms and then the test array (see Fig. 1A). The test array remained visible until a response was emitted. Participants made an unspeeded response via button press (using the “Z” and “/” keys on a computer keyboard, indicating “same” and “different,” respectively) to indicate whether the test array included only old items or one new item (with an equal probability for same and different trials; the test array at the uncued side was always identical to the memory array). On change trials, one of the items in the cued side was replaced by a new item from the same category (i.e., a color was replaced by a different color or a bar by a bar with a different orientation). Color-changes and orientation-changes were equally probable. On the other half of the trials, the test array was identical to the memory array.
The experiment included 3 possible conditions that were randomly intermixed within each block. In the 2O–2L condition (2 objects in 2 locations), one color and one bar were presented, each at a different location. In the 4O–4L condition (4 objects in 4 locations), 2 colors and 2 bars were presented, each at a different location. The 4O–2L condition (4 objects in 2 locations) included 2 conjunctions: 2 colors and 2 bars were presented in 2 shared locations, to create 2 sets of a tilted bar on top of a colored square. Participants started with a practice block of 12 trials, followed by 15 blocks, each consisting of 60 trials. The first block was considered practice, and the remaining 14 blocks (840 trials) were analyzed.
Experiment 2: Stimuli and Procedure
Experiment 2 was identical to Experiment 1, except as noted below. The items in the memory array moved for 1000 ms covering ∼1.5° of visual angle, and then remained stationary for 100 ms before disappearing (for a total of 1100 ms, see Fig. 1B). The items moved in a straight line and always stayed on the same side of the screen throughout their trajectory. The experiment included 4 possible conditions. In the 2O–2L condition (2 objects in 2 locations), one color and one bar were presented, each moving in a different direction. In the 4O–4L condition (4 objects in 4 locations), 2 colors and 2 bars were presented, each moving to a different direction. The 4O–2L condition (4 objects in 2 locations) included 2 “common fate” conjunctions: 2 colors and 2 bars were presented in 2 shared locations, to create 2 sets of a tilted bar on top of a colored square, and the items in each set moved together. The 4L-to-2L condition included 4 items that moved separately and then met to create 2 conjunctions (i.e., 2 sets of a tilted bar on top of a colored square).
Experiment 3: Stimuli and Procedure
The prime and probe were tilted black bars on top of colored squares. We used the red and green squares and horizontal and vertical bars from Experiment 1. The exact color and orientation of the stimuli were randomly selected at the beginning of each trial.
Each trial started with the presentation of a fixation point (“+”) in the middle of the screen for 250 ms. After a blank interval of 250 ms, the prime appeared, and participants were instructed to ignore it. After an interval of 500 ms, the probe appeared for 1000 ms, and subjects were asked to make a speeded response to the orientation of its bar, using the computer keyboard (pressing the “Z” and “/” keys on a computer keyboard, indicating “horizontal” and “vertical,” respectively).
The repetition versus alternation of orientation (horizontal or vertical bar) and color (red or green square) from prime to probe were manipulated independently, creating 4 possible and equally probable conditions that were randomly intermixed within each block: orientation repetition and color repetition, orientation repetition and color alternation, orientation alternation and color repetition, and orientation alternation and color alternation. Participants started with a practice block of 12 trials, followed by 8 experimental blocks, each consisting of 40 trials.
Experiment 4: Stimuli and Procedure
Experiment 4 was identical to Experiment 2, except as noted below. Instead of tilted bars, we used outlined shapes. There were 4 familiar shapes (a square, a circle, a star, and a triangle), presented in black, and subtending ∼1.3° × 1.3° of visual angle. Instead of colored squares, we used amorphic (“cloud”-like) patches of color. There were 4 light colors that were easily discriminable (yellow, light green, light blue, and pink). The colored patches subtended ∼2° × 2° of visual angle. After a stationary display of 100 ms, the memory array moved for 1000 ms, and then remained stationary for 100 ms. The 4 possible conditions were the same as in Experiment 2. In the 2O–2L condition (2 objects in 2 locations), one color and one shape were presented, each moving in a different direction. In the 4O–4L condition (4 objects in 4 locations), 2 colors and 2 shapes were presented, each moving to a different direction. The 4O–2L condition (4 objects in 2 locations) included 2 “common fate” conjunctions: 2 colors and 2 shapes were presented in 2 shared locations, to create 2 sets of a shape on top of a colored square, and the items in each set moved together. The 4L-to-2L condition included 4 items that moved separately and then met to create 2 conjunctions (i.e., 2 sets of a shape on top of a colored square).
Experiments 1, 2, and 4: EEG Recording
The electroencephalography (EEG) was recorded inside a shielded Faraday cage using a Biosemi ActiveTwo EEG recording system (Biosemi B.V., Amsterdam, The Netherlands). Data were recorded from 64 scalp-electrodes at locations of the extended 10–20 system, as well as from 2 electrodes placed on the left and right mastoids. The horizontal electrooculogram (EOG) was recorded from electrodes placed 1 cm to the left and right of the external canthi to detect horizontal eye movement, and the vertical EOG was recorded from an electrode beneath the left eye to detect blinks and vertical eye movements. The single-ended voltage was recorded between each electrode site and the common mode sense/driven right leg electrodes. Data were digitized at 256 Hz.
Offline signal processing and analysis was performed using EEGLAB Toolbox (Delorme and Makeig 2004), ERPLAB Toolbox (Lopez-Calderon and Luck 2014), and custom Matlab scripts. All electrodes were referenced to the average of the left and right mastoids. Artifact detection was performed using a peak-to-peak analysis, based on a sliding window 200-ms wide with a step of 100 ms. Trials containing activity exceeding 80 µV at the EOG electrodes or 100 µV at the analyzed electrodes (P7, P8, PO7, PO8, PO3, and PO4) were excluded from the averaged ERP waveforms. This procedure resulted in a mean rejection rate of 5.5% in Experiment 1, 11.9% in Experiment 2, and 9.8% in Experiment 4.
The continuous data were segmented into epochs from −200 to +1100 ms relative to onset of the memory array in Experiment 1, −200 to +2000 ms in Experiment 2, and −200 to +2100 ms in Experiment 4. The epoched data were then low-pass filtered using a noncausal Butterworth filter (12 dB/oct) with a half-amplitude cutoff point at 30 Hz. Only trials with a correct response emitted after at least 200 ms and at most 2000 ms after presentation of the test array were included in the analysis. The minimum number of trials per condition per subject following the rejection procedure was 190 in Experiment 1, 130 in Experiment 2, and 130 in Experiment 4.
Experiments 1, 2, and 4: CDA Analysis
Separate average waveforms for each condition were then generated, and difference waves were constructed by subtracting the average activity recorded at the electrodes ipsilateral to the memorized array from the average activity recorded at electrodes contralateral to the memorized array. For statistical purposes, in Experiment 1 we used the average activity between 300 and 1000 ms, time locked to onset of the memory array.
The CDA can be monitored not only during the retention interval (when items are not visible), but also during visual tracking (e.g., Drew and Vogel 2008; Drew et al. 2011; Drew et al. 2012) and when the items are stationary but remain visible (Tsubomi et al. 2013). We could thus examine the dynamic influence of common fate (in Experiments 2 and 4) on visual WM representations both when the potent visual cue was perceptually available and during the maintenance stage. In these experiments, we analyzed the average activity in 2 time ranges, both time locked to onset of the memory array: one for the Tracking CDA (between 300 and 1000 ms in Experiment 2, and between 400 and 1000 ms in Experiment 4), and one for the Memory CDA (between 1300 and 2000 ms in Experiment 2, and between 1400 and 2100 ms in Experiment 4).
For the ease of description purposes, we will only present the results from the PO7/PO8 electrodes because that is where the CDA amplitude is most evident. However, we analyzed the results over neighboring electrodes (P7/P8 and PO3/PO4) and found similar patterns of activations.
Experiments 1, 2, and 4: Statistical Analysis
In all 3 experiments, we conducted a one-way analysis of variance (ANOVA) with condition as a within-subject variable on the CDA mean amplitude as a dependent variable, and another one-way ANOVA with condition as a within-subject variable on accuracy as a dependent measure. All of these tests revealed a significant effect of condition, all Fs > 5, all Ps < 0.05. We do not further report them, instead focusing on the results of the planned comparisons (contrasts) between the different conditions.
Experiment 3: Data Analysis
Trials with reaction times (RTs) that were >2 standard deviations from the mean were removed from RT analyses. We then calculated mean RTs for correct trials and accuracy rates as a function of the repetition versus alternation of orientation and color, and submitted these variables to two-by-two ANOVAs.
In Experiment 1, we sought to test the integration of different-dimension features belonging to different objects. We examined the WM representation of colored squares and oriented bars sharing a location, using both behavioral performance and the CDA as an electrophysiological marker for the number of representations in visual WM. There are quite robust evidence for a perfect and immediate WM-integration of color and orientation when these features belong to a single object (e.g., Luck and Vogel 1997; Vogel et al. 2001; Luria and Vogel 2011). However, we claim that when these features belong to different objects they would not mandatorily or immediately integrate in WM, similarly to different objects carrying same-dimension information (i.e., color–color conjunctions).
To investigate the integration of different-dimension conjunctions, we compared the CDA amplitude in the 4O–2L condition, including 4 objects in 2 shared locations, to the CDA amplitudes of 2 and 4 separate objects in the 2O–2L and 4O–4L conditions. If a shared location produces a perfect integration of the 4 objects to 2 objects, the CDA amplitude in the 4O–2L condition should be similar to the CDA amplitude of the 2L–2L condition. This would suggest that WM efficiently integrates different-dimension features even when they belong to separate objects. Conversely, if WM does not integrate objects based only on their shared location, the CDA amplitude in the 4O–2L condition should be similar to the CDA amplitude of the 4O–4L condition. This would support the hypothesis of a difficulty in integrating different objects in WM, even for different-dimension conjunctions.
The CDA waveforms for the different conditions are presented in Figure 2A. Our results indicate that a color and an orientation belonging to different objects and sharing a location were not integrated to a bound representation in WM. The CDA amplitude in the 4O–2L condition (4 objects in 2 locations; −1.82 µV) was higher than the 2L–2L condition (−0.77 µV), F1, 9 = 19.20, mean squared error (MSE) = 0.29, P < 0.005, and did not differ from the CDA amplitude in the 4O–4L condition (−1.98 µV), F1, 9 = 1.09, MSE = 0.12, P = 0.32. Replicating previous findings, the CDA amplitude in the 4O–4L condition was higher than in the 2O–2L condition, F1, 9 = 41.24, MSE = 0.18, P < 0.0005.
The accuracy for the different conditions is presented in Figure 2B. Mirroring the electrophysiology data, we found that accuracy for the 4O–2L condition (0.93) was lower than in the 2O–2L condition (0.98), F1, 9 = 13.69, MSE = 0.00, P < 0.005, and did not differ from the 4O–4L condition (0.93), F1, 9 < 1. Accuracy in the 2O–2L condition was higher than in the 4O–4L condition, F1, 9 = 26.28, MSE = 0.001, P < 0.001.
Both the CDA and behavioral results suggest that objects in the 4O–2L condition were not integrated in WM, supporting our claim that different objects are difficult to integrate in WM even when carrying different-dimensions features. This sheds new light on previous results concerning color–color conjunctions (e.g., Olson and Jiang 2002; Delvenne and Bruyer 2004), suggesting that the lack of integration in these cases could have originated from the fact that the 2 colors belonged to 2 distinct objects. Interestingly, the shared location which was strong enough to gradually integrate color–color conjunctions (Luria and Vogel 2011) did not change the WM representations of color-orientation conjunctions at all, suggesting that different-dimension conjunctions are even more difficult to integrate than same-dimension conjunctions, an option we further examine in the following experiments.
Recently, Luria and Vogel (2014) found that given a strong enough grouping cue, subjects would immediately and perfectly integrate color–color conjunctions: when these same-dimension conjunctions moved according to the Gestalt principle of common fate, they were immediately integrated, similarly to the features of a single object. Thus, perhaps all that is needed for different-object integration to occur is a strong enough grouping cue. However, it is possible that when it comes to different objects (unlike the different features of a single object), grouping cues are not enough for an integration to occur. Instead, an interaction of several factors is necessary, and hence our different-dimension conjunctions should be represented separately in WM even following a stronger grouping cue, an option we tested in the following experiment.
The goal of Experiment 2 was to test whether the potent Gestalt principle of common fate, previously demonstrated to immediately integrate color–color conjunctions (Luria and Vogel 2014), would be enough to cause WM to maintain our color-orientation conjunctions as an integrated object in WM. If even common fate does not induce a perfect integration for our different-dimension objects, the results would strengthen the hypothesis that different objects are difficult to integrate in WM, and mark different-dimension objects as harder to integrate than same-dimension conjunctions.
Since the color and orientation in Experiment 1 belonged to 2 distinct objects, placing them at the same location could have been interpreted as independent figure and ground. In the 4O–2L condition (4 items in 2 groups) of the current experiment, the color and orientation not only shared a location but also moved together for one second before disappearing, giving a stronger grouping cue. The 4O–2L condition was compared with conditions in which the items moved separately, to test whether the shared location would cause the items to be integrated in WM and represented as 2 objects (producing a CDA amplitude similar to that of 2 separate objects in the 2O–2L condition), or remain unintegrated and represented as 4 objects (producing a CDA amplitude similar to that of 4 separate objects in the 4O–4L condition).
We included an additional control condition (the 4L-to-2L condition), in which 4 objects started at separate locations but then met to create 2 shared-location conjunctions such as those in the 4O–2L condition of Experiment 1. The final visual input in this condition was identical to the 4O–2L condition of the current Experiment, but their history was different, allowing us to isolate the joint movement aspect of our 4O–2L condition from its shared location aspect that was also available in Experiment 1. The CDA was monitored both during the dynamic display (Tracking CDA) and during the retention interval (Memory CDA).
Electrophysiological Results: Tracking CDA
The CDA waveforms for the different conditions are presented in Figure 2C. During the presentation of the memory array, we found evidence for partial, although far from perfect, integration following a common fate cue. While the CDA amplitude in the 4O–2L condition (4 objects moving together in 2 groups; −1.63 µV) was lower than in the 4O–4L condition (−2.08 µV), F1, 14 = 4.99, MSE = 0.31, P < 0.05, it was also higher than in the 2O–2L condition (−0.67 µV), F1, 14 = 23.04, MSE = 0.30, P < 0.0005. Similarly to Experiment 1, the shared-location cue in the 4L-to-2L condition (4 objects meeting to form 2 groups) also did not lead to an integrated representation of color and orientation in WM. The CDA amplitude in the 4L-to-2L condition (−2.39 µV) was higher than in the 2O–2L condition, F1, 14 = 43.05, MSE = 0.51, P < 0.00005, and in the 4O–2L condition, F1, 14 = 8.07, MSE = 0.54, P < 0.05, and did not differ from the CDA amplitude in the 4O–4L condition, F1, 14 = 1.83, MSE = 0.39, P = 0.20. The CDA amplitude in the 4O–4L condition was higher than in the 2O–2L condition, F1, 14 = 65.20, MSE = 0.23, P < 0.000005.
Electrophysiological Results: Memory CDA
We found an identical pattern of results for the Memory CDA as for the Tracking CDA. The CDA amplitude in the 4O–2L condition (−1.27 µV) was lower than the CDA amplitude in the 4O–4L condition (−1.77 µV), F1, 14 = 5.30, MSE = 0.35, P < 0.05, but higher than in the 2O–2L condition (−0.46 µV), F1, 14 = 18.60, MSE = 0.26, P < 0.001. The CDA amplitude in the 4L-to-2L condition (−1.72 µV) was higher than in the 2O–2L condition, F1, 14 = 37.40, MSE = 0.32, P < 0.00005, and in the 4O–2L condition, F1, 14 = 4.97, MSE = 0.3, P < 0.05, and was not significantly different than the CDA amplitude in the 4O–4L condition, F < 1. The CDA amplitude in the 4O–4L condition was higher than in the 2O–2L condition, F1, 14 = 56.01, MSE = 0.23, P < 0.000005.
The accuracy for the different conditions is presented in Figure 2D. We found no evidence for an object benefit produced by a common fate cue, replicating the pattern of the proximity cue in Experiment 1. Accuracy was higher in the 2O–2L condition (0.94) than in the 4O–2L condition (0.89), F1, 14 = 15.76, MSE = 0.001, P < 0.005, in the 4L-to-2L condition (0.89), F1, 14 = 16.79, MSE = 0.001, P < 0.005, and in the 4O–4L condition (0.88), F1, 14 = 14.34, MSE = 0.002, P < 0.005. Accuracy did not differ between any of the other 3 conditions, all Fs < 1.
The results thus far clearly speak against the creation of a fully integrated visual WM representation for different objects following a grouping cue. Namely, although the CDA amplitude in the 4O–2L condition was lower than the amplitude in the 4O–4L condition (indicating that the common fate cue had some effect), it was also significantly higher than the amplitude in the 2O–2L condition, indicating imperfect integration (we will return to this point in the Discussion). This supports our claim that the limiting factor in WM integration is whether the features compose distinct objects or belong to a single object. When comparing the results of the first 2 experiments with previous research of color–color conjunctions, we found that our color-orientation conjunctions were actually less prone to integration: color–color conjunctions were previously found to gradually integrate when sharing a location (Luria and Vogel 2011), and immediately integrate following a common fate cue (Luria and Vogel 2014). We suggest that this finding highlights the delicate interaction of several factors in different-objects integration in WM.
One of these factors could be the specific task used, and this idea receives support from a comparison of our results with past research involving different paradigms. While the potent Gestalt principle of common fate did not lead to an integration of the different objects in Experiment 2, there are numerous studies reporting grouping following Gestalt organization in different tasks. Gestalt principles have been claimed to act preattentively and reflexively (e.g., Duncan 1984), since they influence the allocation of attention and emerge regardless of task-relevance (e.g., Moore and Egeth 1997; Lamy et al. 2006). The influence of grouping cues such as proximity and common fate has been widely demonstrated through object-based attention paradigms, generally demonstrating faster or more accurate responses for grouped objects compared with the same number of ungrouped objects (e.g., Baylis and Driver 1992; Egly et al. 1994; Beck and Palmer 2002). This suggests that the paradigm might influence the outcomes of the grouping, and hence the goal of the next experiment was to examine whether an easier task would indeed produce more integration of our different-objects conjunctions.
In Experiment 3 we wanted to find evidence for an integration of the color-orientation conjunctions of Experiment 1 and 2 in a different task, and thus show that the lack of integration was not the result of specifically problematic stimuli. Here, we employed a behavioral object review paradigm (see Fig. 3A), in which it has been previously found that a shared location was enough for items to be encoded into a single object file (e.g., Kahneman et al. 1992; van Dam and Hommel 2010). For example, van Dam and Hommel (2010) used shared-location conjunctions of colors and orientations similar to those of the 4O–2L condition in Experiment 1, presenting a task-irrelevant prime conjunction followed by a probe conjunction whose orientation was to be reported. They found that partial repetitions of features from the prime to the probe (e.g., a horizontal bar on top of a green color, followed by a horizontal bar on top of a red color) elicited slower responses compared with complete repetitions or complete alternations of the features. This was interpreted as the encoding of both the color and the orientation of the prime to one object file, which had to be accessed and updated on partial repetition trials.
We used the object review paradigm for 3 reasons. First, it was extensively used to demonstrate the integration of different features or items in a single object file (e.g., Kahneman et al. 1992; Henderson 1994; Gordon and Irwin 1996; Hommel 1998; van Dam and Hommel 2010; Zmigrod and Hommel 2010). Second, the object review paradigm is easier than the change detection paradigm, since only one item has to be monitored in each trial (instead of 2 or 4 in Experiments 1 and 2), and the task requires to simply report which of 2 possible features appeared. Third, the object review paradigm puts less emphasize on the independence of the items, because only one dimension has to be monitored. Contrary, in the change detection paradigm the features may change independently, which encourages their individuation. Thus, if a pattern of grouping (i.e., an interaction between orientation repetition and color repetition) emerges, it would suggest that the same stimuli which were represented separately in WM when presented in the change detection paradigm are capable of producing a pattern of integration in a different task, marking task demands as a crucial factor in different objects integration [Note that we actually used a subset of the stimuli from Experiment 1 (two possible colors, red and green, and two possible orientations, vertical and horizontal), since the object review paradigm in its prominent form involves binary feature values].
Mean RTs for correct trials are displayed in Figure 3B. A two-by-two ANOVA with Orientation repetition and Color repetition as within-subject variables on mean RT as a dependent measure revealed no main effect of Orientation or of Color, both Fs < 1. Importantly, replicating previous work (e.g., van Dam and Hommel 2010) the interaction between these factors was significant, F1, 9 = 16.258, MSE = 69, P < 0.005, such that repeating only one feature of the prime (i.e., orientation repetition and color alternation, or orientation alternation and color repetition) lead to worse performance compared with complete repetitions (i.e., orientation repetition and color repetition) or complete alternations (i.e., orientation alternation and color alternation). The same analysis conducted on accuracy as a dependent measure did not yield any significant effects. Accuracy rates were 0.95 in the orientation repetition and color repetition condition, 0.95 in the orientation repetition and color alternation condition, 0.94 in the orientation alternation and color repetition condition, and 0.94 in the orientation alternation and color alternation condition.
These partial repetition costs replicate previous findings (e.g., van Dam and Hommel 2010), indicating that the objects sharing a location were encoded to a single object file, whose updating required time to complete. Thus, in a different paradigm, a shared location successfully produced a pattern of grouping for the same color-orientation conjunctions that were represented separately in the change detection task of Experiment 1. It is important to note that despite their wide-spread use as a marker of integration, these partial repetition costs do not allow for a fine-grained examination of the integration pattern. For example, it might be that this integration takes time to develop, as was found for color–color conjunctions using the change detection paradigm (Luria and Vogel 2011) or was only partial (e.g., occurred only in a certain proportion of the trials). Therefore, we do not consider the behavioral effect of the current experiment to be strong evidence in favor of a full integration of distinct objects. However, the results of Experiment 3 suggest that the problem in integrating the items in Experiments 1 and 2 was not entirely due to particularly problematic stimuli, and hence the current results provide initial support for the importance of the paradigm in such integration. Specifically, since Experiments 1 and 3 used the very same stimuli and grouping principles, their contrasting results point to the role of the particular task demands in the integration of different objects following a grouping cue, a point we return to in the discussion.
Past evidence suggest that different-objects grouping is possible within the change detection task as well (e.g., Woodman et al. 2003; Luria and Vogel 2011; 2014), but since it is clear from our first 2 experiments that such integration does not always occur, it is important to further test the conditions needed for it to take place. In Experiment 4, we attempted to increase the likelihood of grouping separate objects within the change detection task.
While Experiment 3 (together with the long line of Gestalt literature) established that grouping separate items is possible, Experiments 1 and 2 demonstrated that it is not a mandatory process within WM. Our goal in the current experiment was to provide evidence for different object integration within the 2 dimension change detection task, under more potent grouping conditions. First, we replaced the orientations of Experiments 1–3 with simple shapes, which we found in pilot studies to be easier to remember. Second, we used the potent Gestalt cue of common fate, which produced evidence for a partial integration in Experiment 2. Third, we replaced the colored squares for amorphic color patches, to remove the sharp contours which might have emphasized the distinctiveness of the items.
Electrophysiological Results: Tracking CDA
The CDA waveforms for the different conditions are presented in Figure 2E. During the presentation of the memory array, we found that common fate movement did not lead to a full integration of color and shape in visual WM. The CDA amplitude in the 4O–2L condition (4 objects moving together in 2 groups, −1.00 µV) was lower than the CDA amplitude in the 4O–4L condition (−1.53 µV), F1, 14 = 9.10, MSE = 0.24, P < 0.01, but still higher than in the 2O–2L condition (−0.68 µV), F1, 14 = 4.89, MSE = 0.15, P < 0.05. Items in the 4L-to-2L condition (4 objects meeting to form 2 color–shape conjunctions; −2.02 µV) were also represented as separate object in WM. The CDA amplitude in the 4L-to-2L condition was higher than the 2O–2L condition, F1, 14 = 29.78, MSE = 0.45, P < 0.0001, the 4O–2L condition, F1, 14 = 27.26, MSE = 0.29, P < 0.0005, and even higher than the CDA amplitude in the 4O–4L condition, F1, 14 = 16.06, MSE = 0.11, P < 0.005. The CDA amplitude in the 4O–4L condition was higher than in the 2O–2L condition, F1, 14 = 13.73, MSE = 0.40, P < 0.005.
Electrophysiological Results: Memory CDA
In contrast to Experiment 2, during the retention interval, the common fate color–shape conjunctions were integrated in WM. The CDA amplitude in the 4O–2L condition (−0.96 µV) was lower than the CDA amplitude in the 4O–4L condition (−1.41 µV), F1, 14 = 6.54, MSE = 0.24, P < 0.05, and did not differ than the CDA amplitude in the 2O–2L condition (−0.74 µV), F1, 14 = 1.41, MSE = 0.25, P = 0.25. The 4L-to-2L condition was still represented separately, even though the last visual input was identical in this condition and in the 4O–2L condition. The CDA amplitude in the 4L-to-2L condition (−1.69 µV) was higher than in the 2O–2L condition, F1, 14 = 25.03, MSE = 0.27, P < 0.0005, in the 4O–2L condition, F1, 14 = 17.07, MSE = 0.24, P < 0.005, and even in the 4O–4L condition, F1, 14 = 5.18, MSE = 0.11, P < 0.05. The CDA amplitude in the 4O–4L condition was higher than in the 2O–2L condition, F1, 14 = 10.87, MSE = 0.32, P < 0.01.
The CDA pattern of Experiment 4 suggests that distinct objects carrying different-dimension features can be integrated in the change detection paradigm, under certain conditions. However, when comparing Experiments 2 and 4, we interpret Experiment 4 as evidence for a better integration, instead of a perfect integration. This is mainly because the integration was only gradual, unlike the immediate integration of the different features of multidimensional objects (e.g., Woodman and Vogel 2008) or color–color conjunctions following a common fate cue (Luria and Vogel 2014). We will return to this point in Discussion.
The accuracy for the different conditions is presented in Figure 2F. Behaviorally, the common fate cue produced object benefits for the color–shape conjunctions. Accuracy in the 4O–2L condition (0.93) was higher than in the 4O–4L condition (0.9), F1, 14 = 16.62, MSE = 0.001, P < 0.005, although lower than in the 2O–2L condition (0.98), F1, 14 = 22.88, MSE = 0.001, P < 0.0005. Accuracy in the 4L-to-2L condition (0.92) was lower than in the 2O–2L condition, F1, 14 = 44.76, MSE = 0.001, P < 0.00005, and the 4O–2L condition, F1, 14 = 5.82, MSE = 0.001, P < 0.05, although higher than in the 4O–4L condition, F1, 14 = 5.82, MSE = 0.001, P < 0.05. Accuracy was higher in the 2O–2L condition than the 4O–4L condition, F1, 14 = 34.21, MSE = 0.001, P < 0.00005.
Thus, we found a behavioral benefit for objects arranged according to Gestalt principles, corroborating the CDA findings. Interestingly, the improvement in accuracy was only partial, since the 4O–2L condition resulted in lower accuracy relative to the same number of simple objects (the 2O–2L condition). We will return to this point in Discussion.
To provide support for our claim that the greater integration is due, at least in part, to a less challenging task, we compared accuracy in the 2O–2L condition between Experiments 2 and 4. This condition is below the usual capacity estimates of WM (∼3–4 items), and therefore any differences can be more confidently attributed to specific task demands rather than a general overloading of WM. Subjects in Experiment 4 were indeed more accurate than in Experiment 2, t28 = 2.07, P < 0.05.
The goal of the present study was to examine the integration of distinct objects in WM. We hypothesized that an integration of such conjunctions would be difficult even when the objects carry different-dimension features. Indeed, in Experiment 1 both the CDA and behavioral results indicated that a color and an orientation belonging to distinct objects were not integrated in WM when placed in the same location in the 4O–2L condition. Thus, unlike the different features of an object which are immediately and perfectly integrated in WM (e.g., Luck and Vogel 1997; Luria and Vogel 2011), similar feature information conveyed by different objects is not necessarily integrated to one unit. We suggest that an integration of distinct objects to one WM-unit is a demanding process that depends not only on the stimuli but on an interaction between several additional factors.
To capture our theoretical approach compared with previous claims, we present a summary of our results and interpretations in Figure 4. As can be seen in the figure, the stimuli used in the research of WM integration can be divided into 3 types. The first type of stimulus is multidimensional objects, such as a colored bar. These items were generally found to be represented as an integrated object (e.g., Luck and Vogel 1997; Vogel et al. 2001), suggesting that the different-dimension features of an object are immediately and mandatorily integrated to a bound representation. The second type of stimulus is same-dimension conjunctions, of which the usual example is color–color conjunctions. These items were shown to lead to a gradual integration, (Luria and Vogel 2011) or to no integration at all (e.g., Olson and Jiang 2002). The source of the difficulty WM has with such integration was typically attributed to the fact that the conjunction includes 2 features from a single dimension (i.e., color), meaning that same-dimension conjunctions are not easily integrated in WM. Contrary, we suggest that the cause of difficulty lies in the fact that these stimuli can be interpreted as 2 superimposed distinct objects, in which case their integration will be nonmandatory. Therefore, in the current study we introduced a third type of stimuli, namely different-dimension conjunctions of distinct objects, such as a tilted bar on top of a colored square. Similarly to the mixed results regarding color–color conjunctions, we found that under certain conditions there can be some evidence for an integration of color-orientation or color-shape conjunctions (see Experiments 3 and 4), but this integration is nonmandatory (see Experiments 1 and 2) and gradual. Since our stimuli carried features from different dimensions, the lack of a perfect integration suggests a reconceptualization of the visual WM framework, with same-dimension conjunctions and different-dimension conjunctions being treated in a similar way. We argue that whenever distinct objects are concerned, WM will not necessarily integrate them.
We suggest that the basic WM-unit that is mandatorily integrated is an object in its simplest form, namely, it includes a shape, a color and it occupies a position in space. Some of these features may be task-irrelevant, such as the black color of our bars, but, importantly, none of them can be eliminated (e.g., all objects have at least one color). In the case of a color–color conjunction or when superimposing a tilted bar on a colored square, it is possible to eliminate one feature (e.g., one of the colors) and still perceive an object. When WM encodes these types of stimuli some integration is needed and this may lead to costs in performance observed by previous studies. Hence, in the case of distinct objects, the WM representation depends on several factors, some of which were identified in the current study and are outlined next. Importantly, we do not claim that we are able to determine all of the potential influences on the integration of distinct objects in WM, but simply that such integration is not a mandatory process.
The first and most obvious influence on the integration of different objects is the strength of the grouping cues. While a shared location did not change the WM representation of our color-orientation conjunctions in the 4O–2L condition, in Experiment 2 a more potent common fate cue produced a partial integration of the objects. However, this Gestalt principle did not suffice to fully integrate the conjunctions in the 4O–2L condition, which stands in contrast to the immediate and complete integration of color–color conjunctions following such a cue (Luria and Vogel 2014), indicating that the stimulus type also plays a role in the likelihood of integration.
The source for the somewhat counterintuitive finding that different-dimension conjunctions are even harder to integrate than same-dimension (e.g., color–color) conjunctions could be the characteristics of our version of the change detection task, which specifically encouraged an independent monitoring of each item. Our subjects were actually given 2 different tasks, a color-monitoring task and an orientation-monitoring task, each task involving distinct items. This highlights the separation of the items to a greater extent than the usual single-dimension change detection task used in previous studies of color–color conjunctions involving only a color task.
The importance of the paradigm in encouraging (or discouraging) integration is further supported by the results from Experiments 3 and 4. In Experiment 3, the same color-orientation conjunctions of the 4O–2L condition of Experiment 1, which were represented separately in the change detection task, produced a pattern of grouping in the object review paradigm. While this pattern is ambiguous in the sense that it does not allow for an examination of the exact degree (i.e., complete versus partial) of the grouping, it was widely used in the past as a marker of integration (e.g., Kahneman et al. 1992; van Dam and Hommel 2010). Since the major difference between Experiments 1 and 3 was the task and not the stimuli or the Gestalt cue, the discrepancy between the results obtained across these experiments points to the role of the specific paradigm used in the integration of distinct objects. The change detection paradigm is harder, and specifically encourages an individuating strategy since each item can change independently, and in a different dimension. In contrast, the object review paradigm is simple and requires the monitoring of only one dimension.
Experiment 4 demonstrated that integration of different objects carrying different-dimensions features is possible in the change detection task, once its individuating nature is attenuated. When an easier task was used, amorphic color patches and simple shapes were gradually integrated in WM following a potent common fate cue. However, it is important to note that even under these conditions, the integration of distinct objects was not identical to that of the different features of a single object, since it took time to complete and produced only a partial behavioral benefit. We consider Experiment 4 to be an example for a better integration of distinct objects than Experiment 2, highlighting factors that support such integration, but we cannot regard it as a perfect integration of distinct objects carrying different-dimension features (such as the perfect integration of color–color conjunctions following a common fate cue, Luria and Vogel 2014). This is because the integration took time, and relies on a null difference (between the 2O–2L and 4O–2L conditions). Taken together, our results suggest that different objects can be integrated to form complex WM-units, but this process is highly sensitive not only to stimulus-driven cues such as Gestalt principles, but also to a task environment that supports the integration signal.
Another factor that influences the integration of different objects becomes apparent when comparing the 4O–2L and 4L-to-2L conditions of Experiments 2 and 4: the history of the items (Hollingworth and Rasmussen 2010; Luria and Vogel 2014). The final visual input of these 2 conditions was identical, but their movement history was different. Items in the 4L-to-2L condition previously moved separately, and were represented separately in WM (producing CDA amplitudes similar to 4 items) despite finally meeting, while items with a constant common fate in the 4O–2L condition were at least partially integrated in WM. These very distinct representations suggest that WM representation is not determined solely by the last available visual input but by the sum of information about the item.
The flexibility of WM representations is manifested also in the fact that the CDA amplitude produced by the 4L-to-2L condition was even higher than that of the 4O–4L condition (Experiment 4). Recently, it has been found (Drew et al. 2011) that tracking the spatial locations of items in a multiple object tracking task induces a higher CDA amplitude compared with the monitoring of items identity. Perhaps dynamically updating the representations of the items (due to their expected interaction in the 4L-to-2L condition), even when the updating is not purely spatial, imposes a greater demand on WM. This aspect of WM representations could be the target for future research.
An important point that arises from our findings regarding Gestalt organization involves the partial behavioral benefit found in Experiment 4. The 4O–2L condition, including 4 objects in 2 Gestalt groups (arranged by common fate) resulted in higher accuracy compared with 4 unorganized objects in the 4O–4L condition, corroborating previous research, but still had lower accuracy relative to 2 unorganized objects in the 2O–2L condition. The comparison of a certain number of Gestalt groups to the same number of simple objects is critical if one wishes to demonstrate a full integration caused by the Gestalt principles, but it was lacking from most previous studies (usually involving only a comparison of a certain number of objects organized in a Gestalt group to the same number of unorganized objects). A conclusion as to whether the integration is partial or complete cannot be drawn from a behavioral improvement following Gestalt organization, without the proper control conditions or the simultaneous monitoring of the appropriate electrophysiological marker. In addition, comparing Experiments 1 and 3 reveals that different tasks respond to grouping principles in a different way, and cues that would cause integration in one paradigm might fail to do so in a different paradigm. Of course, there are indications for a complete object integration in WM (see Luria and Vogel 2014). We only suggest that caution should be taken when interpreting behavioral object benefits, since other factors such as task demands and objects’ history appear to be crucial in different-objects integration.
A final question that arises from our findings concerns the meaning of the so-called “partial integration”. In Experiment 2 and in the movement phase of Experiment 4, we found that while common fate did not induce a complete integration (since the CDA amplitude in the 4O–2L condition was higher than that of 2 objects in the 2O–2L condition), it did influence WM representations to some extant (since the amplitude was lower than that of 4 objects in the 4O–4L condition, carrying the same visual information). An analogous pattern was found in an functional magnetic resonance imaging (fMRI) study by Xu and Chun (2007), who examined activation of the intraparietal sulcus (IPS; a brain area related to visual WM, see Xu and Chun 2006). They found that 3 shapes arranged in 2 Gestalt groups resulted in a lower IPS activation relative to 3 ungrouped items, but a higher activation than that of 2 ungrouped items. An intermediate CDA probably reflects a mixture of fully integrated and completely unintegrated representations. It could be that some subjects perfectly integrated the objects while others kept them perfectly separate. Alternatively, for every subject certain trials resulted in a completely integrated representation while other trials resulted in a completely unintegrated representation (either randomly, gradually along the task, or such that only certain conjunctions induced a perfect integration). The results thus far cannot rule out any of these options, but nonetheless this implies that WM representations are complex and flexible.
This work was supported by an Israel Science Foundation grant 1696/13 to R.L.
Conflicts of Interest: None declared.