Abstract

Increasing evidence over the past decade suggests that vision is not simply a passive, feed-forward process in which cortical areas relay progressively more abstract information to those higher up in the visual hierarchy, but rather an inferential process with top-down processes actively guiding and shaping perception. However, one major question that persists is whether such processes can be influenced by unconsciously perceived stimuli. Recent psychophysics and neuroimaging studies have revealed that while consciously perceived stimuli elicit stronger responses in higher visual and frontoparietal areas than those that fail to reach conscious awareness, the latter can still drive high-level brain and behavioral responses. We investigated whether unconscious processing of a masked natural image could facilitate subsequent conscious recognition of its degraded counterpart (a black-and-white “Mooney” image) presented many seconds later. We found that this is indeed the case, suggesting that conscious vision may be influenced by priors established by unconscious processing of a fleeting image.

Introduction

Visual perception is traditionally considered a passive, feed-forward process in which cortical areas relay information to those higher up in the hierarchy, with each step extracting progressively more abstract information from the stimulus. Thus, early visual areas relay elemental information such as contrast, edges, and shape to higher areas, which associate it with concepts such as faces, objects, places, and scenes. However, overwhelming evidence from psychophysics, neuroimaging, and neurophysiology now favors a framework that views vision as an active, inferential process, with top-down processes carrying a predictive model that is validated or updated by bottom-up processes (Mumford, 1992; Yuille and Kersten, 2006; Bar, 2007; Albright, 2012; Bastos et al., 2012). For instance, even in the primary visual cortex (V1) and middle temporal area (MT), the receptive field properties and tuning curves of neurons are subject to top-down influences established by task context or associative learning (Li et al., 2004; Schlack and Albright, 2007), suggesting that top-down influences can alter the information coded in lower-order sensory areas. These top-down influences may originate from higher-order visual areas or frontoparietal areas (Fahrenfort et al., 2012; Panagiotaropoulos et al., 2012; Gilbert and Li, 2013; Wang et al., 2013).

In Bayesian terms, perception is a probabilistic computation that combines the prior probability distribution and stimulus input information to construct the posterior probability distribution (Pouget et al., 2013). Thus, at different stages of stimulus processing, the predictive model carried by top-down influences may contain varying proportions of prior and stimulus information. The prior could be established by task contexts (Summerfield and de Lange, 2014), learned from experience (Tovee et al., 1996; Dolan et al., 1997), or sculpted by development (Berkes et al., 2011) and genes (Zhu et al., 2010).

An important unresolved question is whether priors in perception could be influenced by unconscious experiences. Thus far, the majority of work in this domain has concerned exclusively conscious vision. However, recent behavioral and neuroimaging studies revealed that unconsciously perceived stimuli can nevertheless trigger high-level processes such as response inhibition, task switching, conflict monitoring and error detection, and activate higher-order brain regions including the prefrontal cortex (van Gaal and Lamme, 2012). Inspired by this work, we investigated whether unconscious processing of a visual stimulus could alter perceptual priors and influence subsequent conscious visual perception.

Mooney images present an ideal paradigm for investigating this question. Mooney images are thresholded black-and-white images that are difficult to recognize initially (Fig. 1). However, once the subject is exposed to the original, nondegraded image, the corresponding Mooney image is usually effortlessly recognized, and this “disambiguation” effect is typically long lasting, persisting for days, months, even a lifetime (Ludmer et al., 2011). This phenomenon demonstrates that a perceptual prior can be established in a remarkably fast and robust manner, exemplifying the power of synaptic plasticity.

Task paradigm. (A and B) Trial structure for gray-scale (A) and Mooney (B) image presentation. For details see “Materials and Methods” section. (C) The structure of a run, which consisted of one block presented twice, followed by a verbal test section. Each block consisted of 10 trials: 2 gray-scale image trials, followed by 4 Mooney images trials in randomized order, followed by a repeat of the 4 Mooney image trials in randomized order. Two of the four Mooney images presented corresponded to the gray-scale images in the same block, and are thus presented “post-disambiguation.” The remaining two Mooney images are presented “pre-disambiguation.”
Figure 1.

Task paradigm. (A and B) Trial structure for gray-scale (A) and Mooney (B) image presentation. For details see “Materials and Methods” section. (C) The structure of a run, which consisted of one block presented twice, followed by a verbal test section. Each block consisted of 10 trials: 2 gray-scale image trials, followed by 4 Mooney images trials in randomized order, followed by a repeat of the 4 Mooney image trials in randomized order. Two of the four Mooney images presented corresponded to the gray-scale images in the same block, and are thus presented “post-disambiguation.” The remaining two Mooney images are presented “pre-disambiguation.”

We used backward-masked presentation of real-world photographs (“gray-scale images”) and their Mooney image counterparts to investigate whether a gray-scale image that fails to be consciously recognized could nevertheless influence subsequent conscious recognition of the corresponding Mooney image. To anticipate, we observed that this is indeed the case, suggesting that unconscious processing of a stimulus could leave a prior in the brain that guides subsequent conscious visual perception. In addition, our results pave the way for future investigation on the neural instantiation of such perceptual priors elicited by unconscious processing, and whether the perceptual priors sculpted by conscious and unconscious processing differ in their influences on behavior and their underlying neural code.

Materials and Methods

Participants

Twenty-five volunteers (age range: 22–35 years, mean age 26.5, 16 females) participated in the main experiment. Six additional volunteers participated in the initial screening of Mooney images for use in the main experiment. Three additional volunteers participated in a pilot experiment to determine the effective stimulus-onset-asynchrony (SOA) for backward-masked presentation of the gray-scale images. All participants were right-handed and neurologically healthy, with normal or corrected-to-normal vision. The experiment was approved by the Institutional Review Board of the National Institute of Neurological Disorders and Stroke. All subjects provided written informed consent.

Visual stimuli

Mooney and gray-scale images were generated from gray-scale photographs of real-world objects and animals selected from the Caltech (http://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html) and Pascal VOC (http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html) databases. First, gray-scale images were constructed by cropping gray-scale photographs with a single inanimate object or animal subject in a naturalistic setting to 500 × 500 pixels and applying a box filter. Mooney images were subsequently generated by thresholding the gray-scale image. Threshold level and filter size were initially set at the median intensity of each image and 10 × 10 pixels, respectively. Each parameter was then titrated so that the Mooney image was difficult to recognize without first seeing the corresponding gray-scale image.

Of an original set of 252 images, 44 (half included inanimate objects, and the other half animals – unbeknownst to the subjects) were chosen to be used in the main experiment via an initial screening procedure. Six participants recruited separately from the main experiment were presented with the Mooney, matching gray-scale, and Mooney image again at 2- sec duration each. After each Mooney image presentation, participants were asked to rate the difficulty of recognizing the item depicted in the image using a five-point scale. Images were ranked for each participant based on difference in difficulty scores between post-disambiguation and pre-disambiguation period. Numerical rankings for each image were then averaged across the six participants, and the top-ranked images were selected for the main experiment.

Images were presented on a ViewSonic V3D245 monitor at a 1920 × 1080 resolution and 120 Hz refresh rate. All images subtended 7.6 × 7.6 degrees of visual angle.

Task paradigm for the main experiment

Each trial began with a 1-s fixation period during which subjects were instructed to fixate their gaze on a red dot presented in the center of the screen. For Mooney image trials, the fixation period was followed by a 2-s image presentation period. For gray-scale image trials, the fixation period was followed by 17-ms image presentation, a 50-ms blank screen, and 1933-ms mask presentation, where the mask consisted of phase-shuffled noise created from the gray-scale image presented in the same trial. The red dot was present throughout this time period and subjects were instructed to keep their gaze fixated on it (Fig. 1A and B). Thereafter, participants were tested for subjective recognition of the image by a text prompt asking, “Can you recognize and name the object in the image?” The answer choices, “Yes” and “No,” were presented on each side of the screen below the question prompt. Positions for the answer choices were randomized across image presentations to avoid innate response bias. Participants responded with a two-button response box using the index and middle finger of their right hand, with each button corresponding to different sides of the screen. Each trial ended with a 1.5 s or 2.5 s jittered blank screen.

Trials were organized into blocks, using a structure similar to a previous study (Gorlin et al., 2012). Each block consisted of 10 trials: 2 different gray-scale images followed by 4 Mooney images, and a repeat of the same 4 Mooney images (Fig. 1C). Two of the Mooney images corresponded to the preceding gray-scale images and the other two images were different. The presentation order of the four Mooney images in each repeat was randomized. The block was repeated again with identical gray-scale image sequence, and reshuffled Mooney image sequence (“block 2” in Fig. 1C), followed by a verbal test section. This constitutes one experimental run. Of the four different Mooney images presented in each run, the two Mooney images corresponding to the gray-scale images were presented “post-disambiguation.” The two Mooney images that did not correspond to the gray-scale images were presented “pre-disambiguation,” as their corresponding gray-scale images would be presented during the next run. Each participant completed 21 runs. The total duration of the experiment was about 1 h.

The verbal test was included to verify that subjects’ recognition of the Mooney images was the correct interpretation, and consisted of presenting each of the four different Mooney images from the preceding run for 2 s on the screen. Following each image presentation, participants were asked to verbally respond what they saw in the Mooney image and were allowed to answer that they did not recognize the image. Verbal responses were scored as correct or incorrect using a pre-determined list of acceptable responses for each image (Fig. 2). This resulted in a verbal test for each Mooney image once before disambiguation and once after disambiguation.

Sample image sets and their acceptable responses for the verbal identification test.
Figure 2.

Sample image sets and their acceptable responses for the verbal identification test.

Of the 44 image sets used in experiment, only 40 had Mooney images presented both pre- and post-disambiguation; this is due to the fact that the first and last runs of the experiment contained Mooney images that were never disambiguated or only presented post-disambiguation, respectively. Out of 40 image sets with Mooney images presented both pre- and post-disambiguation, a random set of 10 were selected for each participant as catch image sets. For catch image sets, the disambiguating gray-scale image was replaced by a gray-scale image that was not associated with any Mooney image used in the experiment.

Pilot study for determining the SOA used in backward masking

The 67-ms SOA between gray-scale image and mask presentation in the main experiment (Fig. 1A) was determined based on a pilot study using 96 gray-scale images (including the 44 used in the main experiment). Trial structure was the same as in Fig. 1A, except that the image and mask duration was fixed at 17 and 2000 ms, respectively, and the blank duration varied between 33, 50, 83, 150, and 283 ms (corresponding to SOAs of 50, 67, 100, 167, 300 ms). Trials were organized into five blocks. Each block consisted of 96 trials. During each block, the same set of 96 gray-scale images were presented in randomized order. For every unique image, the presentations across blocks used five different SOAs. Thus, each block contained trials with varying SOAs, but each image was presented five times, once with each SOA.

Eye-tracking

As a control measure, 13 participants’ gaze position and pupil size were recorded at 1000 Hz using the SR Research Eyelink 1000+ system. Participants’ gaze and pupil size were determined by recording the pupil and corneal reflection of their dominant eye. Head position was stabilized by use of a head-post with chin and forehead rests. Pupil tracking was set to centroid model mode.

Behavioral data analyses

For each image set, the gray-scale image was presented twice, and the Mooney image was presented four times before disambiguation, and four times following disambiguation (Fig. 1C). Subjects responded to the subjective recognition prompt with a key press following each image presentation; for each Mooney image, they answered the verbal test once pre-disambiguation and once post-disambiguation. Disambiguation of Mooney images was assessed in two ways – “subjective recognition” and “(verbal) identification.” Subjective recognition rate was determined as the fraction of image presentations in which the subject responded “Yes” to the subjective recognition prompt, separately for the pre- and post-disambiguation period. Mooney images were considered identified if the subject replied with a correct answer during the verbal test. Mooney images already identified correctly during the verbal test in the pre-disambiguation period were excluded from all analyses.

Since we were interested in assessing whether nonrecognized gray-scale images could nevertheless facilitate recognition of the corresponding Mooney images, to be conservative, gray-scale images were considered recognized if participants responded “Yes” to the subjective recognition prompt in at least one of two presentations. A three-way repeated-measures ANOVA was applied to subjective recognition responses to the Mooney images with independent factors: (i) pre- versus post-disambiguation period, (ii) whether or not participants recognized the gray-scale image in the same image set, and (iii) regular versus catch image set (i.e. whether the gray-scale image presented matched the Mooney image). Post-hoc paired t-tests between pre- and post-disambiguation periods were also carried out (Fig. 4A).

A two-way ANOVA was applied to verbal test performance in the post-disambiguation period (since all identified Mooney images in the pre-disambiguation period were excluded from the analyses), with independent factors: (i) whether or not the gray-scale image in the same image set was subjectively recognized, and (ii) regular versus catch image sets. In addition, post-hoc one-sample t-tests were carried out for each condition (Fig. 4B).

Additional analyses were carried out to specifically compare regular image sets whose gray-scale images were not recognized with catch image sets, as described in “Results” section.

Eye-tracking data analyses

To ensure that the main behavioral effect we observed (namely that nonrecognized gray-scale images could nevertheless facilitate recognition of corresponding Mooney images) was not due to a difference in eye-movement pattern, we analyzed eye-tracking data. Only image sets where the gray-scale image was not recognized, and the Mooney image was not already identified in the pre-disambiguation period, were included in this analysis. In addition, among these image sets, a subject must have both identified and unidentified Mooney images in the post-disambiguation period to be included in the analysis, so that the statistical test was performed at the within-subject level (using repeated-measures ANOVAs or paired t-tests). Seven out of 13 subjects with eye-tracking were included in the final analyses. For these image sets, we asked whether the eye-movement pattern or pupil size (during both gray-scale and Mooney image presentations) was different between the Mooney images identified in the post-disambiguation period and those that remained unidentified.

For each image presentation (2 s for Mooney image, 17 ms for gray-scale image), two measures were used to assess eye-movement pattern: mean distance of gaze to the fixation dot, and area of eye movement. Distance to fixation was determined by taking the mean of the distance, in pixels, between gaze position and the red fixation dot across time for each image presentation. Area of eye movement was calculated by multiplying the standard deviation of gaze position in the x and y directions for each image presentation. In addition, we used pupil size measurement as a proxy for subjects’ attentional state (Eldar et al., 2013). All measures were averaged across image presentations for each subject and then subjected to random-effects analyses across subjects using paired t-tests or ANOVAs (Fig. 5).

Control analyses for classical priming effect

First, we examined the temporal separation, in number of trials, between gray-scale images and their closest (i.e. the first presentation out of two) corresponding Mooney images in every block (Fig. 6). For example, if a gray-scale image was presented in the trial immediately before its corresponding Mooney image, the distance between them would be one trial. In this case, the temporal separation between the gray-scale and Mooney image is 5.92 s on average (50 ms ISI, 1933 ms mask, 939 ms mean response time, 2000 ms blank, 1000 ms fixation, see Fig. 1A), given that the mean response time across subjects was 939 ms (SD  = 493 ms). The temporal separation between a gray-scale and its matching Mooney image has a lower limit of 4.48 s, which assumes instantaneous response time and the lower bound of blank period (1.5 s). We calculated the distribution of the temporal separation between a gray-scale image and subsequent presentation of its matching Mooney image in several key conditions (Fig. 6). Catch image sets were excluded from this analysis.

We further examined whether gray-scale images presented in G1 or G2 position (Fig. 1A) were more effective in enhancing subjective recognition of their corresponding Mooney images. If priming effects were responsible for disambiguation, we would expect that the position closer to Mooney images (G2) is more effective in disambiguating Mooney images. We separated post-disambiguation Mooney images according to whether their corresponding gray-scale images were presented in the G1 or G2 position, and calculated subjective recognition rates for each subject in each category (Fig. 7A). Results were subjected to a paired t-test.

Finally, subjective recognition rates for post-disambiguation Mooney images presented in the second block during each run (see Fig. 1C) are expected to be higher than those of the first block, due to the additional exposure to the priors (i.e. gray-scale images) at the beginning of each block. To test this, we compared subjective recognition rates of post-disambiguation Mooney images, grouped by their presentation block (Fig. 7B). Results were subjected to a paired t-test across subjects.

Results

A pilot experiment determined the threshold SOA for masking the gray-scale image to be 67 ms, where the recognition rate of the gray-scale image was ∼50% (Fig. 3A). Thus, a 67-ms SOA was used in the main experiment (Fig. 1A). In the main experiment, there were roughly equal numbers of recognized and not recognized gray-scale images, consistent with the pilot study (Fig. 3B). Image sets for which subjects already identified the Mooney image correctly (as assessed by verbal test) in the pre-disambiguation period were removed from further analyses. The remaining image sets were split into regular versus catch (i.e. whether the gray-scale image matched the Mooney image), and whether their gray-scale images were recognized. The remaining number of image sets in each condition averaged across subjects is shown in Fig. 3C, which shows that when the Mooney image is already identified in the pre-disambiguation phase (and thus removed from further analyses), its corresponding gray-scale image was more easily recognized.

Masking of gray-scale images. (A) Mean recognition rate of masked gray-scale images under varying SOAs in the pilot experiment. An SOA of 67 ms was chosen for the main experiment as it yielded 50% recognition rate and thus roughly equal numbers of recognized and not recognized gray-scale images. The solid line is the logistic fit and had a value of 48% at a 67-ms SOA. (B) Number of regular or catch image sets with recognized versus not recognized gray-scale images. Recognition of a gray-scale image was defined as a “yes” response to the recognition prompt in at least 1 of 2 presentations. (C) Same as B, except that image sets with verbally identified Mooney images in the pre-disambiguation phase were excluded. The remaining image sets are used in further analyses. Error bars denote SEM across subjects.
Figure 3.

Masking of gray-scale images. (A) Mean recognition rate of masked gray-scale images under varying SOAs in the pilot experiment. An SOA of 67 ms was chosen for the main experiment as it yielded 50% recognition rate and thus roughly equal numbers of recognized and not recognized gray-scale images. The solid line is the logistic fit and had a value of 48% at a 67-ms SOA. (B) Number of regular or catch image sets with recognized versus not recognized gray-scale images. Recognition of a gray-scale image was defined as a “yes” response to the recognition prompt in at least 1 of 2 presentations. (C) Same as B, except that image sets with verbally identified Mooney images in the pre-disambiguation phase were excluded. The remaining image sets are used in further analyses. Error bars denote SEM across subjects.

We first investigated the influence of gray-scale image presentation on participants’ subjective recognition rate of the corresponding Mooney image. For each subject, we sorted all image sets according to the combination of two criteria: (i) regular versus catch; and (ii) whether the gray-scale image was recognized by the subject. We then computed the fraction of Mooney image presentations that were subjectively recognized (i.e. subjects answered “Yes” to the subjective recognition prompt) in the pre- and post-disambiguation phase separately (Fig. 4A). A three-way ANOVA with subjective recognition rate as the dependent measure revealed a significant effect for all the main effects: regular versus catch image sets (F1,21  = 5.0, P =  0.03), recognition of gray-scale image (F1,21  = 38.5, p  = 4.1 × 109), and pre- versus post-disambiguation (F1,21  = 16.2, P  =  8.5 × 105). The interaction between gray-scale image recognition and the other two factors was also significant (with regular versus catch: F1,21  = 37.1, P =  7.5 × 109; with pre- versus post-disambiguation: F1,21  = 4.7, P  =  0.03). There was a trend for the interaction between regular versus catch and pre- versus post-disambiguation (F1,21  = 3.4, P  =  0.07). The interaction among all three factors did not reach significance (P=0.22).

Mooney images disambiguation. (A) Subjective recognition rate of Mooney images, conditioned by whether the Mooney image was presented pre- or post-disambiguation, whether its corresponding gray-scale image was recognized, and whether it was in a regular or catch image set. (B) Mean verbal identification rate of Mooney images in the post-disambiguation phase, conditioned by whether the corresponding gray-scale image was recognized and whether it was in a regular or catch image set. All graphs show mean and SEM. across subjects.
Figure 4.

Mooney images disambiguation. (A) Subjective recognition rate of Mooney images, conditioned by whether the Mooney image was presented pre- or post-disambiguation, whether its corresponding gray-scale image was recognized, and whether it was in a regular or catch image set. (B) Mean verbal identification rate of Mooney images in the post-disambiguation phase, conditioned by whether the corresponding gray-scale image was recognized and whether it was in a regular or catch image set. All graphs show mean and SEM. across subjects.

To better understand the above results, we performed post-hoc paired t-tests between Mooney image subjective recognition rates in the pre- versus post-disambiguation period for each of the four conditions. Mooney image recognition rate increased dramatically after viewing a matching gray-scale image that was successfully recognized (P   =  3.1 × 107). However, even nonrecognized gray-scale images significantly facilitated recognition of their matching Mooney images (P  =  0.0001). Recognition of Mooney images improved marginally between pre- and post-disambiguation phase for catch image sets (gray-scale image recognized: P  =  0.02; gray-scale image not recognized: P  =  0.03; both are n.s. after correction for multiple comparisons), presumably due to repeated presentations of the same Mooney image.

We next investigated the influence of gray-scale image presentation on Mooney image recognition using the verbal identification test. Since all image sets with correct pre-disambiguation verbal identification were removed from analyses, the identification rate for all remaining image sets in the pre-disambiguation period was zero. Thus, we calculated the identification rate in the post-disambiguation period for each of the aforementioned four conditions (Fig. 4B). A two-way ANOVA with identification rate as the dependent measure revealed a significant effect of gray-scale image recognition (F1,21  = 15.6, P=  1.7 × 104), regular versus catch image sets (F1,21  = 18.8, P=  4.0 × 105) and, as expected, a significant interaction effect (F1,21  = 13.8, P  =  3.7 × 104). Post-hoc Wilcoxon signed-rank tests suggested that the identification rate in the post-disambiguation period was significantly different from zero for the regular image sets, regardless of whether the gray-scale image was recognized or not (P =7.1 × 10−5 and P =0.00024, respectively). In contrast, identification rate was not significantly different from zero for catch image sets (P  =  0.13, whether the gray-scale image was recognized or not; P>  0.06 when all catch image sets were included in the test).

A crucial test for our hypothesis is to compare the effect of nonrecognized gray-scale images on Mooney image identification with that of catch images directly, which would inform whether the increase in Mooney image identification rate in the post-disambiguation phase is above and beyond that expected from repeated presentations of the same image. To this end, we compared the fraction of identified Mooney images in the post-disambiguation phase for regular image sets whose gray-scale image was not recognized (second bar in Fig. 4B) with catch image sets (i.e. the third and fourth bar in Fig. 4B combined). There was a significant difference between these two conditions (P  =  0.0492, t  =  2.076, n  =  24, after excluding one subject more than 3 SD outside group mean), suggesting that nonrecognized gray-scale images indeed facilitated Mooney image identification.

We further conducted a similar analysis on subjective recognition, by comparing the pre- and post-disambiguation subjective recognition rates for regular image sets with nonrecognized gray-scale images (second pair of bars in Fig. 4A) and catch image sets (third and fourth pairs of bars in Fig. 4A combined) using a two-way ANOVA. If nonrecognized gray-scale images enhanced subjective recognition rates of their matching Mooney images above and beyond that expected from repeated presentations of the Mooney images, we should expect a significant interaction effect. However, the interaction effect was not significant (P  >  0.5). Together with the previous analysis, these results suggest an interesting pattern: Even though subjects do not report higher subjective recognition rate for Mooney images after presentation of nonrecognized gray-scale images, they do become better at verbally identifying them.

The above results suggest that brief, masked presentation of gray-scale image can facilitate subsequent identification of its matching Mooney image, even when the gray-scale image was not consciously recognized. To ensure that this effect was not due to variation in subjects’ gaze behavior, we analyzed eye-tracking data for regular image sets whose gray-scale images were not recognized (corresponding to “Gray Not Recognized” condition in Fig. 4). Gaze behavior was assessed using three parameters – pupil size, distance to fixation, and eye movement area. The image sets were sorted by whether their Mooney images were correctly identified in the post-disambiguation phase, and each gaze parameter was plotted separately for the pre- and post-disambiguation phase (Fig. 5). Two-way ANOVAs suggested that there was no significant effect of pre- versus post-disambiguation, nor identification status in the post-disambiguation phase, nor their interaction (all P  >  0.4). We further conducted a similar analysis on gaze behavior during gray-scale image presentation, and again there was no significant difference between those with later identified Mooney images and those with Mooney images that remained unidentified (paired t-tests, all P>  0.07).

Control analysis on eye-tracking data. Only image sets whose Mooney image was not verbally identified in the pre-disambiguation phase, and whose gray-scale image was not recognized, were used in this analysis. Pupil size (A), mean distance to fixation (B), and eye movement area (C) were averaged across image presentations for Mooney images in the pre- versus post-disambiguation phase, conditioned on whether it was correctly identified post-disambiguation. All graphs show mean and SEM across subjects.
Figure 5.

Control analysis on eye-tracking data. Only image sets whose Mooney image was not verbally identified in the pre-disambiguation phase, and whose gray-scale image was not recognized, were used in this analysis. Pupil size (A), mean distance to fixation (B), and eye movement area (C) were averaged across image presentations for Mooney images in the pre- versus post-disambiguation phase, conditioned on whether it was correctly identified post-disambiguation. All graphs show mean and SEM across subjects.

Further control analyses demonstrate that our finding is unlikely to be explained by classical priming effect. Even if a gray-scale image is immediately followed by its matching Mooney image (and the subject took no time to respond), their temporal separation is longer than 4.5 s. This is much longer than classical priming effect, which typically lasts no more than hundreds of milliseconds (Kouider and Dehaene, 2007). In addition, a gray-scale image and its matching Mooney image is typically separated by multiple trials (see Fig. 6, which only includes the first Mooney image presentation in each block), with the average trial length being 5.92 s. Finally, image sets corresponding to our main behavioral finding (Mooney images not identified in the pre-disambiguation period, but identified post-disambiguation, despite nonrecognized gray-scale image) did not skew toward shorter temporal separation between gray-scale and Mooney images (Fig. 6C). A distribution skewed toward shorter temporal separation would be expected if priming were the mechanism behind our finding. Finally, Mooney images corresponding to gray-scale images presented in the G2 position (Fig. 1C) did not have higher post-disambiguation subjective recognition rates than those with gray-scale images presented in the G1 position (P  =  0.27, Fig. 7A), indicating that having gray-scale images temporally closer to Mooney images did not enhance their disambiguation effect. Finally, block position of Mooney image presentation did affect post-disambiguation subjective recognition rates (P=  0.002), which is expected given that block 2 (“repeat series” in Fig. 1C) provided an extra exposure to the gray-scale image and that the same Mooney image has been presented two more times (Fig. 7B).

Control analysis on temporal distance between gray-scale and corresponding Mooney images. Only regular image sets whose Mooney image was not verbally identified in the pre-disambiguation phase were used in this analysis. (A) Distance between recognized gray-scale images and subsequent corresponding Mooney presentation. (B) Distance between nonrecognized gray-scale images and subsequent corresponding Mooney presentation. (C) Distance between nonrecognized gray-scale images and subsequent corresponding Mooney presentation that were verbally identified post-disambiguation (i.e. the set of images from the second bar of Fig. 4B, which is a subset of images from panel B). Catch image sets are excluded from this analysis.
Figure 6.

Control analysis on temporal distance between gray-scale and corresponding Mooney images. Only regular image sets whose Mooney image was not verbally identified in the pre-disambiguation phase were used in this analysis. (A) Distance between recognized gray-scale images and subsequent corresponding Mooney presentation. (B) Distance between nonrecognized gray-scale images and subsequent corresponding Mooney presentation. (C) Distance between nonrecognized gray-scale images and subsequent corresponding Mooney presentation that were verbally identified post-disambiguation (i.e. the set of images from the second bar of Fig. 4B, which is a subset of images from panel B). Catch image sets are excluded from this analysis.

Control analysis on the effects of gray-scale image position and Mooney image block position on post-disambiguation Mooney image subjective recognition rates. Only regular image sets whose Mooney image was not verbally identified in the pre-disambiguation phase were included in this analysis. (A) Post-disambiguation subjective recognition of Mooney images corresponding to gray-scale images presented in the G1 or G2 position (the first or second gray-scale image in a block, respectively). (B) Post-disambiguation subjective recognition of Mooney images in the first block versus second block in a run. Thin gray lines are individual-subject data; black dashed lines show group average. P-values are obtained using paired t-tests across subjects.
Figure 7.

Control analysis on the effects of gray-scale image position and Mooney image block position on post-disambiguation Mooney image subjective recognition rates. Only regular image sets whose Mooney image was not verbally identified in the pre-disambiguation phase were included in this analysis. (A) Post-disambiguation subjective recognition of Mooney images corresponding to gray-scale images presented in the G1 or G2 position (the first or second gray-scale image in a block, respectively). (B) Post-disambiguation subjective recognition of Mooney images in the first block versus second block in a run. Thin gray lines are individual-subject data; black dashed lines show group average. P-values are obtained using paired t-tests across subjects.

Discussion

Our results first replicated the classic Mooney image disambiguation phenomenon – degraded two-tone images that were initially unrecognizable are effortlessly recognized after seeing the corresponding original gray-scale image. Surprisingly, we also observed that even when the gray-scale image was not consciously recognized, it still facilitated recognition of subsequently presented matching Mooney image. This finding indicates that a fleeting, backward-masked image that failed to reach conscious recognition can nevertheless leave a prior in the brain that guides and shapes subsequent visual perception.

Recognition rate of Mooney images may increase over repeated presentations. This potential confound was controlled for by the use of catch image sets in our experiment, where the gray-scale image presented did not match the Mooney image. The inclusion of catch image sets confirmed that veridical recognition of Mooney images, as assessed by verbal identification rate, following nonrecognized matching gray-scale images, was above and beyond that expected from repeated presentations of the same Mooney image (P  <  0.05, paired t-test of post-disambiguation verbal ID rate for “gray not recognized” against catch). On the other hand, the increase of subjective recognition rate following nonrecognized gray-scale images did not significantly exceed that expected from repeated presentations of the same Mooney image [P  >  0.5, interaction effect of (pre- versus post-) × (“gray not recognized” versus catch)]. This is an interesting dissociation, suggesting that subjects become better at verbally identifying Mooney images after viewing nonrecognized gray-scale image, even though their subjective recognition rate does not improve significantly (as compared to catch image sets). This observation underlines the importance of verifying subjects’ self-report.

Another potential confound is that for unconsciously disambiguated Mooney images (whose corresponding gray-scale images were not recognized), subjects may have paid more attention while viewing those images, or broken fixation more often to explore the image. Our control analysis on eye-tracking data suggested that this was not the case. For regular image sets whose gray-scale images failed to be recognized, there was no difference in pupil size or fixation behavior during any stage of the experiment between those with later identified Mooney images and those whose Mooney images remained unidentified.

An addition control analysis on the temporal spacing between gray-scale images and their corresponding Mooney images presented thereafter suggests that the long timescales in which the disambiguation effect occurs preclude classical priming as an explanation for our finding. The minimum time between a gray-scale and its corresponding Mooney image presented thereafter is 4.5 s, which is much beyond the timescale of classical priming effect. The distribution of this time interval across trials (even when considering only the first Mooney presentation in each block) is between 5 and 29 s (Fig. 6), and that of the crucial condition – unconscious disambiguation – is not skewed toward shorter time intervals (Fig. 6C). In addition, gray-scale images presented closer to the post-disambiguation Mooney images did not elicit a stronger disambiguation effect (Fig. 7A). All of these findings converge to suggest that classical priming is unlikely to be responsible for our finding. Rather, the unconscious disambiguation effect we observed exhibits a very long timescale, similar to conscious disambiguation.

One potential caveat is the use of subjective recognition of gray-scale images as a proxy for classifying whether these images were consciously recognized. It is important to point out that even when subjects answer “not recognized,” there may be residual awareness of low-level stimulus features (Dienes and Seth, 2010). In other words, it is possible that subjects were aware of a stimulus being presented. However, our interest was in whether subjects consciously recognized the “content” of the image, thus, the subjective recognition question was an appropriate measure for this purpose. Our results show that even when subjects were not consciously aware of the “content” of the gray-scale image, it nevertheless facilitated identification of its corresponding Mooney image thereafter.

In summary, we have demonstrated that a fleeting, masked visual image that was not consciously recognized could nevertheless leave a prior in the brain that guides future perception and recognition of a different but related image. This effect is likely carried by top-down influences rather than priming of low-level regions for several reasons. First, the low-level visual features of the gray-scale image and its corresponding Mooney image are very different, similar to a previous finding using ambiguous images (Owen, 1985). Second, the gray-scale image and its matching Mooney image were presented with a minimum of ∼5 s separation and at least one question prompt between them, while priming effect typically lasts no more than hundreds of milliseconds (Kouider and Dehaene, 2007). Third, since the presentation order of gray-scale and Mooney images was randomized within each block (Fig. 1C), a gray-scale image and its matching Mooney image were typically separated by other unrelated images (Fig. 6). In contrast, priming effect is vulnerable to interfering stimuli (Kouider and Dehaene, 2007). Future neuroimaging work should elucidate the neural underpinnings of such unconsciously established perceptual prior, and further illuminate their similarities and differences with consciously established priors. Last but not least, these results push the boundary of our knowledge on the depth and scope of unconscious processing in the brain.

Acknowledgements

This research was supported by the Intramural Research Program of the National Institutes of Health/National Institute of Neurological Disorders and Stroke. B.J.H. acknowledges support by Leon Levy Foundation. Data are available on request.

Conflict of interest statement. None declared.

References

Albright
TD.
On the perception of probable things: neural substrates of associative memory, imagery, and perception
.
Neuron
2012
;
74
:
227
45
.

Bar
M.
The proactive brain: using analogies and associations to generate predictions
.
Trends Cogn Sci
2007
;
11
:
280
89
.

Bastos
AM
Usrey
WM
Adams
RA
, et al. .
Canonical microcircuits for predictive coding
.
Neuron
2012
;
76
:
695
711
.

Berkes
P
Orban
G
Lengyel
M
, et al. .
Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment
.
Science
2011
;
331
:
83
87
.

Dienes
Z
Seth
AK
, et al. .
Measuring any conscious content versus measuring the relevant conscious content: comment on Sandberg
.
Conscious Cogn
2010
;
19
:
1079
80
. discussion 1081–1073.

Dolan
RJ
Fink
GR
Rolls
E
, et al. .
How the brain learns to see objects and faces in an impoverished context
.
Nature
1997
;
389
:
596
99
.

Eldar
E
Cohen
JD
Niv
Y.
The effects of neural gain on attention and learning
.
Nat Neurosci
2013
;
16
:
1146
53
.

Fahrenfort
JJ
Snijders
TM
Heinen
K
, et al. .
Neuronal integration in visual cortex elevates face category tuning to conscious face perception
.
Proc Natl Acad Sci USA
2012
;
109
:
21504
9
.

Gilbert
CD
Li
W.
Top-down influences on visual processing
.
Nat Rev Neurosci
2013
;
14
:
350
63
.

Gorlin
S
Meng
M
Sharma
J
, et al. .
Imaging prior information in the brain
.
Proc Natl Acad Sci U S A
2012
;
109
:
7935
40
.

Kouider
S
Dehaene
S.
Levels of processing during non-conscious perception: a critical review of visual masking
.
Philos Trans R Soc Lond B Biol Sci
2007
;
362
:
857
75
.

Li
W
Piech
V
Gilbert
CD.
Perceptual learning and top-down influences in primary visual cortex
.
Nat Neurosci
2004
;
7
:
651
57
.

Ludmer
R
Dudai
Y
Rubin
N.
Uncovering camouflage: amygdala activation predicts long-term memory of induced perceptual insight
.
Neuron
2011
;
69
:
1002
14
.

Mumford
D.
On the computational architecture of the neocortex. II. The role of cortico-cortical loops
.
Biol Cybern
1992
;
66
:
241
51
.

Owen
LA.
The effect of masked pictures on the interpretation of ambiguous pictures
.
Curr Psychol Res Rev
1985
;
4
:
108
18
.

Panagiotaropoulos
TI
Deco
G
Kapoor
V
, et al. .
Neuronal discharges and gamma oscillations explicitly reflect visual consciousness in the lateral prefrontal cortex
.
Neuron
2012
;
74
:
924
35
.

Pouget
A
Beck
JM
Ma
WJ
, et al. .
Probabilistic brains: knowns and unknowns
.
Nat Neurosci
2013
;
16
:
1170
78
.

Schlack
A
Albright
TD.
Remembering visual motion: neural correlates of associative plasticity and motion recall in cortical area MT
.
Neuron
2007
;
53
:
881
90
.

Summerfield
C
de Lange
FP.
Expectation in perceptual decision making: neural and computational mechanisms
.
Nat Rev Neurosci
2014
;
15
:
745
56
.

Tovee
MJ
Rolls
ET
Ramachandran
VS.
Rapid visual learning in neurones of the primate temporal visual cortex
.
Neuroreport
1996
;
7
:
2757
60
.

van Gaal
S
Lamme
VA.
Unconscious high-level information processing: implication for neurobiological theories of consciousness
.
Neuroscientist
2012
;
18
:
287
301
.

Wang
M
Arteaga
D
He
BJ.
Brain mechanisms for simple perception and bistable perception
.
Proc Natl Acad Sci USA
2013
;
110
:
E3340
49
.

Yuille
A
Kersten
D.
Vision as Bayesian inference: analysis by synthesis?
Trends Cogn Sci
2006
;
10
:
301
8
.

Zhu
Q
Song
Y
Hu
S
, et al. .
Heritability of the specific cognitive ability of face perception
.
Curr Biol
2010
;
20
:
137
42
.