-
PDF
- Split View
-
Views
-
Cite
Cite
Bin Zhou, Guo Feng, Wei Chen, Wen Zhou, Olfaction Warps Visual Time Perception, Cerebral Cortex, Volume 28, Issue 5, May 2018, Pages 1718–1728, https://doi.org/10.1093/cercor/bhx068
- Share Icon Share
Abstract
Our perception of the world builds upon dynamic inputs from multiple senses with different temporal resolutions, and is threaded with the passing of subjective time. How time is extracted from multisensory inputs is scantly known. Utilizing psychophysical testing and electroencephalography, we show in healthy human adults that odors modulate object visibility around critical flicker-fusion frequency (CFF)—the limit at which chromatic flickers become perceived as a stable color—and effectively alter CFF in a congruency-based manner, despite that they afford no clear environmental temporal information. The behavioral gain produced by a congruent relative to an incongruent odor is accompanied by elevated neural oscillatory power around the object's flicker frequency in the right temporal region ~150–300 ms after object onset, and is not mediated by visual awareness. In parallel, odors bias the subjective duration of visual objects without affecting one's temporal sensitivity. These findings point to a neuronal network in the right temporal cortex that executes flexible temporal filtering of upstream visual inputs based on olfactory information. Moreover, they collectively indicate that the very process of sensory integration at the stage of object processing twists time perception, hence casting new insights into the neural timing of multisensory events.
As a fundamental physical quantity, time continuously anchors events from past through present to future. This continuity is not fully embodied in our perception, which is constructed from discrete processing epochs (snapshots) of the dynamic environmental inputs (VanRullen and Koch 2003; Fries 2009) that have different temporal precisions (Buhusi and Meck 2005). In the case of vision, word perception takes over 100 ms whereas chromatic flicker can be resolved at ~20 Hz and luminance flicker beyond 50 Hz (Holcombe 2009). Audition, prominent in timing, operates on the scale of milliseconds to tens of milliseconds (Phillips 1999). Olfactory sampling, by contrast, is inherently restricted by the respiratory cycle that typically lasts several seconds. As such, time as we experience it on different timescales has been framed as hierarchically connected subjective phenomena, including temporal order and duration, that are extracted from the dynamics of distributed neural networks implemented by synaptic and cellular properties as well as oscillations (Buonomano and Merzenich 1995; Pöppel 1997; Mauk and Buonomano 2004; Buhusi and Meck 2005). Yet how multimodal signals entering various brain regions are coordinated with each other remains poorly charted in the temporal domain.
Psychophysical studies using both visual and auditory stimuli generally note a superior temporal acuity of audition over vision to the extent that auditory stimuli dominate the percept of visual temporal rate but not vice versa (Shipley 1964; Recanzone 2009). Intermodal timing is found to be significantly worse than timing within modalities with elevated interval discrimination threshold (Rousseau et al. 1983; Grondin and Rousseau 1991; Westheimer 1999). On the surface, these suggest that multimodal processing imposes a cost on temporal resolution, which seems at odds with the consensus that sensory integration shortens response latency (Gottfried and Dolan 2003; Rowland et al. 2007; Stein and Stanford 2008). As response latency reflects temporal accumulation of discrete sensory evidence needed to reach a perceptual decision (VanRullen and Koch 2003; Huk and Shadlen 2005; Wong and Wang 2006), reduced temporal resolution presumably should lead to increased duration to accumulate sufficient sensory evidence and hence delayed rather than speeded response.
In an effort to address this discrepancy and to probe the more general issue of whether sensory integration influences perceptual sampling—thereby the building blocks of our perceptual experience, we set out to examine the effect of olfaction on visual temporal perception. Unlike audition, olfaction is typically considered vague and fuzzy (Cain 1979) and affords no clear environmental temporal information (Gire et al. 2013), yet it readily integrates with vision and modulates visual object processing (Zhou et al. 2010, 2012; Chen et al. 2013; Robinson et al. 2015). This provides a means to delineate the specific role of multisensory integration in time perception—any influence of olfaction on visual temporal processing would be due to olfactory-visual integration per se rather than added temporal information from the olfactory channel. In a series of experiments, we combined psychophysical testing and electroencephalography (EEG) to assess how an odor acts upon the visual temporal sampling and apparent duration of an object.
Materials and Methods
Participants
A total of 138 healthy nonsmokers (62 males) participated in the study, 16 in Experiment 1 (mean age ± SD = 22.7 ± 3.3 y), 32 in Experiment 2 (23.6 ± 2.5 y), 18 in Experiment 3 (23.3 ± 2.8 y), 24 in Experiment 4 (23.9 ± 2.3 y), and 48 in Experiment 5 (22.6 ± 2.2 y). All participants reported to have normal or corrected-to-normal vision, a normal sense of smell, and no respiratory allergy or upper respiratory infection at the time of testing. They gave informed consent to participate in procedures approved by the Institutional Review Boards at the Institute of Psychology at Chinese Academy of Sciences.
Olfactory Stimuli
The olfactory stimuli were presented in identical 280 ml glass bottles. They consisted of a banana-like odor (amyl acetate, 0.02% v/v in propylene glycol) and an apple-like odor (apple flavor, Givaudan) in Experiments 1, 3, and 4, as well as 2 bottles of purified water for half of the participants in Experiments 2 and 5, which served as control experiments for Experiments 1 and 4, respectively. Each bottle contained 10 ml clear liquid and was connected with 2 Teflon nosepieces via a Y-structure. Participants were instructed to position the nosepieces inside their nostrils and continuously inhale through the nose and exhale through their mouth throughout the experiments. In Experiments 1, 3, and 4, each participant was firstly presented with 3 concentrations of the apple-like odor (0.01%, 0.02%, and 0.03% v/v in propylene glycol) and was asked to pick the one that best matched with the banana-like odor in terms of both intensity and pleasantness. The chosen concentration was subsequently used in the actual testing.
Visual Stimuli
The visual stimuli were presented on a gamma-corrected 21″ CRT monitor (DELL Trinitron P1130). In Experiments 1-3, isoluminant 2-color apple and banana drawings were employed. Specifically, 2 opposite images were produced for each line drawing, namely, red apple/banana (4.7° × 5.2°) on green background (8.3° × 8.3°) and green apple/banana on red background (Fig. 1A). The edges were Gaussian blurred with a radius of 5 pixels (FWHM ≈ 0.5°) to facilitate flicker-fusion. Uniform red and green fields (8.3° × 8.3°, Fig. 1A) served as comparisons (Experiments 1 and 2) or controls (Experiment 3). Red/green isoluminance was individually determined through heterochromatic flicker photometry (Ives 1912). Two static colored images of an apple (4.2° × 4.4°) and bananas (3.8° × 5.0°) (Fig. 4A), respectively, were used in Experiments 4 and 5. For half of the participants in Experiments 2 and 5, a semantic label of either “apple odor” or “banana odor” (in Chinese, 2.3° × 2.3°) was constantly presented at fixation throughout each block.
Behavioral Procedures and Analyses
Odor Judgment
The participants in Experiment 1 and half of those in Experiments 2 and 5 sampled each olfactory stimulus and rated on a 100-unit visual analog scale its intensity and pleasantness, with a 1-min break between the samplings. Those in Experiments 2 and 5 were told that the 2 bottles respectively contained a low concentration of apple odor and banana odor, and rated on a 100-unit visual analog scale each bottle's similarities to the odors of apple and banana, in addition to intensity and pleasantness. The ratings were then compared between the olfactory stimuli in each experiment with paired-sample t-tests.
2AFC Object Detection Around CFF
Experiments 1 and 2 adopted a 2-alternative forced-choice (2AFC) object detection task. As shown in Figure 1A, each trial of the task began with a fixation on a central cross (0.5° × 0.5°) for 800–1200 ms, and comprised 2 stimulus intervals of 400 ms each and a 600 ms blank screen in between. During one of the stimulus intervals, the opposite images of apple (50% of the trials) or banana (50% of the trials) alternated at a frequency of 15, 20, 22.5, or 25 Hz (see below). During the other stimulus interval, uniform red and green fields alternated at the same frequency. Each interval was preceded and followed by a red–green noise pattern (8.3° × 8.3°) that lasted 200 ms. The participants were asked to press one of two buttons to indicate whether the first or the second stimulus interval contained an object. The next trial began immediately after a response was made.
There were 40 trials per block. The images flickered at 15 Hz in 8 trials and at a fixed frequency of 20, 22.5, or 25 Hz in the other 32 trials in random order. We adjusted the refresh rate of the CRT monitor (90, 120, or 150 Hz) in between blocks, as a single refresh rate could not generate all of the above flicker frequencies. Each participant completed a total of 12 blocks and 480 trials, including 6 blocks per olfactory condition (apple-like odor vs. banana-like odor) in Experiment 1, and 6 blocks per semantic condition (water suggested as containing an apple vs. a banana odor for half of the participants; a constant visual semantic label of “apple odor” vs. “banana odor” for the other half) in Experiment 2.
The theoretical threshold accuracy in a 2AFC detection task is 75% (see Results). For each participant, flicker frequencies with overall perithreshold accuracies (75 ± 15%) across image contents (apple vs. banana) and olfactory/semantic conditions were pooled together. The resulting accuracies around CFF were analyzed with a repeated measures ANOVA using image content and olfactory/semantic condition as the within-subjects factors. The perithreshold accuracy range of 75 ± 15% was chosen to maximize the number of trials included in the analysis while eliminating flicker frequencies that were likely suprathreshold (accuracy > 90%) or subthreshold (accuracy < 60%). Using a more restricted range of 75 ± 10% did not affect the result patterns.
We also performed a complementary analysis where we estimated for each participant his/her CFF per image content and olfactory/semantic condition by fitting their detection accuracies with a Boltzmann sigmoidal function x being the actual flicker frequency. in this function was a lapse parameter varied between 0 and 6% (Wichmann and Hill 2001); corresponded to the estimated CFF; ω, associated with the shape of the fitted curve, was held constant across conditions for each participant as there were only 4 flicker frequencies (x = 15, 20, 22.5, or 25 Hz) and initial analysis indicated that ω was unaffected by image content, olfactory/semantic condition, or their interaction (ps > 0.05, in both Experiments 1 and 2). The resulting CFFs were then grouped by the congruency between olfactory/semantic condition and image content and subjected to a paired-sample t-test.
Object Processing Below CFF
The above object detection task was modified in Experiment 3 for EEG recording. The same visual stimuli were used. Each trial began with a fixation on a central cross (0.5° × 0.5°) for 1800–2200 ms and comprised one 400 ms stimulus interval that was preceded and followed by a red-green noise pattern for 100 ms. During the stimulus interval, the opposite images of apple or banana alternated at a visible frequency of 15 Hz in 20% of the trials and at a predetermined subliminal frequency (22.5 Hz for 9 participants, 25 Hz for 9 participants; see below) in 60% of the trials. The remaining 20% of the trials were catch trials where red and green uniform fields alternated at the subliminal frequency. In trials that contained an object, apple and banana appeared with equal probability. The participants, while being continuously exposed to either the apple-like or the banana-like odor, pressed one of two buttons to indicate whether an object was present or not in each trial. There were 40 trials per block. Each participant completed a total of 10 blocks, with 5 blocks per olfactory condition. Hit and correct-rejection rates were calculated.
The subliminal flicker frequency was individually determined prior to the actual testing using a procedure like that in Experiments 1 and 2 but without olfactory stimulus or semantic manipulation. The frequency right beyond one's CFF with an overall accuracy no larger than 70% was chosen (threshold accuracy = 75%).
2AFC Subsecond Duration Comparison
Experiments 4 and 5 employed a 2AFC duration comparison task. As shown in Figure 4A, each trial of the task began with a fixation on a central cross (0.5° × 0.5°), 400–600 ms after which 2 images of an apple and bananas, respectively, appeared in random order with an inter-stimulus interval of 600–800 ms. The participants pressed one of two buttons to indicate whether the first or the second image was longer in duration. For half of the participants, the apple image served as the standard image and was always presented for 500 ms; the banana image served as the comparison image and was presented for 300, 400, 450, 500, 550, 600, or 700 ms with equal probability. For the other half of the participants, it was the reverse. Each participant completed 6 blocks of 42 trials each, with 3 blocks per olfactory condition (apple-like odor vs. banana-like odor) in Experiment 4, and 3 blocks per semantic condition (water suggested as containing an apple vs. a banana odor for half of the participants; a constant visual semantic label of “apple odor” vs. “banana odor” for the other half) in Experiment 5.
We calculated for each participant and each olfactory/semantic condition the proportions that the comparison image was judged as longer in duration. These were fitted with a Boltzmann sigmoidal function where x represented the actual duration of the comparison image; λ was a lapse parameter between 0 and 6%; corresponded to the point of subjective equality (PSE), at which the observer perceived the comparison image to equal the standard image in duration; and half the interquartile range of the fitted function corresponded to difference limen (DL), an index of discrimination sensitivity. Repeated measures ANOVAs were subsequently performed on PSEs and DLs, respectively, with olfactory/semantic condition as the within-subject factor and comparison image (apple image vs. banana image) as the between-subject factor.
In Experiments 1, 3, and 4, the participants were continuously exposed to one olfactory stimulus per block. There was a break of over 2 min in between the blocks to eliminate olfactory habituation. The olfactory stimulus was replaced by either water that was suggested as containing an odor (for half of the participants) or a visual semantic label of an odor (for the other half of the participants) in Experiments 2 and 5. The order of the olfactory/semantic conditions was pseudo-randomized across blocks and balanced across participants in each experiment.
EEG Data Acquisition and Analyses
In Experiment 3, scalp EEG was recorded using a 64-channel NeuroScan SynAmps2 system (Compumedics NeuroScan) with a band-pass filter from either DC or 0.01 Hz to 100 Hz, and digitized at 500 Hz. The signals were referenced online to an electrode between CZ and CPZ. Vertical and horizontal electro-oculograms were recorded from electrodes placed below and above the left eye, and at the outer cantus of each eye, respectively. Electrode impedance was kept below 5 KΩ.
Offline EEG analyses were performed in Matlab using FieldTrip (Oostenveld et al. 2011) and EEGLAB/ERPLAB (Delorme and Makeig 2004; Lopez-Calderon and Luck 2014). The EEG data were power-line notch-filtered at 50 Hz, high-pass filtered at 1 Hz, and segmented into epochs of 1200 ms (−500 to 700 ms with respect to flicker onset). Epochs containing eye movement or with EEG exceeding ±100 μV were rejected.
We first validated our flicker frequency manipulation. To this end, we pooled together the catch trials with the same subliminal flicker frequency (22.5 or 25 Hz), and performed a fast Fourier transformation on the combined occipital EEG signals recorded from CB1, CB2, O1, O2, and OZ in a time window of 100–500 ms after flicker onset. We then normalized the obtained powers for each frequency by prestimulus values (−400 to 0 ms).
Subsequent analyses were conducted on the subliminal flicker trials that contained an object undetected by the participants. These trials were classified as congruent or incongruent based on the congruency between the subliminal object and the olfactory stimulus the participants were being exposed to. Trials where the participants reported seeing an object were excluded. Based on previous findings on object recognition and intracortical temporal filtering of high-frequency chromatic information in the ventral visual stream (Doniger et al. 2000; Jiang et al. 2007; Cichy et al. 2014), we were primarily interested in electrode sites along the ventral visual stream, particularly in the temporal area, which sits downstream of occipital regions that distinguish between fused chromatic flicker and matched nonflicker and is also responsible for multisensory object representations (Murray and Richmond 2001; Taylor et al. 2006).
Time–Frequency Analysis
For each trial, the EEG data from −200 to 400 ms in 10 ms steps were analyzed in the time–frequency domain by convolution with complex Gaussian Morlet's wavelets, yielding a time–frequency power map , where was for each time t and frequency f a complex Morlet's wavelet with and (Tallon-Baudry et al. 1996). The obtained powers for each frequency were then normalized by prestimulus means (−200 to 0 ms). We focused on the 10 Hz frequency range centered at the subliminal flicker frequency, that is, 22.5/25 ± 5 Hz, and mapped the scalp topographies for the band-power difference between congruent and incongruent trials within this frequency range, averaged across participants, in steps of 50 ms from 100 to 400 ms poststimulus onset—a period encompassing the processing stages relevant to object perception (Doniger et al. 2000; Cichy et al. 2014). These preliminary plots revealed a congruency-induced power enhancement in the right temporal area, within our a priori region of interest (Fig. 3A). To further quantify this observation, we then subjected the time–frequency data for congruent versus incongruent trials to a nonparametric permutation test with 2000 Monte Carlo randomizations (Maris and Oostenveld 2007), using a sliding window of 50 ms that was advanced in steps of 50 ms from 100 to 400 ms poststimulus onset. Multiple comparisons over all temporal electrodes were corrected for using the Bonferroni adjustment. To characterize the distribution of congruency-induced power difference over frequency (subliminal flicker frequency ± 5 Hz) and time (−50 to 350 ms), we fitted the time–frequency data with a 2-dimensional Gaussian function (Fig. 3C).
Source Localization
We performed adaptive beamforming using the dynamic imaging of coherent sources algorithm (Gross et al. 2001) to localize possible sources of the congruency-induced power difference within ±5 Hz of the subliminal flicker frequency. Based on the results of the time–frequency analysis, a time window of 100–300 ms poststimulus onset was chosen to allow a reasonable frequency resolution (1/length of time window in sec) and to adequately encompass the EEG signals pertaining to the congruency effect (Dalal et al. 2008). All electrodes were overlaid onto the scalp of a volume conductor model derived from the MNI brain template Conlin27 (Montreal Neurological Institute) (Fuchs et al. 2002), and a lead field matrix (i.e., spatial forward model) was calculated with 1 cm grid resolution. At each grid point spatial filters were constructed from the cross-spectral density matrix of the Fourier-transformed EEG signals and the respective lead field. The resulting spatial filters were subsequently applied to the power of the Fourier-transformed data for the frequency range of interest to optimally estimate potential sources. Significance of neural sources across participants was determined by a permutation test between congruent and incongruent trials with 2000 Monte Carlo randomizations (Maris and Oostenveld 2007). The resulting t values were interpolated onto the surface of the template brain.
Results
Odors Alter Visual Temporal Sampling and Modulate Object Visibility Around Critical Flicker-Fusion Frequency in a Congruency-Based Manner
We took advantage of the flicker-fusion phenomenon (De Lange Dzn 1958) and constructed 2 pairs of opposite images of apple and banana drawings, that is, isolumiant red apple or banana on green and green apple or banana on red (Fig. 1A). By definition, when the 2 opposite images in a pair alternated at a frequency corresponding to the critical flicker-fusion frequency (CFF), the red and green colors tended to “fuse” beyond V4 in the visual system (Jiang et al. 2007) and the object (apple or banana, each presented in 50% of the trials) would be only visible half of the time, presumably reflecting a bottleneck of visual object sampling (Biederman et al. 1974; Holcombe 2009). This translated to a ~75% accuracy in a 2-alternative forced-choice (2AFC) task where a series of red and green uniform fields with the same flicker frequency served as the comparison and participants indicated which image series contained an object (Fig. 1A, see Materials and Methods). We firstly asked if olfactory inputs would influence object detectability at around CFF.

Olfactory modulations of visual temporal sampling and object visibility around CFF. (A) Visual stimuli and procedure of the 2AFC object detection task used in Experiments 1 and 2. Resp.: response. (B)Left panel: object detection accuracies at perithreshold flicker frequencies exhibited an interaction between odor content and image content. Right panel: Sigmoidal curve fits of object detection accuracies under the exposures to congruent and incongruent odors, respectively; inset shows the estimated CFFs, higher in the presence of a congruent as opposed to an incongruent odor. (C, D) Left panels: suggested odor content (C) and visual semantic label (D) in the absence of actual olfactory input failed to alter object detection accuracies at perithreshold flicker frequencies. Right panels: the estimated CFFs showed no effect of the congruency between image content and suggested odor content (C) or visual semantic label (D). Error bars represent standard errors of the mean adjusted for individual differences; *P < 0.05.
In Experiment 1, 16 participants performed the aforementioned task while being continuously exposed to either an apple-like or a banana-like odor of comparable intensity (P = 0.22) and pleasantness (P = 0.13) (see Materials and Methods). The images flickered at 15, 20, 22.5, or 25 Hz, which largely covered the normal range of chromatic CFF (Mason et al. 1982). For each participant, flicker frequencies with overall perithreshold accuracies (75 ± 15%) were combined and analyzed. Using performance accuracy as the dependent variable, repeated measures ANOVA revealed a significant interaction between olfactory condition (apple-like odor vs. banana-like odor) and image content (apple vs. banana) (F1, 15 = 9.92, P = 0.007), with no main effect of either factor (F1,15 = 0.48 and 0.48, P = 0.50 and 0.50, for olfactory condition and image content, respectively) (Fig. 1B left panel). In other words, the participants were better at detecting the flickering object around CFF when they smelled a congruent, rather than an incongruent, odor, despite that the task did not require explicit object discrimination or identification.
Given that CFF was operationally defined as the flicker frequency where accuracy in the 2AFC task was at 75% correct, we were also able to estimate for each participant his/her CFF per image content and olfactory condition by fitting their accuracies with a Boltzmann sigmoidal function (Fig. 1B right panel, see Materials and Methods). This complementary analysis revealed that the CFF for a flickering object was significantly increased by an average of 0.67 Hz in the presence of a congruent relative to an incongruent odor (23.08 vs. 22.41 Hz, t15 = 2.15, P = 0.048), which more directly pointed to an olfactory modulation of visual object sampling, a process that constitutes the basis of visual time perception (Treisman 1963; Buhusi and Meck 2005).
To verify if this was merely a result of demand characteristics (Orne 1969) or semantic priming (i.e., the presence of the apple/banana odor primed the conceptual link between apple/banana odor and apple/banana image), we tested 2 independent groups of 16 participants each in Experiment 2, who performed the same task but under different semantic manipulations. In the first group, 2 bottles of purified water were used in place of the actual odors in Experiment 1. The subjects were however told that one of the bottles contained a low concentration of apple odor and the other a low concentration of banana odor. They were also told which odor they were going to receive before each block. In the second group, verbal labels of “apple odor” and “banana odor” were adopted instead, one of which was constantly presented at fixation throughout each block, serving as a salient semantic cue. Semantic suggestions effectively biased olfactory percepts in the first group of participants. They rated the purified water as more like the odor of apple (P = 0.007), less like the odor of banana (P < 0.001), but similarly intense (P = 0.37) and pleasant (P = 0.13) when it was suggested as containing an apple as opposed to a banana odor. In both groups, however, object detections in the 2AFC task were unaffected by semantic manipulations, as denoted by an absence of interaction between semantic cue (suggested odor content or verbal label) and image content in their performance accuracies around CFF (F1,15 = 0.001 and 0.07, P = 0.98 and 0.80, respectively, Fig. 1C, D, left panels). Analysis of the estimated CFFs also showed no effect of the congruency between semantic cue and image content (t15 = 0.06 and −0.57, P = 0.96 and 0.58, Fig. 1C,D right panels). Moreover, a direct comparison between the results of Experiments 1 and 2 confirmed that the presence of the odors elicited a stronger congruency effect than the semantic manipulations (object detection accuracy around CFF: F1,46 = 7.11, P = 0.011; estimated CFF: F1,46 = 5.75, P = 0.021).
Based on these results, we inferred that the sensory congruency relative to incongruency between olfactory and visual inputs facilitated visual temporal sampling and boosted the visibility of the corresponding object around CFF. It remained unclear, however, where in the sensory processing stream the effect first took place and whether visual awareness was critically involved in. This led us to probe the underlying mechanisms.
Olfactory-Visual Integration Adjusts the Spectral Power of Early Temporal EEG Signals Around the Flicker Frequency of a Subliminal Object
Along the visual processing hierarchy, retinal ganglion cells and lateral geniculate nucleus neurons can exhibit spike timing reproducibility approaching 1 ms (Berry et al. 1997; Reinagel and Reid 2000). Cortical regions from V1 to V4 reliably respond to chromatic flicker beyond the fusion frequency (Jiang et al. 2007). Temporal filtering that shapes our visual awareness of colored objects seems to occur only at higher cortical areas in the visual stream, and as shown above, is liable to olfactory modulation. Utilizing the high temporal resolution of EEG, we next probed the neural correlates of such congruency-based crossmodal temporal modulation in Experiment 3. Specifically, we asked how an odor enables an otherwise undetectable flickering object to enter awareness (as suggested by Experiment 1), and thus focused our examination on subconscious visual processing. We were primarily interested in the temporal region because it sits downstream of the aforementioned early visual areas and has been shown to be responsible for multisensory object representations (Murray and Richmond 2001; Taylor et al. 2006).
The same visual and olfactory stimuli as in Experiment 1 were adopted. Each trial comprised only one image series. Participants pressed one of two buttons to indicate whether an object was present or not while being continuously exposed to either the apple-like or the banana-like odor. Out of all trials, 80% contained an object (40% contained the apple drawing, 40% contained the banana drawing), which flickered at a predetermined subliminal frequency right above one's CFF in 60% of the trials (22.5 Hz for 9 participants, 25 Hz for another 9 participants; see Materials and Methods), and at a visible frequency of 15 Hz in 20% of the trials to sustain participants’ attention. The remaining 20% were catch trials where red and green uniform fields alternated at the subliminal flicker frequency. The use of a subliminal frequency in most trials eliminated possible top-down interference. Across the 18 participants, correct-rejection rate in the catch trials was 97.3%. Hit rate was 97.6% in the visible flicker trials but only 6.6% in the subliminal flicker trials. As the adopted flicker frequencies were either visible or invisible as opposed to perithreshold, no significant effect of olfactory-visual congruency emerged in the hit rates for the visible flicker trials (P = 0.76) or the subliminal flicker trials (P = 0.22).
In validation of our frequency manipulation, Fourier transformation of the occipital EEG signals (O1, O2, OZ, CB1, and CB2 combined) from the catch trials yielded a fundamental peak frequency consistent with the subliminal flicker frequency (Fig. 2). We then focused our examination on the subliminal flickering objects undetected by the participants and traced how their temporal profiles were differently “filtered” in the presence of congruent versus incongruent olfactory inputs. As an exploratory step, we applied wavelet time–frequency analysis (Tallon-Baudry et al. 1996) and mapped the scalp topographies over time for the band-power difference between these 2 conditions in a 10 Hz range centered at the subliminal flicker frequency (i.e., 22.5/25 ± 5 Hz), normalized by prestimulus mean (−200 to 0 ms with respect to flicker onset) and averaged across participants. As shown in Figure 3A, maximum congruency-induced enhancement was observed over right temporal regions in the vicinity of electrodes T8, TP8, and P8, within our a priori region of interest, approximately 150–300 ms poststimulus onset––a period corresponding well with object-level processing (Doniger et al. 2000; Itier and Taylor 2004). Follow-up permutation tests showed this enhancement to be statistically significant, with a peak P value of 0.002 (TP8, 200–250 ms; Bonferroni adjusted P = 0.028, corrected for multiple comparisons across all temporal electrodes).

Fourier transforms of occipital EEG signals (left panel, electrodes shown as bold dots) during the catch trials where the subliminal flicker frequency was 22.5 Hz and 25 Hz (right panel). The obtained powers for each frequency were normalized by prestimulus values.

Time–frequency distribution and source localization of changes in EEG spectral power induced by the integration of congruent, relative to incongruent, olfactory and visual inputs. (A) Topographic plots of congruency-induced band power difference within ± 5 Hz of the subliminal flicker frequency in steps of 50 ms from 100 to 400 ms poststimulus onset. White dots: electrodes T8, TP8, and P8. (B) Time–frequency plot of the congruency effect averaged across T8, TP8, and P8. Time 0 marks flicker (object) onset; frequency 0 marks the subliminal flicker frequency to which EEG frequencies were referenced. (C) Gaussian fitted time–frequency distribution of the normalized EEG power difference (averaged across T8, TP8, and P8) induced by olfactory-visual congruency. (D) Likely intracranial sources of the congruency effect were localized to the right temporal cortex.
We subsequently combined the time–frequency data from T8, TP8, and P8 so as to increase statistical power and to further quantify and compare the EEG spectral dynamics time-locked to object onset for congruent and incongruent trials (Fig. 3B). This approach revealed a significant main effect of olfactory-visual congruency in the 10 Hz bandwidth centered at the subliminal flicker frequency in 3 consecutive time windows: 150–200 ms (P = 0.003), 200–250 ms (P < 0.001) and 250–300 ms (P = 0.024) poststimulus onset. The time window of 100–150 ms showed a similar effect that trended towards significance (P = 0.064). Across frequency and time, the distribution of the congruency-induced power difference largely followed a 2-dimensional Gaussian function (adjusted r2 = 90.6%), with the most prominent enhancement taking place at 230 ms poststimulus onset (P = 0.004) (Fig. 3B,C).
To track the possible source of the observed amplitude modulation by odors, we performed beamforming (Gross et al. 2001) on the oscillatory EEG signals within ±5 Hz of the subliminal flicker frequency in a time window from 100 to 300 ms poststimulus onset. As plotted in Figure 3D, the effect was spatially localized to the right temporal cortex extending anteriorly to the temporal pole and posteriorly to the lateral occipital sulcus (ps < 0.05), a broad area heavily implicated in object representations (Chao et al. 1999; Sigala and Logothetis 2002).
In combination with the psychophysical data from Experiments 1 and 2, these EEG results indicated that a congruent as opposed to an incongruent odor strengthened the oscillatory signals of the corresponding visual object in right temporal regions at the stage of object processing, thereby pushing it closer into visual awareness (Moutoussis and Zeki 2002). This was likely mediated through local neural synchronization, a process linked to sensory binding and correlated with sensory awareness (Tononi et al. 1998; Engel and Singer 2001; Robertson 2003). Moreover, they led to the following inference: Odors should also, in a congruency-based manner, modulate the perceived duration of visual objects, which presumably reflects the magnitude of neural responses to those objects (Eagleman and Pariyadath 2009).
Odors Bias the Perceived Duration of Visual Objects in a Congruency-Based Manner
The hypothesis, that odors would alter one's subjective duration of visual objects in a congruency-based manner, was tested in Experiment 4 using a 2AFC duration comparison task (Fig. 4A). Two images of apple and banana, respectively, were presented in random order and subjected to comparison, one with a fixed duration of 500 ms (standard image), the other with a duration of 300, 400, 450, 500, 550, 600, or 700 ms that varied randomly across trials (comparison image). A total of 24 participants reported which of the 2 images appeared longer in duration while being continuously exposed to either the apple-like or the banana-like odor. For half of them, the apple image served as the standard image and the banana image served as the comparison image. For the other half, it was the reverse. Their responses per olfactory condition were plotted against the duration of the comparison image in Figure 4B (left and middle panels), fitted with a Boltzmann sigmoidal function. In both cases, odors induced a systematic shift of the psychometric curves. To characterize this shift, we calculated for each participant and each olfactory condition the PSE, an index of judgment criterion at which the duration of the comparison image was perceived as equal to that of the standard image, as well as DL, an index of discrimination sensitivity (essentially the slope of the fitted psychometric function near the PSE). Repeated measures ANOVA on the PSEs with olfactory condition (apple-like odor vs. banana-like odor) as the within-subject factor and comparison image (apple image vs. banana image) as the between-subject factor showed no main effect of either factor (F1,22 = 1.31 and 1.14, P = 0.27 and 0.30, for olfactory condition and comparison image, respectively) but a significant interaction between the two (F1,22 = 9.18, P = 0.006). With respect to the comparison image, PSE was shortened by an average of 41 ms in the presence of a congruent rather than an incongruent odor, indicating dilated perception of its duration (Fig. 4B, right panel). Temporal discrimination sensitivity, as indexed by DL, was nonetheless unaffected (F1,22 = 1.19 and 0.12, P = 0.29 and 0.74 for the main effect of olfactory condition and its interaction with comparison image, respectively).

Olfactory modulation of the perceived duration of visual objects. (A) Procedure of the 2AFC duration comparison task used in Experiments 4 and 5. (B) Duration comparison performances fitted with sigmoidal curves (left and middle panels) and the corresponding PSEs (right panel). With respect to the comparison image, PSE was shortened under the exposure of a congruent rather than an incongruent odor, indicating dilated perception of its duration. (C, D) Suggested odor content (C) and visual semantic label (D) in the absence of actual olfactory input failed to affect duration judgments. Error bars represent standard errors of the mean adjusted for individual differences.
To examine whether these results arose from semantic bias or demand characteristics, we recruited another 48 participants and conducted Experiment 5. The experimental setup was identical to that of Experiment 4 except that the olfactory stimuli were replaced with either 2 bottles of purified water respectively suggested as containing a low concentration of apple odor and banana odor (in 24 participants) or verbal labels of “apple odor” and “banana odor” presented at fixation (in the remaining 24 participants), like in Experiment 2. In the former case, the participants rated the purified water as more like the odor of apple (P < 0.001), less like the odor of banana (P < 0.001), but equally intense (P = 0.29) and pleasant (P = 0.34) when it was suggested as containing an apple odor as compared with a banana odor. Still, neither semantic manipulation affected the participants’ duration judgments (Fig. 4C, D, left and middle panels). There was no interaction between semantic cue (suggested odor content or verbal label) and comparison image in the PSEs (F1, 22 = 0.47 and 0.77, P = 0.50 and 0.39, respectively, Fig. 4C,D right panels), nor the DLs (F1, 22 = 0.85 and 0.55, P = 0.37 and 0.47, respectively). Furthermore, pulling together the data from Experiments 4 and 5, we were able to verify that the odors, as compared with the semantic manipulations, induced a significantly stronger congruency effect in the subjective duration of the visual objects (F1, 68 = 10.17, P = 0.002).
We hence confirmed the inference drawn from both the psychophysical results of Experiments 1 and 2 and the EEG results of Experiment 3, and demonstrated that a congruent odor, as compared with an incongruent one, stretched the perceived duration of the corresponding visual object without affecting temporal discrimination sensitivity.
Discussion
With no direct access to the physical time, the brain reconstructs “time” from the dynamic environmental inputs relayed by various sensory modalities. Here we show that the very process of sensory integration twists time perception. On the one hand, odors alter visual object sampling and modulate object visibility around CFF in manners contingent upon their sensory congruency with visual inputs (Experiments 1 and 2). This is likely mediated through the adjustment of the oscillatory power of right temporal EEG signals surrounding the flicker frequency at the stage of object processing, independent of visual awareness (Experiment 3). On the other hand, such olfactory-visual integration biases the subjective duration of the corresponding object (Experiments 4 and 5), in agreement with the proposal that perceived duration is a signature of the amount of energy expended in representing a stimulus (Eagleman and Pariyadath 2009). Whereas sensory integration and time perception have largely been treated as independent processes in the literature, our findings argue that the two are innately intertwined. Meanwhile, they provide a clear demonstration that visual sampling rate of complex objects is not fixed but influenced by other sensory information.
We note that olfactory-visual congruency (congruent vs. incongruent) was manipulated in a within-subject manner in Experiments 1, 3, and 4 without a visual-only (unimodal) baseline. This was based on common practice in the field to maximize experimental sensitivity (Meeren et al. 2005; Yuval-Greenberg and Deouell 2007; Alsius and Munhall 2013). Besides, a visual-only condition would not be strictly “neutral” as it could not produce the subjective experience of arousal and pleasantness automatically elicited by an odor (Zald and Pardo 1997; Anderson et al. 2003). Consequently, our experimental design is not optimal to discern the relative contributions of congruent and incongruent odors to the temporal processing of visual objects. Nevertheless, preliminary comparisons of the results from Experiments 1 and 4 with those from the control Experiments 2 and 5 indicate that congruency and incongruency effects are likely both at play in the observed olfactory modulations of visual time perception, consistent with previous neurophysiological and psychophysical findings of crossmodal enhancement and suppression (De Gelder and Bertelson 2003; Stein and Stanford 2008; Klatzky et al. 2011).
Attention is intricately related to conscious perception (Dehaene et al. 2006; Koch and Tsuchiya 2007) and has been suggested to modulate subjective duration (Ivry and Schlerf 2008). We reason, however, that endogenous attention or top-down cognitive control is unlikely to have caused the aforementioned effects. The visual tasks in Experiments 1-3 heavily taxed attention and did not require object recognition. Visual objects were subliminally presented in critical trials, thus preventing participants from directing attention to a particular object. Under these circumstances, an increase of CFF was observed in Experiment 1 but not Experiment 2 (control experiments) where the only difference was the lack of actual odors. One may still argue that an odor could have automatically attracted attention to the congruent visual object and its location (Chen et al. 2013). But in that case, the net effect would be a reduction of attentional resources allocated to temporal information, which has been shown to degrade temporal resolution, shorten subjective duration, and impair temporal sensitivity (Yeshurun and Levy 2003; Coull et al. 2004; Cicchini and Morrone 2009), contrary to the results of the current study.
We are constrained by the limited spatial resolution of EEG to pinpoint the exact neural substrate underlying the interplays between olfaction and visual temporal processing. Nonetheless, the result from our source localization analysis nicely aligns with previous findings that have highlighted the involvements of temporal structures in the representations of visual (Chao et al. 1999; Sigala and Logothetis 2002), olfactory (Li et al. 2008; Gottfried 2010), and multimodal (Murray and Richmond 2001; Taylor et al. 2006) objects, as well as time (Dalla Barba and La Corte 2013; Kraus et al. 2013). The right lateralization of the temporal activities is also in line with the documented right hemisphere dominance in olfactory processing (Brand et al. 2001).These lead us to postulate that the right temporal cortex encompasses a neuronal network that executes flexible temporal filtering of upstream visual inputs based on olfactory information. It remains to be tested what role each involved structure plays in this process.
From the perspective of neurons (Scharnowski et al. 2013), information regarding an unknown time-dependent stimulus can only be extracted from segments of spike trains. Signal-to-noise ratio determines decoding success (Bialek et al. 1991) and ultimately affects the readout of neuronal codes. It has been shown that state-dependent neural synchrony and oscillations underlie the dynamic binding (or integration) of different sensory information pertaining to the same object or event, and also serve to enhance the saliency of the relevant neural responses (Engel and Singer 2001). Odors are known to elicit various oscillations in the olfactory system (largely located in the temporal lobe), some of which engage in long-range couplings with other brain regions (Kay et al. 2009). It is plausible that through phase resetting (Achuthan and Canavier 2009), these signals can modulate the oscillatory responses, internally generated (Gray et al. 1989) or stimulus-locked (Tononi et al. 1998), of the neuronal assembly encoding visual objects. In this regard, the observed enhancement of object visibility around CFF and elongation of subjective duration may be viewed as flip sides of the same coin, both being perceptual manifestations of augmented object signals, and thereby improved signal-to-noise ratios in object processing, induced by the integration of congruent, as opposed to incongruent, olfactory and visual inputs.
In physics, space-time is warped by the distribution of mass and energy in it (Hwaking 1988). Our findings, along with others (Johnston et al. 2006; Xuan et al. 2007; Wang and Jiang 2012; Mayo and Sommer 2013), suggest a parallel in the perception of time––subjective time is “warped” by the neural energy involved in representing multisensory inputs at subsecond scales.
Funding
National Natural Science Foundation of China (31422023 and 31100735) and the Strategic Priority Research Program (XDB02030006 and XDB02010003) and the Key Research Program of Frontier Sciences (QYZDB-SSW-SMC030) of the Chinese Academy of Sciences.
Notes
Conflict of Interest: None declared.