Interoception, the perception of our body internal signals, plays a key role in maintaining homeostasis and guiding our behavior. Sometimes, we become aware of our body signals and use them in planning and strategic thinking. Here, we show behavioral and neural dissociations between learning to follow one's own heartbeat and metacognitive awareness of one's performance, in a heartbeat-tapping task performed before and after auditory feedback. The electroencephalography amplitude of the heartbeat-evoked potential in interoceptive learners, that is, participants whose accuracy of tapping to their heartbeat improved after auditory feedback, was higher compared with non-learners. However, an increase in gamma phase synchrony (30–45 Hz) after the heartbeat auditory feedback was present only in those participants showing agreement between objective interoceptive performance and metacognitive awareness. Source localization in a group of participants and direct cortical recordings in a single patient identified a network hub for interoceptive learning in the insular cortex. In summary, interoceptive learning may be mediated by the right insular response to the heartbeat, whereas metacognitive awareness of learning may be mediated by widespread cortical synchronization patterns.
Homeostatic balance is attained by a complex network of body–brain communications that are primarily processed unconsciously (Dworkin 2000); however, conscious access may be key in guiding our behavior and decision-making. Heartbeat awareness (Craig 2003, 2009; Critchley et al. 2004; Critchley and Harrison 2013; Park and Tallon-Baudry 2014), the capacity to become sentient of one's own heartbeat, is a key model for the study of heart–brain interactions. Traditionally, heartbeat perception has been studied employing tasks in which participants have to count their heartbeats [Schandry 1981; but see Ring et al. (2015) on the validity of this method] or to assess whether auditory stimuli appeared “synchronously” or “delayed” to their own heartbeats (Whitehead et al. 1977). The neurophysiological network of cardiovascular activity includes information processing from the myocardium baroreceptors, through the brainstem solitary nucleus, up to the insular and anterior cingulate cortex (Critchley and Harrison 2013). Cardiac interoception is reflected in the modulation of the so-called heartbeat-evoked potential (HEP; Montoya et al. 1993; Pollatos and Schandry 2004; Couto et al. 2014), a signature of cortical cardiac processing (Gray et al. 2007).
Compared with common visceral feelings (e.g., hunger or thirst), there is a high cardiac interoception variability among healthy participants: Some are able to detect their heartbeat (synchronous tapping), whereas others show chance performance (Ludwick-Rosenthal and Neufeld 1985). The neurophysiological mechanisms behind these differences remain elusive, although higher HEP amplitudes have been recorded in individuals with high cardiac interoception (Pollatos and Schandry 2004; see Supplementary Fig. 1 for convergent results in this study). Interestingly, cardiac interoception can be enhanced by auditory (Schandry and Weitkunat 1990) or visual heartbeat feedback (Schaefer et al. 2014); however, little is known about the neural mechanisms of cardiac interoceptive learning (or any other interoceptive learning mechanism for that matter). For this, we hypothesized a correspondence between improvement in heartbeat-tapping and modulation of HEP amplitude in interoceptive learners but not in non-learners.
Metacognition, in its original definition, is described as the knowledge about and regulation of one's cognitive activities in learning processes (Flavell 1979). A wide body of literature refers to metacognition as the knowledge and evaluation of the participant's own performance (Fleming and Dolan 2012). Regarding interoceptive awareness, it is commonly referred to as the ability of focusing awareness on internal signals such as heartbeat timing, as opposed to focusing awareness on external signals such as note pitch (Critchley et al. 2004). Importantly, interoceptive awareness should be distinguished from interoceptive sensitivity, the latter referring to an invariant constitutional trait that can be measured by objective tests and does not necessary involve consciousness (Garfinkel and Critchley 2013). In contrast, interoceptive awareness refers to an individual's subjective experience of bodily signals. As a further level of processing, metacognition of interoception refers to the capacity of evaluating the interoceptive performance. However, there are still scarce accounts investigating the neuronal patterns of interaction between metacognition and the awareness of internal signals (Barttfeld et al. 2013).
One of the generic processes underlying complex cortical computations in a broad range of higher cognitive functions is gamma phase synchronization (Fries 2009). In fact, phase synchrony is a known functional connectivity measure reflecting information transfer between distributed brain networks (Uhlhaas, Pipa, et al. 2009). Given that fMRI studies of metacognition show the involvement of widely distributed brain regions, including the anterior prefrontal cortex (Del Cul et al. 2009; Fleming et al. 2012) and medial parietal cortex (McCurdy et al. 2013), we hypothesized that gamma phase synchronization could be one of the electrophysiological markers of metacognition. Furthermore, gamma synchronization seems to be specifically associated with conscious access to information processing in different sensory and cognitive functions, including visual conscious awareness (Melloni et al. 2007), somatosensory processing (Hagiwara et al. 2010), and working memory (Palva et al. 2010). Assuming that metacognitive awareness of interoceptive learning depends on the efficiency of conscious access to interoceptive input, gamma phase synchrony may underlie the availability of interoceptive information for further metacognitive processing.
Here, we show a dissociation between learning to follow one's own heart and the awareness of that learning, by having participants attempt to tap to their heartbeat before and after auditory feedback. Using electroencephalography (EEG) and direct cortical recordings, we found distinct neural markers of objective interoceptive performance and its associated metacognitive awareness. We conclude that interoceptive learning may be mediated by insular cortex response to the heartbeat, whereas metacognitive awareness of learning seems to be associated with widespread cortical synchronization patterns.
Materials and Methods
Participants and Patient
The 39 right-handed healthy participants (23 male; mean ± SD age = 22.60 ± 3.2 years) and the epileptic patient (male; 33 years) gave written informed consent in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of the Faculty of Psychology of Universidad Diego Portales (Chile) and the Institutional Ethics Committee of the Hospital Italiano de Buenos Aires, Argentina. Six participants did not correctly follow the instructions of the first 2 blocks of tapping in synchrony with the external heartbeat sound: They refused to tap adducing several reasons to justify that it was not possible to feel their heartbeat. Data of these participants were discarded from the analyses. The patient suffered from drug-resistant epilepsy since the age of 4 years and was offered surgical intervention to alleviate his intractable condition. His current drug treatment included 2250 mg/day oxcarbazepine, 300 mg/day topiramate, and 250 mg/day lacosamide. computed tomography (CT) and magnetic resonance imaging (MRI) scans were acquired after insertion of depth electrodes, which revealed an acute inflammatory reaction to implantation (Fig. 5B–D). One week later, and 1 day before the epileptic surgery, the patient took part in the current study. He was attentive and cooperative during testing, and his cognitive performance before and 1 week after the implantation was indistinguishable from healthy volunteers. The patient was specifically recruited for this study, because he was implanted with a 10 contact point's spear covering the right anterior insular cortex.
Each participant took part in one session consisting of 7 consecutive blocks (Fig. 1). In Block 1, participants were instructed to tap in synchrony with an external regular heartbeat (60 Hz, 3 min) aurally delivered through the earphones (Fig. 1A). Block 2 was identical to Block 1, but tapping followed an external irregular heartbeat (Fig. 1A). In the pre-feedback Blocks 3 and 4, which were separated by a short break, participants were asked to tap in synchrony with their own heartbeat (180 taps per block; Fig. 1B). In the feedback Block 5, participants were asked to tap in synchrony with their own heartbeat as heard through a stethoscope (180 taps; Fig. 1C). In the post-feedback Blocks 6 and 7, participants received the same instructions as in Blocks 3 and 4 (Fig. 1D). Importantly, each block ended once the pre-defined number of trials was reached (180 trials). Trials lacking motor taps were excluded from further electrophysiological analyses.
After each block, participants were asked to rate their performance on a scale from 1 (bad) to 10 (good). After the final block, participants were asked to rate how much their performance accuracy improved after the feedback blocks on a scale from 1 (did not improve) to 10 (improved) (see the Procedure section for details). Thus, participants were separated into 2 groups: Learners, whose tapping accuracy significantly improved in the post-feedback blocks compared with the pre-feedback blocks, and non-learners, who did not show a performance improvement after the feedback block.
EEG caps were fitted and electrocardiogram (ECG) electrodes were attached to participants, one electrode below the right collar bone (calvical) and another electrode on the left hip bone (iliac crest). Participants were comfortably seated in the Faraday cage in a dimly light room. After reading instructions, the lights were turned off and the experiment began. Auditory stimuli were presented using Etymotics ER-3A 135 earphones at a comfortable volume. Experiment was controlled using a custom built script programmed in Python, running on a Dell laptop. After each block, the lights were turned on and participants were asked about their performance. The auditory feedback was delivered using a Littmann stethoscope, which was held by the participants themselves with the left hand, while they tapped a keyboard button with the right hand. A pilot study of 15 participants performed in the MRC Cognition and Brain Sciences Unit in Cambridge (8 males) was instrumental in defining several parameters of the subsequent full study, the HEP variability between learners and non-learners, the proportion of learning subjects, the power of the single-subject behavioral analysis, and the selection of the scales to measure metacognition of learning.
While objective performance was defined by the behavioral analysis comparing tapping accuracy distributions of the pre- and post-feedback conditions via single-subject analysis, assessment of subjective performance was based on the ratings given by each participant in the pre- and post-feedback blocks. After each pre- and post-feedback block, they answered to the question: “How accurate do you think you were in tapping to your heartbeat from 1—“inaccurate” to 10—“extremely accurate”?” Then, the averages of the 2 ratings of the pre- and post-feedback conditions were calculated. Participants were split according to the averages of 2 ratings of the pre- and post-feedback blocks: They were classified as good performers (subjective measure) when the post-feedback score was higher than the pre-feedback score or as bad performers when the post-feedback score was lower than pre-feedback score. In addition, after completing the final post-feedback block, participants were asked to compare their performance accuracy between the first pre- and post-feedback blocks as well as between the second pre- and post-feedback blocks (Fig. 1A): “How much do you think your performance improved after the auditory feedback from 1—“did not improve” to 10—“improved a lot”?” The 2 answers were averaged and participants were split into 2 groups: Subjective improvers (ratings 6–10) and non-improvers (1–5). Finally, participants were split between a metacognitively congruent group (i.e., those claiming that they improved and they did, and those saying that they did not improve and they did not, 17 participants) and a metacognitively incongruent group (i.e., those claiming that they improved but they did not, and those reporting no improvement when in fact they did, 16 participants). That is, subjective and objective performance matched in the metacognitively congruent group, but did not match in the metacognitively incongruent group.
For behavioral analysis, 2 complementary measures of performance were computed: tapping accuracy and omissions. First, tapping accuracy was defined as the absolute value of the time difference between the R-peak and the motor tap:
Tapping accuracy was computed in each trial excluding the baseline period (i.e., from −50 to 600 ms, see below). Second, omissions were defined as the number of missed or skipped trials, that is, trials lacking motor taps.
Electrophysiological Recordings and Analysis
EEG signals were recorded with 129-channels HydroCel Sensors using a GES300 Electrical Geodesics amplifier at a rate of 500 Hz. ECG signals were simultaneously recorded using Ag/Cl electrodes from a Polygraphy Input Box from Electrical Geodesics. The physical filters were set at 0.01–100 Hz for the recording acquisition. Later, EEG data were further filtered using a band-pass digital filter with a range of 0.5–40 Hz for event-related potentials analysis to remove any unwanted frequency components, but kept at 0.01–100 Hz for phase synchrony analysis. For ERP analysis, data were down-sampled to 250 Hz. During recording, the vertex was used as the reference electrode by default, but signals were re-referenced offline to linked mastoids. Two bipolar derivations were designed to monitor vertical and horizontal ocular movements (EOG). Eye movement contamination and other artifacts were removed from data for further processing using an independent component analysis (Delorme and Makeig 2004). For all datasets, independent components representing cardiac-field artifact were semi-automatically identified and subsequently removed as described elsewhere (Viola et al. 2009). The peaks of the ECG R-waves were detected offline and used as triggers for EEG segmentation to calculate the HEPs with the aid of customized Matlab functions. For HEP analysis, all EEG data were segmented into 800-ms epochs, including a −200 to −50 ms pre-stimulus baseline period, based on the R-peak markers. For phase synchrony analysis, 900 ms epochs were used: −400 to 500 ms relative to the R-peak. Trials that contained voltage fluctuations exceeding ± 200 μV, transients exceeding ±100 μV, or electro-oculogram activity exceeding ±70 μV were rejected. The number of trials included did not differ between groups. All conditions yielded a least 87% of artifact-free trials. The EEGLAB Matlab toolbox was used for data preprocessing and pruning (Delorme and Makeig 2004).
HEP Analysis and Source Reconstruction
For HEP analysis, the EEG epochs were baseline-corrected relative to a −200- to −50-ms time window, excluding the rising edge of R-peak. For ERP analysis in the sensor space, 3 frontal 9–10 electrode regions of interest (ROIs) were defined (Fig. 2A) based on previous studies (Pollatos and Schandry 2004; Pollatos et al. 2005). Cortical sources of subject-wise averaged ERPs for conditions of interest were reconstructed with Brainstorm (Tadel et al. 2011). The forward model was calculated using the OpenMEEG Boundary Element Method (Gramfort et al. 2010) on the cortical surface of a template MNI brain (colin27) with 1 mm resolution. The inverse model was constrained using weighted minimum-norm estimation (Baillet et al. 2001) to calculate source activation in picoampere-meters. To plot cortical maps, grand-averaged activation values were baseline corrected by substracting the mean of the baseline period (−200 to −50 ms window) to each time point, and spatially smoothed with a 5-mm kernel. Subject-wise activation time courses were extracted at ROIs visually identified in the cortical maps. Time courses in pairs of conditions were compared to identify statistically significant temporal clusters using a FieldTrip-based (Oostenveld et al. 2011) analysis of one ROI at a time (see below).
Local Field Potential Recordings
Direct cortical recordings were obtained with 128 stereotactically defined depth electrodes 2.3 mm long with 1-mm diameter cylinders and an interelectrode distance of 10 mm. The electrode strips were implanted in different regions of the frontal, central, and parietal cortices and subcortical structures. For the purposes of the current study, local field potentials (LFPs) were analyzed from the right anterior insular and adjacent regions guided by previous functional magnetic resonance imaging (fMRI) and source EEG analyses on HEP (Critchley et al. 2004; Pollatos et al. 2005). MNI coordinates of the depth electrodes were obtained from MRI and CT images using SPM (Friston et al. 2007) and MRIcron (Rorden and Brett 2000) softwares. The exact MNI coordinates and cortical regions (gyri) of the selected electrodes are reported in Table 1. For HEP-LFP analysis, all intracranial LFP data were segmented into 800-ms epochs, including a −200- to −50-ms pre-stimulus baseline period, based on the R-peak markers. The LFP epochs were baseline-corrected relative to a −200- to −50-ms time window, which excluded the rising edge of R-peak.
|Electrode||MNI coordinates||Cortical region (gyrus)|
|1||32; 20; 8||R. insula|
|2||7; 48; −2||R. front. med. orb|
|3||27; −4; 52||R. precentral|
|Electrode||MNI coordinates||Cortical region (gyrus)|
|1||32; 20; 8||R. insula|
|2||7; 48; −2||R. front. med. orb|
|3||27; −4; 52||R. precentral|
Phase Synchronization Analysis
For the analysis of time–frequency distributions and phase synchrony, the digitized signals were analyzed by means of a windowed Fourier transform by applying an FIR (350 order Hanning window) band-pass filter (10–100 Hz; window length: 128 ms, step 10 ms, window overlap 90%) within the 20- to 45-Hz frequency range. Signal windows were zero padded to 512 points to obtain an interpolated frequency resolution of approximately 1 Hz per frequency bin. For every time window and frequency bin, amplitude and phase values were computed as reported previously (Lachaux et al. 1999; Rodriguez et al. 1999; Melloni et al. 2007). Time–frequency charts of phase synchrony were normalized to a baseline before the stimulus onset. The normalization involves subtracting the baseline average and dividing by the baseline standard deviation on a frequency-by-frequency basis using a window from −400 to −50 ms relative to the R-peak.
It is still a subject of debate whether spurious volume conduction can mimic bona ﬁde neural synchronization if the synchrony occurs with zero or π phase lag (Nolte et al. 2004; Vicente et al. 2008). This situation can occur when a single powerful dipole activates consistently at the same time across trials. In such eventuality, the near-by electrodes show zero phase-locking, and the distant ones show π phase-locking. To counter this possibility, we used a method described elsewhere (Uhlhaas, Roux, et al. 2009). In this procedure, the windows exhibiting zero phase-locking and π phase-locking were eliminated from the analysis. In particular, vectors representing phase differences of 0 ± 1° were multiplied by zero, thus effectively taking them out of the subsequent computations of phase-locking.
(1) For single-subject behavioral comparisons of tapping accuracy, we computed a paired t-test between pre- and post-feedback blocks. (2) For behavioral comparisons of tapping accuracy between blocks, we computed separate one-way ANOVA for learners [irregular exteroception (Block 2) vs. pre-feedback interoception vs. post-feedback interoception] and for non-learners [irregular exteroception (Block 2) vs. pre-feedback interoception vs. post-feedback interoception]. Tapping accuracy scores were normalized with respect to the condition of regular exteroception (Block 1) to decrease the motor variance characteristic of each individual. (3) For HEP comparisons between learners and non-learners, we computed a mixed-model ANOVA including 1 between-participants factor: learning (learners vs. non-learners), and 2 within-participants factors: interoception condition (pre-feedback vs. post-feedback) and ROI (left-frontal vs. centro-frontal vs. right frontal). (4) For HEP comparisons of the objective performance, the mixed-model ANOVA included 1 between-participants factor: objective performance (learners vs. non-learners) and 1 within-participants factor: ROI (left-frontal vs. centro-frontal vs. right frontal). (5) For HEP comparisons of the subjective performance, we used subjective performance (learners vs. non-learners) as the independent between-participants factor and 1 within-participants factor: ROI (left-frontal vs. centro-frontal vs. right frontal). (6) For gamma phase synchrony comparisons between metacognitively congruent and incongruent groups, we computed a mixed-model ANOVA with 1 between-participants factor: metacognitive awareness (congruent vs. incongruent) and 1 within-participants factor: interoception condition (pre- vs. post-feedback). (7) To analyze if HEP amplitude differed between metacognitively congruent and incongruent participants, the mixed-model ANOVA was tested with 1 between-participant factor: metacognitive awareness (congruent vs. incongruent) and 1 within-participant factor: ROI (left-frontal vs. centro-frontal vs. right frontal). (8) For comparison of gamma phase synchronization between metacognitive awareness groups and experimental conditions, the mixed-model ANOVA included 1 between-participants factor: metacognitive awareness (congruent vs. incongruent) and 1 within-participants factor: condition (irregular exteroception vs. post-feedback interoception). (9) For comparison of gamma phase synchronization in learners and non-learners according to their objective and subjective performance, the mixed-model ANOVA included 2 between-participants factors: objective performance (learners vs. non-learners), subjective performance (learners vs. non-learners); and 1 within-participants factor: condition (pre- vs. post-feedback). Time windows for HEP (Pollatos and Schandry 2004; Pollatos et al. 2005) (200–450 ms) and gamma phase synchronization (Lachaux et al. 1999; Rodriguez et al. 1999; Melloni et al. 2007) (100–350 ms) were selected based on previous findings.
For EEG results, two-tailed t-tests were used to evaluate conditions and group differences. The statistical framework used throughout the analysis of the HEP is similar to what we previously described (Chennu et al. 2013); briefly, time windows of interest were compared in pairs of experimental conditions using temporal clustering analysis implemented in FieldTrip (Oostenveld et al. 2011). For each such pairwise comparison, epochs in each condition were averaged subject-wise. These averages were passed to the analysis procedure of FieldTrip, the details of which are described elsewhere (Maris and Oostenveld 2007). In short, this procedure compared corresponding temporal points in the subject-wise averages using one-tailed-dependent (for within-subject comparisons) or -independent (for between-subject comparisons) samples t-tests. Although this step was parametric, FieldTrip uses a nonparametric clustering method (Bullmore et al. 1999) to address the multiple comparisons problem. t-values of adjacent temporal points whose P-values were 0.05 were clustered together by summating their t-values, and the largest such cluster was retained. This whole procedure, that is, calculation of t-values at each temporal point followed by clustering of adjacent t-values, was then repeated 1000 times, with recombination and randomized resampling of the subject-wise averages before each repetition. This Monte Carlo method generated a nonparametric estimate of the P-value representing the statistical significance of the originally identified cluster. The cluster-level t-value was calculated as the sum of the individual t-values at the points within the cluster. For phase synchrony analysis, mixed-model ANOVA was followed up with planned t-tests. This temporal window was chosen based on previous results relating gamma phase synchronization with conscious awareness and perception (Lachaux et al. 1999; Rodriguez et al. 1999; Melloni et al. 2007). All effect size values were calculated using Cohen's d (Cohen 1992).
The statistical analyses of the interelectrode phase synchronization were performed on the grand average phase synchrony chart per experimental condition per subject. Then, those charts were grouped by condition and analyzed by means of a permutation test in search of time–frequency windows showing significant effects (Bullmore et al. 1999). Thus, difference maps for both the pre- and post-feedback conditions were compared relative to the irregular exteroception condition separately since phase synchrony was not related to the cardiac rhythm of the participants in the irregular exteroception condition. In the figures, synchrony between electrodes is indicated by lines, which are drawn only if the synchrony value is beyond a two-tailed probability of P < 0.01.
Behavioral Differences Between Interoceptive Learners and Non-learners
In our study of the neural mechanism of interoceptive learning, participants (n = 33) were instructed to tap in synchrony with the second sound of the heartbeat (the dub) before and after listening to the enhanced sound—via auditory feedback with a stethoscope—of their own heartbeat (Fig. 1A). Participants performed 2 consecutive blocks of interoceptive heartbeat perception before and after a single auditory feedback condition. About 42% (14/33) of participants showed a significant improvement in objective performance (single participant t-tests: P < 0.05), that is, accuracy of synchronous tapping to their heartbeat, after the auditory feedback condition (Fig. 1B and Supplementary Fig. 2A), based on which they were regarded as “learners.” At the group level, the learners' tapping accuracy did not differ between the 2 consecutive interoceptive blocks, either before (paired t-test; t(13) = −1.11, P = 0.275) or after feedback (paired t-test; t(13) = 0.48, P = 0.637). These findings suggest that learning was not due to task repetition, but that interoceptive perception improved with auditory feedback, probably through the unmasking of relevant sensory information. Heartbeat detection is not easy as the signal is faint; when provided with an enhanced signal of their own heart via the stethoscope, learners may be extracting the relevant sensory and perceptual information through perceptual sharpening or attentional tuning, showing an improvement in tapping performance during post-feedback sessions.
Addressing the possibility of a ceiling effect in a pre-feedback condition among non-learners, which may have led to post-feedback group differences, we computed a correlation between the extent of improvement (i.e., the difference between pre- and post-feedback accuracy) and the pre-feedback tapping accuracy, which showed no association (N = 33, r = 0.28, P = 0.17). Thus, participants who were relatively inaccurate in the pre-feedback condition did not show a higher improvement than those with high accuracy in the pre-feedback blocks, and vice versa. Thus, pre-feedback tapping accuracy does not explain post-feedback difference between learners and non-learners. Moreover, tapping variability (SEM) between groups proved to be lower for learners than non-learners in the pre-feedback condition (unpaired t-test; t(31) = −2.29; P = 0.028; Cohen's d = 0.80), suggesting that more consistent tapping to their own heartbeat contributed to the subsequent learning elicited by the auditory feedback (Fig. 1).
Furthermore, we also computed the number of omissions (i.e., trials where motor response was absent, see Materials and Methods) as a complementary behavioral measure of performance between the group of learners and non-learners. Omissions decreased in the post-feedback condition compared with the pre-feedback condition for learners (pre-feedback omissions = 5002; post-feedback omissions = 2683; paired t-test; t(13) = 3.73; P = 0.002; Cohen's d = 1.46), but not for non-learners (pre-feedback omissions = 5340; post-feedback omissions = 4826; paired t-test; t(18) = 1.61; P = 0.123), indicating improved interoceptive performance among learners.
HEP Modulation in Interoceptive Learners and Non-learners
Neural mechanisms and markers of interoceptive learning were investigated in both sensor and source spaces of the high-density EEG data. In sensor space, the group of interoceptive learners (but not the non-learners) showed an increase (negative in voltage) of the HEP amplitude in the post-feedback condition compared with the pre-feedback condition in the right and centro-frontal ROI (F1,31 = 13.13; P = 0.001; Cohen's d = 1.56; Fig. 2). Given that learners showed lower tapping accuracy than non-learners in the pre-feedback condition (t(31) = 4.35, P < 0.001; Cohen's d = 1.53), a possible confounding influence of pre-feedback tapping variance on the HEP differences between the groups was tested with an ANCOVA (tapping accuracy as a covariate). HEP amplitude remained significantly higher between the pre- and post-feedback conditions for learners after controlling for tapping accuracy (right frontal ANCOVA: F1,30 = 13.73; P = 0.001; tapping accuracy regressor: P = 0.412). Furthermore, HEP amplitude in the pre-feedback condition did not differ between learners and non-learners neither for the left, central nor right ROIs (Supplementary Fig. 3).
HEP Modulation in Objective and Subjective Performance
When pre- and post-feedback conditions were taken together, we found higher amplitude of HEP in the group of interoceptive learners versus non-learners in the left, right, and central frontal pre-defined ROIs (F1,31 = 3.08; P = 0.016; Cohen's d = 0.61; Fig. 3A–C). These findings demonstrate that cardiac interoceptive learning is associated with changes in HEP amplitude, which might be driven by neural processing in the insular and the anterior cingulate cortex, convergent with results from previous EEG and fMRI studies (Critchley et al. 2004; Pollatos et al. 2005). The results from source analysis and intracranial recordings confirm and extend these findings.
To test the HEP changes to subjective report of performance (“How accurate were you when tapping to your heartbeat?”), instead of objective tapping performance, we compared the HEP's amplitude between those participants reporting better performance on a subjective scale after the feedback condition relative to the pre-feedback condition (see Materials and Methods), with those claiming no improvement after the feedback. Unlike the objective performance results, there were no significant differences in the HEP when participants were grouped according to their subjective performance (F1,31 = 0.46; P = 0.50; Fig. 3D–F). Thus, the increase of the HEP amplitude is likely to be driven by the objective improvement of performance, and independent from its subjective report.
HEP Source Reconstruction and Direct Cortical Recordings in Interoceptive Learners and Non-learners
In the source space, the minimum-norm estimate analysis showed significantly higher activity in the post-feedback condition compared with the pre-feedback condition in the right frontal operculum area (a signal also covering the insular cortex underneath) for the group of objective interoceptive learners (Fig. 4A–C), but not for the non-learners (Fig. 4D—furthermore, the right insular source of HEP was confirmed by the intracranial data obtained from an epileptic patient performing the same task). The direct cortical recordings (LFPs; Fig. 5A–C) showed a HEP-like waveform measured directly from the right anterior insular cortex, with an inversion of potential—possibly from the same source—in the orbitofrontal cortex. The evoked potential locked to the heartbeat in the intracranial electrode contacts showed very little local field activity in other electrodes, especially in the motor and premotor cortices (Fig. 5D), suggesting convergence of evidence in space between the direct local brain activity and the inferred by the inverse solution. It also points to a local hub in the insular–orbitofrontal network processing cardiac interoception.
Metacognitive Awareness, Interoception, and Gamma Phase Synchrony
In the absence of established EEG markers of metacognitive learning, we hypothesized that if there is an overlap between brain networks involved in conscious access and subjective experience (Seth et al. 2008), we can use gamma band phase changes to track the modulation in metacognition of learning for this interoceptive study. Thus, we expected the metacognitively congruent group to have higher gamma phase-locking values between pre-feedback and post-feedback conditions, when compared with the metacognitively incongruent group. To test this hypothesis of increased cortical synchrony with metacognitive awareness of interoceptive learning (post experimental subjective report of performance), participants were also grouped according to the congruency between their objective performance accuracy and the post-test verbal report: A metacognitively congruent group (n = 17: 8 learners and 9 non-learners) and a group with low metacognitive congruency (n = 16: 6 learners and 10 non-learners; a metacognitively incongruent group with low introspection of their performance, see Materials and Methods).
First, a mixed-model ANOVA of the gamma phase synchrony (30–45 Hz) revealed a significant interaction between metacognition and interoception conditions (F1,31 = 9.10; P = 0.006; Cohen's d = 1.05). As predicted, participants showing congruency between interoceptive performance and its metacognition showed increased gamma phase synchrony in the post-feedback condition (Fig. 6A–C, 100–350 ms after the heartbeat R-peak; paired t-test, t(16) = 2.18; P = 0.038; Cohen's d = 0.75). In contrast, the metacognitively incongruent group showed a decrease in gamma phase synchrony (Fig. 6D–F, paired t-test, t(15) = −2.47; P = 0.021; Cohen's d = 0.88). These findings indicate that auditory feedback (unmasking of heartbeat information) differentially modulates gamma phase synchrony depending on the (in)congruency of metacognitive awareness. Interestingly, neither objective nor subjective performances were associated with changes in gamma phase synchrony; a mixed-model ANOVA of gamma phase synchrony (30–45 Hz) revealed no significant interaction between performance and conditions (F1,26 = 0.159; P = 0.694; Supplementary Fig. 4).
To characterize the topography of the activation patterns in the metacognitively congruent and incongruent groups, spatial distributions of gamma phase synchrony (30–45 Hz) were computed for the pre-and post-feedback conditions in the interval from 100 to 400 ms after R-peak. In the metacognitively congruent group (Fig. 6C), a lateralized pattern of synchronization between left-frontal sensors was observed in both the pre- and post-feedback condition with a higher degree of interhemispheric connections in the post-feedback condition. On the contrary, this left-frontal pattern of phase synchrony was not observed in the metacognitively incongruent group (Fig. 6F). These results suggest that the congruency (but not the incongruency) between objective performance and its metacognition is associated with a widespread pattern of predominantly left-frontal phase synchronization.
However, it is not yet fully clear whether the metacognitive differences in gamma phase synchronization represent a specific correlate for interoceptive awareness or a broader metacognitive general process. To investigate the latter, participants were separated into metacognitively congruent and incongruent groups but this time based on their awareness of performance in the external irregular heartbeat condition (exteroception; Fig. 1A), where the detection of the sound is not based on interoceptive information but on an auditory external signal. These groups did not differ in the external heartbeat-induced gamma phase synchrony (unpaired t-test, t(31) = 0.09; P = 0.929). To directly test if synchrony changes are specific for metacognition of interoception, a mixed-model ANOVA was performed, showing a significant interaction between metacognitive awareness and extero/intero conditions (F1,31 = 7.84; P = 0.010; Cohen's d = 0.98), suggesting that increased gamma synchronization may be specifically related to the interoceptive metacognitive awareness.
To further explore the dissociation of metacognitive awareness and objective performance, a mixed-model ANOVA was used to test the putative modulatory effect of congruency of metacognitive awareness on the HEP (shown to be modulated by performance, see Fig. 2). In contrast to the neural synchrony results, there were no significant differences in HEP between metacognitive awareness groups (F2,62 = 0.128; P = 0.88). This pattern of results clearly ties the HEP to heartbeat detection, and gamma band synchrony to metacognitive awareness of this detection.
Here, we have shown behavioral and neural dissociation between learning to follow one's own heartbeat and its metacognitive awareness, in a heartbeat-tapping task before and after auditory feedback. First, EEG amplitude of the HEP in interoceptive learners was higher compared with non-learners signaling a change in the weight of the interoception central network. Second, source localization in a group of participants and direct cortical recordings in a single patient showed a network hub for interoceptive learning in the insular cortex. And third, gamma phase synchrony (30–45 Hz; but not HEP amplitude) increased in participants showing agreement between objective interoceptive performance and metacognitive awareness.
These findings suggest that once the processing of interoceptive information becomes associated with cortical networks supporting large-scale gamma phase synchronization, conscious access to the accuracy of performance becomes possible. Notably, this is the first study, to our knowledge, evaluating a possible dissociation between objective performance and its metacognition (of performance) in relation to gamma phase synchrony. We found that a decrease in gamma phase synchrony may reflect the lack of awareness of performance accuracy, even in those participants who showed improvement in cardiac interoceptive performance (and HEP neural marker). Thus, interoceptive learning may be taking place either consciously, as gamma phase synchronization increases, or unconsciously, if synchronization decreases, as happened for participants showing a dissociation between their interoceptive performance and the metacognition of learning or the lack thereof. The group displaying incongruent metacognition of learning constitutes a puzzling phenomenon: How could auditory feedback be associated with the impairment of metacognition of learning, such that participants performing well feel that they are doing poorly (and vice versa)? A plausible explanation could be that during feedback some of the participants lose focus and start paying attention to interoceptive signals that are irrelevant to the task. This disengagement of attention could explain the decrease of post-feedback phase synchrony (Fig. 6E) as attention to processes poorly time-locked to the heartbeat would result in destructive summation of synchrony values.
The findings of a long-distance pattern of phase synchrony between left-frontal and temporo-parietal sensors in this study arise as a plausible—and much needed—neural marker of the metacognition of interoception. Supporting this interpretation, fMRI studies have shown with the activation of anterior prefrontal cortex in visual tasks (Fleming and Dolan 2012) that retrospective judgments on confidence correlate. Specifically, the pattern of phase coherence between distant sensors pairs—with predominance in the left-frontal region—found in this study supports the view that the integration and maintenance of interoceptive information facilitates accurate metacognitive report.
How do interoceptive awareness and metacognition of learning arise in the process of following one's own heart? Pasquali et al. (2010) proposed that we gain awareness on what to monitor and what to attend to, as we learn to use knowledge from unconscious—lower order—processes to create higher-order representations that inform us about the state and landscape of our own internal states. This is in line with our proposal of a higher-order layer of predictive coding for interoceptive signals discussed later.
It is important to note that metacognition can be defined and studied using either the “global ratings” of overall performance, evaluated at the end of a block of trials, or the “local ratings” collected on a trial-by-trial basis. The present study evaluated global metacognition of interoceptive learning. Given that studies of global and local metacognition have found divergent results using the 2 measures, suggesting that these 2 forms of metacognition may be driven by different mechanisms (Gallo et al. 2012), current results pertain to the global but not necessarily to the local metacognitive judgments of interoception. Further studies may investigate the relationship between gamma phase synchronization and local metacognition.
According to current instantiations of the predictive coding framework (Friston 2009; Feldman and Friston 2010; Seth 2013), successive layers of cortical information processing can be best seen as neural instantiations of a hierarchy of increasingly complex predictive models of the external world. Models at each of these levels generate predictions about the expected pattern of sensory activations at the next lower level, and failures in prediction project upward to the next level as a prediction error. These residual error signals are then used to update the predictive models to be more accurate reflections of reality.
Interoception, as recently conceptualized within hierarchical predictive coding (Seth 2013), can be seen as the generation of predictions about internal physiological states of the body. These top-down predictions are based on plausible probabilistic explanations of body states, and are themselves inferred and updated on the basis of past interoceptive signals (the previous heartbeats and taps). Furthermore, these predictions can be improved by the enhanced perceptual acuity elicited by auditory feedback. Consequently, feedforward interoceptive prediction errors reflect failures in accurate prediction of body states, and in our case, the timing of the heart beat.
Recently, it has been proposed that minimization of prediction errors will update the posterior probabilities and may induce changes in priors (perceptual learning) in an interoception task involving the Rubber Hand Illusion (Suzuki et al. 2013). Thus, in our experiment, statistical correlations between interoceptive signals per se, auditory and sensory faint detections of the heartbeat through the ribcage, could have led to an update of predictive models of self-signals through minimization of prediction error, resulting in learning.
Going further, we propose that metacognitive processes can integrate information about predictions of internal body states over longer time scales, and generate higher-order expectations about one's ability to generate accurate predictions about body states. In our case, this corresponds to reflecting on our own ability to track the timing of our heart beats. This additional layer of prediction and prediction error, likely to be generated by broadly distributed frontoparietal networks, is effectively responsible for the monitoring of interoceptive performance and conscious report thereof. This is a putative proposed mechanism that allows us to evaluate our own predictive behavior. Specifically, individuals with “good” metacognition generate accurate predictions about their performance at this higher-order layer, whether or not they learn to predict interoceptive body states at the lower layer. Importantly, this multilevel framework allows for predictions generated at the first (interoceptive) and second (metacognitive) layers to be independent in terms of their accuracy: hence, good/bad learners can both be metacognitively aware/unaware.
Source reconstruction findings were consistent with the HEP increase in frontal regions in sensor space for the group of learners compared with non-learners, adding spatial information and confirming the temporal dimension of the effect seen in the ERPs. Notably, the right insular cortex, a proposed candidate source for the HEP (Pollatos et al. 2005), relates structurally (Cerliani et al. 2012) and functionally (Cauda et al. 2011) with the fronto-temporal operculum. Moreover, the source activity found in the operculum might be a reflection of a deeper source (the anterior insula itself). Thus, our results suggest that the cardiac interoceptive learning may be mediated by computations reflected in an increased cortical activity in the insular cortex.
The insular cortex and adjacent and related networks have been implicated in awareness in a myriad of visceral perception signals such as thirst, dyspnea, “air hunger,” the Valsalva maneuver, sensual touch, itch, penile stimulation, sexual arousal, coolness, warmth, exercise, heartbeat, wine-tasting (in sommeliers), and distension of the bladder, stomach, rectum, or esophagus (Craig 2009). Importantly, the insula is also thought to be involved in the processing of subjective feelings about some of these interoceptive signals (Craig 2009), such as the sensation of warmth (Craig et al. 2000) or satiety (Stephan et al. 2003). The fMRI activity related to these subjective evaluations represents neural correlates of metacognitive evaluations of percepts that constitute a body of evidence pointing to a putative key role of the insula not only in interoception but correspondingly in interoceptive awareness. In this study, we extend these findings in 3 ways: First, we introduce a second dimension of metacognition, that is the comparative evaluation of 2 metacognitive evaluations, or metacognition of interoceptive learning; second, we show that it can be tracked by a global direct brain measure, gamma phase synchrony; third, that this metacognition of learning to follow one's own heart can be dissociated from the objective performance.
A cautionary note on the HEP modulation is mandatory to accurately interpret its modulation with feedback. It is a well-established fact that positive and negative deflections in voltage measured by scalp EEG sensors cannot be interpreted as “excitation” or “inhibition” of cortical pyramidal neurons, respectively (Niedermeyer and da Silva, 2005). For instance, inhibitory synapses acting in the soma and excitatory synapses acting in the apical dendrites of pyramidal neurons can both generate a negative deflection on scalp sensor voltages. This means that the increment in negative voltage observed in the group of interoceptive learners relative to non-learners after the auditory feedback should not be interpreted as an increase in the excitatory synaptic activity in the insular/opercular cortex, but rather as an increment in the overall synaptic activity in this region (i.e., excitatory and/or inhibitory synaptic activity).
We believe that our short and simple testing protocol provides an elegant manner in which to test interoceptive proficiency (and whether it can be changed by training). Since there is evidence that interoceptive awareness can be modulated and that this interacts with self-regulation in a variety of psychological and neurological disorders (Pollatos et al. 2007; Garfinkel et al. 2013), the prospects of being able to measure changes in interoceptive processing (and the awareness of it) with training could potentially allow clinicians and researchers to implement diagnostic and prognostic tools, test interventions and treatments, profiting from quantitative methods for both objective and subjective performance. We have defined the behavioral and brain signals to evaluate objective and subjective interoceptive learning when trying to follow your heart.
Taken together, these results suggest that interoceptive perception can be trained but also that this learning may be unconscious, since the agreement between performance and metacognition was split between participants. HEP, a cortical representation of cardiac afference, is also a marker of enhanced interoceptive performance, but not its awareness. Metacognition of learning of cardiac interoception requires the emergence of a large-scale gamma phase synchronization that may mediate the availability of interoceptive information to conscious access.
This research was supported by a Wellcome Trust Biomedical Research Fellowship WT093811MA, the Chilean National Fund for Scientific and Technological Development Grant 1130920, the Argentinean National Research Council for Science and Technology, and the Argentinean Agency for National Scientific Promotion, FONCyT -PICT 2012-0412 and FONCyT-PICT 2012-1309. Funding to pay the Open Access publication charges for this article was provided by Wellcome Trust Biomedical Research Fellowship (WT093811MA).
We thank John Deeks, Russell Thompson, and Alejandro Chandía for assisting with technical support and stimuli preparation, Anil K. Seth for contributing valuable discussions and insights, the reviewers for their helpful comments, and The Tivoli's crew in Cambridge for the energy supply while this study was performed. Conflict of Interest: None declared.