This paper presents a model-based investigation of mechanisms underlying the reduction of mismatch negativity (MMN) amplitudes under the NMDA-receptor antagonist ketamine. We applied dynamic causal modeling and Bayesian model selection to data from a recent ketamine study of the roving MMN paradigm, using a cross-over, double-blind, placebo-controlled design. Our modeling was guided by a predictive coding framework that unifies contemporary “adaptation” and “model adjustment” MMN theories. Comparing a series of dynamic causal models that allowed for different expressions of neuronal adaptation and synaptic plasticity, we obtained 3 major results: 1) We replicated previous results that both adaptation and short-term plasticity are necessary to explain MMN generation per se; 2) we found significant ketamine effects on synaptic plasticity, but not adaptation, and a selective ketamine effect on the forward connection from left primary auditory cortex to superior temporal gyrus; 3) this model-based estimate of ketamine effects on synaptic plasticity correlated significantly with ratings of ketamine-induced impairments in cognition and control. Our modeling approach thus suggests a concrete mechanism for ketamine effects on MMN that correlates with drug-induced psychopathology. More generally, this demonstrates the potential of modeling for inferring on synaptic physiology, and its pharmacological modulation, from electroencephalography data.
The mismatch negativity (MMN) is an event-related potential (ERP), measured with electrophysiological techniques such as electroencephalography (EEG) or magnetoencephalography (MEG), observed in response to the violation of a statistical regularity. Operationally, it is defined as the difference waveform obtained by subtracting the ERP to predicted (“standard”) stimuli from unpredicted (“deviant”) stimuli. While it has been studied most intensively in the auditory domain, it has also been elicited using visual and somatosensory stimuli (Astikainen et al. 2004; Czigler et al. 2004). Ever since its initial description in 1978 (Naatanen et al. 1978), the MMN has played an increasingly important role in cognitive neuroscience. Traditionally, it was seen to reflect a basic process of memory trace formation (Naatanen et al. 2001), which enables automatic, pre-attentive novelty or change detection (Tiitinen et al. 1994; Naatanen 2000). More recently, the MMN has been interpreted as a electrophysiological index of surprise or prediction error and treated as a paradigmatic example of perceptual inference and learning within a general hierarchical Bayesian framework of brain function, namely, predictive coding (Rao and Ballard 1999; Friston 2005) that can be regarded as an instance of the free-energy principle (Friston 2009).
Beyond cognitive neuroscience and theories of brain function, the MMN has attracted a lot of attention because it is altered in several brain disorders (Naatanen 2003) with, in particular, significant reductions in schizophrenia patients (Baldeweg et al. 2002, 2004; Umbricht and Krljes 2005). The MMN is well-suited as a potential index of pathophysiology because it can be obtained with relatively little effort and is robust against a number of factors that can confound the interpretation of diagnostically relevant measures from cognitive paradigms, such as attentional state or vigilance (Naatanen et al. 2001). The clinical utility of the MMN is further established by remarkably consistent findings from neuropharmacological studies, rendering it potentially informative with regard to pathophysiology and treatment: Over the past 2 decades, numerous pharmacological experiments in animals and humans have indicated that MMN expression can be strongly reduced by antagonizing NMDA receptors (NMDAR; Javitt et al. 1996; Umbricht et al. 2000; Kreitschmann-Andermahr et al. 2001; Ehrlichman et al. 2008; Heekeren et al. 2008).
Understanding the NMDAR-dependence of the MMN is best pursued within a comprehensive theory of the physiological and computational mechanisms that generate MMN responses. One such theory is the so-called “adaptation hypothesis”, which postulates that the MMN arises from adaptation mechanisms in tonotopically organized parts of the auditory system, that is, neurons in the primary auditory cortex that are repeatedly excited by auditory stimuli of the same frequency (May et al. 1999; Jääskeläinen et al. 2004; Ulanovsky et al. 2004). Biophysically, this appeals to mechanisms such as rapid synaptic depression (Zucker and Regehr 2002) or spike-frequency adaptation (Faber and Sah 2003). An alternative perspective on MMN mechanisms is the “model adjustment hypothesis”, which views the MMN as a response reflecting the update of an environmental model that is represented by a network of temporo-frontal areas and reconfigures in the light of unexpected sensory events (Winkler et al. 1996; Winkler 2007). Neurophysiologically, this theory speaks to the importance of short-term plasticity of glutamatergic long-range connections between temporal and frontal areas. More recently, a “free-energy theory” of the MMN was formulated that unifies both the adaptation and model adaptation hypothesis and suggests an overarching physiological and computational process that requires both adaptation that is intrinsic to cortical sources and short-term plasticity in extrinsic connections between sources (Garrido et al. 2008; Garrido, Kilner, Stephan, et al. 2009). This theory interprets the MMN as a prediction error signal (generated by pyramidal neurons in supragranular layers) during predictive coding in the auditory processing hierarchy, where each level attempts to minimize the discrepancy between bottom-up inputs from the level below and top-down predictions from the level above. By recurrent message passing across levels and prediction error-dependent synaptic plasticity, this circuit minimizes free-energy (an approximation to the information theoretic measure of surprise) across the entire hierarchy, enabling inference about the causes of sensory input and optimal learning about statistical regularities (Friston 2009). Importantly, the free-energy theory of the MMN incorporates the 2 key physiological mechanisms implied by the 2 previous theories: local adaptation and short-term plasticity of inter-regional glutamatergic synaptic connections. The former controls the post-synaptic gain of neurons encoding prediction error (such that inputs with low precision or high uncertainty have less impact on predictions), while the latter optimizes inter-regional synaptic weights during learning and thus regulates the transmission of predictions and their errors across the hierarchy.
While the free-energy principle is a very generic theory of brain function (Friston 2010), it has been particularly useful for framing studies of MMN mechanisms. Of particular relevance for the present study is that both of the key processes described above—adaptation and short-term plasticity of glutamatergic connections—are regulated by NMDARs. The free-energy formulation thus offers an opportunity to address the question, from a model-based perspective, which synaptic mechanism may underlie the empirically well-established NMDAR-mediated reduction of the MMN. It is conceivable that this effect could be expressed entirely at the level of neuronal adaptation because spike-frequency adaptation results from potassium channel-dependent hyperpolarization which, in turn, relies on intracellular calcium influx that is modulated by NMDAR status (Faber and Sah 2003). On the other hand, it is equally conceivable that the NMDAR-mediated MMN reduction results from aberrant short-term plasticity of inter-regional glutamatergic connections. This is because activation of NMDARs can lead to rapid changes in the strength of glutamatergic synapses, for example, via phosphorylation of AMPA receptors (AMPARs; Wang et al. 2005). A third option is that both mechanisms contribute to empirically observed NMDAR effects on MMN expression.
Clearly, these competing accounts cannot be disentangled by traditional ERP analyses that rely on simple subtraction of evoked responses. Instead, we need to evaluate the relative plausibility of different physiologically interpretable models that can be fitted to empirically measured MMN responses. This allows us to assess the relative contributions from adaptation and short-term plasticity of glutamatergic connections, and how they change under NMDAR antagonists.
A general framework for model-based assessment of competing theories about neuronal circuits is dynamic causal modeling (DCM; Friston et al. 2003; Stephan et al. 2010). DCM is a generic Bayesian system identification technique that has gained popularity in neuroimaging and electrophysiology over the past few years and allows for inference on “hidden” neurophysiological mechanisms that generated observed measures, such as blood oxygen-level–dependent signal in functional magnetic resonance imaging (fMRI) or evoked responses measured with EEG. The key idea here is to formulate a simplified model of neuronal population responses and combine this with a modality-specific forward model such that one can predict the measurement that would arise from any particular neuronal circuit. Given such a generative model and known experimental perturbations (stimuli), one can invert the model and thereby compute the posterior probability of the model parameters, given the data. Furthermore, alternative models embodying competing hypotheses about the mechanisms generating the data can be evaluated using their model evidence, a principled measure of the balance between model fit and model complexity (Penny et al. 2004).
While DCM has been formulated for different modalities (cf. Stephan et al. 2007; Kiebel et al. 2009), its current implementation for ERPs represents a neural mass formulation of interacting cortical sources (David et al. 2006), with distinct representations of adaptation and synaptic plasticity that have proven very useful in previous MMN studies (Kiebel et al. 2007; Garrido et al. 2008; Garrido, Kilner, Kiebel, et al. 2009). In this paper, we use this implementation of DCM for inferring on the physiological mechanisms that underlie empirically measured reductions of MMN amplitude under the influence of the NMDAR antagonist ketamine. The data analyzed here were from a recent study by Schmidt et al. (2012) who examined 19 healthy volunteers with a roving MMN paradigm (Haenschel et al. 2005; Garrido et al. 2008) on 2 sessions, using a cross-over, double-blind, placebo-controlled design. As previously reported, the ERP analyses of these data indicated a significant reduction of the MMN at fronto-central electrodes following ketamine administration (Schmidt et al. 2012), which is consistent with a number of previously published reports in humans, using conventional (non-roving) MMN paradigms (Umbricht et al. 2000, 2002; Heekeren et al. 2008).
In this paper, we apply DCM and Bayesian model selection (BMS) to the data from Schmidt et al. (2012) to address the following 3 questions. First, can we replicate the results of previous (non-pharmacological) DCM studies of the MMN that evaluated a set of competing models inspired by the free-energy formulation (Garrido et al. 2008; Garrido, Kilner, Stephan, et al. 2009)? Secondly, which of the 2 mechanisms of interest—intrinsic adaptation or extrinsic short-term glutamatergic plasticity—contributes most to explaining ketamine effects on MMN expression, and are these mechanisms regionally specific? Finally, can we validate our model-based approach by finding a correlation between model parameter estimates and cognitive measures altered by ketamine?
In brief, 1) we replicate previous DCM results from non-pharmacological MMN studies that both adaptation and short-term plasticity are necessary to explain MMN expression per se, 2) we find significant ketamine effects on synaptic plasticity, but not adaptation, and observe a selective ketamine effect on forward connections from the left primary auditory cortex, and finally, 3) we observe a significant correlation between ketamine-induced changes in plasticity of auditory forward connections and introspective measures of cognition.
Materials and Methods
The participants, drug administration, and data acquisition have previously been described in Schmidt et al. (2012) and the interested reader is referred to this paper for details. Here, we summarize the most important aspects and provide details of data analysis with DCM and BMS.
This study was approved by the Ethics Committee of the University Hospital of Psychiatry, Zurich. After receiving written and oral descriptions of the aim of the study, all participants gave written informed consent statements before inclusion in the study. The use of psychoactive drugs was approved by the Swiss Federal Health Office, Department of Pharmacology and Narcotics (DPN), Bern, Switzerland.
Healthy subjects were recruited at the local university and technical college through advertisement (N = 19; male: 12, mean age = 26 ± 5.09 years). The study of Schmidt et al. (2012) also included a group of subjects receiving psilocybin versus placebo. As no effect of psilocybin on the MMN was found, this group was not included in the present study.
Prior to inclusion, the subjects' physical health was confirmed by medical history, clinical examination, electrocardiography, and blood analysis. To ascertain the subjects' mental status, all subjects were screened by the diagnostic expert system (Wittchen and Pfister 1997), using a semi-structured psychiatric interview, and the Hopkins Symptom Checklist (SCL-90–R; Derogatis 1994). Furthermore, subjects also underwent the Mini-International Neuropsychiatric Interview (MINI), a short structured psychiatric interview (Sheehan et al. 1998). We verified the absence of a history of drug dependence or present drug abuse by urine drug screening and a questionnaire that assessed previous drug consumption.
Drug Administration and Psychometric Assessment of S-Ketamine State
Subjects underwent 2 sessions (placebo and S-ketamine) in a counterbalanced fashion at an interval of at least 2 weeks. Both subjects and the principal investigator were blind to drug order. Subjects were monitored and under constant supervision until all drug effects had worn off, and were then released into the custody of a partner or immediate relative. For the S-ketamine/placebo infusion, an in-dwelling catheter was placed in the antecubital vein of the non-dominant arm. Once the subject was ready, a bolus injection of 10 mg over 5 min was delivered. Following a 1 min break, a continuous infusion with 0.006 mg/kg per min was administered over 80 min. To keep S-ketamine plasma level fairly constant, the dose was reduced every 10 min by 10% (Feng et al. 1995; Vollenweider et al. 1997). In the placebo session, the same procedure was followed and an infusion of physiological sodium chloride solution and 5% glucose was administered.
The Altered State of Consciousness (ASC) questionnaire, a visual analog and self-rating scale, was used to assess the subjective effects of S-ketamine (Dittrich 1998). A recent evaluation study of the ASC questionnaires has constructed 11 new lower order scales (Studerus et al. 2010), which were analyzed in this study.
Electroencephalographic (EEG) activity was recorded during an auditory “roving” oddball paradigm, originally developed by Cowan et al. (1993) and subsequently modified by Baldeweg et al. (2004), to assess the MMN response. Acoustic stimuli were generated using the E-prime software (Schneider et al. 2002) and were presented binaurally through headphones.
The stimuli consisted of seamlessly connected trains of pure sinusoidal tones with a roving frequency structure. Within each stimulus train, all tones were of one frequency and were followed by a train of tones of a different frequency. The first tone of a train represented the deviant, which became a standard tone after a few repetitions. Therefore, the deviant and standard tones had exactly the same physical properties within one stimulus train, differing only in the number of times they had been presented in the recent past. The number of times the same tone was presented within one stimulus train varied pseudo-randomly between 1 and 11 (t = 1–11). The probability that the same tone was presented in one stimulus train was 1) 2.5% for trains with 1 or 2 identical tones, 2) 3.75% for trains with 3 or 4 identical tones, and 3) 12.5% for trains with 5–11 identical tones. In other words, 5% of all stimulus trains consisted of 1–2 identical tones, 7.5% of all stimulus trains consisted of 3–4 identical stimuli, and 87.5% of all stimulus trains consisted of 5–11 identical stimuli. The frequency of the tones varied from 500 to 800 Hz in random steps with integer multiples of 50 Hz, tone duration was set at 70 ms, and the inter-stimulus interval was 500 ms.
In parallel, subjects performed a distracting visual task and were instructed to ignore the sounds. This follows the suggestion that MMN assessment is optimal when the subject's attention is directed away from the auditory domain (Naatanen 2000). The task consisted of button-press responses whenever a fixation cross changed its luminance, which occurred pseudo-randomly every 2–5 s (not coinciding with auditory changes). The experimental session lasted approximately 15 min.
Data Acquisition and Preprocessing
The EEG was recorded at a sampling rate of 512 Hz using a Biosemi system with 64 scalp electrodes. The horizontal electro-occulogram (EOG) was recorded from electrodes attached on the outer canthus of each eye. Similarly, the vertical EOG was recorded from electrodes attached infraorbitally and supraorbitally to the left eye.
Pre-processing and data analysis was performed using SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/). Continuous EEG recordings were referenced to the average, down-sampled to 300 Hz, and bandpass filtered between 0.5 and 30 Hz. The data were then epoched into 500 ms segments using a peri-stimulus window of 100 ms. For each subject and in each condition, 2 trial types were defined, the deviant trial (first tone within a new train) and a standard trial (operationally defined as the sixth tone, as in Garrido et al. (2008)). The artifact rejection procedure used a thresholding approach to detect problematic trials or channels. Trials in which the signal recorded at any of the channels exceeded 80 µV relative to the pre-stimulus baseline were removed from subsequent analysis. Most of the artifacts that were detected reflected vertical and horizontal eye movements, which were monitored using ocular electrodes. The average number of artifact-free trials was 124 and 172 for standard and deviant trials in the placebo condition, and 153 and 207 for standard and deviant trials in the ketamine condition. Grand averages were computed using robust averaging, a weighted least squares procedure that incorporates a weighting matrix into the estimation so that outlier values exhibit less influence on the overall mean (Wager et al. 2005).
The data were subject to standard analyses using statistical parametric mapping (SPM)—over channels and peri-stimulus time—to establish condition (standard vs. oddball) effects and any interaction with ketamine at the between subject level. The same data were then subject to dynamical causal modeling in an attempt to explain the differences observed in terms of adaptation and changes in extrinsic connectivity. For the SPM analyses, the averaged data from each trial type and each condition were converted to scalp images for all 64 channels and 151 time points using a voxel size of 4.25 × 5.38 × 3.33 mm. The images were constructed using linear interpolation for (removed) bad channels and smoothing to accommodate for between-subject spatial and temporal variability in channel space.
DCM was applied to the preprocessed channel data to explain observed responses in source states. For computational expediency, all dynamic causal models (see below) were computed on a reduced channel space that corresponded to 8 channel mixtures or spatial modes. The 8 spatial modes were calculated using singular value decomposition of the channel data over a temporal window of interest. Following Garrido et al. (2008), the temporal window of interest was confined to 0–250 ms post-stimulus to ensure selective modeling of the MMN response (as opposed to later components).
Dynamic Causal Modeling
Neurophysiologically plausible forward models are essential for understanding how ERPs are generated. One such approach is DCM that was originally developed for connectivity analysis of fMRI data (Friston et al. 2003) and was subsequently implemented for a range of other data modalities and features, such as ERPs measured by EEG (David et al. 2006). DCM uses a biologically informed causal model to make inferences about the underlying neural mechanisms that generate observed event-related responses. This approach provides an important advance over conventional source reconstruction techniques for ERP data because it places neurobiological constraints on the model inversion, in which the parameters of the reconstruction have a specific neuronal interpretation. These parameters describe, for example, the synaptic coupling strength among sources and post-synaptic gain, and how these properties depend upon stimulus attributes or experimental manipulations (David et al. 2006; Kiebel et al. 2006).
The implementation of DCM for ERPs in SPM8 uses a neural mass model of a cortical source (Jansen and Rit 1995) that contains interacting inhibitory and excitatory subpopulations of neurons. Specifically, each source is described in terms of the average post-membrane potentials and firing rates of 3 neuronal subpopulations of pyramidal cells, spiny stellate cells, and inhibitory interneurons. The cortical sources are linked by forward, lateral, and backward connections (Linn et al. 1999) and conform to a hierarchical model of intrinsic and extrinsic connections within and between multiple sources as described in previous studies (David et al. 2005; Nee et al. 2007).
To estimate the model parameters that best explain how the observed data were generated, DCM inverts a spatiotemporal model covering all sensors or spatial modes and the temporal window of interest. The neural parameters describe synaptic connectivity strengths, post-synaptic gain, propagation delays among sources, and various synaptic rate constants. The spatial parameters, on the other hand, specify the location and orientation of equivalent current dipoles (ECDs). We used a 4 concentric sphere head model with homogenous and isotropic conductivity as an approximation to the brain, the cerebrospinal fluid, skull and scalp surfaces. We specified an ECD model with uninformed priors about the dipole orientations and informed priors about the source locations (Penny 2012), using the same coordinates as Garrido et al. (2008) for defining prior surface location means with a prior variance of 16 mm2. The moment (orientation) parameters had prior means of zero and a variance of 256 mm2 in each direction, which is equivalent to assuming uninformed priors on the orientations of the dipoles.
Bayesian inversion of the combined neuronal and spatial model provides a posterior distribution for each parameter whose variance represents the uncertainty about the parameter after observing the data. Furthermore, the uncertainty about the model itself is addressed using a Bayesian model comparison based on an approximation to the model evidence. These procedures are explained in detail elsewhere (Penny et al. 2004; Stephan et al. 2009) and are briefly summarized below.
DCM is a hypothesis driven method that does not explore all possible models, but tests a specified model space based on prior knowledge about the system of interest. The network architectures that we tested in the present study were motivated by the results of previous MMN examinations (Rinne et al. 2000; Opitz et al. 2002; Doeller et al. 2003; Grau et al. 2007). These studies suggest that the main cortical generators of the MMN include the bilateral primary auditory cortex (A1), the bilateral superior temporal gyrus (STG), and the right inferior frontal gyrus (IFG). The coordinates used for the specification of our ECD model were informed by the above studies and identical to the ones previously used by Garrido et al. (2008) in the original DCM examination of the roving MMN.
In this study, we compared 8 distinct models that might underlie the generation of MMN in response to a deviant tone. These models were created by systematic combinations of the 2 key mechanisms proposed by predictive coding schemes under the free-energy principle. The first mechanism was short-term plasticity of glutamatergic long-range connections. In DCMs of the MMN, this is typically modeled by allowing for a modulation of the synaptic coupling strength of inter-regional forward and backward connections when the deviant tone is presented (cf. Garrido et al. 2007, 2008). The corresponding DCM parameters express (as a multiplicative factor or scaling coefficient) the coupling change relative to the standard tone. We allowed for different expression of this type of plasticity, creating 4 models: No modulation of connections by the deviant tone (model 1), modulation of either forward or backward connections among A1, STG, and IFG (models 2 and 3), and modulation of both forward and backward connections among the 3 brain regions (model 4). With these 4 models, we hypothesized that the differences between the deviant and the standard are caused by short-term plasticity of connections within the temporofrontal network, representing the reconfiguration of this network in response to prediction error or surprise (cf. model adjustment).
The second mechanism represented by our models concerned neuronal adaptation. That is, in models 5 through 8 we repeated the same variations in synaptic plasticity as for the first 4 models, but additionally we allowed for variations in adaptation, expressed in terms of deviant-induced modulation of the post-synaptic gain (responsiveness or excitability) in left and right A1 (Fig. 4A). This addressed the adaptation hypothesis, which postulates that the MMN arises predominantly from post-synaptic mechanisms such as spike-frequency adaptation as a result of an increase in calcium-dependent potassium conductance (May et al. 1999; Ulanovsky et al. 2004). In DCMs of the MMN, a lumped representation of adaptation mechanisms can be achieved by allowing for a modulation of post-synaptic gain parameters (this has previously also been referred to as modulation of “intrinsic connections”; Kiebel et al. 2007). Thus, models 5 through 8 examine the hypothesis of both frontotemporal interactions and local adaptation within A1 as the neural mechanisms underlying the generation of the MMN response.
Overall, the 8 models tested here are similar, but not identical, to the 6 models considered by Garrido et al. (2008). We used the same type of DCM, time window (0–250 ms), number of spatial modes, and dimensions of model space (i.e., adaptation and synaptic plasticity). In addition to Garrido et al. (2008), however, we also tested models without synaptic plasticity anywhere in the network (models 1 and 5; Fig. 4A) and introduced additional inter-hemispheric connections between A1 and STG, respectively, in both hemispheres (Fig. 4A).
The statistical analyses employed in this paper were based on the standard 2-stage (summary statistics) approach in DCM, that is, model selection followed by interrogation of posterior estimates (Stephan et al. 2010). In the first stage, BMS was used to determine the optimal network architecture underlying electrophysiological responses to auditory stimulation. In a second stage, posterior parameter estimates were examined to detect differences between placebo and ketamine conditions. This second stage used the posterior means averaged over the DCMs of each subject (Bayesian model averaging, BMA) as the dependent variables for a multivariate Hotelling's T2 test and subsequent univariate t-tests (see below for details).
Bayesian Model Selection and Bayesian Model Averaging
From a Bayesian perspective, alternative models that represent competing hypotheses about the mechanisms generating observed data are evaluated by comparing their (log) evidence. The log evidence corresponds to the negative surprise about the data or the (log) probability of the data given a model. It represents a principled measure, derived from probability theory, of the balance between model fit and model complexity. Since it cannot be computed analytically except for linear Gaussian models, approximations to the log evidence are usually required. The approximation used here is the (negative) free energy that provides a bound approximation on the log evidence and can be obtained using Variational Bayes (Friston et al. 2007).
The evidence can be decomposed into 2 components: The accuracy term, which quantifies the data fit, and a complexity term, which penalizes models with many degrees of freedom (e.g., many and/or uninformed parameters). The best model given the data is the one with the largest log model evidence, lnp(y|m) (assuming a uniform prior over all models). Models can be compared by computing evidence ratios (Bayes factors) or log evidence differences. Following conventional classifications (Kass and Raftery 1995), one concludes that there is strong evidence in favor of a model if the difference in log evidence is >3 compared with another model (i.e., a Bayes factor of 20).
Furthermore, inference about general characteristics of model architecture can be obtained by using family-based inference, which compares sets of models grouped by architectural properties (Heresco-Levy et al. 2007). In the present study, we used family level inference to determine whether the modulation of post-synaptic gain in bilateral A1 constituted an important addition to the model architecture. Thus, we specified 2 model families, the first without and the second with post-synaptic gain modulation present at the level of bilateral A1. We performed family-wise inference, using a random effects approach, which is robust to potential outliers in the population (Stephan et al. 2009). Prior to the subsequent analysis of differences in posterior parameter estimates across drug conditions, we used BMA (Penny et al. 2010) that averages over models, weighted by the posterior model probabilities. In this way, BMA provides parameter estimates that account for model uncertainty.
Analysis of Model Parameter Estimates
Following BMA, we used the resulting posterior means from the averaged DCMs for examining differences in deviant-induced changes in adaptation (post-synaptic gain) and synaptic plasticity (parameterized in terms of condition dependent coupling changes) between placebo and ketamine conditions. First, we used a multivariate Hotelling's T2 test, testing whether the 2 sets of parameter estimates encoding neuronal adaptation and short-term synaptic plasticity, respectively, were significantly different between ketamine and placebo conditions. This drug-induced difference was significant for the parameters representing synaptic plasticity, but not for neuronal adaptation. Based on this result we used, in a second step, univariate post hoc t-tests, asking which of the 6 forward and backward connections in the model contributed individually to explain the ketamine effects. We Bonferroni-corrected the results for the number of parameters tested. Finally, in a third step, we used a simple regression analysis to examine the relationship between the model parameters showing significant changes under ketamine and the independent ratings on the ASC questionnaire after ketamine administration.
MMN Responses Due to Repetition Effects
A conventional analysis of MMN amplitudes and latencies from this group of subjects has been presented by Schmidt et al. (2012). Here, we complement these analyses by a spatiotemporal characterization of the MMN response using SPM (in sensor space) and whole-brain correction for multiple tests (family-wise error correction using Gaussian random field theory as implemented in SPM8). The goal of this initial analysis was to verify the presence of a MMN response for each subject and each condition, prior to subsequent modeling. Two of 19 subjects failed to show a significant MMN response in both placebo and ketamine conditions and were therefore excluded from subsequent DCM analyses. In the placebo condition, the MMN response was observed over frontal and central electrodes between 110 and 220 ms. Figure 1A illustrates the grand mean responses averaged across all subjects in the placebo condition, comparing responses to “deviant” trials (first tone within stimulus trains, dashed lines) with responses to “standard” trials (sixth tone, solid black lines).
The MMN response peaked at 177 ms from tone onset, which is consistent with previous studies (cf. Cowan et al. 1993; Garrido et al. 2008). Figure 1B shows a 3D spatiotemporal characterization of the MMN response using SPM to compare the deviant with the standard tones in the placebo condition. The analysis was performed across the entire epoch [−100 400] and over all 64 channels. As noted above, for these analyses the scalp topography at any time bin was interpolated from 64 channels and smoothed. Figure 1B shows the SPM where, over subjects, there is a significant negative amplitude deflection as a result of the deviant tone [t(32) = 5.26, P < 0.05, family-wise error corrected]. This result suggested a significant MMN response over bilateral frontal channels between 160 and 180 ms with maximum at 180 ms.
In the ketamine condition, we also observed a MMN response over frontal and central electrodes between 110 and 220 ms. Figure 2A shows the grand mean responses across all subjects in the ketamine condition in response to the deviant tone (dashed lines) compared with the standard tone (grey).
In the ketamine condition, the MMN response was also seen in right frontal channels between 140 and 160 ms with a maximum at 150 ms [t(32) = 5.25, P < 0.05 FWE]. A late response was observed on the ketamine as well, peaking at 390 ms in central electrodes (Fig. 2B).
Importantly and in accordance with previous studies, the MMN response was significantly attenuated in the ketamine compared with the placebo condition. Figure 3A illustrates the difference waveform, which contrasts the deviant tone to the standard tone, in placebo and ketamine conditions. In the placebo condition, the MMN response was larger over frontal and central electrodes between 150 and 180 ms and between 220 and 300 ms.
A paired SPM t-test was performed to compare the MMN (differences between standard and oddball tones) between placebo and ketamine conditions. Significant differences between the MMN under placebo and ketamine conditions were observed [t(16) = 3.68, P < 0.05 corrected for a search volume defined by the main effect of MMN] in fronto-central and central electrodes between 220 and 240 ms with a maximum at 230 ms (Fig. 3B). These results confirm that the MMN response was significantly larger in the placebo compared with the ketamine condition.
DCMs of the MMN Response
As described in section Materials and Methods, we defined a space of 8 models systematically combining mechanisms of adaptation in A1 with synaptic plasticity expressed by different extrinsic connections of our temporofrontal network. Random effects BMS indicated that the model with plasticity in forward and backward connections as well as adaptation (expressed via post-synaptic gain modulation in A1) had the largest model evidence and was clearly superior to all other models at the group level (exceedance probabilities >80% for placebo and >70% for ketamine; Fig. 4A–C).
Furthermore, we exploited the factorial nature of our model space to examine the importance of each mechanistic factor (adaptation and synaptic plasticity, respectively) on its own. Random effects family-level BMS showed that “adaptation models” that allowed for post-synaptic gain modulation in A1 (models 5–8) were generally superior to models without adaptation (models 1–4) in both placebo and ketamine conditions (exceedance probabilities >99%), regardless of synaptic plasticity elsewhere in the model (Fig. 4D,E). Conversely, models that allowed for the expression of synaptic plasticity at inter-regional connections (models 2–4, 6–8) were generally superior to models without such plasticity (models 1 and 5) in both placebo and ketamine conditions (exceedance probabilities >99%), regardless of whether the model included adaptation or not (Fig. 4F,G).
Effects of Ketamine
To examine the effects of ketamine on mechanisms underlying the NMDAR-dependence of the MMN response, we compared the parameter estimates from subject-specific DCMs that were averaged (using BMA) separately for placebo and ketamine conditions. First, we used a multivariate Hotelling's T2 test to examine whether the distinct sets of parameter estimates encoding neuronal adaptation and short-term synaptic plasticity, respectively, were significantly different between ketamine and placebo conditions. This drug-induced difference was significant for the parameter set representing synaptic plasticity (T = 14.771 with P < 0.023), but not for the parameter set representing neuronal adaptation (T = 0.217 with P = 0.897). In other words, while adaptation was critical for explaining the MMN (see above), it did not change under ketamine compared with placebo. Based on this result we used, in a second step, univariate post hoc t-tests, asking where in the network synaptic plasticity was affected by ketamine (i.e., at which of the 6 forward and backward connections in the model). We found a significant reduction of synaptic plasticity, following ketamine administration, of the forward connection from the left A1 to the left STG (Fig. 5A,B and Table 1). In other words, under placebo, the left A1 → STG connection almost doubled in strength when a deviant tone was presented; in contrast, under ketamine, this increase was almost absent (Fig. 5B; note that the modulatory parameters in DCM for ERPs are scaling coefficients, not additive contributions as in DCM for fMRI).
|Paired differences mean||Standard deviation||Standard error mean||t||Significance (2-tailed)|
|Left A1 > left STG||0.94||0.84||0.20||4.60||0.0001*|
|Right A1 > right STG||0.79||2.10||0.51||1.55||0.140|
|Right STG > right IFG||0.56||1.65||0.40||1.40||0.180|
|Left STG > left A1||0.28||2.46||0.60||0.47||0.646|
|Right STG > right A1||0.20||0.99||0.24||0.81||0.428|
|Right IFG > right STG||0.68||2.55||0.62||1.11||0.285|
|Paired differences mean||Standard deviation||Standard error mean||t||Significance (2-tailed)|
|Left A1 > left STG||0.94||0.84||0.20||4.60||0.0001*|
|Right A1 > right STG||0.79||2.10||0.51||1.55||0.140|
|Right STG > right IFG||0.56||1.65||0.40||1.40||0.180|
|Left STG > left A1||0.28||2.46||0.60||0.47||0.646|
|Right STG > right A1||0.20||0.99||0.24||0.81||0.428|
|Right IFG > right STG||0.68||2.55||0.62||1.11||0.285|
*FWE P < 0.05.
Notably, this effect of ketamine on synaptic plasticity remained significant after Bonferroni-correction for multiple comparisons (t = 4.60, P < 0.001). Furthermore, there was a trend toward a significant reduction in the forward connection from the right A1 to the right STG (t = 1.55, P = 0.14).
Finally, we related this selective effect of ketamine on plasticity of the left A1 → STG connection to the impact of ketamine on subjects' ASC questionnaire scores. Specifically, Schmidt et al. (2012) had reported an effect of ketamine on the “control and cognition” subscale of the ASC, and we now tested, using a simple linear regression analysis, whether this effect of ketamine might be explained through its effects on plasticity of the left A1 → STG connection. Indeed, there was a significant linear relation between drug effects on “control and cognition” ratings (score under ketamine minus score under placebo) and drug effects on plasticity (ratios of condition-specific changes in coupling, that is, relative plasticity under ketamine vs. placebo) of the left A1 → STG connection (F = 5.53, P < 0.03).
In this paper, we present a model-based investigation of the physiological mechanisms that underlie the well-established reduction of the MMN following administration of the NMDAR antagonist ketamine. Specifically, we used DCM and BMS to analyze data from a recent study by Schmidt et al. (2012) of healthy volunteers during a roving MMN paradigm with a cross-over, double-blind, placebo-controlled design. In the following, we summarize the results of our model-based investigations, discuss their implications and consider potential limitations of our study.
In a first step, we verified that we could replicate the results of a previous DCM study of the roving MMN paradigm (Garrido et al. 2008). Specifically, we specified a set of 8 models that differed systematically in 1) whether or not they allowed for a differential expression of adaptation (post-synaptic sensitivity) during standard and deviant stimuli and 2) which connections were allowed to change—show short-term plasticity—between deviant and standard stimuli. These models were very similar to those of Garrido et al. (2008) except that we added interhemispheric connections at the level of both primary and secondary auditory areas (Fig. 4). These connections do not change the relative evidence for different models, but increase the overall model evidence. Reassuringly, our model selection results replicate those by Garrido et al. (2008): Regardless of the drug condition, both adaptation in the primary auditory cortex and short-term plasticity of forward and backward connections across the auditory hierarchy markedly improved the model evidence (Fig. 4). In other words, as postulated by MMN theories resting on the free-energy principle (Garrido, Kilner, Stephan, et al. 2009), both adaptation and synaptic plasticity are required to explain the MMN generation per se.
While both adaptation and plasticity of forward and backward connections were required to explain MMN, which of these mechanisms, if any, would be attenuated by ketamine? Using BMA (Penny et al. 2010), we asked whether ketamine effects on the MMN generation were reliably modelled by DCM such that the pharmacological effects were clearly reflected in the model parameters (changes in coupling). To this end, we used a multivariate Hotelling's T2 test, testing separately whether the parameter estimates encoding neuronal adaptation and short-term synaptic plasticity, were significantly different between ketamine and placebo conditions. This drug-induced difference was significant for the parameters representing synaptic plasticity, but not adaptation. We then proceeded to ask, using univariate t-tests, which of the 6 forward and backward connections were most critical for explaining the ketamine effects. Correcting for multiple comparisons, we found that only plasticity in the forward connection from the left A1 to the left STG showed a significant change under ketamine, compared with placebo. A similar trend was observed for the homologous forward connection from right A1 to right STG, but this did not quite reach significance.
Finally, we examined whether our model-based estimates of ketamine-induced attenuation of short-term plasticity in the left A1 → STG connection correlated with a measure of ketamine-induced change in cognition. The previous study by Schmidt et al. (2012) had found a correlation between relative ketamine-induced impairments of subjective ratings of impaired control and cognition (subsuming items for disordered thought and loss of control over body and thought) and the MMN “slope” under placebo, which indexes the systematic increase in MMN amplitude with the number of preceding standards (a specific index of prediction error processing). In brief, the previous study showed that MMN expression under placebo predicted the effects of ketamine on introspective ratings of symptoms that are reminiscent of schizophrenic symptoms (e.g., thought disorder). Our present analyses suggest a more fine-grained interpretation of this relationship: Our model-based analyses report a significant correlation between ketamine effects on short-term plasticity of the left A1 → STG connection and ketamine-induced changes in “control and cognition” ratings. In other words, we empirically demonstrated that blocking NMDARs by ketamine leads to impaired plasticity by reducing changes in connection strength from the left A1 to the left STG, the extent of which predicted significant S-ketamine-induced psychopathology. It is interesting to note that ketamine selectively exerted effects on the left, but not right A1 → STG connection. This hemispheric asymmetry is reminiscent of the left-hemispheric dominance of inner speech that has been linked to disordered thought in schizophrenia, for example, Strik et al. 2008. However, this is clearly a speculative observation at the present time.
Why should a disruption of NMDAR-dependent synaptic plasticity predict changes in the “control and cognition” subscale of the ASC questionnaire? For example, one may wonder whether a phenomenon like the MMN may be too “low-level” a phenomenon to be informative with regard to “high-level” cognitive control processes as probed by the ASC questionnaire. A contrary perspective is that impairments of perceptual inference, even when they originate from early stages of information processing, may percolate throughout the processing hierarchy and induce significant disturbances at all levels (cf. Leitman et al. 2010). This is predicted by hierarchical Bayesian accounts of brain function, which consider abnormal prediction error signals as one potential cause of maladaptive cognition. Several authors have offered explanations how deficient prediction error processing may lead to impaired learning and suboptimal inference on environmental causes of sensory inputs, with consequences for adaptive cognition in general, and how this might explain a wide range of cognitive symptoms, for example, in psychosis (e.g., Friston 2005; Stephan, Friston, et al. 2009; Fletcher & Frith 2009; Corlett et al. 2011). Assuming that the MMN does indeed represent implicit trial-by-trial encoding of prediction errors that depends on NMDAR-dependent plasticity at glutamatergic synapses (cf. Stephan et al. 2006), our finding that the extent of ketamine-induced disruption of synaptic plasticity predicts impairments in cognitive control across subjects is in line with this general perspective.
However, it is important to emphasize that we do not wish to make any claims of specificity here. Our psychological measures relate to recently developed subscales of one specific questionnaire, the ASC, which is particularly useful for investigating changes in conscious state under drug influence. The items which relate most closely to cognitive impairments in schizophrenia (e.g., concerning disordered thought and loss of cognitive or executive control), are contained by the “control and cognition” subscale (Studerus et al. 2010). While this is the subscale that correlates with our model-based estimates of synaptic plasticity, it certainly does not provide a full coverage of all aspects of psychopathology that may be linked to schizophrenia or other psychoses. Whether model-based estimates of synaptic plasticity, as obtained in the present study, are also associated with other psychopathologically relevant symptoms therefore remains to be tested in future studies.
Our results are consistent with the free-energy account of the MMN in that they highlight the importance of both adaptation (in the primary auditory cortex) and short-term plasticity (of inter-regional connections) for the generation of the MMN. Put simply, the sensory learning that is assumed to underlie the MMN calls on associative plasticity in the extrinsic connections communicating predictions and prediction errors between levels of the auditory hierarchy. As the oddball is repeated, sensory learning produces more efficient predictions of auditory stimuli and a decrease in effective connectivity of extrinsic connections (as the oddball becomes a standard). Our analyses suggest that this decrease is eliminated by ketamine—consistent with its antagonism of NMDAR-dependent plasticity. Adaptation—or changes in post-synaptic gain—is thought, in the context of predictive coding, to encode the precision of bottom-up information. This precision is itself optimized as a function of stimulus repetition. The present results suggest that this neuromodulatory optimization may be less sensitive to the NMDAR blockade.
This lack of a significant ketamine effect on the parameter set encoding neuronal adaptation during the MMN generation fits well to recent results from invasive recordings (multiunit activity) in the auditory cortex of awake rodents, showing no effect of NMDAR antagonists on neuronal adaption during the MMN paradigm (Farley et al. 2010). Nevertheless, based on our present study, we cannot preclude a NMDAR-mediated effect on adaptation processes involved in human MMN expression. This is because our model contains a specific (and rather simple) mathematical representation of adaptation processes, expressed in terms of post-synaptic gain modulation (i.e., drug-dependent changes in the amplitude of the synaptic kernel of our neural mass model) and may fail to capture aspects of neuronal adaption that are not fully captured by this formulation.
One point of contention, however, may be that our present analyses identified a predominant impact of NMDAR antagonism on the plasticity of the “forward” connections of the auditory hierarchy. From the perspective of predictive coding, the free-energy formulation suggests that forward connections convey prediction errors, that is, phasic signals requiring fast synaptic transmission. In contrast, predictive coding regards backward connections as conveying predictions, which may be of a more modulatory nature and elicit more enduring effects. This view draws on the concept of “driving” forward and “modulatory” backward connections in sensory processing streams (Sherman and Guillery 1998) and led to proposals (Friston 2005; Corlett et al. 2011) that forward connections predominantly employ fast AMPARs, whereas the more modulatory backward connections also engage NMDARs, whose time constants are an order of magnitude larger than those of AMPARs. From this perspective, one may thus have expected our DCMs to identify a predominant effect of ketamine on the plasticity of backward, rather than forward, connections. It may be useful to consider, however, that NMDARs, despite their slower time constants compared with AMPARs, are nevertheless ionotropic receptors and are capable of conveying driving effects in the presence of other excitatory inputs (i.e., once the cell membrane is depolarized; Daw et al. 1993). Indeed, such “driving” effects of NMDARs are established, for example, in the visual (Fox et al. 1990) and auditory cortices (Kelly and Zhang 2002). In line with this, the original notion of “driving” and “modulatory” connections, as proposed by Sherman and Guillery (1998), jointly considers AMPARs and NMDARs as ionotropic receptors for driving connections and focuses on metabotropic glutamate receptors (mGluRs) for modulatory (backward) connections.
A second fact that may support the plausibility of our findings is that NMDAR antagonists like ketamine have been found to increase the release of glutamate, and thus lead to secondary activation of glutamatergic non-NMDA receptors such as AMPARs (Moghaddam et al. 1997). Assuming that AMPARs are critical for transmission of prediction errors via forward connections, it is possible that excessive stimulation of AMPARs under ketamine could lead to ceiling-like effects, reducing the discriminability of prediction error signals from baseline transmission (cf. Corlett et al. 2009). In our context of the roving MMN, this may mean that standard and deviant tones are no longer clearly differentiable and transmission via forward connections does not greatly differ between predictable and unpredictable tones, leading to a diminished difference wave (MMN), as observed, and reduced modulation of synaptic coupling. Given that the models used in this study cannot disambiguate effects mediated by AMPA and/or NMDA receptors, this interpretation must remain speculative at the present time, but would nicely explain our finding of reduced modulation of forward connections by prediction errors under ketamine.
A final point that may be of help for a physiological interpretation of our results is the role of NMDARs in controlling rapid changes in AMPAR conductance (e.g., via phosphorylation; Montgomery & Madison 2004). This form of NMDAR-dependent short-term plasticity has previously been highlighted as a putative mechanism for prediction error signaling in hierarchical cortical architectures implementing predictive coding (Stephan, Friston, et al. 2009), with explicit reference to the MMN (Stephan et al. 2006). From this perspective, rapid prediction error signaling via forward connections may be mediated by AMPARs but under control of NMDARs. In other words, even if NMDARs do not themselves transmit prediction error signals via forward connections, they might control the capacity of AMPARs to do so.
In summary, while it is perfectly plausible that NMDARs play an important role in synaptic transmission along backward connections, “it is unlikely that NMDA receptors only signal predictions and AMPA receptors signal prediction errors; rather there may be a division of labor where AMPA receptors are relatively more engaged in bottom-up and NMDA receptors are relatively more involved in top-down processes” (Corlett et al. 2011). Our present results are in line with this view and imply a less strict dichotomy between AMPAR and NMDAR mediated effects at forward and backward connections, respectively, as has sometimes been suggested previously.
Having said this, it is important to keep in mind that we used a fairly simplistic model for inferring the physiological mechanisms of the MNN. While the DCM used in this study does distinguish between different neuronal populations (pyramidal cells, granular cells, and inhibitory interneurons), it does not distinguish between pyramidal cells in supra- and infragranular layers, which take on different roles within predictive coding schemes (Friston 2010). Furthermore, while the model does make a distinction between glutamatergic and GABAergic synapses, there is no explicit distinction between fast (AMPA) and slower (NMDA) ionotropic receptors, not to mention the modulatory contributions from mGluRs. The parameter estimates we obtain from our present model are therefore rather coarse, lumped summaries of numerous physiological processes, and the current estimates obtained for short-term synaptic plasticity cannot be split up into distinct contributions from different glutamatergic receptors.
There is, however, recent progress in developing DCMs that move toward a much more fine grained representation of synaptic physiology. These “conductance-based” DCMs use a simplified Morris-Lecar formulation to infer the relative contributions of separate ion channel types (with different ligand-/voltage-gated behavior and time constants) to measured potentials (Marreiros et al. 2010). Recently, this model was extended to include an explicit NMDAR representation, rendering it potentially capable of inferring on differential contributions of NMDAR- and non-NMDAR–mediated transmission at glutamatergic synapses (Moran, Stephan, et al. 2011). This extended model is currently being validated and when these studies are complete we will re-examine our present data using models of this sort.
Another direction for refining the modeling approach of the current study concerns the selection of data features. In this study, we computed the MMN by subtracting the ERP to “standard” tones (operationally defined as the sixth tone), from “deviating” tones (first tone within a new train). This approach was chosen to allow for the direct comparison of our results to previous DCM studies of the MMN (Garrido et al. 2008). In a further step, the present models could incorporate the parametric modulation by standard repetitions, for example, for studying the effects of ketamine on other data features obtained from roving oddball designs such as the “repetition positivity” (Haenschel et al. 2005).
A third and final opportunity to complement the analysis from the present study is to employ recently developed computational models that use the same Bayesian inference framework as DCM but are agnostic about physiological mechanisms. Instead, they enable the investigation of trial-by-trial changes in MMN amplitude from a purely computational (information theoretic) perspective. In other words, they help clarifying which computational quantities (e.g., prediction errors or surprise) are reflected by the trial-by-trial dynamics of MMN expression (Lieder et al. in preparation). These models, once they are fully established, should enable us to examine the effects of ketamine on MMN generation from a complementary perspective.
In summary, the present study has presented a novel model-based characterization of the effects of NMDAR antagonism during the MMN roving paradigm. This study is part of ongoing efforts to establish model-based assays of brain disease processes; a theme that plays a major role in computational approaches to dissecting the pathophysiology of spectrum diseases, such as schizophrenia, into physiologically well-defined subgroups (Stephan, Friston, et al. 2009). A recent proof-of-concept study demonstrated that one can infer, from scalp MEG recordings, dopaminergic effects of NMDA and AMPA receptors (Moran, Symmonds, et al. 2011). Furthermore, several validation studies have demonstrated that estimates of glutamatergic synaptic physiology—obtained using a DCM that is similar to the DCM used in this study—can be obtained from intracerebral and extradural local field potential recordings (Moran et al. 2008; Moran, Jung, et al. 2011). The present study has extended this program toward inference on NMDAR-mediated synaptic effects from EEG data. Clearly, the present study is only a modest step toward non-invasive, model-based inference on glutamatergic synaptic physiology, and more sophisticated models and further validation studies are needed. Eventually, however, we hope that carefully validated model-based approaches will enable diagnostically useful applications of MMN recordings in the future, for example, for pathophysiologically grounded diagnostic classification of spectrum diseases such as schizophrenia (Stephan et al. 2006).
This work was supported by the Swiss Neuromatrix Foundation (A.S., M.K., F.X.V.), the EU FP7 program (A.D., K.E.S.), a joint UZH & ETH Chair supported by the René and Susanne Braginsky Foundation (K.E.S.), the NEUROCHOICE project by SystemsX.ch (K.E.S.), and the Wellcome Trust (K.J.F.). We thank Marta Garrido for providing an example script of the roving MMN paradigm and Milena Jeker for her assistance in recruiting and measuring. Conflict of Interest: None declared.