## Abstract

Neural oscillations are linked to perception and behavior and may reflect mechanisms for long-range communication between brain areas. We developed a causal model of oscillatory dynamics in the face perception network using magnetoencephalographic data from 51 normal volunteers. This model predicted induced responses to faces by estimating oscillatory power coupling between source locations corresponding to bilateral occipital and fusiform face areas (OFA and FFA) and the right superior temporal sulcus (STS). These sources showed increased alpha and theta and decreased beta power as well as selective responses to fearful facial expressions. We then used Bayesian model comparison to compare hypothetical models, which were motivated by previous connectivity data and a well-known theory of temporal lobe function. We confirmed this theory in detail by showing that the OFA bifurcated into 2 independent, hierarchical, feedforward pathways, with fearful expressions modulating power coupling only in the more dorsal (STS) pathway. The power coupling parameters showed a common pattern over connections. Low-frequency bands showed same-frequency power coupling, which, in the dorsal pathway, was modulated by fearful faces. Also, theta power showed a cross-frequency suppression of beta power. This combination of linear and nonlinear mechanisms could reflect computational mechanisms in hierarchical feedforward networks.

## Introduction

The profuse local connectivity of the brain gives rise to discrete, functionally distinct anatomical areas, which are commonly localized using brain imaging. Longer range connections between these local functional areas can form distributed networks, which are often organized into hierarchies (Felleman and van Essen 1991). Although cost-efficiency keeps long-range connectivity sparse (Buzsáki et al. 2004; Bassett et al. 2009), long-range communication can be mediated by oscillatory dynamics. Experimentally induced spectral power changes are routinely observed in invasive (Fukushima et al. 2012) and scalp electrophysiology and magnetoencephalography (MEG). Moreover, the anatomical spatial scale of connections may induce oscillations at different frequencies, with lower frequencies better suited for transmission across sizable conduction delays (Kopell et al. 2000; von Stein and Sarnthein 2000). Although oscillatory synchronization coordinates neural assemblies (Buzsáki and Draguhn 2004; Schnitzler and Gross 2005), less empirical data describe how oscillatory power in one local area causes power changes in more distant areas (i.e., oscillatory power coupling).

We measured long-range power coupling between brain areas by employing the face perception network as a model system. Face perception has already been linked to oscillatory activity (Rodriguez et al. 1999; Schyns et al. 2011). Moreover, faces, when compared with nonfaces, evoke face-selective domain-specific responses in several distinct, local, and well-replicated temporal lobe areas (Kanwisher and Yovel 2006), where computation appears functionally specialized (Haxby et al. 2000) or even modular (Kanwisher et al. 1997). Thus, the face perception network is an expedient model system for identifying how information is coded in patterns of frequency-specific power coupling by long-range connections between local, functionally specialized areas.

The accuracy of power coupling estimation is dependent on choosing the most likely model connectivity, necessitating a comparison of various connectivity architectures. Model comparison further allows us to use the induced response data to learn about the face perception network by testing hypotheses developed from extant theory and data. Our hypothetical model space was motivated by a dominant theory (Haxby et al. 2000; Calder and Young 2005), which asserts that ventral connections between the occipital face area (OFA) and the fusiform face area (FFA) are specialized for perceiving identity, while dorsal connections between the OFA and the right superior temporal sulcus (STS) are specialized for perceiving emotional expressions and other changeable facial attributes. Consistent with this theory, previous effective connectivity models have shown that face-selective areas indeed communicate as 2 independent pathways, which are connected in a hierarchical and feedforward manner (Fairhall and Ishai 2007). We therefore hypothesized that stimulus inputs enter OFA, which sends feedforward projections to FFA and STS. Because FFA and STS are components of functionally independent pathways, we hypothesized that they would not communicate with each other. Lastly, we hypothesized that only dorsal connections, which supposedly represent changeable facial attributes (Haxby et al. 2000), would show coupling that is modulated by facial expression.

In a sample of 51 normal participants, we measured induced MEG responses at face-selective source locations to their preferred stimuli, faces, and tested formal generative models against the data to estimate power coupling. Using dynamical causal modeling (DCM), we computed both same-frequency and cross-frequency power coupling for directed connections between predefined anatomical locations in source space (Chen et al. 2008). Furthermore, we used Bayesian model comparison to test hypotheses about connectivity and identified the DCM with the most likely model architecture. We demonstrated strong evidence confirming the prevailing theory (Haxby et al. 2000) and previous research (Fairhall and Ishai 2007). The OFA bifurcated into 2 feedforward pathways, which were not coupled together, and facial expressions modulated only the power coupling in the dorsal pathway. In these areas, faces increased power for alpha and theta bands but suppressed beta power. Theta and beta power showed selectivity for fearful expressions. These induced responses were best explained by positive, same-frequency, power coupling for subgamma frequencies. This coupling was combined with negative, cross-frequency coupling, where an area's theta power suppressed beta power in its connected areas. This power coupling pattern reflected both “driving,” linear and “modulatory,” nonlinear mechanisms mediating communication among nonlocally connected functional brain areas. These findings are significant as they not only directly confirm several a priori hypotheses about the face-selective network, but also add additional information about the computational mechanisms used by this network, and suggest new hypotheses about function in feedforward hierarchical networks more generally.

## Materials and Methods

### Participants

We sampled participant data from a large MEG database which contained about 300 healthy volunteers as well as schizophrenic patients and their siblings, collected between 2005 and 2011 and maintained in the laboratory of Daniel Weinberger at the National Institutes of Health. Written consent for participation was acquired from all participants, in accordance with approval from the National Institutes of Health Institutional Review Board. We selected from this dataset only healthy volunteers with recent data collection (2010 or 2011). These participants were right-handed and had normal or corrected-to-normal vision. History, physical, and neurological examinations from a board-certified physician and a diagnostic interview for psychiatric disorders (SCID) showed that all participants were free from psychiatric and neurological disease. Participants whose data were subject to error or excessive signal artifacts (e.g., missing triggers, noisy or malfunctioning sensors, etc.) were excluded. Because Bayesian model comparison is computationally intensive, our analysis focused on 51 participants, a number which was both computationally tractable and provided sufficient statistical power for detecting significant time/frequency effects and oscillatory coupling.

### Stimuli and Procedures

Stimuli were grayscale static face images taken from the Ekman face database (Ekman and Freisen 1976). The set of faces was half-male and half-female and both genders were presented with neutral, fearful, or angry expressions. Each face was presented for 500 ms, followed by an interstimulus interval which varied randomly in length between 1 and 6 s. During the interstimulus interval, participants reported the gender of the face by button press.

### MEG Data Acquisition and Preprocessing

MEG data were collected in a magnetically shielded room at the National Institutes of Health using a 275 CTF system with SQUID-based third-order axial gradiometers (VSM MedTech, Ltd, Coquitlam, British Columbia). Data were collected from each participant in a single session at a sampling rate of 600 Hz. Data were analyzed using MATLAB (The MathWorks, Natick, MA) and SPM8 (Wellcome Trust Centre for Neuroimaging, London; http://fil.ion.ucl.ac.uk/spm/). Data were preprocessed, following closely the standard pipeline provided in SPM8 and described in Litvak et al. (2011). Continuous data were subjected to a fifth-order Butterworth band-pass filter at 0.5–50 Hz. We extracted epochs of 500-ms poststimulus, which were then artifact-corrected by removing epochs for which the MEG signal exceeded 3000 fT. The artifact-corrected datasets retained 24–36 artifact-free trials per facial expression. Data were downsampled to 110 Hz.

### DCM Analysis Strategy

DCM provides a state-space generative model of time series data. One goal of DCM is to estimate coupling parameters that describe how the response magnitude for each brain area influences dynamics in the other brain areas. This power coupling can be interpreted as a measure of effective connectivity. The variety of DCM used here models the dynamics of induced time–frequency power measured from predesignated equivalent current dipole source locations. In the model, dynamics are generated when frequency-specific power in origin brain areas positively or negatively influences power changes in the same or different frequencies in connected target areas. Parameter estimation consists of iteratively deriving the pattern of frequency-specific power coupling that most likely generated the measured time–frequency power representations. A critical step for the proper interpretation of these power coupling parameters is the selection of an optimal connectivity structure, which most accurately and parsimoniously predicts the time–frequency data. This step depends on a Bayesian model comparison among competing DCMs, each with different connections among brain areas. In addition to optimizing parameter estimation, the Bayesian model comparison step also affords the formalization and explicit testing of hypotheses about functional architecture. The mathematical and conceptual development of DCM (Friston et al. 2003), its application to MEG data (Kiebel et al. 2009) and induced responses (Chen et al. 2008, 2009, 2012), random effects Bayesian model comparison (Stephan et al. 2009), and model family comparison (Penny et al. 2010) have already been elaborated in detail elsewhere and are fully implemented in SPM8 (Litvak et al. 2011). Our specific application closely followed the standard preprocessing and modeling procedures used in previous studies (Chen et al. 2008, 2009, 2010).

### Induced Responses at Source Locations

The first step in modeling induced responses to facial expressions using DCM was to acquire single-trial, time–frequency power representations separately for each of our sources of interest. MEG time series for each of our predefined sources were extracted by computing the generalized inverse of the lead-field (gain) matrix. Our lead-field matrix used fiducial-coregistered sensor locations and a local-spheres head model as implemented in SPM8 (Litvak et al. 2011). We chose source locations (Fig. 1*a*) based on Montreal Neurological Institute (MNI) coordinates (*x y z*) previously reported (Henson et al. 2003) for the right OFA (42 −81 −15), left OFA (−39 −81 −15), right FFA (42 −45 −27), left FFA (−39 −51 −24), and right STS (51 −66 12). These coordinates are taken from a previously studied face processing network (Chen et al. 2009), which used the same source locations for bilateral OFA and FFA. The STS was defined only for the right hemisphere, as face selectivity is typically observed only in the right STS (Henson et al. 2003).

These predefined coordinates are based on a large literature defining this set of cortical areas as the relevant nodes for face processing (Calder and Young 2005) and follow the approach taken by Chen et al. (2009). We acknowledge that many functional magnetic resonance imaging (fMRI) studies localize areas for DCM using statistical effects obtained in individual participants. This approach is, unfortunately, not practical for MEG. MEG source reconstruction does not provide the same spatial resolution for detecting peak effects in individual participants, nor does source reconstruction fully remove the effects of signal mixing at every location (Schoffelen and Gross 2009). Thus, using a standard set of source coordinates for all participants is a necessary limitation of this study, which we must accept, if we are to exploit the temporal resolution of MEG to measure oscillatory power coupling. Fortunately, DCM provides a convenient method for examining sources in a way that is less sensitive to signal mixing (See also Discussion section). DCM uses the power in an origin area together with the coupling parameters to predict the derivative (change over time) for power in a target area. Because a function (i.e., a power time series) and its derivative are orthogonal, the same signal measured at both origin and target locations would result in zero coupling. Thus, non-zero coupling in DCM must arise from separate sources.

We computed time–frequency spectra for the frequencies 4–48 Hz and for the times 1- to 500-ms poststimulus using a Morlet wavelet decomposition with a factor of 7. As in previous studies, (Chen et al. 2008, 2009, 2010, 2012), a single-trial log-ratio baseline correction specific to each frequency was applied based on the first 18-ms time bin (Litvak et al. 2011). Time–frequency power was then averaged over trials. These are all the same wavelet decomposition procedures as in Chen et al. (2009). Prior to DCM, these time–frequency representations were analyzed using conventional mass-univariate 1-sample *t*-tests, testing for significant non-zero responses at individual source locations and for non-zero differences among the different facial expressions. All results reported below were observed at *P* < 0.001 uncorrected and achieved *P* < 0.05 using cluster-level familywise error (FWE) correction based on Gaussian random field theory (Brett et al. 2003).

Because DCM for induced responses is intensive computationally, we followed the conventional practice of reducing the dimensionality of the time–frequency representations to 4 modes using singular-value decomposition. Four modes are standard for previous studies (Chen et al. 2008, 2009, 2010, 2012). In our data, the 4 modes explained 98% of the variance, on average over participants. DCM then operates on these modes and, afterward, the resultant couplings between power modes are projected back into frequency space to recover frequency-specific power coupling matrices. Note that, sometimes, this dimensionality reduction can complicate interpretation of Bayesian model comparisons of linear versus nonlinear connections. As described previously (Friston et al. 2003; Chen et al. 2008), cross-frequency power coupling arises from nonlinear neural mechanisms. However, Bayesian model comparison, as implemented in SPM8, tests for nonlinear power coupling between modes (which are linear mixtures of frequencies), rather than testing individual frequency bands per se. Thus, the model evidence can occasionally favor linearity even when nonlinear, cross-frequency power coupling is clearly apparent after projection back to frequency space. As this was the case in our data and the use of more than 4 modes was not computationally tractable, we omit formal Bayesian model comparisons of nonlinearity. We present results from nonlinear DCMs, which place no prior constraints on the form that power coupling can take.

### Bayesian Model Comparison

We used Bayesian model comparison (Stephan et al. 2009) in 51 participants to select an optimal model from a set of models formulated to test various hypotheses about model structure and modulation by facial expressions (See below). Model selection is used to identify optimal models within a family of related or nested models (Kass and Raftery 1995). In contrast to classical statistics, which requires pairwise comparisons between models, model selection methods allow simultaneous comparison of a large set of models.

The optimal DCM was identified using 2 complementary measures: expected posterior and exceedance probabilities. These measures are described in mathematical detail by Stephan et al. (2009) and its routine use in numerous recent studies (Bányai et al. 2011; Cieslik et al. 2011; Dima et al. 2011; Rahnev et al. 2011; Wang et al. 2011; Chen et al. 2012; den Ouden et al. 2012; Müller et al. 2012; Nagy et al. 2012) has shown it to be an accepted method for selecting the best DCM across a group of participants. Expected posterior and exceedence probabilities assume that different models may be best in different participants. The free energies (lower bounds on the model evidence) over participants are used to estimate the parameters of a Dirichlet probability distribution over model space. With these parameters, one can define a multinomial distribution of model probabilities from which expected posterior and exceedance probabilities are computed. Expected posterior probabilities represent the expected likelihood of sampling a participant with a specific model and therefore reflect the proportion of participants that favor a certain model. Note that if all models were equally likely to have generated the group data, chance would predict that the expected posterior probabilities of the models would equal the reciprocal of the number of models tested and so the maximal expected posterior probability will often be lower when more models are compared. We also report exceedance probabilities, which express belief that a model has the highest posterior probability, relative to other models. Importantly, the free-energy bound on model evidence (on which these probabilities are based) incorporates model complexity (number of parameters). Thus, a model that more probably gives rise to a participant's responses can be identified among a set of models differing in the number of parameters. Thus, when we refer herein to an “optimal model,” we refer to the model which can most accurately predict the data using the most parsimonious number of parameters.

All DCMs modeled responses recorded between 0- and 500-ms poststimulus. This epoch was chosen because it captured the entire stimulus duration, was free from stimulus offset responses which might occur after 500 ms, and was most likely to capture visual processing. All models also included exogenous inputs to bilateral OFA, which modeled effects of stimulus presentation. The dynamics of these inputs were adjusted during model fitting, but employed a Gaussian prior distribution over all frequencies with its mean at 60-ms poststimulus and a dispersion of 18 ms.

Although we compared numerous models, we did not use a purely data-driven exploration of all possible connections between areas. Our model comparison space was constrained to test specific hypotheses suggested by prior theory (Haxby et al. 2000; Calder and Young 2005) and previous research (Rossion et al. 2003; Fairhall and Ishai 2007). This body of work (See Introduction section) suggests that lower levels of the visual system provide input to OFA, which then projects to 2 functionally independent dorsal and ventral pathways, with the dorsal pathway (involving STS) specialized for changeable attributes of faces (e.g., expressions). Prior work (Fairhall and Ishai 2007) further suggests that these pathways are feedforward. We also will test the hypothesis that stimulus input can circumvent the OFA and directly affect FFA and/or STS (Rossion et al. 2003).

We put these hypotheses to the test in 2 model comparison stages. For the first stage, we tested “structural models,” where we identified a connectivity structure between bilateral OFA, bilateral FFA and right STS which optimally explained induced responses to faces but did not model whether any power coupling was also modulated by expression (i.e., there were no bilinear terms). We assumed fixed forward connections from OFA to FFA, lateral interhemispheric connections between the OFAs and FFAs in each hemisphere and intrinsic, self-connections. However, we varied the other connectivity properties of our models according to our a priori hypotheses (Fig. 1*b*, Tables 1–3). We tested whether the face perception network was hierarchical, using models that had stimulus inputs to bilateral FFA and/or right STS, in addition to bilateral OFA. We also tested whether the best model operated in a feedforward manner or whether additional backwards connections improved prediction of the induced responses. Lastly, we tested whether the dorsal (STS) and ventral (FFA) pathways were coupled or operated independently. To systematically test these hypotheses, we manipulated connections in 3 ways (See below), which may be likened to structural manipulation “factors” in a fully crossed 4 × 5 × 4 factorial design, yielding 80 individual DCM models (Tables 1 and 2).

Expected posterior probabilities | ||||
---|---|---|---|---|

None | To STS | To FFA | Both | |

Ventral forward | ||||

Dorsal forward | 0.224 | 0.009 | 0.041 | 0.008 |

Dorsal bidirectional | 0.012 | 0.008 | 0.009 | 0.010 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Ventral bidirectional | ||||

Dorsal forward | 0.010 | 0.008 | 0.078 | 0.059 |

dorsal bidirectional | 0.009 | 0.008 | 0.008 | 0.008 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Ventral forward + FFA input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward | ||||

dorsal bidirectional | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Ventral bidirectional + FFA input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward | ||||

Dorsal bidirectional | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Expected posterior probabilities | ||||
---|---|---|---|---|

None | To STS | To FFA | Both | |

Ventral forward | ||||

Dorsal forward | 0.224 | 0.009 | 0.041 | 0.008 |

Dorsal bidirectional | 0.012 | 0.008 | 0.009 | 0.010 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Ventral bidirectional | ||||

Dorsal forward | 0.010 | 0.008 | 0.078 | 0.059 |

dorsal bidirectional | 0.009 | 0.008 | 0.008 | 0.008 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Ventral forward + FFA input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward | ||||

dorsal bidirectional | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Ventral bidirectional + FFA input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward | ||||

Dorsal bidirectional | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal only STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal forward + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Dorsal bidirectional + STS input | 0.008 | 0.008 | 0.008 | 0.008 |

Bold text indicates the optimal model.

Exceedance probabilities | ||||
---|---|---|---|---|

None | To STS | To FFA | Both | |

Ventral forward | ||||

Dorsal forward | 0.980 | 0 | 0 | 0 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Ventral bidirectional | ||||

Dorsal forward | 0 | 0 | 0.016 | 0.004 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Ventral forward + FFA input | ||||

Dorsal forward | 0 | 0 | 0 | 0 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Ventral bidirectional + FFA input | ||||

Dorsal forward | 0 | 0 | 0 | 0 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Exceedance probabilities | ||||
---|---|---|---|---|

None | To STS | To FFA | Both | |

Ventral forward | ||||

Dorsal forward | 0.980 | 0 | 0 | 0 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Ventral bidirectional | ||||

Dorsal forward | 0 | 0 | 0.016 | 0.004 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Ventral forward + FFA input | ||||

Dorsal forward | 0 | 0 | 0 | 0 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Ventral bidirectional + FFA input | ||||

Dorsal forward | 0 | 0 | 0 | 0 |

Dorsal bidirectional | 0 | 0 | 0 | 0 |

Dorsal only STS input | 0 | 0 | 0 | 0 |

Dorsal forward + STS input | 0 | 0 | 0 | 0 |

Dorsal bidirectional + STS input | 0 | 0 | 0 | 0 |

Bold text indicates the optimal model.

Expected | Exceedance | |
---|---|---|

Ventral pathway families | ||

Ventral forward | 0.650 | 0.983 |

Ventral bidirectional | 0.314 | 0.017 |

Ventral forward + FFA input | 0.018 | 0 |

Ventral bidirectional + FFA input | 0.018 | 0 |

Dorsal pathway families | ||

Dorsal forward | 0.927 | 1 |

Dorsal bidirectional | 0.020 | 0 |

Dorsal only STS input | 0.018 | 0 |

Dorsal forward + STS input | 0.018 | 0 |

Dorsal bidirectional + STS input | 0.018 | 0 |

Between-pathway families | ||

None | 0.570 | 0.884 |

To STS | 0.313 | 0.112 |

To FFA | 0.098 | 0.005 |

Both | 0.019 | 0 |

Expected | Exceedance | |
---|---|---|

Ventral pathway families | ||

Ventral forward | 0.650 | 0.983 |

Ventral bidirectional | 0.314 | 0.017 |

Ventral forward + FFA input | 0.018 | 0 |

Ventral bidirectional + FFA input | 0.018 | 0 |

Dorsal pathway families | ||

Dorsal forward | 0.927 | 1 |

Dorsal bidirectional | 0.020 | 0 |

Dorsal only STS input | 0.018 | 0 |

Dorsal forward + STS input | 0.018 | 0 |

Dorsal bidirectional + STS input | 0.018 | 0 |

Between-pathway families | ||

None | 0.570 | 0.884 |

To STS | 0.313 | 0.112 |

To FFA | 0.098 | 0.005 |

Both | 0.019 | 0 |

Bold text indicates the optimal model.

For the first structural manipulation, we varied the connectivity of the ventral pathway. The OFA and FFA could be connected using only forward connections and no stimulus input to FFA (ventral forward), both forward and backward connections and no FFA stimulus input (ventral bidirectional), only forward connections plus an additional stimulus input to FFA (ventral forward + FFA input) or forward and backward connections plus an additional stimulus input to FFA (ventral bidirectional + FFA input). The 4 levels of the ventral pathway manipulation therefore tested both whether the ventral pathway was hierarchical (i.e., stimulus input was to only the OFA or to higher level areas as well) and whether or not it was feedforward.

For the second structural manipulation, we varied dorsal pathway connectivity by testing whether the OFA and STS were connected using only forward connections and no STS stimulus input (dorsal forward), both forward and backward connections and no STS stimulus input (dorsal bidirectional), no connection between OFA and STS but instead adding a direct stimulus input to STS (dorsal only STS input), forward connections plus an STS stimulus input (dorsal forward + STS input), both forward and backward connections plus an STS stimulus input (dorsal bidirectional + STS input). The STS input models considered the possibility that the dorsal and ventral pathways may divide at a visual area earlier than OFA (dorsal only STS input) or that a separate visual pathway may otherwise project to STS (Rossion et al. 2003). The 3 levels of the dorsal pathway manipulation also tested whether the dorsal pathway was hierarchical and whether or not it was feedforward.

For the third structural manipulation, we varied the between-pathway connectivity. These models could have no connection between FFA and STS (none), a unidirectional connection from FFA to STS (to STS), a unidirectional connection from STS to FFA (to FFA) or bidirectional connections (both). The 4 levels of this manipulation explored the extent to which the 2 pathways were either coupled to each other or were “functionally independent” (Haxby et al. 2000; Calder and Young 2005).

Bayesian model comparison then proceeded by first performing an individual model comparison between all 80 structural models and identifying the best model based on expected posterior and exceedance probabilities. We then further confirmed the properties of the best structural model by performing comparisons of model “families.” That is, we partitioned the model space into subspaces (Penny et al. 2010) according to the levels of each of our 3 structural manipulations (Table 3). We performed 3 model family comparisons, each comparing the levels of one of the 3 manipulations. The best model families were identified based on expected posterior and exceedance probabilities.

Based on these criteria, we selected the structural pattern of OFA, FFA, and STS connectivity that maximized model evidence. This structural pattern was then used at the second stage of Bayesian model comparison. This second stage tested the prediction of Haxby et al. (2000) that the dorsal pathway is specialized for representation of changeable facial attributes (e.g., emotional expressions). Because the induced responses measured at the predefined sources all showed robust selectivity for fearful expressions (see Results section), we examined which connections might be modulated by fearful versus neutral and angry expressions. We examined 3 bilinear models where fearful expressions modulated power coupling for 1) connections between OFA and FFA and their associated lateral and self-connections (ventral pathway), 2) connections between OFA and STS (dorsal pathway) and associated lateral and self-connections, and 3) all connections (both pathways). The best model from this second stage was assessed using expected posterior and exceedance probabilities.

### Parameter Estimation

The parameters from the optimal bilinear model consisted of frequency-specific input parameters (*C*) as well as matrices for each connection representing the magnitude of power coupling from each frequency in the origin area to each frequency in the target area (*A* and *B*). The *A* parameters describe endogenous, between-area, power coupling in response to all faces. The *B* or bilinear parameters show the magnitude to which fearful expressions modulated endogenous power coupling. As with previous applications of DCM for induced responses (Chen et al. 2010, 2012), we increased our signal-to-noise ratio for statistical testing and accounted for interparticipant variability by smoothing the *A* and *B* parameter matrices using a 6-Hz full-width half-maximum Gaussian kernel.

We report statistical parametric maps for analyses on power coupling matrices that were averaged for different connection types of interest. The large number of connections included in our best model complicated reporting of each connection individually and, so here, we report tests using averages over the connection types that conformed to our a priori hypotheses about connectivity modulation. These included forward and backward connections in either the dorsal or ventral pathway as well as OFA lateral, FFA lateral, and self-connections. In practice, we observed little difference in power coupling for different connections within a connection type, including connections in the right and left hemispheres. The statistical parametric maps were based on mass-univariate 1-sample *t*-tests testing the null hypothesis that the parameters differed from zero. All reported results from these *t*-tests were observed at *P* < 0.001 uncorrected and achieved *P* < 0.05 using cluster-level FWE correction based on Gaussian random field theory.

## Results

### Time–Frequency Representations

The time–frequency representations measured at the 5 predefined anatomical source locations, averaged over conditions, are shown in Figure 2. We observed an increase in 6–12 Hz alpha power between 50 and 200ms, which we expected, as the major evoked response components M100 and M170 are known to occur during this time period (Liu et al. 2002). Following this transient response, a more sustained 4–8 Hz theta power increase appeared around 200 ms, peaked between 200 and 300 ms, and remained to some degree throughout the remainder of the epoch. During this period, we also observed a substantial power decrease which arose after 200 ms and was sustained throughout the remainder of the epoch. This power decrease spanned the alpha, beta, and low gamma bands, 8–35 Hz, and showed a prominent peak in the mid-beta range around 20 Hz. This pattern was observed at all 5 sources with no statistical differences among sources. The 5 sources together showed significant differences between fearful facial expressions compared with the average of neutral and angry expressions (Fig. 3*a*). This difference between fearful and nonfearful expressions was apparent for the theta (Fig. 3*b*) and beta (Fig. 3*c*) power components described above.

### Bayesian Model Comparisons

The time–frequency spectra for the 5 sources were modeled to assess the connectivity structure (Fig. 1) that most likely generated the observed induced responses (Fig. 2). We first directly compared all 80 structural models. Note that if all models were equally likely, chance would predict expected posterior probabilities of 1/80 = 0.0125 and therefore probabilities larger than this indicate evidence in favor of a particular model. Both expected posterior and exceedance probabilities (Tables 1 and 2) disclosed a clearly optimal model (Fig. 4), where both dorsal (OFA to STS) and ventral (OFA to FFA) pathways had only forward connections, and there were no connections between FFA and STS. This model yielded an expected posterior probability of 0.224 and an exceedance probability of 0.98. By comparison, the next best model had an expected posterior probability of only 0.078 and an exceedance probability of 0.016.

We also performed 3 model family comparisons, which allowed us to verify the structural properties of the best model by collectively evaluating the evidence for numerous models which share structural properties (See Materials and Methods section). Indeed, our model family comparisons further supported a feedforward network with no power coupling between the dorsal and ventral pathways (Fig. 4 and Table 3). The first model family comparison compared the 4 “ventral pathway” families, each family containing 20 models. Here, expected posterior probabilities predicted by chance would be 1/4 = 0.25. We found that the family of models with only ventral forward connections and no FFA stimulus input (expected posterior: 0.650; exceedance: 0.983) outperformed families with ventral bidirectional connections and no FFA stimulus input (expected posterior: 0.314; exceedance: 0.017), ventral forward connections plus an FFA stimulus input (expected posterior: 0.018; exceedance: 0) and ventral bidirectional connections plus an FFA stimulus input (expected posterior: 0.018; exceedance: 0). The second model family comparison compared the 5 “dorsal pathway” families, each family containing 16 models. Here, expected posterior probabilities predicted by chance would be 1/5 = 0.20. We found that the family of models with only dorsal forward connections and no STS stimulus input (expected posterior: 0.927; exceedance: 1) outperformed families with bidirectional connections and no STS stimulus input (expected posterior: 0.020; exceedance: 0), no forward or backwards connections but an STS stimulus input (expected posterior: 0.018; exceedance: 0), forward connections plus an STS stimulus input (expected posterior: 0.018; exceedance: 0) and bidirectional connections plus an STS stimulus input (expected posterior: 0.018 exceedance: 0). The third model family comparison compared the 4 “between-pathway” families, each family containing 20 models. Here, expected posterior probabilities predicted by chance would be 1/4 = 0.25. We found that the family of models with no connections between FFA and STS (expected posterior: 0.570; exceedance: 0.884) outperformed families where there were unidirectional connections from FFA to STS, unidirectional connections from STS to FFA and bidirectional connections between FFA and STS.

We last tested which connections showed power coupling that was modulated by fearful expressions by adding additional bilinear parameters B, which modeled changes in power coupling as a function of stimulus type (Friston et al. 2003). Here, expected posterior probabilities predicted by chance would be 1/3 = 0.33. We found (Fig. 4) that a model with bilinear modulation on only the dorsal pathway connections (expected posterior: 0.910; exceedance: 1) outperformed models where only ventral pathway connections were modulated (expected posterior: 0.069; exceedance: 0) or where all connections were modulated (expected posterior: 0.021; exceedance: 0). Overall, our model comparison clearly supports a model (Fig. 4) with 2 pathways that are feedforward, hierarchical and not coupled together, with fearful expressions modulating only dorsal pathway power coupling.

### Parameter Estimation

Once this optimal model was established, we estimated its parameters. Figure 5*a* shows the *C* parameters: the spectrum of OFA responses driven by the exogenous stimulus inputs. These differ from the induced responses (Fig. 2). They are estimated in the context of the model, which estimates the stimulus-driven exogenous response while accounting for the endogenous influence on OFA of its self-connections as well as of its lateral interhemispheric connections. These *C* parameters showed that presentation of facial stimuli induced positive oscillatory power changes in OFA that peaked in the alpha power range and also induced appreciable increases in theta and beta power. While the input parameters appear to drive the increases in alpha and theta power that we observed in the time–frequency representations (Fig. 2), they cannot also explain the late sustained beta power decrease. This decreased beta power must therefore arise from endogenous interactions among sources, such as the lateral connections between the right and left OFA and/or OFA self-connections.

This conclusion can be more directly assessed by examining the *A* parameters, which describe the endogenous influence that the frequency-specific power in each origin area exerts on frequency-specific dynamics in one of its target areas. Forward connections (Fig. 5*b*,*c*) and lateral connections (Fig. 5*d*,*e*) showed a common pattern. Subgamma power (theta alpha and beta bands) in origin areas increased power in the corresponding frequencies in their target areas. This positive, same-frequency power coupling was most prominent for forward connections, compared with lateral connections. These connections also showed negative, cross-frequency power coupling. Here, theta power in origin areas suppressed beta power in their targets. Overall, therefore, origin beta and theta power had respective opposing enhancing and suppressive effects on target beta power.

Figure 5*f* shows *A* power coupling parameters for self (intrinsic) connections, averaged over all 5 brain areas. These self-connections showed highly negative, on-diagonal, same-frequency power coupling. This arose because, as with other forms of DCM (Friston et al. 2003), estimation of self-A connections depends on a prior that favors coupling where self-inhibition for each frequency forces responses back to a stable fixed point, following input perturbation.

Figure 5*g*–*i* shows the pattern of bilinear modulation of power coupling induced by fearful facial expressions. These B matrices showed subgamma, positive, same-frequency power coupling similar to that shown by the *A* parameters for forward and lateral connections (Fig. 5*b*–*e*). Although bilinear modulation of the cross-frequency power coupling showed some numerically negative values, they failed to reach statistical significance at *P* < 0.001 uncorrected. Self-connections also showed a similar pattern of bilinear modulation, including positive, same-frequency power coupling. Self-connections further showed a significant bilinear suppression of beta power by theta power.

In summary, all endogenous (A) power coupling to faces as well as dorsal pathway bilinear modulation by fearful expressions (B) were characterized by same-frequency interactions between subgamma frequencies. However, induced beta power further depended on endogenous (A) cross-frequency power coupling, where theta power in origin areas suppressed beta power in their targets.

## Discussion

We identified the optimal DCM to explain MEG induced responses to faces in face-selective locations in source space (Figs 1–3). The best model from the 51 participants (Fig. 4) was hierarchical and feedforward, replicating Fairhall and Ishai (2007), and also conformed in detail to the prevailing hypothesis about the functional neuroanatomy of face perception (Haxby et al. 2000). Stimulus inputs entered only OFA but not STS and/or FFA. OFA showed dorsal forward power coupling with STS and ventral forward power coupling with FFA. These connections described 2 functionally distinct pathways, as there was no connectivity between FFA and STS and only power coupling in the dorsal pathway was modulated by fearful facial expressions. Faces induced power changes in OFA, FFA, and STS (Fig. 2), including a transient increase in alpha (8–12 Hz) around 50–150 ms, followed later by an increase in theta (4–8 Hz), juxtaposed with a sustained decrease between 8 and 35 Hz. Fearful expressions enhanced the theta and beta components in all areas (Fig. 3). We then computed the pattern of power coupling most likely to generate these induced responses using the winning model from the Bayesian model comparison. This revealed a combination of same- and cross-frequency coupling for power at low frequencies (4–30 Hz). In the dorsal pathway, the same-frequency power coupling was also modulated by fearful expressions. For cross-frequency power coupling, theta power in origin areas suppressed beta power in their target areas (Fig. 5). This positive same-frequency power coupling, together with cross-frequency suppression of beta power, may play an important role for long-range communication between distributed, hierarchical, and functionally defined visual areas.

### Same- and Cross-Frequency Coupling

Our face-selective areas showed both induced responses and power coupling primarily for subgamma frequencies (<30 Hz). Emphasis of lower frequency ranges was expected, as they can carry information over long-range connections that involve sizable conduction delays (Kopell et al. 2000; von Stein and Sarnthein 2000). Particularly, a late, sustained decrease in alpha and beta power (Klopp et al. 1999) was induced by faces (Fig. 2), which has been observed for many stimuli and has been linked to memory encoding (Hanslmayr et al. 2012). Here, we show that this decrease can arise when the power of slow theta rhythms enhances itself in connected areas while suppressing the power of faster frequencies (beta). A target area's beta power therefore depended both on a negative influence by theta power and a positive influence by beta power in origin areas (Fig. 5). This combination of same- and cross-frequency power coupling resulted in a net decrease in induced beta power (Fig. 2).

This combination of same- and cross-frequency power coupling, respectively, reflected linear and nonlinear mechanisms (Friston 2000; Chen et al. 2008, 2009). We refer to these as linear and nonlinear because linear systems are not capable of cross-frequency power modulation (Papoulis 1991). Any modulation of one frequency by another must be the result of a nonlinear interaction (Nikias and Petropulu 1993). Intuitively, linear coupling might be considered the transmission of driving signals, which pass an activity pattern from the origin to the target without changing its content, while nonlinear coupling might be considered the transmission of modulatory signals, where activity in an origin area is nonlinearly transformed in the target, thereby changing the information content (Sherman and Guillery 1998). We observed this combined same- and cross-frequency power coupling for all connections, whether they were in the dorsal or ventral pathways, or were forward or interhemispheric. This suggests that, despite the putative functional specialization of the dorsal and ventral pathways, they nevertheless employ common computational mechanisms. Indeed, computational models reinforce the necessity of combining linear and nonlinear mechanisms in hierarchical feedforward networks (Riesenhuber and Poggio 1999), an organization that typifies the visual system (Felleman and van Essen 1991).

We show that a feedforward visual network in the human can manifest cross-frequency spectral power coupling. Further study is needed to test whether the same- and cross-frequency coupling we observed is specific to feedforward hierarchical information transfer. It is unclear whether this pattern would be observed in other systems. There are few studies examining power coupling, but most show a variety of different cross-frequency coupling patterns. In a previous study which modeled the causes of heightened responses to faces compared with nonfaces, only backward connections showed suppressive cross-frequency coupling (Chen et al. 2009). DCM studies of the motor system show even more varied patterns of cross-frequency power coupling (Chen et al. 2010, 2012). Also, our results differ from a few studies using amplitude envelope correlations of intracranial recordings in the human (Bruns et al. 2000; Bruns and Eckhorn 2004).

Clearly, directed coupling measurement of the power in different areas provides a fruitful approach. However, the sparse existing data suggest a variety of different results. Other studies of oscillatory interactions are less easy to compare directly, as many employed nondirected, correlational, and descriptive “functional connectivity” measures (e.g., Gross et al. 2001) or did not measure signals from local areas in source space (Breakspear and Terry 2002; Kotini and Anninos 2002). In contrast, DCM provides a mechanistic generative model to explain induced responses at specified anatomical source locations in terms of directed, “effective connectivity” (Chen et al. 2008). And the vast majority of studies which have examined cross-frequency interactions have focused on phase (Varela et al. 2001; Breakspear 2002; Jensen and Colgin 2007), while we measured effective connectivity for the power spectrum. This measure reflects any perturbation, that the power in one area may actuate on power dynamics in connected areas (Schiff et al. 1996) and so is sensitive to many other complex dynamical interactions, including “asynchronous coding” (Friston 2000).

### Two Hierarchical, Feedforward, and Specialized Pathways

We computed our power coupling estimates using the optimal DCM, based on Bayesian model comparison. In addition to improving the accuracy of our parameter estimation, this allowed us to verify prior theory and findings pertaining to connectivity among face-selective areas. Indeed, we incorporated several of the major hypotheses in face perception into the development of our model space. And, in fact, our optimal model substantiated in detail the predictions of one of the major theories of face perception (Haxby et al. 2000). This theory, inspired in part by an influential psychological theory (Bruce and Young 1986), proposes anatomically separated ventral and dorsal temporal lobe pathways, which are respectively specialized for perceiving identities versus changeable facial attributes (e.g., expressions). These pathways hypothetically bifurcate at the level of OFA with ventral connections projecting to the FFA, and dorsal connections projecting to the STS. Correspondingly, in our optimal model, the OFA sent separate forward projections to STS and FFA, while STS and FFA were not themselves coupled. The absence of communication between the pathways is consistent with “functional independence” (Bruce and Young 1986) and replicates a prior DCM study using fMRI (Fairhall and Ishai 2007).

It may appear surprising that the face perception network is dominated by hierarchical interactions, with no additional inputs to FFA or STS, as predicted from human lesion data by Rossion et al. (2003). Alternate input pathways to FFA and STS might exist, but they also did not detectably influence induced responses to faces using our task. It may also appear surprising that these pathways are feedforward. It is clear that backward connections exist in the visual system and undoubtedly they are used. For example, Chen et al. (2009) observed cross-frequency backward power coupling from FFA to OFA. We note, however, that Chen et al. (2009) examined the origins of face-selective responses, while we did not have access to data for nonfaces and did not aim to model how face selectivity arises. Moreover, Chen et al. (2009) did not include feedforward models or the STS in their model comparison and so their study never considered our winning model. However, another study (Fairhall and Ishai 2007) tested similar models to ours and obtained a similar finding: 2 independent pathways which are feedforward. We therefore replicate the finding that the 2 face perception pathways can (sometimes) operate in a “feedforward mode.” One of the likely possibilities is that the feedback connections operate to modulate responses in a task specific way, and in simple visual processing feedforward processing dominates. Computational modeling suggests that a feedforward mode is useful for rapid visual categorizations (Serre et al. 2007), and this may be the case for the face perception system. We suggest that DCM of the face perception system provides a fruitful approach for studying when backward connections influence brain-imaging measures.

We further substantiated the claim that the dorsal pathway shows greater sensitivity to information about facial expressions (Haxby et al. 2000), as areas associated with the dorsal pathway showed power coupling modulated by fearful facial expressions. Others have suggested that cortical sensitivity to facial expression depends on areas in the “extended” system (Haxby et al. 2000), especially the amygdala (Vuilleumier et al. 2003). In a separate analysis, we used Bayesian model comparison to explore possible connectivity between our optimal model and bilateral amygdala and found that the STS connected to amygdala via forward projections (data not shown), also confirming Haxby et al. (2000). Unfortunately, amygdala measurement is problematic for MEG (Attal et al. 2012) and so we focused our primary analysis on more reliably measured cortical visual areas. Nevertheless, all our analyses provided strong corroboration of the predominant theory of the neuroanatomical face perception network (Haxby et al. 2000).

### Limitations and Directions

This approach allowed us to learn much about information transmission and connectivity structure in the face perception network. Our results pose several new questions which we cannot address in the current study, due to methodological limitations. First, what is the functional significance of theta and beta as the constituents of cross-frequency coupling? These particular frequencies may arise due to neural mechanisms within the cortical microcircuit, such as the balance of excitation and inhibition, which are difficult to test using our method. This issue can be addressed using animal neurophysiology as well as models such as DCM for steady state (Moran et al. 2011), which parameterizes biophysically realistic quantities that are estimable from human MEG data. Second, the induced responses we measured at predefined locations may reflect a mixing of signals from multiple locations. Thus, it is difficult, for example, to claim conclusively that OFA, FFA and STS in fact each show fear-selectivity, as this effect may arise from one source, expressed over a wide area. fMRI studies would be more suitable for such precise localization. Fortunately, because DCM uses power to predict changes in power, coupling cannot arise when the same signal influences 2 source locations (See also Materials and Methods section). Thus, our claim that there is bilinear fearful expression modulation only in the dorsal pathway is on firmer ground. For the third new question, one of our aims was to validate a modeling approach for translational study of pathological long-range communication between functional areas. Particularly, the “disconnection disorder” schizophrenia is typified by abnormal neural oscillations (Uhlhaas and Singer 2010), which may be explained mechanistically using DCM.

## Conclusions

The face perception network comprises distinct, domain-specific (or modular) functional areas, which we show to have a hierarchical feedforward organization. Examination of this model network successfully demonstrated cross-frequency spectral information propagation, where a local functional area's theta power enhanced power for lower frequencies (theta) in its target areas while suppressing power for higher frequencies (beta). This pattern of activating and suppressive power coupling can explain induced responses that show sustained theta power increases and beta power decreases. And these interactions can therefore transmit and modulate information across long-range cortical connections in hierarchical feedforward networks. These findings are significant, as they provide strong evidence favoring an a priori theory of functional organization in the face perception network and reveal computational mechanisms that appear to be common throughout the network. This common pattern of oscillatory information transmission may reflect computations performed in other parts of cortex as well. Consequently, with future study, such mechanisms may prove significant for understanding basic cortical functions such as vision as well as pathological conditions.

## Funding

This work was supported by funding from the NIH Intramural Research Program to N.F., R.C., B.A., and D.W. and funding from the United Kingdom Economic and Social Research Council [RES-062-23-2925] to N.F.

## Notes

Thanks to Vladimir Litvak and Karl Friston for their analysis advice.

*Conflict of Interest*: None declared.