Abstract

Magnetoencephalography studies in humans have shown word-selective activity in the left inferior frontal gyrus (IFG) approximately 130 ms after word presentation ( Pammer et al. 2004; Cornelissen et al. 2009; Wheat et al. 2010). The role of this early frontal response is currently not known. We tested the hypothesis that the IFG provides top-down constraints on word recognition using dynamic causal modeling of magnetoencephalography data collected, while subjects viewed written words and false font stimuli. Subject-specific dipoles in left and right occipital, ventral occipitotemporal and frontal cortices were identified using Variational Bayesian Equivalent Current Dipole source reconstruction. A connectivity analysis tested how words and false font stimuli differentially modulated activity between these regions within the first 300 ms after stimulus presentation. We found that left inferior frontal activity showed stronger sensitivity to words than false font and a stronger feedback connection onto the left ventral occipitotemporal cortex (vOT) in the first 200 ms. Subsequently, the effect of words relative to false font was observed on feedforward connections from left occipital to ventral occipitotemporal and frontal regions. These findings demonstrate that left inferior frontal activity modulates vOT in the early stages of word processing and provides a mechanistic account of top-down effects during word recognition.

Introduction

Two prominent theories of word recognition, the local combination detector model (LCD; Dehaene et al. 2005; Dehaene and Cohen 2011) and the interactive account (IA; Price and Devlin 2003; Price and Devlin 2011), critically differ on the question of whether word processing in the left ventral occipitotemporal cortex (vOT) is supported by rapid, automatic feedback from higher-order language areas. The LCD model proposes that word recognition is achieved by a hierarchy of processing stages that occur in a strictly feedforward manner. By comparison, the IA model suggests that feedforward and feedback processing occurs interactively, with higher-order representations of familiar words facilitating word recognition in the left vOT cortex. A number of functional magnetic resonance imaging studies have demonstrated that activity in the left vOT cortex is sensitive to manipulations of task context or higher-order stimulus properties that are best explained by top-down feedback (Starrfelt and Gerlach 2007; Hellyer et al. 2011; Kherif et al. 2011; Twomey et al. 2011), but these studies have not been able to delineate the source of this feedback. The aim of the present study was to examine the direction of inter-regional effects throughout the visual hierarchy of the reading network. By using magnetoencephalography (MEG), we were able to see how these interactions unfolded in the hundred-millisecond range in order to test the hypothesis that feedback occurs in the early stages of word processing.

The existing MEG literature on word processing has identified a general trend for posterior-to-anterior activity: Processing starts in bilateral occipital cortex (OCC: ∼100–130 ms), then proceeds along the ventral visual stream with a peak in the left vOT (∼150–170 ms), culminating in sustained, left-lateralized activity in the temporal and inferior frontal cortex from around 200 ms onwards (Tarkiainen et al. 1999; Marinković et al. 2003; Pammer et al. 2004; Pylkkänen and McElree 2007; Cornelissen et al. 2009; Vartiainen et al. 2009; Wheat et al. 2010). However, some studies have identified an early response to written words in the left inferior frontal gyrus (IFG) at approximately 130 ms, preceding activity in the vOT (Pammer et al. 2004; Cornelissen et al. 2009; Wheat et al. 2010). The functional role of the left IFG's early response to words is currently unknown, but parallels can be drawn with the literature on visual object recognition. Bar and colleagues (Bar et al. 2006; Kveraga et al. 2007) have demonstrated that rapid activation of the orbitofrontal cortex facilitates object identification by feeding top-down predictions about the object's appearance to ventral visual areas. We propose that the left IFG may play a similar role in word recognition, perhaps by providing abstract higher-level representations of the word that predict the most likely stimulus features, leading to more efficient word identification.

The present study tested this hypothesis by investigating the dynamic interactions involved in the early stages of word processing. The high temporal resolution of MEG is ideal for determining directionality and timing of connectivity changes within the reading network. We predicted that 1) the left IFG or its connections would be preferentially responsive to words rather than false font stimuli; and 2) the left IFG's response would be present soon after stimulus presentation and would interact with early visual areas in the M170 time-window.

Materials and Methods

The approach was to identify the set of MEG sources that best explained the pattern of sensor space activity during word and false font processing, and then to investigate which connections between these sources were significantly modulated by the stimulus type. False fonts were chosen to provide a low-level, nonliguistic, baseline condition, matched to letters for basic visual elements only. A 3-step procedure was used. First, Variational Bayesian Equivalent Current Dipole (VB-ECD) source localization (Kiebel et al. 2008) identified the occipital, temporal, and frontal dipoles underlying the sensor data at the M170 peak in response to words or false font stimuli. Then, dynamic causal modeling (DCM; David et al. 2006; Kiebel et al. 2006, 2009) was used to test that the combination of these dipoles best explained the data across a time-window 1–300 ms post presentation. Finally, the winning spatial model was used in a second DCM analysis to measure the directed influences (connection strengths) of all regions within the visual hierarchy, and how these differed between the 2 viewing conditions, words and false fonts.

Subjects

Ten right-handed subjects (5 female, mean age 57 years, range 30–82 years) participated in the study. Handedness was assessed by asking participants which hand they currently wrote with, and whether they started out left handed but switched later in life. Participants who were left handed currently or previously were excluded. The participants were recruited as an age-matched control group for a parallel study with stroke patients. All spoke English as their first language and had no history of significant neurological or psychiatric illness. They had normal or corrected-to-normal vision. Participants were asked whether they had experienced reading difficulties during development, and participants with a history of reading difficulties were excluded. Word and text reading abilities were assessed to provide the normative data for a parallel patient study (data not reported here). This did not reveal any indications of reading difficulties in any of the participants: Word reading accuracy was at ceiling levels (range 98.8–100%) and oral text reading at an unpressured (“read at your own pace”) task ranged from 151 to 189 wpm, which is within 2 standard deviations of a normal range reported by Lewandowski et al. (2003).

Participants gave written consent. Research procedures were approved by the National Hospital of Neurology and Neurosurgery and Institute of Neurology Joint Medical Ethics Committee.

Scanning Procedures

The MEG data were acquired using a VSMMedTech Omega 275 MEG scanner with an array of 275 axial gradiometers in software third gradient mode. The sample rate was 480 Hz with an antialias filter at 120 Hz. Head location was determined using 3 fiducial markers, on the nasion as well as the left and right preauricular points.

The MEG data were acquired in 4 runs, each containing 110 trials. Fifty trials per run contained real words, 50 contained meaningless false font strings, and 10 were familiar names (e.g. john, tim, sarah etc.) used as “catch trials” in an incidental task designed to maintain attention throughout the run. Participants were instructed to read words and to view false font stimuli silently, and respond by a button press when a name was displayed. MEG data from these catch trials were excluded from all further analysis. The stimuli were presented in a pseudorandom order on a screen in front of the participant at a distance of approximately 50 cm using Cogent software (www.vislab.ucl.ac.uk/cogent.php) in MATLAB (The MathWorks, Inc.). At the start of the run, a gray screen was presented with a central black fixation crosshair. Each stimulus was presented for 500 ms, followed by a crosshair for 2500 ms, giving a total interstimulus interval of 3000 ms (Fig. 1a).

Figure 1.

Experimental procedure and analysis. (a) Schematic diagram of the experimental procedure, comprising word, false font, and name trials. (b) Example root mean squared plot across all sensors for the average of all word and false font trials for a sample participant. The asterisk indicates the M170 peak, which was identified in a semisupervised manner in each participant and used as the time-point for the VB-ECD source localization. (c) An example of the distribution of the sensor space data at the time of the M170 peak for the same participant.

Figure 1.

Experimental procedure and analysis. (a) Schematic diagram of the experimental procedure, comprising word, false font, and name trials. (b) Example root mean squared plot across all sensors for the average of all word and false font trials for a sample participant. The asterisk indicates the M170 peak, which was identified in a semisupervised manner in each participant and used as the time-point for the VB-ECD source localization. (c) An example of the distribution of the sensor space data at the time of the M170 peak for the same participant.

A T1-weighted whole-brain structural magnetic resonance imaging (MRI) was acquired for each subject on a Philips Intera 3.0 Tesla MRI scanner using an 8-array head coil. This was used to aid image coregistration and source localization.

Visual Stimuli

The word and catch trial stimuli were all between 3 and 6 characters long, presented in lower case, size 50, Helvetica font. All stimuli were center aligned to the fixation point and fell within 3.5° either side of the fixation. To avoid repetition effects, no stimulus was presented more than once to each participant.

The words were selected from a corpus of 1000 words with high Kucera-Francis (KF) written frequency drawn from the MRC Psycholinguistic Database (Coltheart 1981). The median KF frequency was 484 (interquartile range 385–560). The median word image ability was 63 (interquartile range 29–68). The selection of words from the corpus did not lead to an even distribution of word lengths: 14% of selected words had 3 letters; 25% had 4 letters; 31% had 5 letters; and 20% had 6 letters.

The false font stimuli were direct translations of the word stimuli using the “Carian” font (Jane Warren, personal communication). The characters of this font were adapted from the alphabet of an obsolete Anatolian language (Melchert 2004). Characters of the Carian script share some physical properties with modern Roman alphabetic characters, in that they are simple combinations of linear and/or curved elements. However, most bear little direct resemblance to modern Roman alphabetic characters and, therefore, should not map easily onto phonological representations. The direct translation procedure meant that the false font stimuli were matched to the reading stimuli for string length and character repetition.

MEG Preprocessing

The MEG data were preprocessed using SPM8 software (Litvak et al. 2011; http://www.fil.ion.ucl.ac.uk/spm/) in Matlab 7.11 (The MathWorks, Inc.). The data from all sensors were first high-pass filtered at 1 Hz. Eye-movement artifacts were removed from the sensor data by placing an equivalent current dipole at the location of each eyeball to estimate the topography of the ocular artifact (Berg and Scherg 1994). The data were then epoched from 100 ms prestimulus onset to 1000 ms poststimulus onset. The prestimulus time-window (−100 to 0 ms) was used for baseline correction. A low-pass filter of 30 Hz was applied to the epoched data, and robust averaging (Litvak et al. 2011) was used to average across all reading and false font trials separately. Robust averaging is a form of robust general linear modeling (Wager et al. 2005) used to downweight outliers in the data. It uses an iterative procedure, whereby a weight is assigned to each sample of each trial, according to how far it is from the mean response. The weighted mean is then recalculated, and new weights assigned. The process continues iteratively until no outliers remain. Finally, the low-pass filtering was repeated to remove any high frequency noise introduced by the robust averaging process.

Source Localization

The next step of the analysis was to identify the sources at 1 fixed point in time (the M170) using VB-ECD modeling (Kiebel et al. 2008). VB-ECD is a point-source inversion method that uses a nonlinear optimization algorithm to fit a number of dipoles simultaneously, with different prior distributions on their locations and moments, to all of the sensor data (i.e., with no a priori selection of channels). For example, prior distributions for a dipole in the left OCC may have a mean location of x = −15 y = −95 z = 2 with a standard deviation 6 mm in all directions; and a mean moment of 0 nAm, with a standard deviation of 10 nAm. The resulting dipole locations are free to move outside of the prior distribution, although the model evidence of the result is a balance between the likelihood of the model given the data, and the (Kullback-Leibler) divergence between the posteriors and the priors (cf. the model complexity).

Point-source inversion methods are an ideal way of localizing MEG activity for use in a DCM analysis, as they reduce the data down to a number of discrete regions rather than providing a distributed imaging solution that would have to be split artificially into regions of interest. However, one disadvantage is that the VB-ECD dipole moments have no dynamics and are fit to a single point in time, which may or may not be representative of the sources active over the whole time-window of interest. To get around this problem, DCM was used to estimate how well the winning source locations for each dipole configuration explained the observed data over the full 1–300 ms time-window.

To prevent the source localization being biased to one condition over the other, the data used for the VB-ECD analysis was the average response over all trials (words and false font). To identify the M170 peak in each individual, the average response for all word and false font trials was calculated and the root mean squared average across all sensors was plotted against time, as shown in Figure 1b. The M170 peak was chosen as the time-point for the source localization because it was reliably present in all subjects and is known to represent orthographic processing (Tarkiainen et al. 1999; Pylkkänen and Marantz 2003; Rossion et al. 2003; Marinković 2004; Vartiainen et al. 2009). The peak was identified in a semisupervised manner for each subject and had an average latency of 164 ms across the group (range 133–195 ms).

Although the cortical network of regions involved in reading is widespread, the regions of interest for the VB-ECD analysis were strictly limited to those previously identified by MEG source-space analyses of evoked responses within the first 300 ms after presentation of single words. A literature search using these criteria identified the following sources: Bilateral OCC, vOT, the superior temporal sulcus (STS), and the left IFG (Tarkiainen et al. 1999; Marinković et al. 2003; Pammer et al. 2004; Pylkkänen and McElree. 2007; Cornelissen et al. 2009; Vartiainen et al. 2009; Wheat et al. 2010). The initial locations for the OCC, IFG, and STS dipoles (±15 –95 2, ±48 28 0, and ±54 –29 0, respectively) were defined anatomically using the Jülich Histological Atlas (Amunts et al. 2000; Eikhoff et al. 2005). The location for the vOT was ±44 –58 –15, taken from a meta-analysis of word or pseudoword activations by Jobard et al. (2003). These coordinates define the prior expected mean VB-ECD location. The ECD search optimized the model evidence for different source locations (and moments) for each participant. The standard deviation of the location prior was set to be 6 mm in all 3 directions for all locations. Each subject's structural MRI image was used for coregistration, and a single shell was used to define the forward model (Nolte 2003).

Four different dipole configurations (c1–c4) were fit to the data, varying in both number and regional location: Configuration 1 (c1), with dipoles in the left and right OCC only; configuration 2 (c2), with left and right OCC and vOT dipoles; configuration 3 (c3), with left and right OCC, vOT, and IFG dipoles; and configuration 4 (c4), with left and right OCC, vOT, and STS dipoles. A further configuration with 8 sources (left and right OCC, vOT, IFG, and STS) was ruled out because of the prohibitively large model space it would incur (see section below). For each configuration, the VB-ECD algorithm varied the locations and moments of all dipoles iteratively to maximize the model evidence of the solution. To prevent the solution from getting stuck in local maxima, this process was repeated over 100 iterations, selecting new start locations for each dipole based on a Gaussian distribution with a standard deviation of 6 mm from the initial coordinates. The iteration with the highest model evidence was taken as the winning solution for that particular dipole configuration. The locations of the winning source coordinates for each individual and each configuration were visually checked to ensure that they were consistent with their anatomical labels. In addition, Euclidian distances between resultant sources showed that no solution included sources separated by <2 cm.

In the final step of the source localization, described below, DCM models using the winning source locations for each configuration and each time-window were estimated for each subject and Bayesian Model Selection (BMS; Penny et al. 2004; Stephan et al. 2009) was used to identify the most probable source configuration at the group level in each time-window.

Dynamic Causal Modeling

DCM is a powerful tool for investigating the effective connectivity (i.e., the influence that one neuronal region has on another) between brain regions, and how the strength of those connections are modulated by variations in stimuli or tasks demands. The methodology of DCM for the analysis of ERP data is described in detail elsewhere (David et al. 2006; Kiebel et al. 2006, 2009), but a brief synopsis is provided here for clarity. DCM uses a biologically informed, fully generative approach, whereby the activity in each brain region is modeled as a mixture of signals arising from 3 subpopulations of neurons (pyramidal cells, spiny stellate cells, and inhibitory interneurons), each with characteristic response rates and patterns of connectivity within and between regions based on the neural mass model of Jansen and Rit (1995) and the principles of laminar organization of Felleman and Van Essen (1991). An impulse at time t delivers input u(t) to layer 4 spiny stellate cells as described by equation (5) in David et al. (2006). This input then causes a change in activity in the region to which the input is connected: Input and forward connections impinge on the stellate cell population, while backward connections impinge upon pyramidal cells and inhibitory interneurons and lateral connections impinge on all 3 subpopulations. In detail, ensemble firing rates are convolved with a postsynaptic kernel to either depolarize or hyperpolarize the membrane potential, depending on the incoming cell type. All extrinsic connections are excitatory, while connections within a region can be excitatory or inhibitory (David et al. 2006). Activity evolves within and between regions through static sigmoid functions that transform membrane potentials to ensemble firing rates. The parameters of these kernel and sigmoid functions are estimated from the data (Kiebel et al. 2007).

In a standard DCM analysis, the user defines a number of parameters: The location of the MEG sources; the connections present between sources; the sources that receive the sensory input; and the connections that are modulated by variations in stimulus type or task. The DCM model is then estimated, that is, the values of these parameters are varied iteratively until the predicted neural activity generated by the model most closely matches the observed data. In practice, the user estimates a wide range of DCM models, each with a different combination of sources or connections, in order to test a particular hypothesis. Bayesian statistics are then employed to investigate that model gives the best overall explanation of the data (BMS; Penny et al. 2004). If no model is clearly superior to all others tested, then a weighted average model can be constructed using Bayesian Model Averaging (BMA; Penny et al. 2010).

In the present study, DCM was used to investigate 2 separate questions: First, BMS was used to identify that configuration of MEG sources identified by the VB-ECD source localization best explained the data in the time-window of interest (1–300 ms). In these models, words and false font stimuli equally drive the connectivity between regions. Secondly, having established the optimal number of sources and their intrinsic (average) connectivity, BMA was used to investigate how stimulus type (words vs. false fonts) modulated effective connectivity between these sources. This second question was framed over successive 100-ms time-windows to allow for bottom-up and top-down effects to play out throughout the reading network (c.f. Garrido et al. 2007).

DCM—Bayesian Model Selection

For the first step, 4 DCM models were estimated per participant—one for each of the 4 dipole configurations (c1, c2, c3, and c4). In all cases, the observed data were the average response for all trials (words and false fonts) over the first 300 ms, and the spatiotemporal model used source locations defined by the winning solution for the VB-ECD analysis and an exogenous stimulus that entered the left and right OCC sources at 60-ms poststimulus time. All possible forward, backward, and lateral connections between sources were included in the spatiotemporal model, but diagonal connections (e.g. from left OCC to right vOT) were not modeled as there is good evidence that homotopic occipital connections outnumber heterotopic ones in the human callosum (Jarbo et al. 2012). At the group level, BMS with fixed effects (Penny et al. 2004) was used to test which of the 4 configurations gave the optimal fit to the data. The results demonstrated that c3, with sources in the left and right OCC, vOT, and IFG, was decisively the winning model, with a log-evidence value relative to the c4, the second best model, of F = 2011 (model posterior probability >0.99). The winning source locations for c3, shown in Figure 2, were used as the sources for the DCM spatial model for the next step of the analysis.

Figure 2.

The locations of the winning coordinates across the group. Optimal source locations from the winning 6-source model for each participant plotted on a glass brain in MNI space. The start point-source locations were: OCC = ±15 –95 2; vOT = ±44 –58 –15 14; IFG = ±48 28 0. The average locations of the group's winning source locations are shown, with standard deviations. OCC: occipital pole; vOT: ventral occipitotemporal cortex; IFG: inferior frontal gyrus.

Figure 2.

The locations of the winning coordinates across the group. Optimal source locations from the winning 6-source model for each participant plotted on a glass brain in MNI space. The start point-source locations were: OCC = ±15 –95 2; vOT = ±44 –58 –15 14; IFG = ±48 28 0. The average locations of the group's winning source locations are shown, with standard deviations. OCC: occipital pole; vOT: ventral occipitotemporal cortex; IFG: inferior frontal gyrus.

DCM—Bayesian Model Averaging

The next step was to estimate the modulatory effects of stimulus type on 1) the sensitivity of each source to its inputs, as modeled by the self-connection on each node of the model (Kiebel et al. 2007); and 2) the effective connectivity between neuronal sources modeled by forward, backward, and lateral connections between nodes. The winning model, c3, with sources in the left and right OCC, vOT, and IFG was used as the spatial model for this analysis.

The total number of possible connections in a model with 6 nodes is 30, and the total number of possible combinations of those connections (the total model space) is 230 or 1,168,859,344. Inevitably, for computational reasons, some constraints had to be placed to limit the size of the model space. We constrained the model space using 3 rules: These rules reduced the number of independent connections to 9, creating a total model space of 512 DCM models per participant, each modeling a different combination of connections between sources mediating trial-specific effects (using the same set of constraints, a DCM model with 8 nodes would have 15 independent connections, and a total model space of 32 768 DCM models per participant).

  1. Only allowing horizontal lateral connections within a level of the cortical hierarchy (e.g. from left OCC to right OCC) and not allowing diagonal lateral connections (i.e., lateral connections between levels of the cortical hierarchy, e.g., from left OCC to right vOT).

  2. Ensuring that any forward or backward connection (e.g., from left OCC to left vOT) was mirrored in the opposite hemisphere (right OCC to right vOT).

  3. Ensuring that any lateral connections (e.g. left OCC to right OCC) had a reciprocal opposing connection (e.g. right OCC to left OCC).

To investigate the evolution of connection strengths over time, the DCM analysis was performed for 3 different time-windows—0–100, 0–200, and 0–300 ms (see Garrido et al. 2007 for a similar approach). This overlapping time-window design was taken into account when making inferences about the temporal dynamics of effective connectivity, but it is a requirement of the DCM analysis that all time-windows incorporate the time at which the stimulus was presented.

After all 512 models were estimated for each participant, group-level BMA with random effects (Penny et al. 2010) was used to estimate the average strength of the trial-modulated connections (i.e., gain for words vs. false fonts) across the whole model space. Gains are measured in log space, hence, for each connection, an average gain equal to 1 would indicate that stimulus type did not modulate connection strength, whereas an average gain greater or smaller than 1 would indicate that the connection strength was stronger for words or false fonts, respectively.

A nonparametric proportion test was used to evaluate whether the results of the BMA were statistically significant. For each connection, the distribution of the log gain was reconstructed by taking 10 000 samples from a Gaussian distribution based on the posterior mean and standard deviation calculated in the BMA process. If >90% of samples were greater or smaller than unity (a posterior mean of one), the connection was deemed to be significantly stronger for words or false fonts, respectively. Conditions which satisfy this criterion were reported in the results as P> 0.9. This approach has been previously reported (Richardson et al. 2011).

Results

Magnetic Evoked Fields

Source locations were identified for each subject using the winning 6-dipole configuration from the VB-ECD analysis based on a single time-point of data at approximately 170 ms. Figure 3 shows the group average-fitted responses resulting from the estimation of this winning model. The responses in each subpopulation of neurons (pyramidal, spiny stellate, and inhibitory interneurons) that comprise each source of the DCM model are shown. The spiny stellate cells of the left and right OCC sources (which received the sensory input) peaked first at around 85 ms, followed closely by the pyramidal cells (∼100 ms) and inhibitory interneurons (∼115 ms). The left and right vOT and IFG sources peaked after the OCC sources, both with very similar response latencies of around 135–145 ms. In these analyses, word and false font stimuli were treated equally.

Figure 3.

Fitted DCM source responses over time. Group average of the fitted DCM source responses over time for each neuronal subpopulation (pyramidal cells, spiny stellate cells, and inhibitory interneurons). The data were taken from the model shown by the BMS analysis to best explain the word and false font data over the 1–300-ms time-window, which included sources in the left and right occipital cortex (OCC), ventral occipitotemporal cortex (vOT), and inferior frontal gyri (IFG).

Figure 3.

Fitted DCM source responses over time. Group average of the fitted DCM source responses over time for each neuronal subpopulation (pyramidal cells, spiny stellate cells, and inhibitory interneurons). The data were taken from the model shown by the BMS analysis to best explain the word and false font data over the 1–300-ms time-window, which included sources in the left and right occipital cortex (OCC), ventral occipitotemporal cortex (vOT), and inferior frontal gyri (IFG).

Dynamic Causal Modeling

The results of the BMA testing the effects of each stimulus type on effective connectivity in each time-window are shown in Figure 4. Table 1 shows the posterior means and exceedance probabilities of connections that were above the significance criterion of P > 0.9.

Table 1

Posterior means and exceedance probabilities for connections that were significantly stronger for words (mean >1) or weaker for words (mean <1) than would be expected by chance

Connection Posterior mean Exceedance probability 
1–100 ms, Words > false font 
 Right OCC self-connection 1.11 0.999 
1–200 ms, Words > false font 
 Left IFG self-connection 1.06 0.958 
 Left IFG to left vOT 1.17 0.908 
1–200 ms, Words < false font 
 Left OCC self-connection 0.91 >0.999 
 Right OCC self-connection 0.93 >0.999 
1–300 ms, Words > false font 
 Left OCC to left vOT 1.10 0.971 
 Left OCC to left IFG 1.08 0.949 
 Right vOT self-connection 1.05 0.954 
1–300 ms, Words < false font 
 Left OCC self-connection 0.89 >0.999 
 Right OCC self-connection 0.91 >0.999 
 Left vOT to left OCC 0.86 0.907 
 Right vOT to right IFG 0.88 0.907 
 Right IFG to right OCC 0.78 0.984 
Connection Posterior mean Exceedance probability 
1–100 ms, Words > false font 
 Right OCC self-connection 1.11 0.999 
1–200 ms, Words > false font 
 Left IFG self-connection 1.06 0.958 
 Left IFG to left vOT 1.17 0.908 
1–200 ms, Words < false font 
 Left OCC self-connection 0.91 >0.999 
 Right OCC self-connection 0.93 >0.999 
1–300 ms, Words > false font 
 Left OCC to left vOT 1.10 0.971 
 Left OCC to left IFG 1.08 0.949 
 Right vOT self-connection 1.05 0.954 
1–300 ms, Words < false font 
 Left OCC self-connection 0.89 >0.999 
 Right OCC self-connection 0.91 >0.999 
 Left vOT to left OCC 0.86 0.907 
 Right vOT to right IFG 0.88 0.907 
 Right IFG to right OCC 0.78 0.984 
Figure 4.

The effects of stimulus type on connection strength. Results of the DCM analyses in 3 time-windows: 1–100, 1–200, and 1–300 ms. Arrows represent the modulatory effect of stimulus type on connection strengths. Values larger than one represent stronger connections for words relative to the false font baseline and were thresholded at P > 0.9 (highlighted with solid black lines). Values significantly smaller than one represent weaker connections for words versus baseline, again with a threshold of P > 0.9 (dotted black lines; N = 10).

Figure 4.

The effects of stimulus type on connection strength. Results of the DCM analyses in 3 time-windows: 1–100, 1–200, and 1–300 ms. Arrows represent the modulatory effect of stimulus type on connection strengths. Values larger than one represent stronger connections for words relative to the false font baseline and were thresholded at P > 0.9 (highlighted with solid black lines). Values significantly smaller than one represent weaker connections for words versus baseline, again with a threshold of P > 0.9 (dotted black lines; N = 10).

Stronger Connections for Words Than False Font Stimuli

The earliest preferential response to words over the false font baseline was observed in the self-connection of the right OCC node in the 1–100-ms time-window. As noted previously, self-connections represent a region's sensitivity to its inputs: Given the same strength of postsynaptic stimulation, a region with a strong self-connection will respond with increased output (neuronal firing) when compared with a region with a weak self-connection (Kiebel et al. 2007). This result indicates that right OCC shows greater sensitivity to words than false font in the first 100 ms after stimulus presentation.

The next significant effect to emerge was a preferential response for words over false font stimuli in the self-connection of the left IFG and the backwards connection from left IFG to left vOT. As these effects were only present in the 1–200-ms time-window, it can be inferred that they were not significant within the first 100 ms, and were not sustained for long enough to be significant in the 1–300-ms time-window.

After this feedback response from the left IFG, stronger feedforward connections for words than false font were observed from left OCC to left vOT and from left OCC to left IFG in the 1–300-ms time-window. The self-connection of right vOT was also stronger for words in this time-window. As these connections did not show significant modulation in the 2 earlier time-windows, it is likely that they became active after 200 ms.

Reduced Connections for Words Than False Font Stimuli

The connections that were significantly reduced for words relative to false font stimuli were the self-connections on the left and right OCC sources in both the 1–200- and 1–300-ms time-windows; and left vOT to left OCC, right IFG to right OCC, and right vOT to right IFG in the 1–300-ms time-window.

Discussion

This study tested whether posterior cortical regions (visual and ventral occipitotemporal cortices) are influenced by feedback from frontal cortex in the early stages of word reading relative to a baseline condition that controlled for visual features, but not higher-level semantic and phonological associations. The results demonstrated that in the 1–200-ms time-window word reading showed a transient effect on the feedback connection from the left IFG to the left vOT that was not observed in the earlier 1–100-ms time-window and did not reach significance in the longer time-window of 1–300 ms. This is possibly the first demonstration of “evoked” IFG activation within the first 200 ms in word reading. The group average left IFG response had a peak of roughly 5 nAm, compared with 12 nAm for the left vOT and 21 nAm for the left OCC. That is (assuming similar proximity to the sensors), we would expect the IFG source to account for between 1/4 and 1/8 of the sensor level variance due to the other sources, and this is one reason why it might not have been observed before in less constrained source models of the data.

This effect preceded the feedforward influence of words on connections from left OCC to vOT and IFG sources. Note that feedforward connections were involved in this time-window (otherwise the stimuli could not affect IFG connectivity), but just not in a differential manner. The early influence of words on left IFG activity has previously been reported (Pammer et al. 2004; Cornelissen et al. 2009; Wheat et al. 2010), but the influence of the left IFG on early left vOT activity is novel, as is the observation that this precedes the modulation of feedforward activity by words relative to false font stimuli.

The pseudorandom order of reading and false font trials used in this experiment ruled out the possibility that the early response of the left IFG could have been due to the participant's expectation of the oncoming stimulus. Rather, the early left IFG response must have been related to stimulus content, implying that visual information must reach the left IFG within the first 200 ms after stimulus presentation. It is important to note that the results described here only test the differences in connectivity between word and false font conditions: It is probable that there is early feedforward activity from occipital to frontal regions that is common to both stimulus types. Anatomically, there are 2 candidate white matter tracts that could support the rapid communication between the occipital lobe and the inferior frontal cortex: The inferior fronto-occipital fasciculus or the pathway through the inferior longitudinal fasciculus and the middle part of the superior longitudinal fasciculus (ILF–SLFII). Both of these fiber pathways have been shown by detailed anatomical work in nonhuman primates to connect visual association cortex with inferior frontal or ventral premotor cortex (Yeterian and Pandya 2010; Yeterian et al. 2011), although the precise locations of the frontal termination sites of these tracts in humans are less certain (Martino et al. 2010; Turken and Dronkers 2011).

The influence of left IFG on the left vOT is consistent with left IFG sending sensory predictions based on prior knowledge of the word. As the early activation of this region has been shown to be sensitive to phonological priming (Wheat et al. 2010), it seems plausible that the left IFG plays a role in providing phonological cues that constrain visual processing to features that are consistent with possible solutions. Interpreting these results in the rubric of hierarchical processing and the predictive coding account of brain function (Mumford 1992; Friston 2005; Schofield et al. 2009) suggest that long-term, abstract representations of words are encoded in the left IFG and are engaged soon after visual presentation. These activated representations then provide feedback predictions to lower-level areas in order to constrain sensory processing. Changes in the left IFG self-connection could reflect a change in the tonic state of the network because participants were in reading mode. This interpretation is in close alignment with the theory from Bar and colleagues, which proposes that magnocellular projections to the frontal cortex play a role in facilitating object recognition via feedback to the ventral visual stream (Bar et al. 2006; Kveraga et al. 2007).

Differences between words and false font stimuli were also observed in the left and right occipital sources. In the first 100 ms, the right OCC was more sensitive to words than false font stimuli, which, assuming a split fovea theory of the visual field (Brysbaert 1994; Lavidor and Walsh 2004; Van der Haegen et al. 2009), may be due to the necessity to transfer the initial and most informative half of the word (Perea and Lupker 2003) to the left hemisphere. The corresponding connection from right to left OCC had a posterior mean of 1.12 in the 1–200-ms time-window, indicating that it was more strongly modulated by words than false font, but the exceedance probability (0.82) did not reach the threshold for significance. After 100 ms, both left and right OCC nodes were less sensitive to words than false font stimuli. This may be reflect the fact that familiar written words require less visual processing than the unfamiliar false fonts; that is, more sustained visual processing is required for the unfamiliar stimuli.

Reduced connectivity for words relative to false font stimuli was also observed from the right vOT to the IFG and the right IFG to the OCC, in the 1–300-ms time-window. This result contributes to an existing literature showing that right hemisphere regions including the occipital, occipitotemporal, and inferior frontal cortices are more strongly activated by unfamiliar scripts than real words (Tagamets et al. 2000; Bokde et al. 2001; Appelbaum et al. 2009; Maurer et al. 2010). One possible interpretation for this is that right hemisphere activation is suppressed by left hemisphere activation during word processing (Seghier et al. 2011). In this time-window, we also saw increased feedforward connectivity in the left hemisphere for words relative to false font stimuli, from OCC to both FG and IFG. Because these effects were seen relatively late in time and after the top-down effects, we think that they probably relate to more abstract (lexical or semantic) properties that words have but false fonts lack. According to the predictive coding account (Friston 2005), these connections could be driven by prediction errors (a mismatch between expectations and stimuli), which are passed forward through the processing hierarchy. These will be stronger for words because any given word will be subject to neighborhood effects that cause higher-level identity conflict (Perea and Pollatsek 1998), whereas nonsense stimuli (false fonts) will not.

In conclusion, we have established that activity in the left IFG is modulated by stimulus type within the first 200 ms after presentation, and that this region plays a role in modulating word processing in the left ventral visual stream, within the time-window of the M170. This result is easier to reconcile with models such as the IA (Price and Devlin 2011), which include automatic feedback during word processing, than purely feedforward accounts such as the LCD model (Dehaene and Cohen 2011).

Funding

This work was supported by the Wellcome Trust and a grant from the Economic and Social Research Council (ES/J003174/1) to Z.W. Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust.

Notes

Conflict of Interest: None declared.

References

Amunts
K
Malikovic
A
Mohlberg
H
Schormann
T
Zilles
K
Brodmann's areas 17 and 18 brought into stereotaxic space: where and how variable?
Neuroimage
 , 
2000
, vol. 
11
 
1
(pg. 
66
-
84
)
Appelbaum
LG
Liotti
M
Perez
R
Fox
SP
Woldorff
MG
The temporal dynamics of implicit processing of non-letter, letter, and word-forms in the human visual cortex
Front Hum Neurosci
 , 
2009
, vol. 
3
 pg. 
56
 
Bar
M
Kassam
KS
Ghuman
AS
Boshan
J
Schmidt
AM
Dale
AM
Hämäläinen
MS
Marinkovic
K
Schacter
DL
Rosen
BR
, et al.  . 
Top-down facilitation of visual recognition
Proc Natl Acad Sci U S A
 , 
2006
, vol. 
103
 
2
(pg. 
449
-
454
)
Berg
P
Scherg
M
A multiple source approach to the correction of eye artifacts
Electroencephalogr Clin Neurophysiol
 , 
1994
, vol. 
90
 
3
(pg. 
229
-
241
)
Bokde
AL
Tagamets
MA
Friedman
RB
Horwitz
B
Functional interactions of the inferior frontal cortex during the processing of words and word-like stimuli
Neuron
 , 
2001
, vol. 
30
 
2
(pg. 
609
-
617
)
Brysbaert
M
Interhemispheric transfer and the processing of foveally presented stimuli
Behav Brain Res
 , 
1994
, vol. 
64
 (pg. 
151
-
161
)
Coltheart
M
The MRC psycholinguistic database
Q J Exp Psychol
 , 
1981
, vol. 
33A
 (pg. 
497
-
505
)
Cornelissen
P
Kringelbach
M
Ellis
AW
Whitney
C
Holliday
I
Hansen
P
Activation of the left inferior frontal gyrus in the first 200ms of reading: evidence from magnetoencephalography (MEG)
PLoS ONE
 , 
2009
, vol. 
4
 
4
pg. 
e5359
 
David
O
Kiebel
SJ
Harrison
LM
Mattout
J
Kilner
JM
Friston
KJ
Dynamic causal modelling of evoked responses in EEG and MEG
Neuroimage
 , 
2006
, vol. 
30
 
4
(pg. 
1255
-
1272
)
Dehaene
S
Cohen
L
The unique role of the visual word form area in reading
Trends Cogn Sci
 , 
2011
, vol. 
15
 
6
(pg. 
254
-
262
)
Dehaene
S
Cohen
L
Sigman
M
Vinckier
F
The neural code for written words: a proposal
Trends Cogn Sci
 , 
2005
, vol. 
9
 
7
(pg. 
335
-
341
)
Eikhoff
S
Stephan
KE
Mohlberg
H
Grefkes
C
Fink
GR
Amunts
K
Zilles
K
A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data
Neuroimage
 , 
2005
, vol. 
25
 
4
(pg. 
1325
-
1335
)
Felleman
DJ
Van Essen
DC
Distributed hierarchical processing in the primate cerebral cortex
Cereb Cortex
 , 
1991
, vol. 
1
 
1
(pg. 
1
-
47
)
Friston
K
A theory of cortical responses
Philos Trans R Soc Lond B Biol Sci
 , 
2005
, vol. 
360
 
1456
(pg. 
815
-
836
)
Garrido
MI
Kilner
JM
Kiebel
SJ
Friston
KJ
Evoked brain responses are generated by feedback loops
Proc Natl Acad Sci U S A
 , 
2007
, vol. 
104
 
52
(pg. 
20961
-
20966
)
Hellyer
PJ
Woodhead
ZV
Leech
R
Wise
RJ
An investigation of twenty/20 vision in reading
J Neurosci
 , 
2011
, vol. 
31
 
4
(pg. 
14631
-
14638
)
Jansen
BH
Rit
VG
Electroencephalogram and visual evoked potential generation in a mathematical model of coupled cortical columns
Biol Cybern
 , 
1995
, vol. 
73
 (pg. 
357
-
366
)
Jarbo
K
Verstynen
T
Schneider
W
In vivo quantification of global connectivity in the human corpus callosum
Neuroimage
 , 
2012
, vol. 
59
 (pg. 
1988
-
1996
)
Jobard
G
Crivello
F
Tzourio-Mazoyer
N
Evaluation of the dual-route theory of reading: a metanalysis of 35 neuroimaging studies
Neuroimage
 , 
2003
, vol. 
20
 
2
(pg. 
693
-
712
)
Kherif
F
Josse
G
Price
CJ
Automatic top-down processing explains common left occipito-temporal responses to visual words and objects
Cereb Cortex
 , 
2011
, vol. 
21
 
1
(pg. 
103
-
114
)
Kiebel
SJ
Daunizeau
J
Phillips
C
Friston
KJ
Variational Bayesian inversion of the equivalent current dipole model in EEG/MEG
Neuroimage
 , 
2008
, vol. 
39
 
2
(pg. 
728
-
741
)
Kiebel
SJ
David
O
Friston
KJ
Dynamic causal modelling of evoked responses in EEG/MEG with lead field parameterization
Neuroimage
 , 
2006
, vol. 
30
 
4
(pg. 
1273
-
1284
)
Kiebel
SJ
Garrido
MI
Friston
KJ
Dynamic causal modelling of evoked responses: the role of intrinsic connections
Neuroimage
 , 
2007
, vol. 
36
 
2
(pg. 
332
-
345
)
Kiebel
SJ
Garrido
MI
Moran
R
Chen
CC
Friston
KJ
Dynamic causal modelling for EEG and MEG
Hum Brain Mapp
 , 
2009
, vol. 
30
 
6
(pg. 
1866
-
1876
)
Kveraga
K
Boshyan
J
Bar
M
Magnocellular projections as the trigger of top-down facilitation in recognition
J Neurosci
 , 
2007
, vol. 
27
 
48
(pg. 
13232
-
13240
)
Lavidor
M
Walsh
V
The nature of foveal representation
Nat Rev Neurosci
 , 
2004
, vol. 
5
 
9
(pg. 
729
-
735
)
Lewandowski
LJ
Codding
RS
Kleinmann
AE
Tucker
KL
Assessment of reading rate in postsecondary students
J Psychoeduc Assess
 , 
2003
, vol. 
21
 (pg. 
134
-
144
)
Litvak
V
Mattout
J
Kiebel
S
Phillips
C
Henson
R
Kilner
J
Barnes
G
Oostenveld
R
Daunizeau
J
Flaudin
G
, et al.  . 
EEG and MEG data analysis in SPM8
Comput Intell Neurosci
 , 
2011
, vol. 
2011
 pg. 
852961
 
Marinković
K
Spatiotemporal dynamics of word processing in the human cortex
Neuroscientist
 , 
2004
, vol. 
10
 
2
(pg. 
142
-
152
)
Marinković
K
Dhond
RP
Dale
AM
Glessner
M
Carr
V
Halgren
E
Spatiotemporal dynamics of modality-specific and supramodal word processing
Neuron
 , 
2003
, vol. 
38
 
3
(pg. 
487
-
497
)
Martino
J
Brogna
C
Robles
SG
Vergani
F
Duffau
H
Anatomic dissection of the inferior fronto-occipital fasciculus revisited in the lights of brain stimulation data
Cortex
 , 
2010
, vol. 
46
 
5
(pg. 
691
-
699
)
Maurer
U
Blau
V
Yoncheva
YN
McCandliss
BD
Development of visual expertise for reading: emergence of visual familiarity for an artificial script
Dev Neuropsychol
 , 
2010
, vol. 
35
 
4
(pg. 
404
-
422
)
Melchert
CM
Woodward
RD
Carian
The Cambridge Encyclopaedia of the World's Ancient Languages
 , 
2004
Cambridge
Cambridge University Press
Mumford
D
On the computational architecture of the neocortex. II. The role of cortico-cortical loops
Biol Cybern
 , 
1992
, vol. 
66
 
3
(pg. 
241
-
251
)
Nolte
G
The magnetic lead field theorem in the quasi-static approximation and its use for magnetoencephalography forward calculation in realistic volume conductors
Phys Med Biol
 , 
2003
, vol. 
48
 
22
(pg. 
3637
-
3652
)
Pammer
K
Hansen
P
Krigelbach
M
Holliday
I
Barnes
G
Hillebrand
A
Singh
KD
Cornelissen
P
Visual word recognition: the first half-second
Neuroimage
 , 
2004
, vol. 
22
 
4
(pg. 
1819
-
1825
)
Penny
WD
Stephan
KE
Daunizeau
J
Rosa
MJ
Friston
KJ
Schofield
TM
Leff
AP
Comparing families of dynamic causal models
PLoS Comput Biol
 , 
2010
, vol. 
6
 
3
pg. 
e10000709
 
Penny
WD
Stephan
KE
Mechelli
A
Friston
KJ
Comparing dynamic causal models
Neuroimage
 , 
2004
, vol. 
22
 
3
(pg. 
1157
-
1172
)
Perea
M
Lupker
SJ
Does jugde activate COURT? Transposed-letter similarity effects in masked associative priming
Mem Cogn
 , 
2003
, vol. 
31
 
6
(pg. 
829
-
841
)
Perea
M
Pollatsek
A
The effects of neighbourhood frequency in reading and lexical decision
J Exp Psychol Hum Percept
 , 
1998
, vol. 
24
 (pg. 
767
-
779
)
Price
CJ
Devlin
JT
The interactive account of ventral occipitotemporal contributions to reading
Trends Cogn Sci
 , 
2011
, vol. 
15
 
6
(pg. 
246
-
253
)
Price
CJ
Devlin
JT
The myth of the visual word form area
Neuroimage
 , 
2003
, vol. 
19
 
3
(pg. 
473
-
481
)
Pylkkänen
L
Marantz
A
Tracking the time course of word recognition with MEG
Trends Cogn Sci
 , 
2003
, vol. 
7
 
5
(pg. 
187
-
189
)
Pylkkänen
L
McElree
B
An MEG study of silent meaning
J Cogn Neurosci
 , 
2007
, vol. 
19
 
11
(pg. 
1905
-
1921
)
Richardson
FM
Seghier
ML
Leff
AP
Thomas
MS
Price
C
Multiple routes from occipital to temporal cortices during reading
J Neurosci
 , 
2011
, vol. 
31
 
22
(pg. 
8239
-
8247
)
Rossion
B
Joyce
CA
Cottrell
GW
Tarr
MJ
Early lateralization and orientation tuning for face, word, and object processing in the visual cortex
Neuroimage
 , 
2003
, vol. 
20
 
3
(pg. 
1609
-
1624
)
Schofield
TM
Iverson
P
Kiebel
SJ
Stephan
KE
Kilner
JM
Friston
KJ
Crinion
JT
Price
CJ
Leff
AP
Changing meaning causes coupling changes within higher levels of the cortical hierarchy
Proc Natl Acad Sci U S A
 , 
2009
, vol. 
106
 
28
(pg. 
11765
-
11770
)
Seghier
ML
Josse
G
Leff
AP
Price
CJ
Lateralization is predicted by reduced coupling from the left to right prefrontal cortex during semantic decisions on written words
Cereb Cortex
 , 
2011
, vol. 
21
 
7
(pg. 
1519
-
1531
)
Starrfelt
R
Gerlach
C
The visual what for area: words and pictures in the left fusiform gyrus
Neuroimage
 , 
2007
, vol. 
35
 
1
(pg. 
334
-
342
)
Stephan
KE
Penny
WD
Daunizeau
J
Moran
RJ
Friston
KJ
Bayesian model selection for group studies
Neuroimage
 , 
2009
, vol. 
46
 
4
(pg. 
1004
-
1017
)
Tagamets
MA
Novick
JM
Chalmers
ML
Friedman
RB
A parametric approach to orthographic processing in the brain: an fMRI study
J Cogn Neurosci
 , 
2000
, vol. 
12
 
2
(pg. 
281
-
297
)
Tarkiainen
A
Helenius
P
Hansen
PC
Cornelissen
PL
Salmelin
R
Dynamics of letter string perception in the human occipitotemporal cortex
Brain
 , 
1999
, vol. 
122
 
11
(pg. 
2119
-
2132
)
Turken
A
Dronkers
M
The neural architecture of the language comprehension network: converging evidence from lesion and connectivity analyses
Front Syst Neurosci
 , 
2011
, vol. 
5
 (pg. 
1
-
20
)
Twomey
T
Kawabata Duncan
KJ
Price
CJ
Devlin
JT
Top-down modulation of ventral occipito-temporal responses during visual word recognition
Neuroimage
 , 
2011
, vol. 
55
 
3
(pg. 
1242
-
1251
)
Van der Haegen
L
Brysbaert
M
Davis
CJ
How does interhemispheric communication in visual word recognition work? Deciding between early and late integration accounts of the split fovea theory
Brain Lang
 , 
2009
, vol. 
108
 
2
(pg. 
112
-
121
)
Vartiainen
J
Partiainen
T
Salmelin
R
Spatiotemporal convergence of semantic processing in reading and speech perception
J Neurosci
 , 
2009
, vol. 
29
 
29
(pg. 
9271
-
9280
)
Wager
TD
Keller
MC
Lacey
SC
Jonides
J
Increased sensitivity in neuroimaging analyses using robust regression
Neuroimage
 , 
2005
, vol. 
26
 
1
(pg. 
99
-
113
)
Wheat
KL
Cornelissen
P
Frost
SJ
Hansen
P
During visual word recognition, phonology is accessed within 100 ms and may be mediated by a speech production code: evidence from magnetoencephalography
J Neurosci
 , 
2010
, vol. 
30
 
15
(pg. 
5229
-
5233
)
Yeterian
EH
Pandya
DN
Fiber pathways and cortical connections of preoccipital areas in rhesus monkeys
J Comp Neurol
 , 
2010
, vol. 
518
 
18
(pg. 
3725
-
3751
)
Yeterian
EH
Pandya
DN
Tomaiuolo
F
Petrides
M
The cortical connectivity of the prefrontal cortex in the monkey brain
Cortex
 , 
2011
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.