We report that preexisting individual differences in the cortical thickness of brain areas involved in a perceptual learning task predict the subsequent perceptual learning rate. Participants trained in a motion-discrimination task involving visual search for a “V”-shaped target motion trajectory among inverted “V”-shaped distractor trajectories. Motion-sensitive area MT+ (V5) was functionally identified as critical to the task: after 3 weeks of training, activity increased in MT+ during task performance, as measured by functional magnetic resonance imaging. We computed the cortical thickness of MT+ from anatomical magnetic resonance imaging volumes collected before training started, and found that it significantly predicted subsequent perceptual learning rates in the visual search task. Participants with thicker neocortex in MT+ before training learned faster than those with thinner neocortex in that area. A similar association between cortical thickness and training success was also found in posterior parietal cortex (PPC).
Perceptual learning refers to enduring changes in the perceptual processing of frequently encountered stimuli (Gibson 1963; Sasaki et al. 2010). It is usually characterized by some degree of specificity to the trained stimulus attributes including stimulus location (e.g., Ball and Sekuler 1982; Karni and Sagi 1991; Poggio et al. 1992; Ahissar et al. 1998; Saffell and Matthews 2003; Jastorff et al. 2006; Bi et al. 2010), the recruitment of stimulus-specific cortical areas (e.g., Vaina et al. 1998; Schwartz et al. 2002; Furmanski et al. 2004; Grossman et al. 2004; Song et al. 2010; Bi et al. 2014; Frank et al. 2014), as well as stability of acquired behavioral improvements over time (e.g., Ball and Sekuler 1987; Karni and Sagi 1993; Watanabe et al. 2002; Bi et al. 2014; Frank et al. 2014). The study of perceptual learning has the potential to provide insights into neural correlates of learning more generally.
Many studies have found substantial individual differences in perceptual learning rate (e.g., Fahle and Edelman 1993; Watanabe et al. 2002; Saffell and Matthews 2003; Grossman et al. 2004; Kourtzi et al. 2005; Sigman et al. 2005; Mukai et al. 2007; Yotsumoto et al. 2008; Bang et al. 2013). One factor accounting for some of the intersubject variance in perceptual learning rate is participants' initial performance levels: in some perceptual learning tasks, better initial performance is associated with smaller subsequent training-related performance changes (Fahle and Henke-Fahle 1996; Astle et al. 2013). However, substantial intersubject variance remains unaccounted for, and particularly little is known about how preexisting individual differences in brain anatomy might influence perceptual learning rate.
In other domains such as memory recall (Walhovd et al. 2006), intelligence (Narr et al. 2007; Choi et al. 2008), or cognitive performance in the elderly (Fjell et al. 2006), neocortical thickness has been shown to predict individual differences in cognition. In general, thicker cortex in task-relevant brain areas has been associated with better performance on such tasks (Kanai and Rees 2011).
We hypothesized that cortical thickness might account for some of the variance in perceptual learning rate between individuals. In particular, we predicted that thicker neocortex in task-relevant brain areas before the onset of training would be associated with faster rates of learning. To test this hypothesis, we measured participants' brain anatomy using magnetic resonance imaging (MRI). Following that, we trained them on a complex motion trajectory detection task over several sessions.
The motion-discrimination task was advantageous because it could be expected to recruit certain well-characterized brain areas involved in perceptual learning of moving stimuli. Previous studies have linked activity in areas MT+ (also called V5; Zohary et al. 1994; Vaina et al. 1998), V3ab (Shibata et al. 2012), and posterior parietal cortex (PPC) (Law and Gold 2008) with perceptual learning involving discrimination of moving stimuli or detection of global motion. In addition, many perceptual learning experiments investigating a wide variety of visual features (e.g., texture, orientation, shape, conjunctions of color, and location) have identified retinotopic visual cortex as functionally related to learning, particularly retinotopic areas corresponding to the location of the stimuli (e.g., Schwartz et al. 2002; Furmanski et al. 2004; Kourtzi et al. 2005; Yotsumoto et al. 2008; Frank et al. 2014).
We therefore expected a priori that these 4 regions—MT+, V3ab, PPC, and the specific retinotopic areas in visual cortex that processed where training stimuli appeared—were most likely to be functionally related to learning in our complex motion search task. However, to confirm this a posteriori, we measured brain activity during performance of the learning task, relative to a control condition, for the first and last training sessions using functional MRI (fMRI). We predicted increased activity in the 4 regions of interest (ROIs) after behavioral training (see Vaina et al. 1998; Schwartz et al. 2002; Furmanski et al. 2004; Grossman et al. 2004; Kourtzi et al. 2005; Yotsumoto et al. 2008; Song et al. 2010; Frank et al. 2014).
Thus, in the end, we analyzed individual differences in the pretraining cortical thickness of ROIs confirmed via fMRI to be involved in learning the motion task, correlating participants' cortical thickness in those areas with their learning rates. Significant correlations support the hypothesis that preexisting individual differences in brain anatomy account for individual differences in perceptual learning rate.
Twenty-seven participants (mean age = 26 ± 5 years, 15 females), including one of the authors (S.M.F.), were recruited for the study and gave informed written consent prior to participation. The study was approved by the local ethics committee at the University of Regensburg in accordance with the Declaration of Helsinki.
Design and Procedure
The study consisted of 18 sessions performed on separate days. During sessions 1–12, participants trained on the learning task. Most participants performed the first and last training sessions during functional MRI scanning. In the same 2 fMRI-sessions, participants also performed a pop-out visual search control condition. Condition order was counterbalanced across participants. Before and after the end of training we collected high-resolution anatomical MRI scans of each participant's brain.
After a break of 3 months during which there was no further training, a subset of participants returned for 3 follow-up test sessions in the learning task. They then performed 3 sessions of the visual search task with exchanged target and distractor identities.
For all participants, eye-tracking data were collected during all behavioral training sessions outside the scanner using a video-based eye-tracking system (Cambridge Research Systems, Kent, UK) that sampled the horizontal and vertical position of the right eye with a frequency of 250 Hz.
Visual Search Tasks
Three different visual search tasks were employed in the study. All visual search tasks used a constant number of stimuli. In each task 8 radially arranged white dots (0.3° visual angle) moved in the peripheral visual field (10° radially away from the central fixation spot). Participants had to maintain fixation and detect the presence or absence of a target dot exhibiting a particular apparent motion trajectory among distractor dots with a different motion trajectory (2-alternative forced choice task). Target position in the search array was not relevant to the response; participants only had to indicate that a target was present.
In the learning task (primary visual search task), the target dot moved downward and to the right at a 45° angle, then upward and to the right at a 45° angle, forming a “V” shaped trajectory. Distractor dots moved upward to the right and then downward to the right, forming an inverted “V” trajectory (Fig. 1a).
The second visual search task was designed to elicit efficient (pop-out) visual search. In this control condition participants searched for a target dot moving in a single direction (either upward or downward, and to the right) among distractor dots moving along the other diagonal (Fig. 1b). Target vs. distractor direction was counterbalanced across participants.
The third visual search task used the same motion trajectories as those used in the learning task, but target and distractor identities were exchanged. Thus, participants had to search for a target moving upward and then downward, to the right, among distractors moving downward and then upward, to the right (Fig. 1c). This condition was tested after participants had finished their training on the learning task. We designed this exchange experiment in order to test if behavioral improvements were specific to the learned target and distractor identities.
In all 3 tasks, stimuli were generated and presented using the Psychophysics toolbox (Brainard 1997; Pelli 1997) running in MATLAB (The MathWorks, Natick, MA, USA). The spatial position of the initial dot, and therefore all subsequent dots on a trajectory, was slightly jittered on every trial. During a complete apparent motion trajectory a dot appeared at 9 successive screen positions for 66.6 ms each, and a 217-ms blank interval occurred at the end of a trajectory. After the blank interval the dot appeared again at the start position. Dots cycled continuously through the same trajectory until the end of each trial. Dots always moved to the right. Successive diagonal positions were 0.5° apart (corresponding to a dot speed of 7.5°/s). After the fifth position the vertical direction changed (downward to upward or vice versa), except in the pop-out control condition, where dots continued to move in the same direction. Each of the 8 dots started moving at the same time but at a different starting position within the motion trajectory. This phase jitter was introduced to avoid any perception of coherent global motion patterns that could have resulted if all dots had moved in phase with each other.
Stimuli were presented for 4 s on each trial. Participants performed a 2-alternative forced choice task and indicated target presence or absence by pressing one of 2 buttons. They were asked to respond as quickly and accurately as possible. Participants had an additional 2 s after stimulus offset to give a response if they did not respond during the presentation. Visual feedback (green or red fixation color change for correct or incorrect) was provided at the end of each trial. In the behavioral sessions, participants initiated each trial using a button press. During functional MRI sessions, trial onsets were synchronized to scan onsets at jittered regular intervals (varying between 4, 6, and 8 s, counterbalanced within each run). The order of target present and target absent trials was random, and 160 trials were performed per session (50% target-present trials, 50% target-absent). During fMRI, the same number of trials was split into 5 runs with 32 trials each (half of which contained a target). Trial order within each run was random. In each behavioral session, the position of the target within the stimulus array was counterbalanced across trials (in the fMRI-sessions it was counterbalanced within each run). We equated stimulus appearance between the MRI scanner and training computer by using the same viewing distance (63 cm) and similar luminance proportions of stimuli and background (Michelson contrast: Psychophysics = 0.99, MRI = 0.98; luminance of black background: Psychophysics = 0.16 cd/m2, MRI = 1.7 cd/m2; luminance of white stimuli: Psychophysics = 185.44 cd/m2, MRI = 193 cd/m2). Behavioral sessions of the visual search tasks lasted ∼20 min. The session duration in the scanner was ∼30 min because of longer intertrial intervals.
MR-imaging was performed with a 3-T Allegra head scanner (Siemens, Erlangen, Germany). We collected a high-resolution T1-weighted scan of each participant's brain (160 sagittal slices) using a magnetization prepared rapid gradient echo (MP-RAGE) sequence (time-to-repeat TR = 2.25 s, time-to-echo TE = 2.6 ms, flip-angle FA = 9°, voxel-size = 1 × 1 × 1 mm3, no interslice gap, field of view FOV = 240 × 256 mm2) that was optimized for the differentiation of gray and white matter using sequence parameters from the Alzheimer's disease Neuroimaging Initiative project (http://adni.loni.ucla.edu). For functional MR-imaging we used a standard T2*-weighted EPI-sequence (34 transverse slices, TR = 2 s, TE = 30 ms, FA = 90°, voxel-size = 3 × 3 × 3 mm3, interslice gap = 0.5 mm, FOV = 192 × 192 mm2).
Anatomical. All participants underwent a high-resolution anatomical MRI scan of the brain at the beginning of the study. For each participant we also collected a high-resolution anatomical scan at the end of training on the learning task (i.e., ∼3 weeks after the first MRI-scan).
Functional localizers. In all functional localizer scans, participants performed a speeded dimming-detection task at central fixation in which they pressed a button whenever the fixation cross flickered.
MT+. During the first anatomical scan session, participants performed a functional localizer task for area MT+. It contained blocks of 200 white dots moving coherently in 12 successive translational directions, alternating with blocks of static dots. Block duration was 12 s and the localizer scan lasted 9.6 min.
Retinotopic visual cortex. For the 19 participants who performed the first and last training session in the scanner we conducted localizers for retinotopic visual cortex and the retinotopic location of the moving dots within retinotopic cortex. Retinotopy was mapped using a standard phase-encoded retinotopic mapping procedure. A bowtie-shaped double-wedge checkerboard pattern flickering in different colors rotated across 18 screen positions subtending each location for 3 s. There was a total of 12 cycles of rotation resulting in a run length of 10.8 min. Two runs were conducted, one in clockwise one in counterclockwise directions.
Since the visual search training primarily stimulated subregions of the retinotopic areas (neurons representing the visual periphery), we conducted a spot-localizer to identify the retinotopic location of the 8 moving dots in the visual periphery. Therefore, circular checkerboards, flickering in different colors, were presented in different blocks at dot locations in the visual quadrants or horizontal and vertical meridians. Each block lasted 12 s and was followed by a blank interval of 12 s (overall duration = 9.6 min).
We quantified participants' perceptual learning rate on the visual search task via a composite measure of reaction time and accuracy: the inverse efficiency score (IES) (Townsend and Ashby 1978). Specifically, for each participant and training session, the median reaction time (in seconds) across all trials was divided by accuracy (in proportion of one), then log-transformed to compensate for distortions of IES by low accuracy values (see Bruyer and Brysbaert 2011). We computed the rate of change in each individual's IES by fitting a linear function to the transformed IES score across training sessions (excluding retesting sessions) and calculating its slope. A steeper (i.e., more negative) slope indicated faster learning. This slope value was used in all subsequent analyses as the participant's learning rate index. Slope values were compared with zero by using a one-sample t-test in order to see if participants' performance improved with training.
Transformed IES scores were also used to quantify performance in the different sessions of the pop-out and target-distractor exchange conditions. For the target-distractor exchange condition we computed for each participant, using the same approach as in the visual search task, a slope value quantifying the improvement in performance over the course of the 3 sessions in this condition and compared the slope values with zero with a one-sample t-test.
Participants were instructed to perform the search covertly while maintaining central fixation. However, small deviations in eye position across a trial are inevitable. If changes in the quality of fixation occur over the course of training, they could confound the interpretation of behavioral changes with respect to true learning effects. Accordingly, we analyzed the mean deviation of the eye position from central fixation (in degrees of visual angle) from trial onset until button press in all trials across behavioral training sessions (N = 25, in most cases; see Supplementary Fig. 2 for details). Similar to the computation of changes in learning score across training sessions we performed a separate linear regression comparing mean deviation of eye position from fixation within a session (averaged across trials) over sessions (without retesting and exchange conditions) for each participant, and compared the slopes of these regressions to zero, across participants, using a one-sample t-test.
MRI Data Analysis
Anatomical and functional MRI data were analyzed using Freesurfer 4.1 and the FSFAST-toolbox (Martinos Center for Biomedical Imaging, Charlestown, MA, USA).
Preprocessing. Preprocessing of the functional images included motion-correction, coregistration to the high-resolution individual T1 scan collected on the first MRI-session, smoothing (3-dimensional Gaussian kernel, full-width at half-maximum = 5 mm), and intensity-normalization. Automatic coregistrations of the functional volumes to the T1 image were checked manually and corrected if necessary.
ROI Creation. Regions of interest were defined on individual participants' inflated brains for the functional MRI data at a threshold of P < 0.001 (false discovery rate-corrected).
Area MT+. The contrast visual motion > rest (i.e., static) on the MT+ localizer revealed a distinct (in most cases) cluster of activation in the region of the occipito-temporal junction representing area MT+.
Retinotopic Cortex (including area V3ab). Phase-encoded retinotopic mapping data were analyzed according to standard techniques (see DeYoe et al. 1996; Engel et al. 1997). Area V3ab was defined as a single ROI.
Trained Retinotopic Locations (TRL). This ROI was defined by finding the intersection between functionally localized retinotopic cortex and stimulus training locations identified in the spot localizer. In most participants, these 2 thresholded maps only overlapped in V1–3.
PPC. We localized PPC (posterior parietal gyrus + sulcus) for each participant using automated parcellation (Desikan et al. 2006) of the reconstructed and inflated high-resolution anatomical scan collected at the beginning of the study.
Visual search. Preprocessed fMRI-data from the visual search conditions were analyzed using a general linear model (GLM) approach with an event-related design. The GLM-model of sessions in the learning task contained 4 predictors of interest for activity in correct target-present trials (hits), correct target-absent trials (correct rejections), incorrect target present trials (misses), and incorrect target absent trials (false alarms). In the pop-out control condition, there were only 2 predictors: hits and correct rejections (most participants had very few if any incorrect trials in this condition). Each GLM-model also contained a linear scanner drift predictor and motion-correction parameters as regressors of no-interest. The blood oxygen level-dependent (BOLD) response was modeled using the SPM canonical hemodynamic response function. Event durations were based on trial-wise reaction times, as in previous work (i.e., the BOLD response was modeled from trial onset until the participant's response) (see Frank et al. 2014).
For each predictor of interest and each session, BOLD percent signal change from implicit baseline, based on the fitted amplitude of the response, was calculated separately for left and right hemispheres in each ROI, then the results were averaged across hemispheres. Reported results focus on collapsed activations between hit and correct rejection trials. By comparing only correct trials and using trial-wise reaction times to define the BOLD response model, potential confounds related to changes in participant performance are minimized. By thus controlling for these confounds, results can be interpreted as reflecting changes in stimulus processing (see Furmanski et al. 2004).
MRI-activity in the learning task before and after training was compared with MRI-activity in the pop-out control task using a separate 2 × 2 repeated-measures analysis of variance (ANOVA) in each ROI, with factors condition (learning task or pop-out control) and time-point (before or after training).
As an additional analysis, we compared each participant's overall change in activity (correct trials only) in each ROI from the beginning of training to the end to his or her perceptual learning rate, using a correlation. A difference score comparing post- to pretraining activation in the learning task was used to quantify changes in activity. This difference score was computed as BOLD percent signal in post-training minus BOLD percent signal change in pretraining. BOLD percent signal change was calculated as described above and represents activity collapsed across correct target present and absent trials. As a control, we performed the same correlation using activity in incorrect target present and absent trials.
Each participant's high-resolution anatomical T1 scans (one collected at the beginning and one at the end of training) were separately reconstructed and inflated in Freesurfer (Dale et al. 1999; Fischl et al. 1999). The reconstructed pretraining T1 was used to compute the average pretraining cortical thickness. Cortical thickness was calculated as the distance between the white/gray matter boundary and the pial surface, with submillimeter resolution (see Fischl and Dale 2000; Dickerson et al. 2008). Reported cortical thicknesses for each ROI are the average across cerebral hemispheres within 2 bilateral ROIs.
Potential correlations were evaluated between the pretraining cortical thickness of each ROI and subsequent perceptual learning rate. We also performed a post hoc whole-brain correlation between pretraining cortical thickness and perceptual learning rate using the QDEC tool in Freesurfer.
In addition, the pre-training cortical surface area of each ROI was computed based on the pre-training T1 scan and examined for a correlation with subsequent learning rate. Cortical surface area refers to the size of the ROI (in mm2) on the cortical flat map.
Finally, we investigated possible changes in cortical thickness over the course of training. For this analysis, ROIs defined on each individual's pretraining anatomical scan were remapped to the same participant's post-training anatomical scan. The average thickness of each ROI before and after training was compared with a paired-samples t-test.
Figure 1d shows average behavioral improvements over sessions. Two participants exhibited learning rates >2 standard deviations above or below the group mean and were excluded as outliers from all subsequent analyses, leaving a total of 25 participants. With training, the IES learning index decreased dramatically. Individual slopes (capturing the change in IES over sessions) were significantly different from zero as indicated by a one-sample t-test (t(24) = −9.00, P < 0.001). The improvements with training were also evident in participants' primary behavioral measures (accuracy, reaction time, and d′; see Supplementary Fig. 1b–d). Although each participant improved with training, but there was substantial variability in learning rate between participants (see Supplementary Fig. 1a).
In the control condition with pop-out visual search (N = 19), there was a marginally significant difference in the IES between post- and pretraining performance (t(18) = −1.94, P = 0.07; circle-symbols at training days 1 and 12 in Fig. 1d and Supplementary Fig. 1b,c). This trend mainly resulted from slightly faster reaction times in the post-training session (see Supplementary Fig. 1c). Accuracy was near ceiling in both sessions (Supplementary Fig. 1b), as expected for an efficient, pop-out visual search condition.
A subset of participants (N = 10) returned 3 months after their last training session to perform 3 more sessions of the learning task (retest-condition) (see R1–R3 in Fig. 1d and Supplementary Fig. 1). There was no significant difference between learning indices measured on the final day of training and first retesting session (paired t(9) = −0.60, P = 0.57). This suggests that there was little if any forgetting (i.e., loss in performance) when participants stopped practicing the task for 3 months.
After completion of the retest condition, returning participants performed 3 additional sessions in which target and distractor identities were exchanged. This resulted in a dramatic drop in performance by all metrics (see E1–E3 in Fig. 1d, Supplementary Fig. 1). The difference in IES between the third retesting session and the first exchange session was significant (R3 vs. E1, paired t(9) = −10.68, P < 0.001). These results suggest that participants' learning was specific to the trained target and distractor identities, consistent with previous results (Shiffrin and Schneider 1977; Ahissar et al. 1998; Frank et al. 2014). Over the course of the 3 sessions with exchanged target and distractor identities, participants' performance improved (individual slopes were significantly different from zero: one-sample t(9) = −2.29, P = 0.048), indicative of new learning.
Participants' fixation quality measured during behavioral testing remained consistent over the course of training, across trial types (t(24) = 1.61, P = 0.12). Separate analyses for target-present and target-absent trials indicated the same result (target-present: t(24) = 1.56, P = 0.13; target-absent: t(24) = 1.47, P = 0.16). Fixation quality was quite good in general; the mean deviation of eye-position across all trials and training sessions was 1.78° ± 0.75° (see Supplementary Fig. 2). Changes in eye movements, therefore, were not considered to be a likely confound for interpretation of the behavioral or MRI data.
ROI locations in a representative participant are shown in Figure 2. Figure 3 shows activity in each ROI, collapsed across correct target present and absent trials, in the first and last training sessions of the learning task and the pop-out control task. Each condition and session included 19 participants (the other 6 participants in the study only completed behavioral trainings).
The 2 × 2 repeated-measures ANOVAs with factors condition (trajectory-training task or pop-out control) and time-point (before or after behavioral training) conducted separately in each ROI revealed the following effects. There was a significant interaction in the fMRI percent signal change in area MT+ (F1,18 = 9.71, P = 0.006), suggesting that activity significantly increased in the learning task after training but did not change in the pop-out control condition (Fig. 3a). There were no main effects of search condition (F1,18 = 3.15, P = 0.09) or time-point (F1,18 = 0.57, P = 0.46) in MT+. Similar interactions between search condition and time-point were obtained in TRL in visual cortex (F1,18 = 7.14, P = 0.02) and area V3ab (F1,18 = 10.42, P = 0.005). Activity in these 2 areas also increased from the pretraining to post-training sessions in the learning task and remained relatively unchanged in the pop-out control condition (Fig. 3b,c). There were significant main effects of condition in TRL (F1,18 = 16.79, P < 0.001) and V3ab (F1,18 = 13.28, P = 0.002), indicating that activity in the control task was more pronounced than in the trajectory-learning task across time-points (Fig. 3b,c). There was also a marginally significant effect of time-point on TRL (F1,18 = 4.18, P = 0.06) and no such effect in V3ab (F1,18 = 1.30, P = 0.27). In PPC, there were no main effects of condition (F1,18 = 0.05, P = 0.83) or time-point (F1,18 = 0.23, P = 0.64), nor an interaction between the 2 factors (F1,18 = 1.84, P = 0.19).
Correlation Between Behavioral and fMRI Results
We correlated the change in BOLD signal between the first and last sessions of the learning task, for each ROI, and behavioral learning rate on this task (i.e., slope of change in IES-score across all training sessions). There was a significant correlation in MT+ (r = −0.47, P = 0.04): participants with faster learning also exhibited more pronounced increases in fMRI activity in that ROI with learning. No such correlation was evident in TRL (r = −0.17, P = 0.47), V3ab (r = −0.33, P = 0.16), or PPC (r = −0.34, P = 0.15). As a control, we performed the same correlation on BOLD activity during incorrect trials (misses and false alarms), and did not find any significant effects (MT+: r = −0.40, P = 0.09; TRL: r = −0.11, P = 0.66; V3ab: r = −0.32, P = 0.18; PPC: r = −0.23, P = 0.34). Although there was a trend for a correlation in MT+ in incorrect trials, the change in activity from pretraining to post-training was significantly more pronounced in correct trials than in incorrect trials in that area (paired t(18) = 2.52, P = 0.02).
Pretraining Anatomical Differences
We examined whether cortical thickness, measured before training, in areas where we predicted there would be and later indeed found changes in activity with training (i.e., MT+, TRL, and V3ab), would be predictive of subsequent perceptual learning rate. We found a significant correlation between pretraining cortical thickness and subsequent perceptual learning rate in area MT+ (r = −0.44, P = 0.03; N = 25) (Fig. 4). In contrast, no significant correlations between pretraining cortical thickness and learning rate were observed in TRL (r = −0.18, P = 0.46; N = 19) or area V3ab (r = −0.34, P = 0.15; N = 19).
Following the example of Bi et al. (2014), we later performed an exploratory whole-brain correlation of pretraining cortical thickness and subsequent perceptual learning rate (Fig. 5, N = 25). There were clusters of significant correlation between pretraining cortical thickness and subsequent learning rate at a location congruent with the average location of MT+ across participants. These clusters were evident in both hemispheres. In addition, there were 2 bilateral clusters of significant correlations in PPC. Based on this observation we investigated the PPC ROI, where activity remained unchanged by training, for a correlation of pretraining thickness and learning rate. Similar to area MT+, thicker cortex in PPC before training was associated with faster subsequent learning (PPC: r = −0.52, P = 0.008; N = 25; Fig. 6). Cortical thicknesses of MT+ and PPC were significantly intercorrelated: participants with thicker MT+ also tended to have a thicker PPC and vice versa (r = 0.52, P = 0.008).
The results obtained were not dependent upon the exclusion of the 2 behavioral outlier participants. The same results hold when these participants were included in a control analysis. There were still significant correlations in MT+ (r = −0.40, P = 0.04; N = 27) and PPC (r = −0.53, P = 0.004; N = 27) but not in TRL (r = −0.05, P = 0.83; N = 20) or V3ab (r = −0.14, P = 0.56; N = 20).
When the analysis was restricted to the 19 participants who performed the first and last training sessions in the scanner (blue circles in Figs 4 and 6), there was a marginally significant correlation in MT+ (r = −0.43, P = 0.07) and a significant correlation in PPC (r = −0.57, P = 0.01). Again, no significant correlation was present in TRL (r = −0.18, P = 0.46) or V3ab (r = −0.34, P = 0.15).
In no ROI did pretraining cortical thickness predict participants' starting performance on the search task (i.e., the intercepts of their learning functions) (MT+: r = 0.09, P = 0.66; PPC: r = 0.25, P = 0.23; TRL: r = 0.32, P = 0.19; V3ab: r = 0.15, P = 0.53). Cortical thickness did not change significantly between the beginning and end of training in any ROI (MT+: t(24) = −0.94, P = 0.36; PPC: t(24) = 1.29, P = 0.21; TRL: t(18) = 0.50, P = 0.62; V3ab: t(18) = −0.96, P = 0.35). Pre-training cortical surface area (i.e., size of the ROI on the cortical flat map) did not significantly predict learning rate in any ROI (MT+: r = 0.11, P = 0.59; PPC: r = −0.22, P = 0.28; TRL: r = 0.08, P = 0.73; V3ab: r = 0.25, P = 0.31).
Using the human visual system as a model, we demonstrate that preexisting individual differences in brain anatomy predict subsequent learning rates. In particular, having thicker cortex in area MT+ before training is associated with faster learning of complex motion trajectories in a visual search task that recruits MT+.
Improvements in Behavior
Over 3 weeks of training, participants' performance on a visual search task for a complex motion trajectory improved dramatically. This is consistent with a large literature describing a wide variety of perceptual learning effects (e.g., Ball and Sekuler 1982; Karni and Sagi 1991; Fahle and Edelman 1993; Watanabe et al. 2002; Jastorff et al. 2006; Bi et al. 2010). The data show that, once acquired, these improvements were stable over the course of months (see also Ball and Sekuler 1987; Karni and Sagi 1993; Watanabe et al. 2002; Bi et al. 2014; Frank et al. 2014). Moreover, the learning was specific to trained target and distractor identities. When target and distractor identities were exchanged, acquired improvements in visual search were abolished (see also Shiffrin and Schneider 1977; Ahissar et al. 1998; Frank et al. 2014). Thus, observed improvements are unlikely to have been driven by global difference-detection or by learning of a more efficient search-strategy, because use of familiar distractors as targets and vice versa substantially impaired search performance.
Changes in Functional Activity
After 3 weeks of behavioral training, changes in activity during task performance were observed in sensory cortex: activity in MT+, TRL, and V3ab increased, compared with pretraining activity. Meanwhile, activity in these areas remained essentially unchanged in the pop-out control condition (except in TRL). These results are in agreement with other reports of increased activity in sensory cortex as a result of perceptual training (e.g., Vaina et al. 1998; Schwartz et al. 2002; Furmanski et al. 2004; Kourtzi et al. 2005; Yotsumoto et al. 2008; Frank et al. 2014). In our study, MT+ was the only area where activity increases were correlated with learning rate. In this respect, area MT+ might be central to learning our motion task.
Yotsumoto et al. (2008) found that training on a perceptual task was initially associated with increase of activity in visual cortex but reverted back to pre-training baseline levels after longer periods of consolidation. Our results are consistent with the first part of this finding, since we found increases in early retinotopic visual areas. However, our experiment was not designed to measure activity after a long period of consolidation, so we could not test the “return-to-baseline” activity as reported by Yotsumoto et al. (2008).
In contrast to sensory areas, no activity change was evident in PPC. Activity in PPC did not significantly differ between the learning task and the pop-out control condition at any time. It has been suggested that higher-order regions like PPC read out sensory information coming from earlier areas (see Sasaki et al. 2010) and that improvements in this read-out process might be key to perceptual learning (e.g., Law and Gold 2008; Dosher et al. 2013). The lack of change in PPC activity might suggest unchanged read-out over time (at least as measured by fMRI). Instead, activity changed in sensory cortex. Thus, the present data are more in line with explanations of perceptual learning involving changes in representations in sensory areas (e.g., Vaina et al. 1998; Schwartz et al. 2002; Furmanski et al. 2004; Kourtzi et al. 2005).
This begs the question why PPC emerged as the one area other than MT+ that showed a bilateral association between cortical thickness and learning rate. Considering the significant intercorrelation between the thicknesses of these 2 areas, it is possible that the significant association is merely a spurious correlation: perhaps MT+ and PPC simply tend to covary in thickness and the PPC-learning relationship is simply a third-variable-driven correlation (driven by MT+ thickness). However, it is also possible that the PPC-learning relationship reflects a real influence of PPC stucture on performance of the visual-search learning task. PPC is certainly relevant to the task: reading out the detected pattern from sensory cortex is essential in order to form a decision about the stimulus array (i.e., target present or absent, see also Shadlen and Newsome 2001; Sasaki et al. 2010). In this sense, people with thicker, better performing cortex in PPC might experience better read-out and consequently learn the task faster. Alternatively, PPC may perform some other processing necessary for learning to take place, such as the shifting or allocation of attention, but which does not itself change with training (Corbetta et al. 1995; Nobre et al. 2003; Frank et al. 2014).
In summary, we suggest that a network of areas, consisting of regions in retinotopic cortex, area V3ab, and area MT+, is involved in learning a sensory representation of our stimuli. This is indicated by increased activation in these areas during task performance after training and might reflect increased neuronal firing in response to the stimuli (Furmanski et al. 2004). We speculate that the output from this network, consisting of representations associated with target present or target absent search arrays, is provided to or accessed by PPC, which reads out this information and informs a decision about target presence or absence.
These findings add to a growing literature on cortical thickness. Variations in cortical thickness have been associated with individual differences in a variety of domains, including perception (see Kanai and Rees 2011, for a review). In binocular rivalry, the thickness of parietal cortex is positively correlated with individuals' reported alternation-rates (Kanai et al. 2010). Another study found that gray matter volume of left posterior superior temporal sulcus (STS) and left ventral premotor cortex was positively correlated with the detectability of biological motion in noise (Gilaie-Dotan et al. 2013). Ditye et al. (2013) report an increase in cortical thickness of STS following training on a perceptual learning task, the size of which was positively correlated with individual learning rates. In addition, preexisting cortical thickness was recently reported to be predictive of perceptual learning rate for face-view discrimination, though in that experiment, thinner cortex was associated with faster learning of the stimuli (Bi et al. 2014). Bi et al's results are unusual in finding thinner cortex to predict faster learning rates.
Our data are consistent with the majority of previous cortical thickness effects reported in other domains (see Kanai and Rees 2011). Our results show a beneficial effect of thick cortex (unlike Bi et al. 2014), while no change in cortical thickness with training was evident (unlike Ditye et al. 2013).
Why would it be beneficial to have a thicker cortex in MT+ for learning a motion-discrimination task? We presume that thicker cortex corresponds to increased processing capacity (see Kanai and Rees 2011). If so, a thicker MT+ might more efficiently combine sensory inputs into a coherent representation of target and distractor stimuli. It is important to emphasize, though, that it is impossible to directly relate macroscopic differences in cortical thickness measured with MRI to differences on the cellular level.
This experiment demonstrates that individual differences in learning rate in a motion-discrimination task can be predicted by preexisting anatomical differences in task-relevant brain areas: participants with thicker MT+ before training learned more quickly. No such relationship was found in visual cortex, though post hoc analyses identified a similar relationship in PPC. Taken together, these results demonstrate, using the visual system as a model, that pre-existing individual differences in the anatomy of brain areas critical to a task can predict subsequent learning rate on that task. It remains an interesting and open question for future research whether this finding extends into other domains of learning.
We are grateful to Steven Blurton, Martin Gall, Markus Goldhacker, Tina Plank, and Katharina Rosengarth for assistance with data collection and to our participants for their time. We also thank 3 anonymous reviewers for helpful comments on a previous version of the manuscript. Conflict of Interest: None declared.