Abstract

Perception and actions can be tightly coupled; but does a perceptual event dissociated from action processes still engage the motor system? We conducted 2 functional magnetic resonance imaging studies involving rhythm perception and production to address this question. In experiment 1, on each trial subjects 1st listened in anticipation of tapping, and then tapped along with musical rhythms. Recruitment of the supplementary motor area, mid-premotor cortex (PMC), and cerebellum was observed during listen with anticipation. To test whether this activation was related to motor planning or rehearsal, in experiment 2 subjects naively listened to rhythms without foreknowledge that they would later tap along with them. Yet, the same motor regions were engaged despite no action–perception connection. In contrast, the ventral PMC was only recruited during action and action-coupled perceptual processes, whereas the dorsal part was only sensitive to the selection of actions based on higher-order rules of temporal organization. These functional dissociations shed light on the nature of action–perception processes and suggest an inherent link between auditory and motor systems in the context of rhythm.

Introduction

Audiovisual mirror neurons discharge when an action is heard, seen or performed (Kohler et al. 2002; Keysers et al. 2003). These findings have led to the idea that mirror neurons with auditory properties may also be involved in a hearing–doing network for music performance (Bangert et al. 2006; Lahav et al. 2007). These studies show that the ventral premotor cortex (vPMC) and posterior regions of the inferior frontal gyrus (pars opercularis, Brodmann’s area [BA] 44; pars triangularis, BA 45) are engaged when subjects listen to action-related sounds such as musical melodies that they have been trained to play. Such findings add to a large body of literature on mirror neurons and imagery, demonstrating that the neural substrates mediating action and perception can be tightly coupled (Grezes and Decety 2001; Rizzolatti and Craighero 2004). In fact, the neural principles underlying action-observation and imagery may be similar (Grezes and Decety 2001); perceptual events are often related in an inextricable manner to motor actions so that for example, the sound of paper tearing or hearing a familiar piece of music can invoke imagery of movements being made to rip the sheet of paper or execute the musical piece, respectively. However, one outstanding question is whether a purely perceptual event dissociated from action processes can still engage the motor system. That is, if sounds do not signal the motor system to act, would auditory mirror neurons still resonate? Here we investigate action–perception coupling and decoupling in the context of musical rhythm processing to test this hypothesis.

It has been well established that movement synchronization with auditory rhythms (Rao et al. 1997; Jancke et al. 2000; Lewis et al. 2004; Chen et al. 2006, 2008), and imagery of musical performance (Langheim et al. 2002; Lotze et al. 2003; Meister et al. 2004) engage motor regions of the brain, including the PMC, supplementary and presupplementary motor areas (SMA and pre-SMA respectively), and cerebellum. More interestingly, recruitment of motor regions has also been demonstrated during music perception (Haueisen and Knosche, 2001; Zatorre and Halpern 2005; Bangert et al. 2006; D'Ausilio et al. 2006; Baumann et al. 2007; Lahav et al. 2007), however, the aim of these studies was to provide evidence for a tight auditory–motor coupling, in which case sounds were meaningful to the motor system. For example, subjects were trained to make sound–movement associations by learning how to play simple melodies on a keyboard, or the stimuli were familiar so that subjects might have easily imagined or rehearsed their performance while listening. Similarly, motor recruitment during music (Sakai et al. 1999; Xu et al. 2006; Grahn and Brett 2007) and speech (Geiser et al. forthcoming) rhythm perception may in part be due to motor preparation or rehearsal. Therefore, the present experiment aimed to determine if motor regions of the brain are still involved when one only listens to musical rhythms without imagining, or anticipating the synchronization of movements with them.

On the one hand, a piece of music is merely composed of a sequence of sounds, so listening to it should simply engage the auditory system. However, music can also be catalytic in stimulating rhythmic movements: people often spontaneously tap their feet or nod along, synchronizing each action with the beat of a tune, regardless of musical aptitude (Snyder and Krumhansl 2001; Large et al. 2002). Furthermore, the manner in which people move can affect how sounds are perceived: babies and adults prefer listening to rhythms whose beat they are bounced to, and not to rhythms whose beat are motorically unfamiliar (Phillips-Silver and Trainor 2005, forthcoming). This suggests that a natural link between the auditory and motor systems may exist. Therefore, we hypothesize that the PMC, a region known to be involved in sensorimotor transformations (Wise et al. 1996; Rizzolatti and Luppino 2001; Hoshi and Tanji 2007; Zatorre et al. 2007), could be the neural substrate mediating these interactions. In particular, some have argued that the vPMC and regions in the posterior inferior frontal gyrus are important for the processing of action-related sounds (Bangert et al. 2006; Lahav et al. 2007). Recent evidence has also pointed to the role of the dorsal PMC (dPMC) in mediating auditory–motor interactions during rhythmic tapping, and suggests that it may be important in abstracting higher-order information from an auditory stimulus so that timely actions can be implemented (Chen et al. 2006, 2008). Thus, the present paper will also evaluate the suggestion that different subregions within the PMC might mediate different types of auditory–motor interactions.

We report 2 functional magnetic resonance imaging (fMRI) studies where results of the 1st investigation motivated the 2nd. In experiment 1, subjects listened with anticipation and tapped in synchrony with 3 musical rhythms that varied in temporal complexity. Manipulation of rhythm complexity allows us to assess the hypothesized premotor functional dissociation. In experiment 2, the critical manipulation involved inclusion of a naive passive listening condition, that is, subjects listened to the rhythms, but were naïve to the fact that they would be tapping along with them during the latter portion of the experiment. This allowed us to disambiguate involvement of the motor system including the PMC, during naïve passive perception compared with action-coupled perceptual processes such as motor preparation and imagery, and action.

Methods and Materials

Subjects

Experiment 1

Twelve (6 female) subjects participated in the experiment after giving informed written consent for a protocol approved by the Montreal Neurological Research Ethics Review Board. Volunteers ranged from 20 to 32 years of age (mean 23.83 years), were right-handed, healthy with normal hearing and were nonmusicians. The definition of a nonmusician includes a person with less than 3 years of musical training or experience, and who is not currently playing an instrument. This is consistent with the selection criteria for nonmusicians used in a number of previous studies from our laboratory (e.g., Savion-Lemieux and Penhune 2005). These subjects were the same as those studied in a prior paper that focused on the neural basis of rhythm complexity (Chen et al. 2008). Importantly, the data reported in experiment 1 of the present paper were not previously reported.

Experiment 2

Twelve new subjects (6 female) were recruited following the same guidelines as that of experiment 1. Volunteers ranged from 19 to 34 years of age (mean 24 years).

Stimuli

Experiments 1 and 2

Subjects listened to, and tapped in synchrony with 3 different auditory rhythms using the index finger of the right hand on a computer mouse key. Each rhythm comprised 11 musical notes, each note composed of a woodblock sound 200 ms in duration. The interval following each sound was varied such that 5 different musical durations (onset-to-onset) would be created, each rhythm containing (in musical terminology): 5 eighth notes (each 250 ms), 3 quarter notes (each 500 ms), 1 dotted quarter note (750 ms), 1 half note (1000 ms), and 1 dotted half note (1500 ms). Thus all rhythms were 6000 ms in duration with the same total number and type of notes that differed only in their temporal organization. This manipulation allowed us to create 3 rhythms with increasing temporal complexity: simple, complex, and ambiguous (Fig. 1). These rhythms were composed based on the rules of metrical organization: sequences that are temporally regular and thus induce a strong sense of beat are metric in structure (i.e., simple); those on the opposite continuum that are temporally irregular are nonmetric because the sense of beat is weak or ambiguous (Povel and Essens 1985, for detailed discussion see Chen et al. 2008).

Figure 1.

Schematic depiction of stimuli. Top row in each case shows the temporal sequence of events; bottom row shows the equivalent musical notation. All rhythms contained the same number and type of musical note durations, but arranged to create 3 levels of increasing metrical complexity: simple, complex, ambiguous.

Figure 1.

Schematic depiction of stimuli. Top row in each case shows the temporal sequence of events; bottom row shows the equivalent musical notation. All rhythms contained the same number and type of musical note durations, but arranged to create 3 levels of increasing metrical complexity: simple, complex, ambiguous.

Procedure

Experiment 1

Prescan.

Subjects came to the laboratory 1 day prior to the fMRI session to be familiarized with the 3 rhythms in order to minimize the potential confound of motor learning during fMRI scanning. Rhythms were presented at a comfortable sound intensity level though Sony headphones using Presentation software (version 0.8, from Neurobehavioral Systems, Albany, CA) on a PC computer. Responses were made on the left mouse button using the right index finger, recorded online. As a warm-up and to address any nontask specific effects, subjects were 1st presented with 6 easy rhythms of 4 trials each. These rhythms were defined as easy because they were each composed of 3-beat motifs that repeated consecutively for 3 times (as opposed to the test rhythms that had no repeating structure). Subjects listened during the 1st trial and then tapped in synchrony for the subsequent 3 trials for each of the 6 rhythms. Next, each of the 3 test rhythms was presented in a block of twenty trials in order to optimize learning, each block randomized for order across subjects. Subjects listened to a rhythm during the odd numbered trials and learned to tap in synchrony with it during the even numbered trials. Lastly, a block of twelve trials was given at the end of this session whereby the rhythms were pseudorandomized in pairs by type. More specifically, each rhythm type was presented in 2 successive trials so that during the 1st presentation, subjects listened attentively to the rhythm and during the 2nd presentation subjects then proceeded to tap in synchrony with each note of the rhythm they heard. Effectively, the listen trial would act as a prime for the ensuing tap trial, ensuring that subjects knew which rhythm type to tap to. This block provided subjects with a preview of how trials would be presented during the fMRI session.

Scan.

Although lying in the scanner bed, subjects were 1st given a block of twelve trials for practice, similar to the last block of trials they carried out during the prescan session. Subjects then completed 2 runs, each of which contained a silent baseline in addition to 6 test conditions as each rhythm type was associated with 2 tasks. Each run started with “listen with anticipation” where subjects only listened to the rhythm without making any movements, followed by “tap in synchrony” where subjects tapped as accurately as possible to the same rhythm, synchronizing motor responses with each note of the sequence. The 3 rhythms were pseudorandomized in pairs by type for presentation order within each run and across all subjects. Two silent trials of the same duration as the rhythm trials interspersed every 6 paired trials. Rhythms were presented binaurally through Siemens MR-compatible pneumatic sound transmission headphones at a sound intensity of 75-dB sound pressure level (as measured using a sound pressure meter), using Presentation on a PC computer. All conditions were performed with eyes closed, and tap responses (key onset and offset times) were collected online.

Experiment 2

The critical difference in this study is the inclusion of a naïve passive listen condition. Thus, there was no prescan session; on the day of the fMRI scanning, subjects were instructed to simply listen to the rhythms during run 1 (Fig. 2A), without being told that they would be tapping to these same rhythms during run 2 of the experiment (Fig. 2B). To control for attention, subjects made a mouse press at the beginning of each scan acquisition, alternating between the left and right mouse button for each trial including the silent baseline. Thus, any neural response related to this motor event is accounted for as the subtraction analysis is performed relative to this baseline. Furthermore, because the hemodynamic response is delayed by 5–6 s, by having subjects make this button press at the start of scan acquisition, we ensured that the hemodynamic response to this action would not be associated to the passive listening trial. After run 1, subjects were informed that they would be tapping to each of the rhythms just heard, for run 2. In order to minimize motor learning, a training session ensued that was identical to the prescan session of experiment 1. Run 2 was then identical to the runs performed in experiment 1.

Figure 2.

Representation of the fMRI sparse-sampling protocol. In experiment 1, subjects performed the protocol for (B) twice (2 runs). Each rhythm type was presented in a pair and subjects 1st listened attentively and then tapped along with the same rhythm in the following trial. The 3 rhythm types were presented in a pseudorandom order, along with silence. In experiment 2, subjects performed the protocol for (A) (run 1) where they passively listened to the rhythms without being told that they would tap to them later on, and then the protocol for (B) (run 2) as described above.

Figure 2.

Representation of the fMRI sparse-sampling protocol. In experiment 1, subjects performed the protocol for (B) twice (2 runs). Each rhythm type was presented in a pair and subjects 1st listened attentively and then tapped along with the same rhythm in the following trial. The 3 rhythm types were presented in a pseudorandom order, along with silence. In experiment 2, subjects performed the protocol for (A) (run 1) where they passively listened to the rhythms without being told that they would tap to them later on, and then the protocol for (B) (run 2) as described above.

A MR-compatible camera (MRC Systems, Germany) was aimed at the right hand/mouse and was in operation throughout the experiment. Footage from this camera was recorded onto a camcorder (Canon Optura 600 NTSC Mini DV) that was located outside the scan room. All other procedures were identical to that of experiment 1.

fMRI Acquisition

Experiments 1 and 2

Scanning was performed on a 1.5-T Siemens Sonata imager. High-resolution T1-weighted anatomical scans were collected for each subject (voxel size: 1 × 1 × 1 mm3, matrix size: 256 × 256). A total of 99 frames were obtained for each of 2 runs in the functional T2*-weighted gradient echo planar scans (14 frames per condition per run). Whole head interleaved scans (n = 25) were taken, oriented in a direction orthogonal to that of the Sylvian Fissure (time echo = 50 ms, time repetition [TR] = 10 000 ms, voxel size: 5 × 5 × 5 mm3, matrix size: 64 × 64 × 25, field of view: 320 mm2). A single-trial sparse-sampling design (i.e., long TR) was used whereby scan acquisition occurred after each trial presentation (Fig. 2B). This ensured that the blood oxygenation level–dependent (BOLD) signal to the auditory stimuli would not be contaminated with the BOLD response to the acquisition noise (Belin et al. 1999) and avoids behavioral and thus neural interactions that may occur when auditory stimuli of a rhythmical nature are concurrently processed with the loud rhythmical scanner noise.

Behavioral Analyses

Behavioral analyses performed on data from experiment 1 have already been described in detail and presented in another paper (Chen et al., 2008). Here we briefly describe and perform the same analyses, but on data from experiment 2. For purposes of comparison, results from both experiments are presented in Figure 2.

In experiment 2, to ensure that subjects were attending to the task during run 1, we verified that they implemented a mouse click after each trial presentation. To ensure that performance was comparable across experiments, behavioral data from the tapping trials in experiment 1 (reported in Chen et al. 2008) were compared with those from the tapping trials in the 2nd fMRI run in experiment 2. This comparison was the most valid as it controlled for the amount of motor practice on the rhythms. Performance related to the specific skill of sensorimotor integration was assessed using 2 sensitive measures of synchronization ability, the intertap interval (ITI) and asynchrony. The ITI measures the ability to reproduce time intervals between each event in a sequence. We calculated the deviation (in absolute value) of a subject's ITI relative to the actual onset-to-onset interval, as a percentage score (% ITI deviation); the greater the deviation, the poorer the performance. Asynchrony assesses the ability to time the onset of a motor response with the onset of a stimulus event. For this measure, the absolute value of asynchrony was calculated because we were only interested in quantifying the amount of phase mismatch without regard for whether subjects were tapping ahead or lagging behind the stimulus event. All dependent variables were calculated for each correct tap subjects made averaged across all trials for each rhythm type.

fMRI Analyses

Experiment 1

The 1st volume of each functional run was discarded. Images from each scan were then realigned with the 3rd frame as reference, motion corrected using the AFNI software (Cox 1996), and smoothed using a 12-mm full-width half-maximum isotropic Gaussian kernel. For each subject, both anatomical and functional volumes were transformed into standard stereotaxic space (Talairach and Tournoux 1988) based on the MNI 305 (Montreal Neurological Institute) template (Collins et al. 1994). Statistical analysis of fMRI data was based on the general linear model (Y = Xβ + ϵ), performed using fMRISTAT (Worsley et al. 2002) (available at www.math.mcgill.ca/keith/fmristat). Error (ϵ) and temporal drift are modeled and removed. A design matrix containing the explanatory variables (X) in each column and volume acquisition in each row is organized and the linear model is then fit with the fMRI time series (Y), solving parameter estimates (β) in the least squares sense, yielding estimates of effects, standard errors, and t-statistics for each contrast, for each run. Runs are combined together within and then across subjects using a mixed-effects model (Worsley et al. 2002), generating group statistical maps for each contrast of interest.

To determine the degree of motor engagement during perception and action, we performed subtraction analyses contrasting each of the listen with anticipation (L) and tap (T) conditions to silence: Lsimple—silence, Lcomplex—silence, Lambiguous—silence, Tsimple—silence, Tcomplex—silence, Tambiguous—silence. Three conjunction analyses were then performed to determine brain regions commonly recruited for all 1) listen with anticipation, 2) tap, and 3) listen with anticipation and tap, regardless of rhythm complexity; we report these analyses. The conjunction analysis was implemented using the minimum of the t-statistic obtained from the subtraction contrast (Friston et al. 2005). Thus, only those voxels that are present in each contrast and survive a common threshold are considered significantly activated in the conjunction analysis. For these subtraction and conjunction contrasts, the threshold for a significant peak was t = 4.7, P < 0.05 corrected, as determined by the minimum of the Bonferonni correction, Random field theory, and discrete local maximum (Worsley 2005). Peaks below these thresholds, but contralateral to significant regions are also reported as they have a high likelihood of representing real effects as opposed to false positives.

Localization of peak neural activity was classified using anatomical atlases (Talairach and Tournoux 1988; Duvernoy 1991; Schmahmann et al. 2000) with further specification based on probability maps for auditory regions (Penhune et al. 1996; Westbury et al. 1999) and reviewed literature on the medial and lateral motor areas (Picard and Strick 1996, 2001). The PMC was further defined into dorsal (dPMC) and ventral (vPMC) portions, with the dPMC located above the inferior junction of the superior frontal sulcus with the superior precentral sulcus, approximately at the z = 50 plane (Rizzolatti and Craighero 2004). This border has also been defined as lying between the levels of the inferior and superior frontal sulci (Tomassini et al. 2007). Peaks located at the border of this subdivision that did not clearly fall into dPMC or vPMC were given the nomenclature mid-PMC.

Experiment 2

The subtraction and conjunction analyses implemented were identical to those of experiment 1 except that they were performed separately for the 2 runs, according to the tasks: passive listen, listen with anticipation, and tap. Only the 1st of the paired passive listen trials was used for data analyses. Similar to experiment 1, a conservative threshold of P < 0.05 corrected was implemented. However, 3 brain regions (bilateral cerebellum lobule VI and left vPMC bordering the pars opercularis) did not pass this criterion. Because they were predicted a priori from experiment 1 and the literature review in the introduction, they are reported using a threshold of P < 0.0005 uncorrected, corresponding to t = 3.39.

For each subject, the % BOLD signal change was calculated for voxels of interest (VOIs) from regions identified in the conjunction analysis of passive listen, listen with anticipation, and tap (Table 2, column—conjunction: all). The only exception was the cerebellar voxel in lobule VIIIa where we used the peak identified from the conjunction: listen with anticipation (Table 2). This region was not significantly present in the passive listen condition and thus was not identified in the analysis conjunction: all. To compare these results with those of experiment 1, the % BOLD signal change was also calculated using the VOIs derived from experiment 2, but on data from experiment 1. A 1-way repeated measures (condition: listen with anticipation, tap) ANOVA with between-subjects group factor (experiments 1, 2) was then implemented on these values. To test for differences in neural activity across conditions in experiment 2, we implemented a 1-way repeated measures ANOVA followed by Tukey's post hoc comparisons.

To assess the premotor functional dissociation and for comparison with our previous findings (Chen et al. 2008), a covariation analysis was conducted using task performance during the tap trials to assess changes in neural activity as a function of rhythm complexity. This analysis is identical to that performed for data in experiment 1 that was previously described in detail and reported (Chen et al. 2008). In brief, in each subject, the % ITI deviation scores were used as regressors to determine neural regions that increased in activity as performance decreased. We report findings pertaining to the predicted regions established a priori (Chen et al. 2008) (pre-SMA, SMA, dPMC, dorsolateral prefrontal cortex [DLPFC], cerebellum) and thus use an uncorrected threshold of t = 3.39 where P < 0.0005. Because the focus of this paper is in part on the dPMC, the other findings will not be further discussed.

Results

Experiment 1

A previous study (Chen et al. 2008) investigated the behavioral and neural correlates of movement synchronization to increasingly complex rhythms. In that study, we presented data that only pertained to the tap trials. Here, we investigate a different experimental question and present previously unexamined data pertaining to the listen with anticipation trials (Fig. 2B).

Listen with Anticipation versus Silence

In addition to the expected auditory areas (right planum temporale [PT], and left Heschl's gyrus), listening to rhythms with anticipation recruited several motor regions: left SMA, bilateral mid-PMC, bilateral vPMC, and bilateral cerebellum lobules VI and VIIIa (Table 1, column—conjunction: listen with anticipation, Fig. 4A). Other neural regions engaged in this condition included the left intraparietal sulcus (IPS) and bilateral caudate.

Table 1

Experiment #1—listening with anticipation and tapping to rhythms

Brain region Conjunction: listen with anticipation
 
Conjunction: tap
 
Conjunction: all
 
 (x, y, zt (x, y, zt (x, y, zt 
L PT   (−52, −30, 18) 8.94   
L Heschl's gyrus (−42, −32, 8) 8.37 (−32, −28, 8) 7.69 (−40, −34, 12) 8.16 
R PT (54, −22, 8) 5.70 (68, −30, 16) 4.88 (66, −26, 12) 4.63 
 (44, −30, 10) 5.63 (40, −36, 12) 3.67 (40, −36, 12) 3.67 
L M1   (−40, −16, 56) 9.80   
L SMA (−8, −4, 64) 6.91 (−4, −8, 60) 11.58 (−8, −4, 64) 6.91 
L mid-PMC (−48, −8, 50) 6.86  (−48, −8, 50) 6.86 
R mid-PMC (52, −2, 44) 3.76 (52, −4, 50) 4.90 (52, −2, 44) 3.76 
L vPMC (−50, 2, 24) 6.31 (−58, 2, 26) 8.65 (−50, 2, 24) 6.31 
R vPMC (50, 6, 28) 3.83 (58, 6, 22) 4.96 (48, 4, 24) 3.53 
 (46, 4, 28) 3.79 (46, 2, 8) 5.87   
L cerebellum lobule VI (−32, −64, −24) 4.99 (−28, −64, −24) 6.29 (−32, −64, −24) 4.99 
R cerebellum lobule VI (12, −72, −24) 5.08 (12, −60, −18) 10.59 (12, −72, −24) 5.08 
 (30, −64, −28) 7.36 (22, −60, −24) 10.37 (30, −64, −28) 7.36 
L cerebellum lobule VIIIa (−26, −66, −50) 4.31 (−28, −64, −50) 4.60 (−26, −66, −50) 4.31 
   (−30, −60, −50) 4.60   
R cerebellum lobule VIIIa (26, −70, −52) 7.50 (22, −68, −50) 9.74 (26, −70, −52) 7.50 
L putamen   (−22, −4, 8) 11.51   
R putamen   (20, 0, 10) 9.56   
L caudate (−14, −4, 14) 5.93     
R caudate (14, 8, 4) 4.85     
 (12, 0, 20) 4.67     
L IPS (−30, −56, 44) 4.95     
L IPL   (−50, −36, 42) 7.60   
Brain region Conjunction: listen with anticipation
 
Conjunction: tap
 
Conjunction: all
 
 (x, y, zt (x, y, zt (x, y, zt 
L PT   (−52, −30, 18) 8.94   
L Heschl's gyrus (−42, −32, 8) 8.37 (−32, −28, 8) 7.69 (−40, −34, 12) 8.16 
R PT (54, −22, 8) 5.70 (68, −30, 16) 4.88 (66, −26, 12) 4.63 
 (44, −30, 10) 5.63 (40, −36, 12) 3.67 (40, −36, 12) 3.67 
L M1   (−40, −16, 56) 9.80   
L SMA (−8, −4, 64) 6.91 (−4, −8, 60) 11.58 (−8, −4, 64) 6.91 
L mid-PMC (−48, −8, 50) 6.86  (−48, −8, 50) 6.86 
R mid-PMC (52, −2, 44) 3.76 (52, −4, 50) 4.90 (52, −2, 44) 3.76 
L vPMC (−50, 2, 24) 6.31 (−58, 2, 26) 8.65 (−50, 2, 24) 6.31 
R vPMC (50, 6, 28) 3.83 (58, 6, 22) 4.96 (48, 4, 24) 3.53 
 (46, 4, 28) 3.79 (46, 2, 8) 5.87   
L cerebellum lobule VI (−32, −64, −24) 4.99 (−28, −64, −24) 6.29 (−32, −64, −24) 4.99 
R cerebellum lobule VI (12, −72, −24) 5.08 (12, −60, −18) 10.59 (12, −72, −24) 5.08 
 (30, −64, −28) 7.36 (22, −60, −24) 10.37 (30, −64, −28) 7.36 
L cerebellum lobule VIIIa (−26, −66, −50) 4.31 (−28, −64, −50) 4.60 (−26, −66, −50) 4.31 
   (−30, −60, −50) 4.60   
R cerebellum lobule VIIIa (26, −70, −52) 7.50 (22, −68, −50) 9.74 (26, −70, −52) 7.50 
L putamen   (−22, −4, 8) 11.51   
R putamen   (20, 0, 10) 9.56   
L caudate (−14, −4, 14) 5.93     
R caudate (14, 8, 4) 4.85     
 (12, 0, 20) 4.67     
L IPS (−30, −56, 44) 4.95     
L IPL   (−50, −36, 42) 7.60   

Experiment 1: Table shows peaks of neural activity commonly recruited across different levels of rhythm complexity for trials involving listen with anticipation (column 2), tap (column 3), and their conjunction (column 4). The stereotaxic coordinates of peak activations are given according to Talairach–MNI space, along with peak t-values. *The peak in the left primary motor area (M1) is extensive and overlaps with that of the left mid-PMC; mid-PMC activity is present in each tap condition (simple, complex, ambiguous) relative to silence. L, left; R, right.

Table 2

Experiment 2—passive listening, listening with anticipation and tapping to rhythms

Brain region Conjunction: passive listen
 
Conjunction: listen with anticipation
 
Conjunction: tap
 
Conjunction: all
 
 (x, y, zt (x, y, zt (x, y, zt (x, y, zt 
L PT (−38, −34, 14) 10.25 (−40, −36, 12) 7.19 (−42, −34, 14) 8.54 (−40, −36, 12) 7.19 
  (−60, −20, 6) 4.36 (−56, −42, 18) 6.99 (−60, −22, 8) 4.35 
   (−54, −20, 8) 4.30   (−54, −20, 8) 4.30 
R PT (44, −28, 10) 8.71 (48, −28, 10) 6.14     
 (58, −24, 12) 9.10  (64, −30, 14) 5.31 (64, −30, 14) 5.31 
L M1        
L SMA (0, −6, 69) 4.96 (−2, 0, 62) 6.54 (−2, −6, 62) 8.95 (0, −6, 58) 4.96 
L mid-PMC (−50, −8, 50) 5.96 (−48, −4, 52) 5.66 (−42, −14, 56) 6.58 (−50, −6, 52) 5.46 
R mid-PMC (54, −6, 48) 5.67 (54, −2, 48) 4.01 (52, −2, 52) 5.56 (54, −2, 48) 4.01 
L vPMC/BA 44   (−54, 10, 2) 3.94 (−50, 8, 4) 6.80   
R vPMC/BA 44     (54, 10, −2) 5.46   
L BA 8/6/44   (−46, 10, 26) 4.58     
   (−40, −2, 34) 5.06     
L cerebellum lobule VI (−34, −66, −20) 3.59 (−30, −62, −24) 7.11 (−30, −58, −26) 7.64 (−34, −66, −20) 3.59 
 (−30, −62, −18) 3.58 (−46, −64, −24) 5.00   (−30, −62, −18) 3.58 
R cerebellum lobule VI (34, −62, −20) 3.88 (34, −66, −24) 6.32 (30, −58, −26) 8.49 (34, −62, −20) 3.88 
   (10, −74, −20) 5.60 (6, −64, −14) 7.93   
L cerebellum lobule VIIIa   (−28, −62, −54) 4.51 (−22, −68, −50) 4.67   
   (−24, −68, −50) 4.51     
R cerebellum lobule VIIIa   (24, −72, −50) 5.41 (20, −68, −48) 6.92   
L thalamus     (−12, −18, 6) 7.89   
     (−16, −12, 12) 7.11   
R thalamus     (16, −4, 12) 8.26   
Brain region Conjunction: passive listen
 
Conjunction: listen with anticipation
 
Conjunction: tap
 
Conjunction: all
 
 (x, y, zt (x, y, zt (x, y, zt (x, y, zt 
L PT (−38, −34, 14) 10.25 (−40, −36, 12) 7.19 (−42, −34, 14) 8.54 (−40, −36, 12) 7.19 
  (−60, −20, 6) 4.36 (−56, −42, 18) 6.99 (−60, −22, 8) 4.35 
   (−54, −20, 8) 4.30   (−54, −20, 8) 4.30 
R PT (44, −28, 10) 8.71 (48, −28, 10) 6.14     
 (58, −24, 12) 9.10  (64, −30, 14) 5.31 (64, −30, 14) 5.31 
L M1        
L SMA (0, −6, 69) 4.96 (−2, 0, 62) 6.54 (−2, −6, 62) 8.95 (0, −6, 58) 4.96 
L mid-PMC (−50, −8, 50) 5.96 (−48, −4, 52) 5.66 (−42, −14, 56) 6.58 (−50, −6, 52) 5.46 
R mid-PMC (54, −6, 48) 5.67 (54, −2, 48) 4.01 (52, −2, 52) 5.56 (54, −2, 48) 4.01 
L vPMC/BA 44   (−54, 10, 2) 3.94 (−50, 8, 4) 6.80   
R vPMC/BA 44     (54, 10, −2) 5.46   
L BA 8/6/44   (−46, 10, 26) 4.58     
   (−40, −2, 34) 5.06     
L cerebellum lobule VI (−34, −66, −20) 3.59 (−30, −62, −24) 7.11 (−30, −58, −26) 7.64 (−34, −66, −20) 3.59 
 (−30, −62, −18) 3.58 (−46, −64, −24) 5.00   (−30, −62, −18) 3.58 
R cerebellum lobule VI (34, −62, −20) 3.88 (34, −66, −24) 6.32 (30, −58, −26) 8.49 (34, −62, −20) 3.88 
   (10, −74, −20) 5.60 (6, −64, −14) 7.93   
L cerebellum lobule VIIIa   (−28, −62, −54) 4.51 (−22, −68, −50) 4.67   
   (−24, −68, −50) 4.51     
R cerebellum lobule VIIIa   (24, −72, −50) 5.41 (20, −68, −48) 6.92   
L thalamus     (−12, −18, 6) 7.89   
     (−16, −12, 12) 7.11   
R thalamus     (16, −4, 12) 8.26   

Experiment 2: Table shows peaks of neural activity commonly recruited across different levels of rhythm complexity for trials involving passive listen (column 2), listen with anticipation (column 3), tap (column 4), and their conjunction (column 5). The stereotaxic coordinates of peak activations are given according to Talairach–MNI space along with peak t-values. *The peak medially located overlaps with the lateral peak at conjunction. #The peak in the left mid-PMC overlaps with that of the primary motor area (M1); M1 activity is present in each tap condition (simple, complex, ambiguous) relative to silence. L, left; R, right.

Tap versus Silence

Tapping to rhythms recruited similar brain regions as the above (bilateral PT, left Heschl's gyrus, left SMA, bilateral mid-PMC, bilateral vPMC, and bilateral cerebellum lobules VI and VIIIa), and in addition left primary motor cortex (M1), bilateral putamen and left inferior parietal lobule BA 40 (IPL), (Table 1, column—conjunction: tap, Fig. 4A).

Listen with Anticipation and Tap

Listening with anticipation and tapping to rhythms commonly recruited the right PT, left Heschl's gyrus, left SMA, bilateral mid-PMC, bilateral vPMC, and bilateral cerebellum lobules VI and VIIIa (Table 1, column—conjunction: all).

The results of experiment 1 show that listening with anticipation and tapping along with musical rhythms engages secondary motor regions of the brain such as the SMA, mid-PMC, and cerebellum. Importantly, recruitment of these regions in the perceptual condition is very likely related to an explicit sound–movement association, developed due to the nature of the task: subjects needed to listen attentively because they knew they had to tap to the same rhythmic pattern on the following trial and hence likely rehearsed or performed imagery to ensure accurate performance. The data of experiment 2 should allow us to disambiguate neural activity related to a purely perceptual condition without sound–movement associations, to that of listening with anticipation. Thus, this manipulation allows us to elucidate the circumstances under which the auditory and motor systems are coupled.

Experiment 2

Behavior

Video recordings confirmed that subjects did not move their hand during either passive listen or listen with anticipation. A 1-way repeated measures (rhythm complexity levels) ANOVA with between-subjects group factor (experiment 1 and 2) showed that subjects in experiment 1 performed no differently from those in experiment 2 (% ITI deviation: F1, 22 = 0.07, P = 0.79; asynchrony F1, 22 = 0.004, P = 0.95) (Fig. 3). As expected, a significant main effect of condition showed that synchronization ability decreased as rhythm complexity increased (% ITI deviation F2, 44 = 9.77, P < 0.0005; asynchrony F2, 44 = 53.76, P < 0.0005) (Fig. 3).

Figure 3.

Performance decreases as rhythm complexity increases. Percent ITI deviation and asynchrony measures plotted across rhythm type for the tapping trials performed in experiments 1 and 2. Data are reported as means ± SE.

Figure 3.

Performance decreases as rhythm complexity increases. Percent ITI deviation and asynchrony measures plotted across rhythm type for the tapping trials performed in experiments 1 and 2. Data are reported as means ± SE.

Passive Listen versus Silence

As predicted, passively listening to rhythms recruited auditory regions such as bilateral PT. Furthermore, motor regions were also engaged in this naïve passive listening condition: midline SMA, bilateral mid-PMC, and bilateral cerebellar lobule VI (Table 2, column—conjunction: passive listen, Figs 4, 5).

Figure 4.

Brain regions involved in action–perception coupling and decoupling. All brain images are all taken in the same Talairach coronal plane. Color bar represents t-values. (A) Left panel shows subtraction results for experiment 1: brain regions engaged while subjects listen with anticipation and tap along with rhythms relative to silence. Right panel shows subtraction results for experiment 2: brain regions engaged while subjects passively listen, listen with anticipation, and tap along with rhythms, relative to silence. (B) % BOLD signal change is plotted for VOIs in each condition (passive listen, listen with anticipation, tap), averaged across rhythm type, for experiments 1 and 2. Data are reported as means ± SE.

Figure 4.

Brain regions involved in action–perception coupling and decoupling. All brain images are all taken in the same Talairach coronal plane. Color bar represents t-values. (A) Left panel shows subtraction results for experiment 1: brain regions engaged while subjects listen with anticipation and tap along with rhythms relative to silence. Right panel shows subtraction results for experiment 2: brain regions engaged while subjects passively listen, listen with anticipation, and tap along with rhythms, relative to silence. (B) % BOLD signal change is plotted for VOIs in each condition (passive listen, listen with anticipation, tap), averaged across rhythm type, for experiments 1 and 2. Data are reported as means ± SE.

Figure 5.

Neural activity in 3 distinct premotor regions: dPMC, mid-PMC, and vPMC, and pars opercularis (vPMC/BA 44). (A) Illustration of the premotor functional dissociation with data from experiment 2 projected onto a 3-dimensional anatomical rendering from 1 subject. Brain regions that increase in neural activity as rhythm complexity increases are shown in hot metal (dPMC); brain regions engaged during passive listening are shown in green (mid-PMC); brain regions engaged during tapping are shown in blue (vPMC/BA 44). The mid-PMC is engaged during both passive listening and tapping (and listen with anticipation not depicted in this image); this region is color coded with a mix of blue and green. (B) Illustration of dPMC sensitivity to metric organization; brain image taken in the Talairach horizontal plane of the covariation contrast from experiment 2 with graph showing corresponding % BOLD signal change plotted across rhythm type for each condition (passive listen, listen with anticipation, tap). (C and D) Illustration of mid-PMC sensitivity across all conditions and vPMC/BA 44 sensitivity during action and action-related sounds; brain images taken in the Talairach sagittal plane of the conjunction contrast “tap minus silence” from experiment 2 (graphs in same format as in A). Color bar represents t-values. Data are reported as means ± SE.

Figure 5.

Neural activity in 3 distinct premotor regions: dPMC, mid-PMC, and vPMC, and pars opercularis (vPMC/BA 44). (A) Illustration of the premotor functional dissociation with data from experiment 2 projected onto a 3-dimensional anatomical rendering from 1 subject. Brain regions that increase in neural activity as rhythm complexity increases are shown in hot metal (dPMC); brain regions engaged during passive listening are shown in green (mid-PMC); brain regions engaged during tapping are shown in blue (vPMC/BA 44). The mid-PMC is engaged during both passive listening and tapping (and listen with anticipation not depicted in this image); this region is color coded with a mix of blue and green. (B) Illustration of dPMC sensitivity to metric organization; brain image taken in the Talairach horizontal plane of the covariation contrast from experiment 2 with graph showing corresponding % BOLD signal change plotted across rhythm type for each condition (passive listen, listen with anticipation, tap). (C and D) Illustration of mid-PMC sensitivity across all conditions and vPMC/BA 44 sensitivity during action and action-related sounds; brain images taken in the Talairach sagittal plane of the conjunction contrast “tap minus silence” from experiment 2 (graphs in same format as in A). Color bar represents t-values. Data are reported as means ± SE.

Listen with Anticipation versus Silence

Listening to rhythms with anticipation recruited similar regions to that described above (bilateral PT, left SMA, bilateral mid-PMC and bilateral cerebellar lobule VI), in addition to the left vPMC bordering with the pars opercularis (vPMC/BA 44), left BA 8/6/44 (at the junction of inferior frontal sulcus and inferior precentral sulcus) and bilateral cerebellum lobule VIIIa (Table 2, column—conjunction: listen with anticipation, Figs 4, 5).

Tap versus Silence

Tapping to rhythms commonly recruited bilateral PT, left M1, left SMA, bilateral mid-PMC, bilateral vPMC/BA 44, bilateral cerebellum lobules VI and VIIIa, and bilateral thalamus (Table 2, column—conjunction: tap, Figs 4, 5).

Passive Listen, Listen with Anticipation, and Tap

Brain regions commonly recruited during passive listen, listen with anticipation and tap include bilateral PT, midline SMA, bilateral mid-PMC, and bilateral cerebellum lobules VI (Table 2, column—conjunction: all).

Comparison across Experiments and Conditions

To ensure that the fMRI results described above were comparable across experiments, the % BOLD signal change was compared for motor regions of interest (SMA, mid-PMC, cerebellum lobules VI and VIIIa) across conditions, collapsed across rhythm type (Fig. 4B: graphs). A 1-way repeated measures (condition: listen with anticipation, tap) ANOVA with between-subjects group factor (experiments 1, 2) showed that the % BOLD signal change in these VOIs were not significantly different between experiments 1 and 2. However as expected, there was a significant main effect of condition (P < 0.05) where greater neural activity was demonstrated in the tap versus listen with anticipation conditions.

Neural activity was also compared across conditions in experiment 2 for the same regions described above. A 1-way repeated measures ANOVA showed that the % BOLD signal change, collapsed across rhythm type, was significantly different (P < 0.05) across conditions with the exception of the right mid-PMC peak. Tukey's post hoc tests were then conducted to assess pair wise comparisons. In the SMA, the signal change was the greatest in the tap condition (tap relative to listen with anticipation: SMA ts(3,22) = 7.84, P < 0.005) but did not differ between the listen with anticipation and passive listen conditions. In contrast, the % BOLD signal change in cerebellar lobules VI and VIIIa did not differ between listen with anticipation and tap, and was greater in both these conditions compared with that in passive listening (tap relative to passive listen: left lobule VI, ts(3,22) = 4.28, right lobule VI, ts(3,22) = 4.47, left lobule VIIIa, ts(3,22) = 7.83, right lobule VIIIa, ts(3,22) = 8.02, P < 0.05). Lastly, the signal change in the right mid-PMC did not differ across conditions, whereas that in the left mid-PMC was significantly greater in the tap compared with the passive listen condition (ts(3,22) = 4.62, P < 0.05).

Comparison across Rhythm Types

A covariation analysis was conducted using the tap trials for comparison with our previous findings (Chen et al. 2008). As performance decreased while subjects tapped along to increasingly complex rhythms, neural activity in several regions increased, replicating our earlier results: the right SMA (6, −2, 64; t = 4.35), left pre-SMA (−8, 4, 56; t = 4.22), right dPMC (12, −2, 68; t = 4.43), left cerebellum lobule VI (−28, −66, −34; t = 3.69) and right DLPFC (32, 50, 30; t = 3.75). Of particular interest in this paper are the findings that pertain to the PMC; the dPMC is sensitive to this manipulation (Fig. 5B), whereas no modulation in neural activity at the vPMC/BA 44 or mid-PMC was identified in this analysis (Fig. 5C,D).

Discussion

Auditory–motor rhythm processing engages 3 distinct premotor regions that are each sensitive to different aspects of action–perception coupling and decoupling. The vPMC is only recruited when subjects listen with anticipation and tap along with rhythms. The dPMC is also engaged during movement synchronization, and furthermore is responsive to higher-order features of rhythmic stimuli such as metrical organization. Most interestingly, we show evidence that motor regions such as the mid-PMC, SMA, and cerebellum lobule VI resonate in response to sounds that do not bear any obvious significance for action implementation. This finding goes against the traditional view that motor brain regions are strictly involved in computing movement-related parameters and reveal, perhaps, an inherent coupling between action–perception processes whereby the motor system is sensitive to and thus driven by properties of the auditory stimulus under certain conditions.

Auditory regions in the posterior superior temporal gyrus, encompassing the PT, were engaged during the perception of, and synchronization to, musical rhythms. The PT has been proposed to be a “computational hub” where incoming auditory stimuli are analyzed, and information is then relayed to other cortical regions for further processing (Griffiths and Warren 2002). In nonhuman primates, these auditory areas have been shown to be anatomically connected with the PMC (reviewed in Zatorre et al. 2007). Based on animal models of visuomotor integration that propose a dorsal–ventral premotor dissociation of function (Wise et al. 1996; Rizzolatti and Luppino 2001; Hoshi and Tanji 2007), we have put forward an analogous suggestion for auditory–motor transformations during music perception and production that also involve the PMC (for details, see Zatorre et al. 2007). In the case of the classic reach and grasp example in the visual domain, the vPMC directly transforms sensory properties of an object into motor representations thereby allowing one to make an appropriate motor gesture to grasp an object (Fogassi et al. 2001). In parallel, we have suggested that the vPMC maps a specific sound with a precise movement that produces that sound; thus sounds must always be action-related for this region to be sensitive. On the other hand, the dPMC mediates indirect transformations whereby a sensory cue instructs movements in an abstract manner as demonstrated in its classic role in conditional sensorimotor behaviors (reviewed in Wise et al. 1996). Similarly, we have suggested that the dPMC implements the selection of movements based on higher-order rules such as those embedded in a rhythm's metrical structure.

Our findings corroborate this ventral–dorsal premotor dissociation. Neural activity in vPMC, bordering the pars opercularis (BA 44), was significant in the tap condition and also during listen with anticipation in the left hemisphere, perhaps related to motor preparation of right-finger tapping. Importantly, vPMC/BA 44 was not recruited in the naïve passive listening condition and was also insensitive to the metric organization of a rhythm, as its activity was constant across rhythm types and not detected in the covariation analysis (Fig. 5: graphs). These results are consistent with the idea of auditory mirror neurons (Kohler et al. 2002; Keysers et al. 2003), and parallel previous findings that showed the specificity of the neural response in left vPMC/BA 44 to only action-related sounds that have a learned auditory–motor mapping, and not to those without motor relevance (Lahav et al. 2007). In contrast, results from the covariation analysis in experiment 2 (Fig. 5B) support our previous findings (Chen et al. 2006, 2008) that neural activity in dPMC increases as subject performance decreases while tapping along with progressively complex, less temporally structured rhythms. This demonstrates that neural activity in dPMC is sensitive to the metric structure of an auditory rhythm and that it mediates the higher-order selection of movements in a temporally organized manner (for detailed discussion see Chen et al. 2006, 2008).

It is also worth noting that subregions within the PMC have been suggested to be organized according to stimulus properties that are linked in a somatotopic manner with its “pragmatically” relevant motor effector (Schubotz and von Cramon 2003). For example, rhythmic sequences preferentially recruit the inferior part of the vPMC because this area is specialized for the mouth representation. In contrast, object and spatial stimuli preferentially engage a more superior part of the vPMC and dPMC respectively where corresponding hand and arm representations are found. Our findings suggest that this model may not be applicable for all cognitive processes, because we show that the same rhythmic stimulus engages the entire PMC (vPMC, mid-PMC, and dPMC), depending on the nature of the sensorimotor interaction. Furthermore, several other studies have shown engagement of mid to dorsal premotor regions during rhythm processing (Jantzen et al. 2004; Lewis et al. 2004; Bengtsson et al. 2005; Chen et al. 2006), and it has also been suggested that organization of the motor systems may not be exclusively determined by the classical somatotopic maps (Graziano et al. 2002). However, one concept from the model of Schubotz that may be relevant is the proposal that the PMC is important for predicting events. This idea will be discussed in the context of the mid-PMC below.

In this study, we have labeled a premotor region as mid-PMC because the peaks across subjects were neither strictly contained within the vPMC nor dPMC according to prior definitions (Rizzolatti and Craighero 2004; Tomassini et al. 2007), but spanned these 2 premotor subregions. The peak in the mid-PMC (z∼50) is located superior to that of the vPMC/BA 44 (z∼0) but inferior to that of the dPMC (z∼68). The vPMC/BA 44 and the dPMC were not recruited during passive listening. In contrast, a 3rd premotor site, the mid-PMC, was significantly engaged not only during listening with anticipation and movement synchronization to musical rhythms, but also during its naïve passive perception when there was no sound–movement association. More specifically the % BOLD signal change results demonstrated that the left mid-PMC showed greater neural activity during the tap relative to the passive listen condition, a finding perhaps attributable to the right-finger movements subjects made, but that the right mid-PMC was equally engaged across all conditions.

Why does the mid-PMC respond to the passive perception of musical rhythms? On the one hand, it could be argued that there can never truly be a passive listening condition because most humans are exposed to the intertwining of music and movement early on in life; for example in nursery school we learn to clap our hands or dance along with songs. These experiences continue across the life-span and thus taken together, response of the mid-PMC during the naïve passive listening condition in the present experiment may actually reflect these long-learned sound–movement associations. The implication is that whenever we hear music, our brain is primed for action regardless of whether we consciously plan to move or not. Yet, this would suggest that the vPMC and/or dPMC should also be engaged because music–movement experiences usually involve direct and indirect mappings of an auditory stimulus with a motor act or program.

In contrast, the mid-PMC was recruited irrespective of any direct or indirect action-related plan and thus we suggest that its neural activity may be driven by more basic auditory stimulus properties. Listening to a rhythm might involve tracking the evolution of sequential events over time and this may be of relevance and/or inherent to the motor system. Many of our daily actions such as walking are not only executed in a rhythmical manner but also generate sounds that highlight the progression of events. Interestingly, neurons located posterior to the genu of the arcuate sulcus, which likely corresponds to the border of vPMC and dPMC, have been identified as having polysensory properties including sensitivity to auditory stimuli, even in the anaesthetized monkey (Graziano and Gandhi 2000). In particular, these regions are purported to be involved in polymodal motion processing (Bremmer et al. 2001). Perhaps it is the sequential nature of stimulus presentation that invokes the motor system to resonate. If this is the case, localization of the mid-PMC at the border between the vPMC and dPMC could reflect the fact that this region responds to both higher-order aspects of sensorimotor processes such as event tracking (and/or prediction), and to the potential action-related component of these sequential events. In fact, our previous findings (Chen et al. 2008) showed that a region in the vPMC, which would actually correspond to the mid-PMC according to the anatomical definition of the present manuscript, increased in neural activity as a function of rhythm complexity. Though this finding was not replicated in the present studies, it provides support to the idea that both the mid-PMC and dPMC may share similar higher-order response properties. But critically, the mid-PMC response differs from both the dPMC and vPMC in that this region is also engaged during the naïve passive perception of musical rhythms. In sum, these findings open the avenue for further research to examine the functional properties of the mid-PMC response such as whether neural activity in this region is specific to sequential stimuli in the auditory and music domains, or generalizes to any type of sensory process.

The SMA and cerebellar lobule VI were also significantly engaged during passive listening; however their % BOLD neural response differed from that of the mid-PMC in that they showed a preference for action-related events. The SMA was most sensitive during the sequencing of actual movements, but interestingly did not distinguish between perceptual events that were passively experienced versus those that were of motor significance which suggests that it may be responding to basic properties of the stimulus. This region has been implicated in the temporal organization of movements: SMA neurons show selective activity for specific sequences of actions and also for the intervals between these actions, thus coding for the temporal order of sequential events (Tanji 2001) and discrete time intervals (Macar et al. 2006). These findings lend support to the idea that the SMA might respond to the sequential nature of the rhythmic temporal stimuli during passive listening.

The same cerebellar region in lobule VI was engaged across all conditions. However, the % BOLD signal change results show that this region was more responsive to auditory events that were of motor relevance, whether they stemmed from perceptual representations of actions or actions themselves. These findings concur with present models of cerebellar function that propose it integrates sensory and motor information to generate internal models for predictive motor control (Wolpert et al. 1998; Ohyama et al. 2003; Bastian 2006). Thus, during listen with anticipation and tap, the cerebellum may have been engaged to optimize the motor outcome, that is, by fine-tuning potential and/or actual movements so that they are precisely timed. This role would differ from that of the PMC, that we propose is involved in more cognitively based sensorimotor processes; however both regions likely work together to enable temporally controlled movements. On the other hand, others have put more weight on the idea that the main role of the cerebellum may be in its acquisition and evaluation of sensory information to predict sensory events (Bower 1995; Nixon 2003; Petacchi et al. 2005) which may then be used by, for example, the motor system. Support for this notion comes from a meta-analysis of neuroimaging studies that show consistent recruitment of the cerebellum, including lobule VI, during auditory tasks that have no cognitive, emotional and/or motor component (Petacchi et al. 2005). Thus, our findings of cerebellar lobule VI being recruited in response to a purely auditory stimulus would also support these latter models.

Our findings shed new light on the nature of action–perception coupling and decoupling and demonstrate the dissociation of 3 distinct premotor regions during these sensorimotor processes. The vPMC is involved in direct sound–movement mappings, whereas in contrast the dPMC mediates the higher-order selection of movements based on information derived from a sensory cue. Most interestingly, the mid-PMC, SMA, and cerebellum were also sensitive to auditory stimuli that bore no motor significance, suggesting that these regions may have a more general role in attending to features of the physical stimulus, tracking the sequentially presented auditory events in the anticipation that they might be of relevance to the motor system. Together, these basic and higher-order response properties of the PMC allow it to be an important node for sound–movement interactions during complex behaviors such as music performance, and may partially explain the often irresistible urge to tap to the beat upon hearing a piece of music.

Funding

Canadian Institutes of Health Research and a McGill Majors Fellowship to J.L.C.

We would like to thank Mike Ferreira, the staff at the McConnell Brain Imaging Centre, and Patrick Bermudez for their technical assistance. Conflict of Interest: None declared.

References

Bangert
M
Peschel
T
Schlaug
G
Rotte
M
Drescher
D
Hinrichs
H
Heinze
HJ
Altenmuller
E
Shared networks for auditory and motor processing in professional pianists: evidence from fMRI conjunction
Neuroimage.
 , 
2006
, vol. 
30
 (pg. 
917
-
926
)
Bastian
AJ
Learning to predict the future: the cerebellum adapts feedforward movement control
Curr Opin Neurobiol.
 , 
2006
, vol. 
16
 (pg. 
645
-
649
)
Baumann
S
Koeneke
S
Schmidt
CF
Meyer
M
Lutz
K
Jancke
L
A network for audio-motor coordination in skilled pianists and non-musicians
Brain Res.
 , 
2007
, vol. 
1161
 (pg. 
65
-
78
)
Belin
P
Zatorre
RJ
Hoge
R
Evans
AC
Pike
B
Event-related fMRI of the auditory cortex
Neuroimage.
 , 
1999
, vol. 
10
 (pg. 
417
-
429
)
Bengtsson
SL
Ehrsson
HH
Forssberg
H
Ullen
F
Effector-independent voluntary timing: behavioural and neuroimaging evidence
Eur J Neurosci.
 , 
2005
, vol. 
22
 (pg. 
3255
-
3265
)
Bower
J
The cerebellum as a sensory acquisition controller
Hum Brain Mapp.
 , 
1995
, vol. 
2
 (pg. 
255
-
256
)
Bremmer
F
Schlack
A
Shah
NJ
Zafiris
O
Kubischik
M
Hoffmann
KP
Zilles
K
Fink
GR
Polymodal motion processing in posterior parietal and premotor cortex: a human fMRI study strongly implies equivalencies between humans and monkeys
Neuron.
 , 
2001
, vol. 
29
 (pg. 
287
-
296
)
Chen
JL
Zatorre
RJ
Penhune
VB
Interactions between auditory and dorsal premotor cortex during synchronization to musical rhythms
Neuroimage.
 , 
2006
, vol. 
32
 (pg. 
1771
-
1781
)
Chen
JL
Penhune
VB
Zatorre
RJ
Moving on time: brain network for auditory-motor synchronization is modulated by rhythm complexity and musical training
J Cogn Neurosci.
 , 
2008
, vol. 
20
 (pg. 
226
-
239
)
Collins
DL
Neelin
P
Peters
TM
Evans
AC
Automatic 3D intersubject registration of MR volumetric data in standardized Talairach space
J Comput Assist Tomogr.
 , 
1994
, vol. 
18
 (pg. 
192
-
205
)
Cox
RW
AFNI: software for analysis and visualization of functional magnetic resonance neuroimages
Comput Biomed Res.
 , 
1996
, vol. 
29
 (pg. 
162
-
173
)
D'Ausilio
A
Altenmuller
E
Olivetti Belardinelli
M
Lotze
M
Cross-modal plasticity of the motor cortex while listening to a rehearsed musical piece
Eur J Neurosci.
 , 
2006
, vol. 
24
 (pg. 
955
-
958
)
Duvernoy
HM
The human brain: surface, three-dimensional sectional anatomy and MRI
 , 
1991
New York
Springer-Verlag
Fogassi
L
Gallese
V
Buccino
G
Craighero
L
Fadiga
L
Rizzolatti
G
Cortical mechanism for the visual guidance of hand grasping movements in the monkey—a reversible inactivation study
Brain.
 , 
2001
, vol. 
124
 (pg. 
571
-
586
)
Friston
KJ
Penny
WD
Glaser
DE
Conjunction revisited
Neuroimage.
 , 
2005
, vol. 
25
 (pg. 
661
-
667
)
Geiser
E
Zaehle
T
Jancke
L
Meyer
M
The neural correlates of speech rhythm as evidenced by metrical speech processing: a functional magnetic resonance imaging study
J Cogn Neurosci
 , 
2008
, vol. 
20
 (pg. 
541
-
552
)
Grahn
JA
Brett
M
Rhythm and beat perception in motor areas of the brain
J Cogn Neurosci.
 , 
2007
, vol. 
19
 (pg. 
893
-
906
)
Graziano
MS
Gandhi
S
Location of the polysensory zone in the precentral gyrus of anesthetized monkeys
Exp Brain Res.
 , 
2000
, vol. 
135
 (pg. 
259
-
266
)
Graziano
MS
Taylor
CS
Moore
T
Cooke
DF
The cortical control of movement revisited
Neuron.
 , 
2002
, vol. 
36
 (pg. 
349
-
362
)
Grezes
J
Decety
J
Functional anatomy of execution, mental simulation, observation, and verb generation of actions: a meta-analysis
Hum Brain Mapp.
 , 
2001
, vol. 
12
 (pg. 
1
-
19
)
Griffiths
TD
Warren
JD
The planum temporale as a computational hub
Trends Neurosci.
 , 
2002
, vol. 
25
 (pg. 
348
-
353
)
Haueisen
J
Knosche
TR
Involuntary motor activity in pianists evoked by music perception
J Cogn Neurosci.
 , 
2001
, vol. 
13
 (pg. 
786
-
792
)
Hoshi
E
Tanji
J
Distinctions between dorsal and ventral premotor areas: anatomical connectivity and functional properties
Curr Opin Neurobiol.
 , 
2007
, vol. 
17
 (pg. 
234
-
242
)
Jancke
L
Loose
R
Lutz
K
Specht
K
Shah
NJ
Cortical activations during paced finger-tapping applying visual and auditory pacing stimuli
Brain Res Cogn Brain Res.
 , 
2000
, vol. 
10
 (pg. 
51
-
66
)
Jantzen
KJ
Steinberg
FL
Kelso
JA
Brain networks underlying human timing behavior are influenced by prior context
Proc Natl Acad Sci USA.
 , 
2004
, vol. 
101
 (pg. 
6815
-
6820
)
Keysers
C
Kohler
E
Umilta
MA
Nanetti
L
Fogassi
L
Gallese
V
Audiovisual mirror neurons and action recognition
Exp Brain Res.
 , 
2003
, vol. 
153
 (pg. 
628
-
636
)
Kohler
E
Keysers
C
Umilta
MA
Fogassi
L
Gallese
V
Rizzolatti
G
Hearing sounds, understanding actions: action representation in mirror neurons
Science.
 , 
2002
, vol. 
297
 (pg. 
846
-
848
)
Lahav
A
Saltzman
E
Schlaug
G
Action representation of sound: audiomotor recognition network while listening to newly acquired actions
J Neurosci.
 , 
2007
, vol. 
27
 (pg. 
308
-
314
)
Langheim
FJ
Callicott
JH
Mattay
VS
Duyn
JH
Weinberger
DR
Cortical systems associated with covert music rehearsal
Neuroimage.
 , 
2002
, vol. 
16
 (pg. 
901
-
908
)
Large
EW
Fink
P
Kelso
JA
Tracking simple and complex sequences
Psychol Res.
 , 
2002
, vol. 
66
 (pg. 
3
-
17
)
Lewis
PA
Wing
AM
Pope
PA
Praamstra
P
Miall
RC
Brain activity correlates differentially with increasing temporal complexity of rhythms during initialisation, synchronisation, and continuation phases of paced finger tapping
Neuropsychologia.
 , 
2004
, vol. 
42
 (pg. 
1301
-
1312
)
Lotze
M
Scheler
G
Tan
HR
Braun
C
Birbaumer
N
The musician's brain: functional imaging of amateurs and professionals during performance and imagery
Neuroimage.
 , 
2003
, vol. 
20
 (pg. 
1817
-
1829
)
Macar
F
Coull
J
Vidal
F
The supplementary motor area in motor and perceptual time processing: fMRI studies
Cogn Process.
 , 
2006
, vol. 
7
 (pg. 
89
-
94
)
Meister
IG
Krings
T
Foltys
H
Boroojerdi
B
Muller
M
Topper
R
Thron
A
Playing piano in the mind–an fMRI study on music imagery and performance in pianists
Brain Res Cogn Brain Res.
 , 
2004
, vol. 
19
 (pg. 
219
-
228
)
Nixon
PD
The role of the cerebellum in preparing responses to predictable sensory events
The Cerebellum.
 , 
2003
, vol. 
2
 (pg. 
114
-
122
)
Ohyama
T
Nores
WL
Murphy
M
Mauk
MD
What the cerebellum computes
Trends Neurosci.
 , 
2003
, vol. 
26
 (pg. 
222
-
227
)
Penhune
VB
Zatorre
RJ
MacDonald
JD
Evans
AC
Interhemispheric anatomical differences in human primary auditory cortex: probabilistic mapping and volume measurement from magnetic resonance scans
Cereb Cortex.
 , 
1996
, vol. 
6
 (pg. 
661
-
672
)
Petacchi
A
Laird
AR
Fox
PT
Bower
JM
Cerebellum and auditory function: an ALE meta-analysis of functional neuroimaging studies
Hum Brain Mapp.
 , 
2005
, vol. 
25
 (pg. 
118
-
128
)
Phillips-Silver
J
Trainor
LJ
Feeling the beat: movement influences infant rhythm perception
Science.
 , 
2005
, vol. 
308
 pg. 
1430
 
Phillips-Silver
J
Trainor
LJ
Hearing what the body feels: auditory encoding of rhythmic movement
Cognition
 , 
2007
, vol. 
105
 (pg. 
533
-
546
)
Picard
N
Strick
PL
Motor areas of the medial wall: a review of their location and functional activation
Cereb Cortex.
 , 
1996
, vol. 
6
 (pg. 
342
-
353
)
Picard
N
Strick
PL
Imaging the premotor areas
Curr Opin Neurobiol.
 , 
2001
, vol. 
11
 (pg. 
663
-
672
)
Povel
DJ
Essens
PJ
Perception of temporal patterns
Music Percept.
 , 
1985
, vol. 
2
 (pg. 
411
-
440
)
Rao
SM
Harrington
DL
Haaland
KY
Bobholz
JA
Cox
RW
Binder
JR
Distributed neural systems underlying the timing of movements
J Neurosci.
 , 
1997
, vol. 
17
 (pg. 
5528
-
5535
)
Rizzolatti
G
Craighero
L
The mirror-neuron system
Annu Rev Neurosci.
 , 
2004
, vol. 
27
 (pg. 
169
-
192
)
Rizzolatti
G
Luppino
G
The cortical motor system
Neuron.
 , 
2001
, vol. 
31
 (pg. 
889
-
901
)
Sakai
K
Hikosaka
O
Miyauchi
S
Takino
R
Tamada
T
Iwata
NK
Nielsen
M
Neural representation of a rhythm depends on its interval ratio
J Neurosci.
 , 
1999
, vol. 
19
 (pg. 
10074
-
10081
)
Savion-Lemieux
T
Penhune
VB
The effects of practice and delay on motor skill learning and retention
Exp Brain Res.
 , 
2005
, vol. 
161
 (pg. 
423
-
431
)
Schmahmann
JD
Doyon
J
Toga
AW
Petrides
M
Evans
AC
MRI atlas of the human cerebellum
 , 
2000
San Diego (CA)
Academic Press
Schubotz
RI
von Cramon
DY
Functional-anatomical concepts of human premotor cortex: evidence from fMRI and PET studies
Neuroimage.
 , 
2003
, vol. 
20
 
Suppl 1
(pg. 
S120
-
S131
)
Snyder
JS
Krumhansl
CL
Tapping to ragtime: Cues to pulse finding
Music Percept.
 , 
2001
, vol. 
18
 (pg. 
455
-
489
)
Talairach
J
Tournoux
P
Co-planar stereotaxic atlas of the human brain
 , 
1988
New York
Thieme Medical Publishers
Tanji
J
Sequential organization of multiple movements: involvement of cortical motor areas
Annu Rev Neurosci.
 , 
2001
, vol. 
24
 (pg. 
631
-
651
)
Tomassini
V
Jbabdi
S
Klein
JC
Behrens
TE
Pozzilli
C
Matthews
PM
Rushworth
MF
Johansen-Berg
H
Diffusion-weighted imaging tractography-based parcellation of the human lateral premotor cortex identifies dorsal and ventral subregions with anatomical and functional specializations
J Neurosci.
 , 
2007
, vol. 
27
 (pg. 
10259
-
10269
)
Westbury
CF
Zatorre
RJ
Evans
AC
Quantifying variability in the planum temporale: a probability map
Cereb Cortex.
 , 
1999
, vol. 
9
 (pg. 
392
-
405
)
Wise
SP
di Pellegrino
G
Boussaoud
D
The premotor cortex and nonstandard sensorimotor mapping
Can J Physiol Pharmacol.
 , 
1996
, vol. 
74
 (pg. 
469
-
482
)
Wolpert
DM
Miall
RC
Kawato
M
Interal models in the cerebellum
Trends Cogn Sci.
 , 
1998
, vol. 
2
 (pg. 
338
-
347
)
Worsley
KJ
An improved theoretical P value for SPMs based on discrete local maxima
Neuroimage.
 , 
2005
, vol. 
28
 (pg. 
1056
-
1062
)
Worsley
KJ
Liao
CH
Aston
J
Petre
V
Duncan
GH
Morales
F
Evans
AC
A general statistical analysis for fMRI data
Neuroimage.
 , 
2002
, vol. 
15
 (pg. 
1
-
15
)
Xu
D
Liu
T
Ashe
J
Bushara
KO
Role of the olivo-cerebellar system in timing
J Neurosci.
 , 
2006
, vol. 
26
 (pg. 
5990
-
5995
)
Zatorre
RJ
Chen
JL
Penhune
VB
When the brain plays music: auditory-motor interactions in music perception and production
Nat Rev Neurosci.
 , 
2007
, vol. 
8
 (pg. 
547
-
558
)
Zatorre
RJ
Halpern
AR
Mental concerts: musical imagery and auditory cortex
Neuron.
 , 
2005
, vol. 
47
 (pg. 
9
-
12
)