Medial Frontal Circuit Dynamics Represents Probabilistic Choices for Unfamiliar Sensory Experience

Neurons in medial frontal cortex (MFC) receive sensory signals that are crucial for decision-making behavior. While decision-making is easy for familiar sensory signals, it becomes more elaborative when sensory signals are less familiar to animals. It remains unclear how the population of neurons enables the coordinate transformation of such a sensory input into ambiguous choice responses. Furthermore, whether and how cortical oscillations temporally coordinate neuronal ﬁ ring during this transformation has not been extensively studied. Here, we recorded neuronal population responses to familiar or unfamiliar auditory cues in rat MFC and computed their probabilistic evolution. Population responses to familiar sounds organize into neuronal trajectories containing multiplexed sensory, motor, and choice information. Unfamiliar sounds, in contrast, evoke trajectories that travel under the guidance of familiar paths and eventually diverge to unique decision states. Local ﬁ eld potentials exhibited beta- (15 – 20 Hz) and gamma-band (50 – 60 Hz) oscillations to which neuronal ﬁ ring showed modest phase locking. Interestingly, gamma oscillation, but not beta oscillation, increased its power abruptly at some timepoint by which neural trajectories for different choices were near maximally separated. Our results emphasize the importance of the evolution of neural trajectories in rapid probabilistic decisions that utilize unfamiliar sensory information.


Introduction
The selection of an appropriate action according to sensory information is fundamental to adaptive animal behavior. When animals associate familiar sensory inputs and motor choices, their decision response is immediate and stereotyped (Asaad et al. 1998;Fujisawa et al. 2008;Harvey et al. 2012). However, when sensory experience is unfamiliar to animals, the choice response of animals is ambiguous and hence the sensory-to-choice transformation is expected to become probabilistic (Gold and Shadlen 2007;Kepecs et al. 2008;Jaramillo and Zador 2011;Znamenskiy and Zador 2013). An interesting situation is when the unfamiliar sensory experience is similar to the previous experiences. In this case, these experiences may guide animal's choice responses and therefore their behaviors will not be random, but may show a certain psychometric curve for probabilistic choice. How the brain transforms unfamiliar sensory information into ambiguous choice responses under guidance by past experiences has been poorly understood. In this study, we explored the neural dynamics underlying this process and addressed the neurometric mechanism of motor response selection on the basis of unfamiliar sensory experience. We developed a novel auditory discrimination task for headrestrained rats, in which they had to make alternative choices in response to both familiar and unfamiliar auditory stimuli, and recorded neuronal ensemble activity from the medial frontal cortex (MFC) by using multielectrodes.
Single-cell activities underlying decision-making processes are well described in primates (Kim and Shadlen 1999;Romo et al. 2004;Gold and Shadlen 2007) and rodents (Feierstein et al. 2006;Roesch et al. 2007;Kepecs et al. 2008;O'Connor et al. 2010;Erlich et al. 2011;Diamond and Arabzadeh 2013). The current literature implicates the MFC in decision-making and goaldirected behavior (Ridderinkhof et al., 2004;Narayanan et al., 2013). In rodents, lesions in the medial agranular cortex (AGm), which is a part of MFC and also called "secondary motor cortex", are known to severely impair motor responses driven by sensory input (Cowey and Bozek 1974;Crowne and Pathria 1982;Erlich et al. 2011) without disturbing sensory perception and movement generation (Cowey and Bozek 1974;Crowne and Pathria 1982;Kesner et al. 1989). Consistent with these findings, the AGm receives convergent inputs from sensory cortices (Condé et al. 1995;Hoover and Vertes 2007) and projects to motor-related regions (Reep et al. 1987). The AGm was shown to be crucial for processing goal-directed decisions on the basis of sensory information (Erlich et al. 2011). To our knowledge, this is the first electrophysiological and pharmacological evidence supporting the existence in the rat of a frontal cortical area engaged in the preparation and/or planning of orienting motor responses. The rodent AGm was also shown to play a pivotal role in processing goal-directed decisions based on outcome values (Sul et al. 2011), further suggesting that the area processes choice in motor responses on the basis of sensory evidence.
Growing evidence from various species suggests that the recruitment of neural ensemble sequences is essential for organizing decision-making behavior (Balaguer-Ballester et al., 2011;Mante et al., 2013). In rodents, choice-selective sequences in decision tasks for spatial navigation were observed in prefrontal (Fujisawa et al. 2008) and parietal cortices (Harvey et al. 2012). Furthermore, neural ensembles in rodent prefrontal cortex exhibited trajectories during evidence-based learning of decision rules (Durstewitz et al. 2010). Theoretical models also infer how neural sequences contribute to probabilistic decision processes (Wang 2012). We asked whether the AGm encodes probabilistic decision processes for unfamiliar sensory experiences via neural sequences formed for past sensory experiences.
We found that some neurons in the AGm respond to auditory tones with various frequency tuning properties, which was previously unknown. Furthermore, we detected a rapid change in neural state evolution at a timepoint during decisionmaking, and extracted the features of neural population activity that guide the formation of probabilistic choice responses to familiar and unfamiliar stimuli. Around this timepoint also occurred an abrupt increase in the gamma-band power of the local field potential (LFP), suggesting a functional relevance of this timepoint to the present task. Thus, the AGm contains neural population trajectories for sensory-guided probabilistic choice behavior where prior sensory experience facilitates the rapid integration of unfamiliar stimuli into the neural trajectory leading to a unique decision state.

Behavioral Task
All experiments were carried out in accordance with the Animal Experiment Plan approved by the Animal Experiment Committee of RIKEN. Head-restrained adult Long-Evans rats (male 210-240 g: SLC) were trained to associate licking of spouts with reward delivery (Fig. 1A, Supplementary Movie 1). After the association was established, each rat was trained on the auditory discrimination task for 2.5-4 h each day. The durations of the intertrial interval (0.5-1.5 s) and pre-cue periods (1.0 s) were the same as those in the pretraining sessions. We presented either of 2 different pure tones (13.0 and 10.0 kHz, for 0.2 or 0.5 s) in a pseudo-random order as a cue that indicates the rats to lick a spout located at the left or right side of rat, respectively (Fig. 1B). In this study, we refer to the 2 tones as "familiar" cue tones. The rats were required to lick a spout within 5.0 s from the cue onset, and saccharin water (0.1%) was immediately delivered as a reward when the licking response was correct, or otherwise the reward was not delivered. After an incorrect choice, we prolonged the duration of the immediately following postresponse period (3.0 s) by 5.0 s as an aversive experience. When the average correct rate remained below a criterion (75%), the rats were forced to repeat the same trial when they made an incorrect choice (error correction). This treatment also prevents the development of a position preference. Once the mean correct rate reached the criterion, we fixed the duration of cue presentation at 0.2 s and omitted the error correction treatment. We continued the training at least for 1 day until the correct rate again exceeds the same criterion without the error correction (Fig. 1C). Each rat underwent 1 or 2 days of subsequent recording sessions, in which we presented the 2 familiar cue tones (10.0 and 13.0 kHz) and 5 "unfamiliar" cue tones (10.5,11.0,11.5,12.0,and 12.5 kHz) with the occurrence probability of 80% or 20% (4% for each unfamiliar tone), respectively (Fig. 1B). The reward probability for familiar cue trials was the same as that during training sessions. For unfamiliar cue trials, the reward probability was linearly varied along its cue tone frequency: (left/right) = 10.5 kHz (0.17/0.83), 11.0 kHz (0.33/0.67), 11.5 kHz (0.5/0.5), 12.0 kHz (0.67/0.33), 12.5 kHz (0.83/0.17). We trained 36 rats to perform an auditory discrimination task with familiar tones and only 21 reached the criteria for successful learning. Task training took more than a few weeks even for the best-performing rats. After surgery, 15 were available for multineuron recordings, and 8 of them finally yielded qualitatively and quantitatively satisfactory data for the succeeding analysis. We set the sound pressure such that the rats could well distinguish between the different tone frequencies. The times of task events and licking responses were sampled at 20 kHz and stored in a hard-disc recorder (LX-110/120; TEAC Inc., Japan). We obtained the behavioral data of 956 ± 143 (mean ± standard deviation [SD]) trials for a rat.

Intracortical Microstimulation
The stereotaxic coordinates of our recording sites were confirmed by intracortical microstimulation (ICMS; −50 μA, 100 pulses at 333 Hz, depth from pia matter: 1.7-2.5 mm) applied to frontal cortex of 5 rats (weighing 210-230 g) under headrestrained and awake conditions. As in previous results (Neafsey et al., 1986), ICMS evoked whisker, forelimb and/or facial movements at the locations corresponding to our recording sites ( Fig. 2A). In contrast, jaw movements were evoked at the locations more lateral to our recording sites. Stimulation in oral Medial Frontal Dynamics for Probabilistic Choices Handa et al. | 3819 primary motor cortex, which receives synaptic connections from the AGm (Reep et al., 1987), is known to evoke jaw movements (Neafsey et al., 1986). Thus, our recording sites were well within the AGm, but were not involved in primary motor cortex. We excluded neural data recorded from anterior cingulate cortex by histological analysis.

Electrophysiological Recordings
We recorded multiunit activity from the MFC of the left (N = 5) or right (N = 3) hemisphere through a 32-channel silicon probe consisting of 4 shanks (Neuro Nexus Technologies, Inc., USA), each with 2 tetrode sites separated vertically by 0.5 mm. The coordinate was determined using a stereotaxic atlas (+2.7 to +3.6 mm anterior, 0.6-2.0 mm lateral of Bregma) and the silicon probe was penetrated vertically or at the angle of 6°mainly to the deep layers (depth from pia matter: 1.0-2.0 mm). Silicon probe and preamplifiers were installed on a fine micromanipulator (1760-61; David Kopf Instruments) on a stereotaxic frame (SR-8 N; Narishige, Inc., Japan). Multiunit signals from the silicon probe were amplified by a custom-made headstage connected to the preamplifier before being fed into main amplifiers (NIHON KODEN, Inc., Japan). Multiunit signals were amplified (final gain 2000) with a band-pass filter (0.5 Hz to 10 kHz) and stored with a hard-disc recorder (LX-110/120; TEAC Inc., Japan) at a sampling rate of 20 kHz. We recorded LFPs with a bit depth of 16 bits and a sampling rate of 20 kHz, and down-sampled the raw signals further at a rate of 1 kHz through a low-pass filtering (<500 Hz) by Remez FIR filter.

Histological Analysis
After completion of electrophysiological recordings, the tetrode positions were marked with electrolytic microlesions by passing current stimuli (+20 μA, for 30 s) through the tetrode posed at the tips of each shank. Under deep anesthesia, animals were perfused intracardially with ice-chilled 0.9% saline followed by 4% paraformaldehyde in 0.1 M phosphate buffer. Brains were removed and placed in 30% sucrose in 0.1 M phosphate buffer. Postfixed brains were sliced coronally into 50-μm thick serial sections. We stained the sections with thionin Nissl staining procedure, and also used the stained sections to determine the boundaries between the major cytoarchitectonic regions of the dorsal frontal cortex according to the criteria described previously (Neafsey et al. 1986). If the trace of microlesions was within anterior cingulate cortex, we excluded the neural data from all the succeeding analyses. Histological verifications of the recording sites indicated that our recording sites were at a position matching +2.7 to +4.6 anterior, 0.6 to 2.0 lateral to Bregma in Paxinos and Watson's atlas (Paxinos and Watson 2009).

Data Analysis
We only analyzed the behavioral and neuronal data obtained on the first day of the recording sessions when the rats were still not habituated to unfamiliar cues. Spikes were sorted using a custom-made semi-automatic spike sorting program, EToS (12 feature dimensions for 4 channels; high-pass filter, 300 Hz; time resolution, 20 kHz; spikedetection interval, >0.5 ms) (Takekawa et al. 2012), and the sorted spike clusters were further analyzed manually using Klusters and NeuroScope (Harris et al. 2000;Isomura et al. 2009). The number of isolated units (RS + FS) was n = 37 in #941 (Rat 1), n = 68 in #897 (Rat 2), n = 22 in #902, n = 12 in #807, n = 44 in #880, n = 40 in #879, n = 50 in #940, and n = 51 in #949. Neuronal activity and behavioral performance were analyzed using MATLAB (The MathWorks, Inc.).

Principal Component Analysis
We performed principal component analysis (PCA) (Bishop 2006) by using Matlab's Statistics Toolbox to visualize the familiar trajectories of population neural activity of n neurons recorded simultaneously on all tetrodes. In each trial, we calculated an n-dimensional vector r(t) of instantaneous firing rates in a sliding time window of width 50 ms by a 10 ms step from −0.5 to +0.5 s relative to the cue onset, where the mth dimension of r(t) represents the firing rate of the mth neuron. We averaged each component of r(t) over repeated trials in each time window and performed PCA of the resultant set of ndimensional data points. We did not conduct PCA for unfamiliar cues because substantial differences in the number of trials between familiar and unfamiliar cue trials may result in the overestimation of the variances of trajectories for unfamiliar cues. Therefore, we visualized population trajectories for unfamiliar cues with the rate vectors projected onto the first 3 normalized eigenvectors for familiar cues.

Analysis of Neural Trajectory Separation
Fisher's linear discriminant (Bishop 2006) was used to find the degree of discrimination between 2 neural trajectories obtained in 2 different experimental conditions. For instance, to define a "choice axis" in each time window, we construct the distributions of the corresponding r(t) separately for left and right choice trials. Our task is to find a hyperplane, or equivalently the normal vector w of this hyperplane, that best divides the 2 distributions projected onto the direction of w. If the 2 distributions have the means and covariance matrices m 1 , Σ1 and m 2 , Σ 2 , respectively, we can obtain w by maximizing the ratio of the between-class variance to the within-class variance, S = (w T m 1 − w T m 2 ) 2 /w T (∑ 1 + ∑ 2 )w, where w T m 1 and w T m 2 are the means of the 2 projected distributions and w T Σ 1 w and w T Σ 2 w are their variances. We can show that S is maximized if w∞(∑ 1 + ∑ 2 ) −1 (m 2 − m 1 ). The coefficient of proportion can be determined by the normalization condition: |w| = 1.
Thus, we can calculate the discrimination function between left and right choice trials as 2 from the means (r L , r R ) and SDs (σ L , σ R ) of the trial-by-trial firing rates in these trials. The discrimination degree between the 2 clusters is defined as d′ = S . We can similarly calculate a "tone axis" in each bin from the population firing rates obtained for different cue tones.

Mutual Information
To estimate quantitatively the difference in population trajectories at time t with respect to spatial choices, we calculated the mutual information (Shannon 1948;Kerr et al. 2007) between the choice and the population vectors projected onto the choice axis, v t = <r(t), w choice >: where C and V t denote the set of choices c and the set of projected population activity v t , respectively, and p(c) is the choice probability, p(c|v t ) is the conditional probability of choice c when v t is observed at time t, and the parenthesis means an averaging over the values of v t . The mutual information was calculated from −0.1 to +0.4 s relative to the cue onset at every 10 ms using a sliding time window of width 50 ms, and was normalized by the total entropy of choice H(C) to provide the percentage of information (Kerr et al. 2007).

LFP Analysis
We analyzed the frequency components of LFPs by using the wavelet transform, which is defined as for a continuous signal ϕ(t), and employed Morlet wavelet function Ψ in the present study (Torrence and Compo, 1998). We used the down-sampled version of ϕ(t) in the actual calculation, and replaced the integration with a summation over discretized functions. Down-sampled LFP signals were band-pass filtered in the beta (15-20 Hz) or gamma (50-60 Hz) band. The band-pass filtered signals were transformed into phase through Hilbert transformation. Phase-locking spiking of individual neurons during 0.5 s after cue onset was analyzed and the statistical significance was evaluated on the basis of Rayleigh's Z values (P = e −Z , P < 0.01), which is given by Rayleigh circular uniformity test (Siapas et al. 2005).

Auditory Cue-Guided Choice Behavior in Head-Restrained Rats
We investigated whether and how decision-making behavior of rats differs when they are familiar or unfamiliar with sensory cues. We expect that the rats exhibit ambiguous choice responses to unfamiliar cues with varying choice probabilities, and that they take longer reaction times (RTs) for unfamiliar cues compared with familiar cues. Eight head-restrained rats were trained to perform an auditory discrimination task in which they licked 1 of 2 spouts located at the left and right sides of their mouth (left or right choice) in response to high (13 kHz)-and low (10 kHz)-pure-tone stimuli (familiar tones: Materials and Methods), respectively (Fig. 1A,B and Supplementary Movie 1). The correct performance rate finally reached 87 ± 6% (Fig. 1C). After establishing learning, we performed multineuron recordings from the MFC during task execution.
During multi-electrode recordings, we randomly presented one of the familiar cue tones and other cue tones unfamiliar to the rats (10.5, 11.0, 11.5, 12.0, and 12.5 kHz) in each trial, with the occurrence probabilities of the familiar and unfamiliar tones being 80% and 20%, respectively (Fig. 1B). The rats had not been exposed to the unfamiliar tones during training, and their behavioral implications were ambiguous to the rats. The rats typically continued to lick the same spout several times once they decided on a motor response (Fig. 1D). The probability of left or right choices showed a near sigmoidal dependence on the tone frequency for this rat (Fig. 1E) and others (Fig. 1F), although the behavior of the 8 rats displayed notable individual differences.
Unexpectedly, there was no simple relationship between the familiarity with tones and reaction time (RT), which was defined as the interval from cue onset to the first licking event.
Once the rats learned the decision-making task, they performed it surprisingly fast for all the cue tones: the median RT ranged from 237 to 304 ms for the familiar cues and from 265 to 320 ms for the unfamiliar cues across the rats (Fig. 1G). The median RT was longer for unfamiliar tones than for familiar tones in 7 rats, and the differences were statistically significant in 4 rats (Mann-Whitney U-test, P < 0.05). However, the differences were typically only 10-20 ms, and one rat exceptionally showed significantly shorter RTs for the unfamiliar cues. The unexpected fast choice responses to unfamiliar tones seem to indicate no hesitation of the rats in decision-making. Such a fast decision is likely to be possible if the past experiences with familiar cues guided behavioral responses to unfamiliar cues. Below, we explored how this guidance is provided by neural ensembles in the AGm.

Electrophysiological Classification of Medial Frontal Neurons
The previous findings of choice-selective sequences suggest that such sequences are formed for familiar cues and guide decision behavior for unfamiliar cues. To examine this hypothesis, we recorded neuronal activity in the AGm using a 4-shank silicon probe with multiple tetrode sites ( Fig. 2A) and sorted spike trains of 324 neurons with an in-house spike sorting algorithm (http://etos.sourceforge.net/) (Takekawa et al. 2012). ICMS was performed to identify the sites of recordings (Materials and Methods). We then classified the recorded neurons into regular-spiking (RS) neurons, which are presumably pyramidal cells, and putative fast-spiking (FS) interneurons on the basis of their spike widths ( Fig. 2B: Supplementary Material). This criterion was reliable in our previous simultaneous multineuron and juxtacellular recordings, by which we identified the cell type of recorded neurons (Isomura et al. 2009), although pyramidal cells discharging thin spikes are possibly misclassified as interneurons (Suter et al. 2013).
We obtained 283 RS and 41 FS neurons in total; putative FS cells showed significantly higher average firing rates than RS cells (2-sample t-test; P < 0.01, inset in Fig. 2B). Two hundred neurons (RS 165, FS 35) were event-related; a neuron was regarded as event-related if its firing rate significantly increased (paired t-test, P < 0.01) during cue presentation (Fig. 2C1) and/or around the onset of first or second licking following cue presentation (Fig. 2C2) compared with the baseline level during the pre-cue period (Supplementary Material). Thus, medial frontal neurons exhibit multiplexed responses to sensory cues and licking behaviors.
Under the familiar cue condition, the highly active epochs of medial frontal neurons constituted a sequence spanning the entire task period (Fig. 2D). We calculated the peri-event-time histogram (PETH) of spike trains in each neuron over trials to find the significantly active epochs and the times of peak activation of individual neurons (Supplementary Material). The emergence of sequences indicates that large ensembles of neurons are recruited for coding the present decision-making behavior. We note that neural trajectories evolved similarly in RS and FS neuron ensembles. This seems to suggest that decision-making computation in the AGm, like other brain functions (Isomura et al., 2009;Isaacson and Scanziani, 2011;Yizhar et al., 2011), requires excitation-inhibition balance. We, however, do not analyze this point further.

Sensory and Choice-Related Single-Cell Activity in Familiar Trials
Sensorimotor responses of singe neurons are poorly understood in the rat MFC. Therefore, we first analyzed the spectrum of task-related activities constituting neural sequences for familiar trials. Most task-related activities were selective to familiar auditory cues and/or motor choices (2-way analysis of variance [ANOVA], P < 0.05). In Figure 3A,B, we present 2 examples of choice-selective responses showing no or strong tone selectivity (see figure legend for the statistical tests used). In Figure 3C,D, we quantify the choice selectivity of these responses for different tones by calculating the area under the receiver operating characteristic curve (aROC: Supplementary Material) in sliding bins of 50-ms width with 10-ms steps. If aROC is 0.5 (or 1.0), 2 responses are statistically indistinguishable (or completely different). Comparison between the results shown in Figure 3A-D suggests that there may be some overlap in the peri-cue and peri-choice periods. Figure 3E shows the proportions of neurons showing tone and/or choice selectivity at a given time. Tone selectivity rapidly increased at cue onset. Moreover, choice-selective neurons but not other categories increased in proportion until reaching a stationary value approximately 130 ms after cue onset, indicating that neural population coding was gradually dominated by choice selectivity. After normalizing the choice selectivity of each neuron (Supplementary Material), we visualized its time evolution in 92 neurons (75 RS, 17 FS) that expressed significant choice selectivity (permutation test, 1000 times, P < 0.05) (Britten et al. 1996) in at least 3 successive bins after cue onset (Fig. 3F, top). The epoch of significant choice selectivity started almost simultaneously in RS and FS neurons, with no significant difference in mean onset times (RS 129 ms, FS 137 ms; 2-sample t-test, P > 0.05). Similarly, tone-selective epochs started almost simultaneously in tone-selective RS and FS neurons (26 RS and 7 FS; RS 123 ms, FS 139 ms; 2-sample t-test, P > 0.05) (Fig. 3F, bottom). Interestingly, FS neurons displayed a slight but significant bias toward ipsilateral choices in the response period (t-test, P < 0.05) but not in the cue period. In contrast, the choiceselective activities of individual RS cells showed no significant contralateral or ipsilateral choice preference in the cue or response period (Fig. 3G).
In summary, 107/200 event-related neurons (86/165 RS, 21/ 35 FS) were selective to upcoming choices, 20% of RS and 26% of FS neurons were weakly tone selective, and neurons with tone selectivity alone (8 RS and one FS) were rare (Fig. 3H). Thus, the response selectivity of neurons is strongly biased toward future behavioral choices in the AGm.

Choice-Selective Familiar Neural Trajectories
We further asked how the population of medial frontal neurons processes multiplexed sensory and motor information to choose a choice response. For this purpose, we investigated the evolution of neural activity for all correct and error trials in all 2 × 2 combinations of familiar cues and licking responses. We combined the firing rates of all simultaneously recorded neurons into population rate vectors evolving as a function of sliding time bin. We averaged the rate vectors over trials and conducted PCA of the trajectories under the same cue choice conditions (Materials and Methods). Figure 4A (left) shows 4 trajectories obtained for 68 neurons recorded simultaneously from the left hemisphere of Rat 2. For clarity, the trajectories are shown as far as the licking responses (the right panels show full trajectories along each axis). The 4 trajectories, initially in a background state, were rapidly separated by familiar cues until they reached cusps where separation also reached a maximum and then evolved into "decision states," in which the rat generated left or right licking (triangles). Importantly, the averaged trajectories for the same choice were similar in correct and error trials (Fig. 4A, right), suggesting that neural states evolve during decision-making along familiar trajectories.
However, the choice-selective trajectories also exhibited large trial-to-trial variation, and accordingly, trial-by-trial rate vectors are widely distributed in each bin. An intriguing question is when the noisy trajectories diverge into different choices during the decision process. Because the MFC is thought to control adaptive cognitive behavior, its neural ensemble dynamics may signal the crucial timepoint. We used Fisher's linear discriminant to find a hyperplane that optimally divides the clusters of rate vectors obtained for different familiar cues (tone axis) or different choices (choice axis) in each bin. Fisher's linear discriminant finds projection to a line such that data points from different clusters projected to the line are optimally separated, and we can solve this optimization problem by maximizing the degree of discrimination between the projected clusters (Materials and Methods). The obtained line gives the direction of the normal vector w of the hyperplane.
We aligned the rate vectors with cue onset to calculate the discrimination degree and mixed correct and error trials together. The discrimination degree between the trajectories initiated by different cues rapidly reached a peak level at 140 ms (t tone ) after cue onset (Fig. 4B, top row, left). Similarly, the discrimination degree between the trajectories bound to different choices reached a peak level at 390 ms (t choice ) (Fig. 4B, top row,  right). Following this, we performed similar analysis on rate vectors aligned with onset of the motor response and found a steep increase in choice-selective trajectories (Fig. 4B, bottom  row, right). The discrimination degrees averaged over all 8 rats show similar increases, indicating that neural trajectories for different tones on average reached a maximal separation approximately 150 ms after cue onset, which was approximately 100 ms before choice responses (Figures 1G and 4C).
To study the trial-by-trial state evolution of correct and error trials, we projected trial-by-trial rate vectors recorded from a rat at t 0 (cue onset), t tone , and t choice onto the tone and choice discrimination axes (Fig. 4D). In correct trials, initial neural states for left and right choices were separated in neither directions at time t 0 (KS-test, P = 0.83 along tone axis; P = 0.14 along choice axis), as expected. Then at time t tone , neural states for left and right choices became distinguishable along both tone (KS-test, P < 0.01) and choice axes (KS-test, P < 0.01). In error trials, initial neural states were also indistinguishable between left and right choices along the tone axis (KS-test, P = 0.34), but were somehow distinguishable along the choice axis (KS-test, P < 0.01). This possibly implies that the failure in decisionmaking was partly due to an initial bias in neural state evolution. Moreover, neural states remained indistinguishable along the tone axis at time t tone (KS-test, P = 0.91 along tone axis; P < 0.01 along choice axis), predicting future failure in choice. At time t choice , neural states became separable in both axes and in both correct and error trials (KS-test, P < 0.01), although in correct trials the degree of discrimination (d′) was reduced along the tone axis (d′[t tone ] = 1.6022 versus d′[t choice ] = 0.7878) during time passage from t tone to t choice . All together, our results suggest that the decision commitment of the AGm switches from sensory to motor coding earlier than t choice , most likely around t tone .

Decisions Represented in Unfamiliar Choice-Selective Trajectories
Now we turn to trials with unfamiliar cues. Despite rapid choice responses, overall the rats changed their choice probability according to the frequency of unfamiliar tones (Fig. 1F,G). This seems to imply that decision-making was almost automatic even in unfamiliar trials. Our hypothesis is that the familiar trajectories provide the neuronal mechanism to guide probabilistic choices with unfamiliar cues. We therefore expect a certain similarity in the evolution of neural trajectories between familiar and unfamiliar trials.
Since auditory responsive neurons give an initial bias for neural state evolution, we first examined their tuning properties. In Figure 5A,B, we present the differential responses of an RS neuron to the 7 cue tones (1-way ANOVA, F 6, 517 = 5.6, P < 0.001), which preferentially responded in right choice trials (2-way ANOVA, F 1 ,882 = 69.2, P < 0.001). Among the 200 neurons recorded (165 RS, 35 FS), approximately 20% (32 RS, 7 FS) showed statistically distinct responses to the different tone frequencies (1-way ANOVA, P < 0.05). However, the frequency tuning curves of individual neurons are generally broad and not even unimodal (Fig. 5C). Therefore, the different tone stimuli are likely to be discriminated by population activity patterns. Because the auditory responsive neurons generally responded to both familiar and unfamiliar cues, familiarity with sensory input is likely processed elsewhere.
As expected, the trial averages of left or right choice trajectories for unfamiliar cues resembled the familiar trajectories for the corresponding behavioral choices (Fig. 6A). As in the familiar case, neural state evolution also displayed large trialby-trial fluctuations for unfamiliar cues (results not shown). Therefore, we examined how the discrimination degree between left and right choice trajectories evolves for each unfamiliar cue. The results are first shown for Rat one which varied choice probability smoothly with tone frequency. For familiar cues, the discrimination degree gradually increased toward the time of a licking response (~0.4 s) and reached a near-plateau state approximately at time t tone (dashed vertical line). The discrimination degree exhibited a similar transition behavior for unfamiliar cues, where the transition ended shortly after time t tone and prior to a licking response (Fig. 6B, top). The observation that neural state evolved similarly in familiar and unfamiliar trials may explain why the RTs were also similar in these trials.
We then calculated mutual information between the instantaneous rate vector and a choice response (Materials and Methods). The mutual information increased with the discrimination degree (Fig. 6B, bottom), indicating that neural trajectories are informative about upcoming choices. The population signals averaged over the 8 rats confirmed the parallel changes in the discrimination degree and mutual information (Fig. 6C), both of which slowly decayed toward baseline levels after a choice response. The time of state transition also approximately coincided with time t tone (mean and SD, t tone = 158 ± 48 ms for the 8 rats). Although the estimation of accurate decision timing is difficult, these results suggest that time t tone and familiar trajectories characterize the time course of decisionmaking for unfamiliar cues.
It is noticed that mutual information has a higher baseline level for unfamiliar cues than for familiar cues. This difference, however, is likely due to the small data size of unfamiliar cues. Actually, mutual information calculated for familiar cues with a much reduced data set (20%) exhibited a similar increase in the baseline level (results not shown).
To confirm the behavioral relevance of time t tone and familiar trajectories, we examined whether the deviations (D tone ) of the rate vectors from the hyperplane (H tone ) at t tone predict trial-by-trial choices in all 8 rats ( Supplementary Fig. 1). We first performed a cross-validation test for familiar trials (Materials and Methods). Note that as in the previous analyses, H tone was derived from familiar correct trials. Overall, the results were satisfactory in correct trials, while they were poorer in error trials. The poor performance in error trials may be partly due to small data size. Next, we used all familiar correct and familiar error trials to determine H tone and used it to predict choices for unfamiliar tones (Supplementary Fig. 1, green crosses). The predictions were overall satisfactory in 4 rats (Rat 1, Rat 2, and rats #807 and #940), but were not always accurate in the other rats. The later 4 rats includes those exhibiting strongly biased choices for unfamiliar tones (data not shown).
Thus, as far as rat's behavior is not strongly biased (and if a sufficiently large data set is available), neural trajectories formed for familiar sensory experiences well guide choice behaviors for unfamiliar sensory experiences. Whether and how neural trajectories of individual animals induce behavioral biases is an interesting question. However, in the present study, we will not explore this question further.

Oscillations and Spike Synchrony During Decision Behavior
The MFC is known to engage in adaptive cognitive control (Ridderinkhof et al., 2004) and low-frequency oscillations (<12 Hz) were related to behavioral adaptation in the prelimbic and anterior cingulate regions of the rodent MFC (Narayanan et al., 2013). Furthermore, gamma oscillations were suggested to be crucial for cross-area communications (Colgin et al., 2009;Canolty et al., 2010;Yamamoto et al., 2014). Therefore, we investigated whether the present decision-making is also accompanied by oscillations and/or spike synchrony. During task performance, the power spectrum of the LFP recorded in the AGm showed peaks in the beta (15-30 Hz) and gamma (40-60 Hz) bands (Fig. 7A). Beta oscillation started to increase approximately 300 ms prior to cue onset and gradually increased its amplitude during the task (Fig. 7B). In contrast, in almost all rats LFPs consistently exhibited a rapid and strong increase in gamma power at approximately 150 ms after cue onset for both familiar (7 rats: Fig. 7B) and unfamiliar cues (7 rats: Fig. 7F). Beta or gamma band-filtered LFP actually displayed enhanced oscillation toward or around this timing, respectively (Fig. 7C). Since this timing approximately coincides with t tone , the enhanced gamma oscillations may report the separation of neural trajectory prior to a choice response.
We then examined coherence between neuronal firing and the LFP oscillations within a 500-ms-long interval from cue onset. Though individual neurons rarely displayed strong phase-locked firing, population spikes were mildly phaselocked to specific phases of beta and gamma oscillations in familiar trials (Fig. 7D: Rayleigh test, P < 0.01). FS neurons showed stronger tendency of phase locking than RS neurons in both trial types, and this difference between the cell types appeared to be stronger in familiar trials (Fig. 7E,G). We also explored spike coincidences between neurons during the task. Among the 7467 pairs of all event-related and nonrelated neurons, only 42 (0.56%; 18 RS-RS, 24 RS-FS, 0 FS-FS) displayed significant correlations (latency <3.5 ms). These RS-FS pairs tended to fire in the RS-to-FS sequence (Supplementary Fig. 2A,B), and spike coincidences occurred in both neighboring and distant (<1 mm) neuron pairs ( Supplementary Fig. 2C, D). At least one neuron was choice-selective in approximately 83% of the 42 pairs ( Supplementary Fig. 2E). However, only one RS-FS pair showed choice-and task phase-dependent spike coincidences ( Supplementary Fig. 2F-H). All together, our results suggest that spike coincidences are unlikely to be crucial for the present decision-making task. However, small fractions of RS and FS neurons exhibited significant tendency of phase-locked firing to LFP oscillations, and the role of oscillations in decision-making requires further experimental clarifications.

Discussion
In the present study, we demonstrated that neural population trajectories probabilistically evolve in the MFC during a sensory-guided motor choice task. The neural trajectories formed for familiar sensory experience are used as templates for choice responses with unfamiliar sensory experience. We identified an inflection point at which the rapid separation between average trajectories for 2 alternate choices becomes almost maximal and slows down during neural state evolution. Our results suggest that neural trajectories represent an elemental mechanism for the formation of rapid probabilistic decision-making behavior. Choice-selective sequences have been reported in various species and decision tasks (Mazor and Laurent 2005;Fujisawa et al. 2008;Pastalkova et al. 2008;Durstewitz et al. 2010;Niessing and Friedrich 2010;Harvey et al. 2012). The present study further addresses crucial questions in neural coding about how sensory experience guides ambiguity processing.
The sensory-guided probabilistic decision-making studied here is a rapid, near-instantaneous process. Accordingly, neural trajectories rapidly evolved and diverged to decision states for both familiar and unfamiliar cue tones as early as approximately 150 ms after the cue onset. Some rats showed significantly longer RTs for unfamiliar cues than for familiar cues; however, the differences were small or insignificant in other rats, suggesting that categorizing unfamiliar auditory information requires little additional processing time. This result is unexpected from a trade-off between the accuracy and speed of evidence summation, which has been well studied in visual discrimination (Kim and Shadlen 1999;Gold and Shadlen 2007) and olfactory discrimination (Abraham et al. 2004). In the present task, evidence summation has to be completed during cue presentation. Our results suggest that evidence summation occurs before the inflection point accompanied by a nearmaximal trajectory separation. We speculate that a higher demand on task performance, such as a more precise control of choice probability for unfamiliar cues, may reveal a clearer accuracy-speed trade-off in the present auditory discrimination task.
Neuronal population activity exhibits several interesting changes around the time of maximal trajectory separation (t tone ), including a switch in the dominant principal components (Fig. 4A), an abrupt increase in the discrimination degree between the trajectories (Fig. 6), and large shifts in the beta and gamma powers of LFP (Fig. 7B,F). Evidence from rodents (Colgin et al. 2009) and primates (Canolty et al. 2010) suggests the change in gamma power may reflect the opening of new communication channels between brain areas. However, neuronal firing was only weakly phase-locked to gamma oscillations in the AGm. Whether the increased gamma power indicates the transfer of signals from MFC to other motorrelated regions, such as the motor cortex (Narayanan and Laubach 2006), spinal cord, and basal ganglia, has to be clarified by future experiments.
The function of the AGm may split into 2 major processes, early sensory commitment and late motor commitment, the latter being determined by increased choice selectivity in the neural population (Fig. 3E). In monkeys, single dorsolateral prefrontal neurons switch from sensory to motor coding in a decision task (Asaad et al. 1998). While rodent AGm also contains neurons showing selectivity to both tones and choices (Fig. 3H), we define the choice point in a neuronal population across a dynamic temporal sequence. It is possible that MFC is involved in a part of functions executed by primate dorsolateral prefrontal cortex (Seamans et al. 2008). However, our data were biased toward strong choice selectivity and weak tone selectivity, suggesting that the AGm is unlikely homologous to primate dorsolateral prefrontal cortex. Evidence from cytoarchitecture, microstimulation (Neafsey et al. 1986), anatomical connections (Reep et al. 1987), and characteristics of neural activity (Erlich et al. 2011;Sul et al. 2011) suggests that the AGm is rather a homolog of primate higher order motor-related cortices such as premotor or supplementary motor area.
Our findings have significant implications for current models of decision-making. Decision-making is typically modeled by recurrent neural networks with shared inhibition between neuronal ensembles (Fusi et al. 2007;Wang 2012;Deco et al. 2013), equivalent to the decision states we observed. Recurrent networks provide a candidate mechanism for sensory evidence summation by population activity in the MFC preceding the maximal separation of trajectories. Large trial-by-trial variation in population neuronal firing makes the evolution of neural states probabilistic, and the broad tuning of cuemodulated neurons may generate a graded representation of an initial bias in this state evolution. However, our results are unlikely to support a role for shared network inhibition in neuronal state progression because putative inhibitory neurons are strongly choice-selective and RS neurons are not strongly suppressed in their nonpreferred trials. Because spike coincidences are unimportant for choice-selective sequences, log (Z)