-
PDF
- Split View
-
Views
-
Cite
Cite
Anna R Chambers, Dominik F Aschauer, Jens-Bastian Eppler, Matthias Kaschube, Simon Rumpel, A stable sensory map emerges from a dynamic equilibrium of neurons with unstable tuning properties, Cerebral Cortex, Volume 33, Issue 9, 1 May 2023, Pages 5597–5612, https://doi.org/10.1093/cercor/bhac445
- Share Icon Share
Abstract
Recent long-term measurements of neuronal activity have revealed that, despite stability in large-scale topographic maps, the tuning properties of individual cortical neurons can undergo substantial reformatting over days. To shed light on this apparent contradiction, we captured the sound response dynamics of auditory cortical neurons using repeated 2-photon calcium imaging in awake mice. We measured sound-evoked responses to a set of pure tone and complex sound stimuli in more than 20,000 auditory cortex neurons over several days. We found that a substantial fraction of neurons dropped in and out of the population response. We modeled these dynamics as a simple discrete-time Markov chain, capturing the continuous changes in responsiveness observed during stable behavioral and environmental conditions. Although only a minority of neurons were driven by the sound stimuli at a given time point, the model predicts that most cells would at least transiently become responsive within 100 days. We observe that, despite single-neuron volatility, the population-level representation of sound frequency was stably maintained, demonstrating the dynamic equilibrium underlying the tonotopic map. Our results show that sensory maps are maintained by shifting subpopulations of neurons “sharing” the job of creating a sensory representation.
Introduction
A hallmark of primary sensory cortices is their functional organization according to topographic principles (Woolsey 1958; Drager 1975; Merzenich et al. 1975; Fleming 2018). In a topographically organized region, tuning to major sensory parameters changes systematically along gradients across the cortical surface. The tonotopic organization of auditory fields revealed by pure tones of changing frequency is a classic example in the auditory modality. Topographic maps in adulthood are characterized by a high degree of stability (Guo et al. 2012) unless affected by major perturbations such as sensory deprivation (Kilgard 2012). This robustness makes them a widely used readout of cortical organization within and across individuals.
At the level of individual cells, plasticity in neuronal tuning in the healthy, mature brain has often been attributed to directed adaptations and learning (Karmarkar and Dan 2006). Recently, however, the advent of chronic recording techniques of neuronal activity with single-cell resolution has revealed that functional properties of neurons can undergo continuous remodeling even under behaviorally and environmentally stable conditions. Such “basal” or intrinsic plasticity appears to be a fundamental feature of the cortex, as it has been reported in various regions, including the mouse hippocampus and auditory, barrel, visual, motor, and posterior parietal cortices (Rokni et al. 2007; Huber et al. 2012; Mankin et al. 2012; Margolis et al. 2012; Ziv et al. 2013; Montijn et al. 2016; Clopath et al. 2017; Driscoll et al. 2017; Hainmueller and Bartos 2018; Rule et al. 2020; Deitch et al. 2021; Aschauer et al. 2022). This indicates that learning-induced forms of plasticity act in concert with intrinsic forms of plasticity (Chambers and Rumpel 2017).
Observations of volatile single-cell tuning properties appear to be in conflict with the apparent stability of sensory maps. A possible explanation could be that tuning to stimuli whose representation is topographically organized may be particularly stable and that long-term remodeling of functional properties selectively affects stimuli with non-topographically organized representations. In line with this, there are recent reports of stable neuronal tuning to simple and basic stimulus features in sensory cortices over extended periods of time (Mank et al. 2008; Margolis et al. 2012; Mayrhofer et al. 2015; Peron et al. 2015; Poort et al. 2015; Rose et al. 2016; Jeon et al. 2018; Marks and Goard 2021). As an alternative explanation, a stable representation at the population level can also arise from cells with volatile tuning properties, as shown for naturalistic stimuli or higher order behavioral features (Rokni et al. 2007; Driscoll et al. 2017; Rule et al. 2020; Deitch et al. 2021; Aschauer et al. 2022). Which of these possible explanations underlies the stability of a topographic map in sensory cortices has not yet been addressed.
Leveraging the mouse auditory cortex as a model, we attempt to reconcile the conflicting observations of unstable single-cell tuning properties and stable population-level representations at the level of a topographic map. Using intrinsic imaging and chronic 2-photon calcium imaging, we assess tuning properties to pure tones and more naturalistic, complex sounds at the population and single-cell levels. A previous analysis of this dataset revealed substantial changes in individual neuron responsiveness to the stimulus set of pure tones and complex sounds (Aschauer et al. 2022). Here, we expand on this finding by modeling individual neuron volatility to explore its consequences at the population-level topographic map. The changes observed in individual sensory neurons under stable experimental conditions can be largely captured by a memory-less stochastic Markov process. According to this model, the stability of the topographic map emerges from a dynamic equilibrium of shifting subpopulations of neurons that are balanced across the spatial and frequency gradients.
Materials and methods
This dataset has been analyzed and presented previously in another publication addressing a different scientific question (Aschauer et al. 2022). The primary data presented in this study are publicly accessible on G-Node (DOI: 10.12751/g-node.trwj8c). Code used for image processing is available at Zenodo (DOI: 10.5281/zenodo.5822486).
Animal use
Experimental subjects were male CB57BL/6J mice of 8–12 weeks of age from Jackson laboratory (strain #000664). Before surgical procedures, mice were kept in groups of 5, and housed in 530 cm2 cages on a 12 h light/dark cycle with unlimited access to dry food and water. Experiments were carried out during the light period. All animal experiments were performed in accordance with the Austrian laboratory animal law guidelines for animal research and had been approved by the Viennese Magistratsabteilung 58 (Approval M58/00236/2010/6).
Molecular cloning
For the generation of a recombinant AAV (rAAV) genome encoding for GCaMP6m under the human SynapsinI promoter (phSyn), a plasmid containing the inverted terminal repeats (ITRs) of AAV, phSyn (Addgene plasmid 26973), Woodchuck Hepatitis Postttranscriptional Regulatory Element (WPRE), and a human Growth Hormone polyadenylation site (hGH-pA site) was digested using BamHI and AccIII and the gene coding for GCaMP6m was PCR amplified from a commercially available plasmid (Addgene plasmid 40754) and inserted. Finally, the plasmid was digested with AccIII and HindIII to excise the original transgene, 3′ overhangs were blunted, and 5′ overhangs were filled in using Klenow fragment.
For the generation of a rAAV genome encoding for H2B::mCherry fusion protein under the phSyn promotor, a gene coding for mCherry was PCR amplified and inserted into a plasmid containing a gene for H2B directly after its coding sequence, using ClaI and SpeI to produce a fusion gene. The H2B::mCherry fusion gene was PCR amplified and inserted into a plasmid containing ITRs, phSyn, WPRE, and hGH-pA using KpnI and HindIII. Finally, the WPRE was removed using HindIII and XhoI, 3′ overhangs were blunted, and 5′ overhangs were filled in using Klenow fragment.
rAAV production
All rAAV vectors described were produced in HEK293 cells using a helper virus-free, 2-plasmid-based production method (Grimm, Kay, et al. 2003) based on a commercially available system (Agilent Technologies, CA, USA; catalog# 240071). Briefly, HEK293 cells were transfected using the calcium phosphate method. Cells were harvested and collected by centrifugation (2,500g, 20 min at 4 °C) 72 h post transfection. Cell pellets were resuspended in buffer and lysed by 3 consecutive freeze/thaw cycles. For removal of genomic DNA, cell lysates were incubated with benzonase (50 U/mL) for 1 h at 37 °C. Subsequently, rAAV particles were precipitated with CaCl2 (25 mM) followed by PEG precipitation (8% PEG-8000, 500 mM NaCl). After resuspension of PEG precipitates in 50 mM HEPES, 150 mM NaCl, 25 mM EDTA, pH 7.4 overnight at 4 °C, rAAV particles were further purified by CsCl density gradient centrifugation. Fractions from CsCl density gradients were analyzed by measuring the refractory index. Samples within a refractory index ranging from 1.3774 to 1.3696 were pooled and dialyzed against PBS for removal of CsCl by using dialysis cassettes with a molecular weight cutoff of 20 kDa (Thermo Scientific, MA, USA; catalog# 87738). Finally, rAAV preparations were concentrated by using ultrafiltration units with a molecular weight cutoff of 50 kDa (Millipore, MA, USA; catalog# UFC905024). After addition of glycerol to a final concentration of 10%, rAAV preparations were sterile filtered with Millex-GV filter units (Millipore, MA, USA; catalog# SLGV013SL), frozen in liquid nitrogen, and subsequently stored in aliquots at −80 °C. Genomic titers of purified rAAV stocks were determined by isolation of viral DNA (Viral Xpress DNA/RNA Extraction Reagent, Millipore, MA, USA; catalog# 3095) and subsequent qPCR analysis using primers specific for phSyn.
Stereotaxic injections
All surgical equipment was sterilized with 70% ethanol before use. Animals were deeply anesthetized with a mixture of ketamine and medetomidine (KM; 2.5 mg ketamine-HCl and 0.02 mg medetomidine-HCl/25 g mouse weight) injected intraperitoneally, and positioned in a stereotaxic frame (Kopf Instruments, Tujunga, CA, USA; Stereotaxic System Kopf 1900). The eyes were protected from dehydration and light exposure using sterile eye gel (Alcon Pharma, Novartis, CHE; Thilo-Tears Gel) and a piece of aluminum foil. Lidocaine was applied as local anesthetic subcutaneously before exposure of the skull. The scalp was washed with a 70% ethanol and a cut along the midline revealed the skull. A small hole was drilled into the skull above the auditory cortex using a stereotaxic motorized drill (Kopf Instruments, Tujunga, CA, USA; Model 1911 Stereotaxic Drilling Unit) leaving the dura mater intact. Injections were performed perpendicular to the surface of the skull. Virus solution consisted of a mixture of 2 different rAAVs (rAAV2/8 ITR-phSyn-GcaMP6m-WPRE-hGHpolyA-ITR; titer: 1.75 * 1011 viral genomes (VG)/ml; rAAV2/8 ITR-phSyn-H2BmCherry-hGHpolyA-ITR; titer: 2 * 1013 VG/ml) in PBS. The virus mixture was loaded into a thin glass pipette, and 150 nL were injected at a flow rate of 20 nL/min (World Precision Instruments, Sarasota, FL, USA; Nanoliter 2000 Injector) in 5 locations along the anterior–posterior axis, resulting in a total injection volume of 750 nL. Stereotactic coordinates were: 4.4, −2.5/−2.75/−3/−3.25/−3.5, 2.5 (in mm, caudal, lateral, and ventral in reference to Bregma). Glass pipettes (World Precision Instruments, Sarasota, FL, USA; Glass Capillaries for Nanoliter 2000; Order# 4878) had a tip diameter of 20–40 μm. After the injection, the pipette was left in place for 3 min, before being slowly withdrawn and moved to the next coordinate. After completion of the injection protocol, the skin wound was sealed using tissue adhesive (3 M Animal Care Products, St. Paul, MN, USA; 3 M Vetbond Tissue Adhesive), and anesthesia was neutralized with 0.02 mL atipamezole. Mice were monitored daily and received intraperitoneal injections of carprofen (0.2 mL of 0.5 mg/mL stock) on the first days after surgery.
Cranial window implantation
Two weeks after stereotactic injections, animals were anesthetized with isoflurane (Abbott Animal Health, IL, USA; IsoFlo; 1.5–2.4% in air, 200 mL/min flow rate, High Precision Instruments, MT; Univentor 400 Anesthesia Unit). All surgical equipment and glass coverslip were sterilized with 70% ethanol before use. Anesthesia was initialized in a glass desiccator filled with an isoflurane/air mixture. Anesthetized animals were mounted on a stereotaxic frame (Kopf Instruments, Tujunga, CA, USA; Stereotaxic System Kopf 1900), and the head was positioned using ear, teeth, and a custom-made v-shaped head holder. About, 0.02 mL dexamethasone (4 mg/mL) was administered intramuscularly to the quadriceps, as well as 0.02 mL carprofen (0.5 mg/mL) intraperitoneally. The eyes were protected using sterile eye gel (Alcon Pharma, Novartis, CHE; Thilo-Tears Gel) and a piece of aluminum foil. A local anesthetic (lidocaine/epinephrine; Gebro Pharma, Austria) was applied subcutaneously before exposure of the skull. The scalp was washed with 70% ethanol and a flap of skin covering temporal, both parietal regions and part of the occipital bone was removed. The musculus temporalis was injected with lidocaine/epinephrine (Gebro Pharma, Austria) as an additional anesthetic and to minimize bleeding. Subsequently, the muscle was partly removed with a surgical scalpel and forceps to expose the right temporal bone. Using a fine motorized drill, the bones of the skull were smoothened, and part of the zygomatic process was removed. The surface was cleaned using cortex buffer and a 2% hydrogen peroxide solution, and covered with a thin layer of one-component instant glue (Carl Roth, Germany; Roti coll). A thin layer of dental cement (Lang Dental, IL, USA; Ortho-Jet) was applied onto the skull, sparing the area of the temporal bone above the auditory cortex. A rectangular groove of about 2 mm by 3 mm was drilled into the skull above the auditory cortex, and the bone was carefully lifted using scalpel and forceps. The exposed area was cleaned and kept moist using sterile sponges (Pfizer, NY, USA; Gelfoam) and cortex buffer. The craniotomy was covered with a small circular cover glass (Electron Microscopy Sciences, PA, USA; 5 mm diameter, catalog# 72195-05), and sealed with 1.2% low-melting agarose (Sigma Aldrich, MO, USA; Agarose Type IIIA). The cover glass was finally set in place with one-component instant glue and dental cement. In order to position the animal under the microscope with the objective facing the window plane perpendicularly, a custom-made titanium head post was mounted on the implant above the window and embedded with dental cement. After dental cement had cured, animals were placed back in a pre-warmed cage. After the surgical procedure, animals were singly housed and recovered for at least 1 week before further handling.
Optical imaging of intrinsic signals
The mesoscopic optical imaging set-up was comprised of a CCD camera (Vosskuehler, Germany; CCD1200QD; 25 Hz frame rate), attached to a macroscope consisting of 2 objectives placed face-to-face (Nikon 135 and 50 mm), and 2 LEDs (470 and 780 nm). Animals were anesthetized initially with isoflurane and were subsequently positioned under the CCD camera using the custom-made head post. During functional imaging, isoflurane was delivered at a concentration of 0.6–1.0% isoflurane in air mixture with a vaporizer (High Precision Instruments, MT; Univentor 400 Anesthesia Unit) at a flow rate of around 200 mL/min to the snout. Using blue and white light illumination, 2 single images were acquired of the cortical surface to record the superficial blood vessel pattern. For functional imaging, the focal plane was moved to 400 μm under the surface. Using a near-infrared LED (780 nm), intrinsic signals were recorded during the presentation of stimuli constructed from 20 pure tone pips (80 ms-long individual pips at 1, 2, 4, 8, 16, 32, and 64 kHz) separated by 20-ms-long smooth gaps (total stimulus length: 2 s). Each stimulus was presented in 30 randomized trials, and for each trial, a baseline and a response image were acquired 2 s before and 2 s after stimulus onset. For each trial, the change in light reflectance between pre- and post-stimulus image was computed and averaged for all trials of the respective stimulus.
Habituation to awake chronic 2-photon imaging
Animals were habituated to handling at the 2-photon microscope. Therefore, animals were mildly water deprived and fixated under the objective in a custom-made acrylic glass tube, using a custom-made head post implant. The mouse head was laterally tilted such that the surface of the auditory cortex aligns approximately with the horizontal plane. During habituation, head fixation lasted for a minimum of 30 min each day, and animals were given access to a 5% sucrose in water solution. This was repeated for at least 5 days until animals accommodated to the head fixation apparatus, showed reduced signs of stress and less body movement (typically consisting of few second-long running bouts). The full sound stimulus set later used for recording of sound-evoked activity, was presented. Hence, animal subjects repeatedly experienced all sensory stimuli before any data acquisition.
Sound presentation
Sounds were delivered free field at a 192 kHz sampling rate in a soundproof booth by a custom-made system consisting of a linear amplifier and a ribbon loudspeaker (Audiocomm, Austria) placed 25 cm from the mouse’s head. The transfer function between the loudspeaker and the location of the mouse ear was measured using a probe microphone (Brüel & Kjær, Bremen, Germany; 4939-L-002) and compensated numerically by filtering the sound files with the inverse transfer function to obtain a flat frequency response at the mouse ear (between 0.5 and 64 kHz ±4 dB). Sound control and equalization was performed by a custom MATLAB program running on a standard personal computer equipped with a Lynx 22 sound card (Lynx Studio Technology, CA, USA). The stimulus set consisted of 34 sound stimuli (19 pure tone pips (50 ms; 2–45 kHz separated by a quarter octave) and 15 complex sounds (70 ms)) separated by one-second-intervals and played at 80 dB sound pressure level. The complex sounds in the stimulus set were characterized by broad frequency content and temporal modulations, generated from arbitrary samples of music pieces or animal calls replayed at fourfold speed. All stimulus on- and offsets were smoothened with a 10-ms-long half-period cosine function.
In vivo 2-photon imaging
The 2-photon microscope (Prairie Technologies, WI, USA; Ultima IV) was comprised of a 20×-objective (Olympus, Tokyo, Japan; XLUMPlan Fl, NA = 0.95) and a pulsed laser (Coherent, CA, USA; Chameleon Ultra). Both fluorophores (GCaMP6m and mCherry) were co-excited at 920 nm wavelength, and separated by emission using a fluorescence filter cube (filter 1: BP 480–550 nm; filter 2: LP 590 nm; dichromatic mirror: DM 570 nm; Olympus, Tokyo, Japan; U-MSWG2). Full frame imaging was performed using a field of view (FOV) of 367 × 367 μm (pixel size: 256 × 128) and images were acquired at 5 Hz frame rate (sampling period: 196.86 ms). For further image processing, each line was doubled to create quadratic FOVs.
In the last habituation session, several FOVs at different xy-positions in layer 2/3 (about 150–300 μm depth from cortical surface) were screened for the presence of reliable sound responses. FOVs with reliable sound responses were repeatedly imaged at a 2-day interval, using the stimulus set described above. Each stimulus was presented for at least 20 repetitions per FOV in pseudo-randomized order. Next, the focal plane was moved 50 μm in the z-axis and data was acquired for a second FOV with the same xy-coordinates. Between imaging periods, animals were given access to few drops of a 5% sucrose in water solution.
Auditory cued fear conditioning
The behavioral setup was controlled by a personal computer with WINDOWS XP Professional, Version 2002, SP2 (Microsoft, Redmond, WA, USA) operating system running custom MATLAB R2007a software (MathWorks, Natick, MA, USA). All behavioral experiments were performed in an isolation cubicle (H10–24, Coulbourn Instruments, Whitehall, PA, USA) which was equipped with white LEDs as house light, a microphone and a CCD KB-R3138 camera with infrared LEDs (LG Electronics Austria, Vienna, Austria), which was connected to a Cronos frame grabber (Matrox, Dorval, QC, Canada). The conditioning chamber (25 × 25 × 42 cm, model H10-11M-TC, modified, Coulbourn Instruments) was combined either with a stainless-steel shock floor or a grid floor. A custom-made cartridge (round or square) formed the walls of the chamber in order to create different local environmental contexts. Foot shocks were delivered via an external shocker (Precision Animal shocker, Coulbourn Instruments). Sounds were played from an L-22 soundcard with a maximal sampling frequency of 192 kHz (Lynx Studio Technology, Costa Mesa, CA, USA) and delivered via an amplifier (Model SLA-1, Applied Research and Technology, TEAC Europe GmbH, TASCAM Division, Wiesbaden, Germany), a modified equalizer (Model #351, Applied Research and Technology, TEAC Europe GmbH, TASCAM Division, Wiesbaden, Germany) and a custom-made free field speaker. The sound stimuli were from the stimulus set used for in vivo 2-photon calcium imaging. 70 ms stimuli were repeated 15 times with a one-second-interval, resulting in a total duration of 14.07 s. On- and offsets of stimuli were smoothed with a 10-ms long half-period cosine function. Sound levels for all stimuli used were normalized to a mean power of 78 dB sound pressure level (SPL). Peak sound levels ranged from 83 to 89 dB SPL.
Conditioning session
In the conditioning environment, lights were turned on (~20–30 lux), and round cartridges were used as walls of the chamber. A mild residual ethanol odor was present from previous cleaning of the chamber. Mice were placed in the chamber directly before the start of each session. After at least 1 min baseline (60–90 s), 5 sound-shock pairings (0.75 mA, 1 s, immediately following the sound) were given with a randomized inter-stimulus-interval ranging from 50 to 75 s (paired).
Memory test session
Four days after the conditioning session (i.e. 1 day after the 2-photon imaging paradigm was completed), mice were tested for a conditioned freezing response. In order to create a different environmental context, square cartridges were used as chamber walls, lights were turned off, and home cage bedding was placed underneath the metal grid to provide a familiar odor. After at least 1 min of baseline (60–90 s), the conditioned stimulus and 1 non-conditioned sound stimulus were presented in 3 randomized presentation blocks with an inter-stimulus-interval of 20–30 s.
Alignment of intrinsic optical signals across animals
Alignment was performed on trial averaged ΔR/R0 of intrinsic optical imaging signals. Responses to 2, 4, 8, and 16 kHz and white noise bursts were stacked for each mouse (n = 19) and used to find the optimal 2-dimensional affine transformation between each pair of animals via the MATLAB function fminsearch on band-pass filtered stacks. Using this optimal transformation, data from all animals were mapped to a single reference frame.
Alignment of 2-photon FOVs onto intrinsic optical imaging map
After each imaging session, 2-photon microscope images of the cortical surface above the FOVs were acquired, scaled accordingly, and manually overlaid onto the photograph of the cortical surface from the same animal acquired at the intrinsic optical signal imaging set-up. After aligning of the intrinsic optical signal imaging data from individual animals onto the canonical map, the coordinates for each FOV, and hence, the position of each individual ROI, on the global map were estimated.
Image processing of chronic in vivo 2-photon data
In order to track cells across days, the optimal affine transformation was identified to register regions of interest (ROI), encompassing the soma of individual neurons, onto each frame of the time series recorded from the same FOV across several days. ROIs were selected independently by 2 experts and can be described by a set of several hundred points marking the centers of the mostly spherical neuronal somata. This set of points was transformed for each frame by an affine transformation consisting of rotation, scaling, and shifting. The objective function value for the optimization of this transformation is the pixel-wise overlap between a band-pass filtered and binarized image of each frame and a mask generated from the transformed ROIs by drawing a circle with a 3-pixel (4.30 μm) radius around the center of each ROI. This 6-dimensional optimization problem (rotation angle, scale in x, scale in y, off-diagonal of scaling matrix, shift in x, shift in y) was solved numerically using MATLAB’s implementation of the Nelder–Mead-Simplex algorithm (fminsearch). This was done in 2 iterations, first for the entire frame, then for four equally sized horizontal segments to correct for full frame movements during the 2-photon microscope scanning. In a third iteration, individual ROIs were moved to the maximum in a 2-pixel (2.87 μm) surrounding of a low-pass filtered image to allow for slight local distortions.
ROI inclusion criteria
Four quality criteria were defined in order to only include cells in the analysis that had a reliably present signal in the H2B::mCherry channel marking the neuronal somata. This was done on a frame-by-frame basis, so that at each given time point a cell was either reliably present or excluded.
Nearest neighbor distance
Strongly overlapping cells in a given frame, i.e. cells with a center-to-center distance below 3 pixels (4.30 μm), were defined as unreliable in that respective frame. Thus, the chance to wrongly label individual cells was minimized.
Normalized soma signal intensity
For each cell at each time point, the difference between the mean signal intensity in the soma (2-pixel radius; 2.87 μm) and the mode of the intensity of the surrounding (ten-pixel radius; 14.34 μm) was computed and normalized by the 95-percentile of this difference. Cells with an intensity close to the background, a normalized soma signal intensity (NSSI) below the value of 0.2 were excluded.
Objective function value
The optimization described above resulted in the alignment and an objective function value, which describes the pixel wise overlap of the frame and the template. In order to rule out movement artifacts, individual frames in which the objective function value (OFV) was less than 3 standard deviations below the mode of the OFV for a given FOV were rejected.
Soma signal to noise ratio
The difference of the mean intensity of the soma (2-pixel radius; 2.87 μm) and the mode of the intensity of the surrounding (10-pixel radius; 14.34 μm) was defined as signal. The standard deviation of a jittered version of the signal (same radii, but pseudo-random location of the “soma” in the 10-pixel radius) was defined as noise. In order to be included in the analysis, cells had to have a signal to noise ratio (SSN) value above one.
All quality criteria were tested and cells were excluded on a frame-by-frame basis. Excluded time points were treated as missing entries in the data. Cells that were not reliably detected on at least 10 trials for each stimulus on a given day were completely excluded from the analysis.
Calculation of ΔF/F0 and deconvolution
The baseline F0 used to compute the ΔF/F0 was defined as a moving rank order filter, the 30th percentile of the 200 surrounding frames (100 before and 100 after). This ΔF/F0 was then deconvolved using the algorithm published by Vogelstein et al. (2010).
Stimulus-evoked sound responsiveness of single cells
To classify single cells as sound responsive or not, all trials from a given stimulus on a given day were compared in a rank-sum test against twenty randomly picked patterns of spontaneous activity (from periods without sound presentation). A cell was classified as significantly responsive, if the p-value of a non-parametric Wilcoxon rank sum test was below 0.01 after a Benjamini–Hochberg correction for multiple comparisons against number of days (4), number of stimuli (34), and number of cells (21,506) for at least 1 stimulus (Benjamini and Hochberg 1995).
FOV inclusion criteria
We included FOVs in our analysis that satisfied the following 3 criteria: (i) FOVs needed to contain at least 100 ROIs (i.e. neurons), which fulfilled the quality criteria described above, (ii) FOVs needed more than 10 significantly sound responsive neurons on each day and (iii) neuronal populations in the FOVs needed to respond to at least four stimuli on at least 1 day.
Analysis of long-term and stable expression of the genetically encoded calcium indicator CGaMP6m
High-resolution images of the FOVs (1,024 × 1,024 pixels, 2.79 pixel per μm) were acquired before each calcium imaging time series and used for this analysis (Supplementary Fig. S3). Coordinates of single cells in high-resolution images were identified and tracked with the same algorithm as for calcium imaging time series data. Only ROIs included in the analysis of chronic 2-photon data and which were at least 15 μm from the border of the FOV were included in the analysis.
Discrete-time Markov chain model
2-photon imaging data from mice with either 4 or 5 consecutive imaging timepoints were considered for the Markov chain analysis (Figs 3 and 4). A discrete-time Markov chain is a model that can be used to describe stochastic transitions within a state space, observed at discrete time points, in which the transition probability only depends on the current state. Four states comprised the state space within which auditory cortical neurons could transition: (i) unresponsive to stimuli, (ii) complex sound responsive, (iii) pure tone responsive, or (iv) complex sound and pure tone responsive. In order to be considered in a sound responsive state, the neuron had to respond significantly to at least one of the stimuli in the pure tone or complex sound stimulus set, according to the analysis described in the previous section “Stimulus-evoked Sound Responsiveness of Single Cells.” The time points of observed transitions were experimental imaging sessions separated by 48 h.
Each cell in the 2-photon imaging dataset was assigned a state (1–4) at each time point of observation, according to its stimulus-driven response. A probability matrix of transitions between states was then generated according to the observed transitions between Day 3 and Day 5, by bootstrapping subsets of 2,000 cells over 500 iterations and measuring the proportion of neurons in each state. The final probability matrix was the average over all iterations. The steady-state vector was then computed by calculating the eigenvector associated with eigenvalue of 1 for the probability matrix. Hit times for each state transition were generated with the MATLAB function hittime after generating a Markov chain model using the dtmc function.
Life-time analysis
Using the Markov chain model, we calculated the probability that a responsive cell would be observed as responsive over X consecutive time points, i.e. its “life-time” (Fig. 4C). We used the transition probability matrix generated from the dataset of neurons imaged at 5 consecutive time points (11,054 cells). Using the simulate function in MATLAB, we simulated a random walk of 1,000 steps for each cell through the Markov chain. The starting distribution of states across cells was taken from the dataset itself. Then, for each cell, we identified responsive epochs and counted the number of consecutive (unbroken) time points of responsiveness in each epoch. A distribution was thus created for each cell, and averaged across all cells to generate the final plot, where life-times are expressed in time points separated by 48 h.
Construction of population frequency maps based on in vivo 2-photon data
Population frequency maps were constructed from subsets of neurons deemed sound-responsive on a given imaging day (Fig. 5E). From these subsets, 100 maps were generated from bootstrapped subsamples of 1,000 cells. Only cells whose position fell within the boundaries of auditory cortex generated by intrinsic imaging were considered.
In order to construct population maps with even spatial sampling, a 33 × 36 grid was constructed to span the range of spatial locations covered by the auditory cortex boundaries. Each element of the grid was assigned a frequency tuning value, which was calculated as the averaged best frequency across all neurons that were located within a distance of 9 μm. This procedure introduced a degree of spatial smoothing across the population map. Empty elements of the grid were given a NaN value, and the transparency of each grid element, color-coded to reflect average best frequency, was adjusted to reflect the normalized number of cells that contributed to the average frequency. Therefore, areas of the population map that had been constructed from more cells are more opaque. The maps generated over 100 subsampled iterations were averaged to create the final map.
Stability of responses within a sound category (pure tones, complex sounds)
Only cells with either a significant pure tone or complex sound response on both days across a 2-day transition were included in the analysis (Supplementary Fig. S5). For pure tones, cellular tuning curves were centered to the respective cellular best frequency (BF) of the first day (day i). On the second day (day i + 2), cells were split into 2 groups: (i) Cells with the same BF on day i + 2 as on day i, and (ii) cells with a different BF on day i + 2 than on day i. The bandwidth was calculated as the width of the tuning curve at the full-width-half-max of a 2D-gaussian fit of the tuning curve. For complex sounds, cellular response profiles were sorted by descending mean stimulus response magnitude on day i. Grouping was done analogously as for pure tones, looking at cells with a stable best stimulus, and cells with a shifting best stimulus. Cellular activity was normalized to the activity of the respective cells’ best pure tone or complex sound stimulus.
Calculation of probability of back and forth state switching
We calculated the probability that a cell in a given state would transition to another state, then transition back to the original state, e.g. an unresponsive cell becoming pure tone responsive, only to return to the unresponsive state again (Supplementary Fig. S7). Using transition probabilities generated by the Markov model, we assumed that the transitions were independent, i.e. memoryless, and thus calculated the joint probabilities of state switching by multiplying the individual transition probabilities. From a starting dataset of 11,054 cells, we randomly chose 5,000 cells and generated a prediction of how many would exhibit this “state switching” behavior based on the calculated joint probabilities, and compared this prediction to a count of how many cells in the group of 5,000 actually exhibited these transitions. We iterated this process 20 times.
Signal correlations of single cell tuning vectors
For each cell, a tuning vector with the average stimulus-evoked activity to all 34 stimuli was computed with a random half of all stimulus trials (Fig. 6C–E, Supplementary Fig. S10). This vector was then correlated with a tuning vector of either the other half of trials from the same imaging day or a random half of trials from another imaging day using Pearson’s correlation coefficient. Random picking of half of trials was repeated in 1,000 iterations and results were averaged across iterations. Plotted are the results of this analysis for different groups of cells based on their sound responsiveness to either of the 2 sound categories (pure tones or complex sounds), on imaging day 1, a given imaging day or imaging day 7.
Cumulative probability of responsiveness given subsets of “permanently unresponsive” cells
Using transition probabilities generated with the Markov model, for each population, we calculated the cumulative probability of sound responsiveness (i.e. a cell becoming responsive on at least one time point) over the course of 100 days, but with an additional caveat: only a percentage of cells originally observed as unresponsive over the course of 2 time points could potentially be sound responsive in the future, while all other unresponsive cells could never be responsive (i.e. transition probabilities for these cells were set to zero). We varied this percentage from zero to 100% in steps of 20% (Supplementary Fig. S8).
Analysis of lost, stable, and gained pure tone responses
We first identified cells, out of the population of cells whose spatial location was within the auditory cortex border generated by the template map in Fig. 5C, that had a significant pure tone response on at least one imaging day. For each imaging time interval, we noted the BF of cells that (i) lost pure tone responsiveness on a subsequent day, (ii) maintained pure tone responsiveness across both time points, or (iii) had previously been unresponsive to pure tones and subsequently gained a response. In each category (lost, stable, and gained BFs), we averaged the number of cells in each frequency bin to generate distributions of BFs for lost, stable, and gained responses (Fig. 7A). To test whether these distributions were different from each other, we used a 2-sample Kolmogorov–Smirnov test (kstest2, MATLAB), and performed a Bonferroni correction for multiple comparisons. To show a snapshot of the population map-level frequency representations gained and lost over a single imaging interval, we constructed population frequency maps using the BF of cells that lost or gained responses between imaging days 3 and 5 (Fig. 7B, see previous section “Construction of Population Frequency Maps based on in vivo 2-Photon Data”) (Fig. 7).
Best frequency slope determination in primary auditory cortex (A1)
A line across the A1 low-to-high best frequency axis was first manually drawn on the average population frequency map from the 2nd imaging time point (Day 3) (Fig. 8B). A subsample of 1,000 neurons in auditory cortex was first chosen for the analysis. For 10 evenly spaced points along this axis, a frequency value was calculated by averaging the best frequency for the 25 closest cells in the dataset (pooled across animals and FOVs, after alignment to the template auditory cortex map). The A1 slope was then calculated according to a linear fit along these 10 points. This process was repeated 20 times for each imaging time point to create a distribution of slopes on each day.
Sound decoding based on logistic regression
A linear classifier (MATLAB function lassoglm with L1 regularization) was trained to discriminate between responses to 2 different stimuli (Fig. 8D). For the analyses where training and testing were done on the same day, cross-validation was performed by leaving out one trial. Where training and testing were done on different days, training was done with all trials of a given day and the performance of the classifier was tested on each trial of a different day. The pairwise decoding performance was then defined as the percentage of correctly classified trials, and FOV decoding performance was defined as the mean pairwise decoding performance over all pairs of stimuli.

In vivo imaging of sound-evoked responses in the mouse auditory cortex. A) In vivo 2-photon image of a field of view (FOV) in layer 2/3 of auditory cortex showing expression of the virally delivered calcium indicator GCaMP6m (green) and the nuclear marker H2B::mCherry (red). White circles and digits represent the location of the neurons shown in panel (C). Scale bar = 50 μm. B) Spectrograms of the stimulus set used for awake, passive recordings of sound evoked activity. Top: fifteen complex sounds (CS1—CS15); bottom: 19 pure tone stimuli (2–45 kHz, separated by a quarter octave). Stimulus color schemes are used throughout the figures of this manuscript. C) Auditory tuning of 6 representative example cells from the FOV shown in panel (A). Trial-averaged activity is shown in black; SEM of trials is shown in gray. Bottom color bar: Colored entries of the stimulus ID bar illustrate stimuli eliciting a significant sound response in the respective cell, while stimuli without a significant response are shown in gray. Single cells show a wide array of tuning profiles; classical v-shaped tuning curves to pure tones, specific responses to single complex sounds, or various combinations of both.
Results
Individual neurons in auditory cortex gain and lose responsiveness over time
To explore sound representations in auditory cortex at single-neuron resolution, we performed 2-photon calcium imaging in 21.506 neurons in layers 2/3 of 12 awake, passively listening mice expressing virally delivered GCaMP6m (Fig. 1A, Supplementary Fig. S1, Aschauer et al. 2022). The stimulus set consisted of 19 pure tones and 15 complex sounds (Fig. 1B). Complex sound stimuli characterized by a broad and temporally modulated frequency spectrum, resemble the characteristics of naturally occurring sounds and typically lead to a broad activation of the auditory cortex that does not reveal an obvious topographic organization (Guo et al. 2012; Moczulska et al. 2013). Individual fields of view (FOVs) were imaged at least four times at a 2-day interval, after mice had been extensively habituated to the imaging setup and sound presentations. To avoid a bias toward active cells and to safeguard high-fidelity re-identification of cells across time points, we co-expressed the nuclear marker H2B::mCherry (see section Materials and Methods).

Auditory responses of single cells can be highly dynamic over time. A) Long-term auditory tuning of 8 representative example cells to either pure tones or complex sounds on 4 imaging days, separated by 2 days each. Bottom color bars: Colored entries in the stimulus ID bar illustrate stimuli with a significant sound response on the respective day (see section Materials and Methods). Stimuli without a significant response are shown in gray. Example cells are grouped into 4 categories of different tuning dynamics, with 2 example cells each: Cells with a stable response on all imaging days; cells without auditory responses on the first day which gain a response during the experiment; cells with a stimulus response on the first imaging day which is lost at later time points; and cells displaying mixed dynamics including stable, gained, and lost sound responses. B) Life-time plots showing sound responsiveness status of all individual cells (rows) across 4 imaging days (columns). From left to right, a thin horizontal black line on a given day indicates if the neuron was unresponsive, complex sound responsive, pure tone responsive, or complex sound and pure tone responsive.
This dataset formed the basis of a previous study (Aschauer et al. 2022), where we investigated learning-induced changes in population codes and how they can predict behavioral stimulus generalization observed following auditory cued fear conditioning. In this paper, we focus on the changes of sound-evoked responses of individual neurons over the time course of days and how this dynamics relates to the maintenance of the spatially ordered tonotopic map.
When analyzing the sound-evoked responses in individual neurons, we observed that on any given day, neurons showed diverse tuning to sound stimuli (Fig. 1C), which consisted of both pure tone tuning and preferences for complex sounds. Across days, single cell stimulus responsiveness could either be stable or volatile, with cells maintaining, losing or gaining responsiveness to a given sound stimulus (Fig. 2A). Inspecting the individual calcium transients of each neuron, it was clear that the large degree of volatility in sound responsiveness could not be explained by trial-by-trial variability of near-threshold responses (Supplementary Fig. S2). Instead, once-strong and reliable sound responses could be entirely absent on a given imaging day, or vice versa. Further, cells displayed volatile sound responsiveness despite showing the same degree of calcium indicator expression (Supplementary Fig. S3) and stable underlying distributions of spontaneous activity (Supplementary Fig. S4).
At a given time point, we divided the cells into four categories according to their response properties: Cells in which we did not observe sound-evoked responses (unresponsive), cells selectively responsive to non-topographically organized complex sounds (CS responsive), cells selectively responsive to topographically organized pure tones (PT responsive), and cells with responses to both, pure tones and complex sounds (CS & PT responsive). We found that most changes in tuning over days were reflected by switches between these 4 categories. Only a minor fraction of neurons that were assigned to the same category on consecutive imaging sessions showed changes in the stimulus that evoked the strongest response (Supplementary Fig. S5). A life-time plot of neurons unresponsive to the stimulus set on the first imaging day revealed that the majority remained unresponsive throughout the whole imaging period, with a smaller population of cells displaying an emergent response on at least one other imaging day (Fig. 2B, far left). For pure tone responsive neurons, a large majority of neurons deemed responsive on one day, showed unstable responsiveness at some point over the four imaging time points (Fig. 2B, second from left). Notably, neurons responsive to either complex sounds only (Fig. 2B, third from left), or both pure tones and complex sounds (Fig. 2B, far right) displayed a similar degree of volatility as those responsive to only pure tones. We probed the validity of the categorizations by assessing their sensitivity to various ways of shuffling the data (Supplementary Fig. S6). In summary, our analyses revealed that over several imaging days, the mouse auditory cortex contains an unstable subpopulation of neurons that gain and lose sound responsiveness to a particular stimulus set regularly over time.
Capturing the dynamics in sound responsiveness with a discrete-time Markov model
We next aimed for a model that would allow us to quantitatively describe ongoing neural response changes. Since our experimental conditions were stable, we reasoned that the ongoing gain and loss of responsiveness could be a largely stochastic process, perhaps related to ongoing synaptic turnover or other non-behaviorally guided factors (Ziv et al. 2013; Chambers and Rumpel 2017). We modeled the transitions between four different states of responsiveness, i.e. unresponsive, pure tone responsive, complex sound responsive, pure tone and complex sound responsive, as a discrete-time Markov chain (Fig. 3; see section Materials and Methods). A discrete-time Markov chain is the simplest model that describes stochastic transitions within a state space. It is “memoryless,” which means that the transition probability only depends on the current state and not the history of previous transitions. Such a Markov model can thus be generated from only 2 time points of observation. We aimed to compare the transition probabilities and predictions generated from the Markov model to our experimental dataset, which consisted of multiple time points of observation, for 2 main reasons. First, we aimed to test whether the changes in responsiveness were in a steady state. Second, we aimed to evaluate how well a memoryless and stochastic model could capture actual dynamics. To further clarify the extent that this model could capture the volatility of the underlying data, we analyzed a population of neurons that had been repeatedly imaged over 9 total days (5 imaging time points; n = 11,054 cells, 8 mice).

Modeling the dynamics of sound responsiveness with a discrete-time markov chain. A) Probability matrix of state transitions from day 3 (x-axis) to day 5 (y-axis), averaged over 500 iterations using 1,000 cells per iteration. B) Graph plot illustrating the probabilities for each transition (direction represented with arrows, same probabilities as displayed in (A)). C) Bar plot of steady state probabilities for each state. D) Actual daily distributions of cells in each state, across 5 imaging time points. Grayscale shade indicates day from which data was taken.
Transitions observed between day 3 and day 5 (Fig. 3A and B), indicated that overall, individual neurons were most likely to remain unresponsive to sound across both time points. Interestingly, neurons that responded to both pure tones and complex sounds had a higher probability of maintaining a sound response than losing it, while neurons responding to either pure tones or complex sounds, but not both, were more likely to be unresponsive at the next time point. We then constructed steady-state probabilities of a neuronal population based on the probabilities of transitions between the 2 time points (Fig. 3C), which closely matched the actual distributions of each neuronal response type (Fig. 3D). The high degree of similarity between predicted and actual distributions supported our hypothesis that under stable behavioral conditions, volatility in stimulus responsiveness was in a state of dynamic equilibrium.
As the transition probabilities in the Markov model do solely depend on the current state, we assessed how well specific series of transitions between different states in responsiveness in our dataset could be captured. We found a good correlation between the model and the data for the likelihood of individual neurons switching back and forth between the various categories (Supplementary Fig. S7). Together, the simple Markov chain was able to capture the general structure of changes in responsiveness over a 9-day period, indicating that the turnover we observed is predominantly stochastic and memoryless.
Predicting long-term life-cycles in neuronal responsiveness
Our experimental results suggested that at any point in time, the population response in auditory cortex is comprised of the activity of a subpopulation of sound responsive neurons. How long would it take for the entire auditory cortex neuron population to participate in this responsive subpopulation at least once? Repeated neuronal recordings are limited in terms of the number of days on which measurements are practically feasible. Therefore, we leveraged the Markov chain model to generate predictions of sound responsiveness at arbitrary lengths of time in the future (Fig. 4). We estimated the average cumulative probability of sound responsiveness for a population of auditory cortex neurons (see section Materials and Methods), considering the transition probabilities we observed in response to our set of 34 pure tone and complex sound stimuli. (Fig. 4A). Regardless of whether the prediction was generated from 2-day transitions in the dataset from early or late in the experimental period, this analysis consistently predicted that within 100 days, more than 80% of the neuronal population would be responsive to a stimulus we used in our experiment, for at least one time point.

Model-based long-term predictions of neuronal sound responsiveness. A) Cumulative probability of sound responsiveness, using probabilities generated from transitions in the actual data between 2 time points (different colored lines). Black dots indicate the actual cumulative distribution of sound responsiveness in the dataset, measured over 5 imaging time points (93 FOVs). B) Cumulative probabilities of sound responsiveness, generated from the Markov chain model, when the stimulus set was subsampled such that observations of sound responsiveness were made only from presentations of those remaining stimuli. Stimuli were chosen randomly and the process was iterated 100 times per FOV. C) Probability (mean ± SD) of life-time length; i.e., the number of consecutive days that a responsive cell will be continuously responsive. Calculated from a random walk of 1,000 steps through the Markov chain, averaged over 11,054 cells. D) Hit times, representing the average number of days it would take to reach each state from any initial state.
Moreover, we asked to what extent the long-term predictions of responsiveness would be affected if some of the initially unresponsive neurons remain permanently unresponsive and only a subpopulation is capable of eventually becoming sound responsive at later time points. To test this, we included an additional category of permanently unresponsive neurons to the model. Only versions of this model in which the majority of initially unresponsive cells are capable of eventually displaying a sound response, were compatible with our experimental data (Supplementary Fig. S8). This included the version in which all initially unresponsive cells have the potential to gain responsiveness later and we therefore used the simplest model for the further analyses without making assumptions on further subcategories in the unresponsive state.
To investigate the effect of experience, we analyzed a previously published dataset of repeated imaging before and after auditory cued fear conditioning (ACFC) (Supplementary Fig. S9A), a paradigm that can shift the probabilities of sound responsiveness in auditory cortex (Aschauer et al. 2022). ACFC biased the long-term prediction of the Markov model toward higher probabilities of gaining sound responsiveness (Supplementary Fig. S9B); however, this shift was subtle and rebounded within days, indicating the robustness of the dynamic equilibrium of sound responsiveness in auditory cortex.
We asked how our results would be affected by the number of different stimuli that were used to probe the sound responsiveness of a given cell. We subsampled our dataset to estimate the cumulative probability of responsiveness if fewer stimuli were used (Fig. 4B). We found that the cumulative probabilities were barely affected, even if only half of the stimulus set was presented. This indicated that at a given time point, a substantial fraction of neurons would be categorized as unresponsive regardless of the diversity and size of the stimulus set. More dramatic differences in the probabilities of responsiveness were observed when using fewer than 17 stimuli. In these cases, the stimulus set likely would not capture the receptive fields of many cells, and even after 200 days, the probabilities failed to converge with those predicted when using the full set of 34 stimuli.
If a cell has gained responsiveness at a given time point, what is the likelihood that the cell will continue to be responsive over further consecutive time points? In other words, how transient is the responsive subpopulation of neurons that forms the stable population response at any given time? To answer this, we simulated a random walk of 1,000 steps through the Markov chain, with the initial distribution of responsiveness in a cell population matching the actual dataset (11,054 cells). Then, we calculated the average probability of responsiveness occurring over consecutive time points (Fig. 4C). This analysis showed that a cell would most likely be responsive at only 1 or 2 consecutive time points separated by 48 h, before becoming unresponsive to the stimulus set again. However, the distribution also had a long tail, indicating that a small fraction of the cell population would display much more persistent responsiveness.
Finally, in a related analysis, we calculated predicted “hit times” (Fig. 4D), which represent the number of days it would typically take for a neuron in each responsiveness category to appear in another category. This analysis revealed that within a time period of several days to several weeks, it is highly likely that an originally responsive neuron would be suddenly unresponsive at some point. However, by contrast, if a neuron was observed as entirely unresponsive at one point, it could take several months to observe responsiveness to both complex sounds and pure tones on the same imaging day. In summary, although the vast majority of neurons on a given experimental day do not display any response to the suite of 34 pure tones and complex sounds we used in our experiment, a discrete-time Markov model predicts that, given enough time, the majority of neurons in the starting population will eventually show a reliable calcium response to at least one sound in the stimulus set.
Multi-scale imaging of the tonotopic map in the mouse auditory cortex
A well-known feature of auditory cortex is its stable and robust topographic sound frequency organization. We asked whether our 2-photon imaging data also showed this level of large-scale topography. To visualize the organization of frequency tuning at multiple scales, we performed both intrinsic signal imaging and 2-photon calcium imaging in the same mice. Intrinsic signal imaging revealed a large-scale organization of pure tone frequency representation over the exposed cortical surface (Fig. 5A and B, Bakin et al. 1996). We aligned and averaged the intrinsic signal activity maps of individual animals in order to form a “template” map (Fig. 5C; see Experimental Procedures) whose outline defined the core regions of auditory cortex. These regions displayed typical low-to-high frequency tuning gradients as published previously (Guo et al. 2012; Issa et al. 2014; Romero et al. 2020).

Intrinsic signal and 2-photon calcium imaging reveal the tonotopic organization of the mouse auditory cortex. A) Trial-averaged in vivo intrinsic signal intensity in response to pure tone stimuli of an individual mouse, overlaid onto the image of the exposed auditory cortex. Corresponding frequency of pure tone stimulus indicated on top of the image. Scale bar = 0.5 mm. B) Average intrinsic signal intensity upon pure tone stimulation, across 8 mice. Scale bar = 0.5 mm. C) Average frequency tuning map derived from intrinsic maps from all mice. Color indicates sound frequency eliciting maximal response. Outline delineates auditory cortex. Scale bar = 0.5 mm. D) Best frequency (BF) tuning of all pure tone responsive neurons across all mice (n = 1,647 cells), on the first day of imaging, overlaid onto auditory cortex border to show spatial distribution. Scale bar = 0.5 mm. E) Spatially averaged, smoothened population frequency tuning map of auditory cortex, taken from neurons shown in (D). Dotted line indicates low-to-high BF gradient characteristic of A1. Scale bar = 0.5 mm.
Considering the significantly responsive neurons on the first imaging day (Fig. 5D), and using the intrinsic imaging maps to register the data from several mice, we constructed a large-scale population tuning map (1,647/21,506 neurons, 12 mice). In contrast to the substantial heterogeneity in tuning observed in individual FOVs, a pronounced low-to-high frequency tuning gradient along primary auditory cortex (A1) became apparent in the large-scale map constructed from 2-photon data (Fig. 5E), resembling the gradient observed in the intrinsic signals.
Gains and losses of pure tone responsiveness occur throughout the tonotopic map
Although we were able to observe large-scale topographic organization in our 2-photon dataset at one time point, individual cell responses could be volatile within one FOV (Fig. 6A), and across the entire topographic map (Fig. 6B). The number of pure tone responsive neurons on any given day was largely constant throughout the imaging period (Fig. 6B, middle row). However, considering a starting population of pure tone responsive neurons on the first imaging day, the number of neurons that continued to show a response strongly declined over the course of a week (Fig. 6B, top row). In a symmetric fashion, a subpopulation of neurons deemed responsive on the latest imaging day (day 7; Fig. 6B, bottom row) also progressively declined in responsiveness, the further back in time the analysis was performed. Similarly, we considered the signal correlation of cells that were pure tone responsive on day 1 (Fig. 6C), each day (Fig. 6D) and on day 7 (Fig. 6E). For individual cells, we constructed a pure tone tuning vector from one half of the trials and calculated the average Pearson correlation coefficient to the other half of trials from the same day or other days (see section Materials and Methods). We observed a gradual and monotonic decrease in signal correlation for increasing time intervals that was largely symmetric for both time directions, whereas the signal correlations on a given day were largely constant throughout the experiment. Importantly, this drift in tuning was similar for responses to non-topographically organized complex sounds (Supplementary Fig. S10).

The gain and loss of responsiveness of individual auditory cortex neurons is balanced over time. A) In vivo 2-photon images of a single FOV from one mouse (H2B::mCherry channel only). Cells that show a significant pure tone response on a given imaging day are outlined with colored circles according to their respective BF. Scale bar = 50 μm. B) BF tuning of all neurons deemed responsive on each day, out of an original subset deemed responsive on day 1 (top row), a given day (middle row), or day 7 (bottom row). Scale bar = 0.5 mm. C) Correlation of single trial response vectors from cells with a significant pure tone response on day 1. Plotted is the Pearson correlation coefficient of a random half of all trials from the respective day against the other half of trials from day 1. (one-way ANOVA with Bonferroni correction for multiple comparisons, F = 152.51, p < 0.001, day 3 vs day 7: p < 0.001). D) Same as panel (D), but plotted is always the correlation of a random half of trials from the respective day against the other half of trials from that day. (one-way ANOVA with Bonferroni correction for multiple comparisons, F = 1.76, p = 0.153, n.s.: p > 0.5 for all comparisons between time points). E) Same as panel (D), but a random half of trials is correlated against the other half of trials from day 7. (one-way ANOVA with Bonferroni correction for multiple comparisons, F = 110.29, p < 0.001, day 1 vs day 5: p < 0.001).
We quantified how the gain and loss of responsiveness to pure tones is spread along the main axis of A1. At each of the 3 imaging intervals, we observed similar numbers of cells that lost (Fig. 7A, left), maintained (Fig. 7A, middle) or gained (Fig. 7A, right) responsiveness (p > 0.05, 2-sample Kolmogorov–Smirnov test). Cells with volatile tuning were spread along the tonotopic gradient and did not show a systematic relationship to it.

Changes in responsiveness are widely distributed across the spectrum of represented frequencies. A) Histograms of BFs of cells within A1 that lost their response (left), cells that had a stable sound response across different imaging (middle), and cells that gained a response (right), averaged over transitions. Distributions of BFs for lost and gained responses did not significantly differ from those for stable responses (2-sample Kolmogorov–Smirnov test, p > 0.05). B, C) Averaged population maps constructed from cells that lost (C) or gained (D) a pure tone response between days 3 and 5. Dotted line indicates low-to-high BF gradient characteristic of A1.
Next, we wondered if the tuning of cells would reveal a period of pronounced instability, signaling the subsequent complete loss of responsiveness. From our 2-photon data, we created a large-scale map of the auditory cortex that is based on the pure tone responses of cells that would be categorized as unresponsive in the next imaging session 2 days later (Fig. 7B). We found that this large-scale map showed a clear topographic organization and highly resembled the map constructed from all responsive cells shown in Fig. 5E. Despite the fact that the local tuning was a bit less pronounced, the main tonotopic gradient along A1 was again evident. A similar picture was seen when constructing a large-scale map selectively from those cells that have gained responsiveness within the last 2 days (Fig. 7C). This map was similar to the map constructed from all responsive cells, including those with stable tuning, and the tonotopic gradient along A1 was clearly detected, indicating that even cells with volatile responsiveness display frequency tuning aligned to the population map.
The tonotopic map is sustained over time by a shifting population of responsive neurons
Our findings so far indicated that tuning to pure tones showed a similar degree of volatility as the tuning to complex sounds. Therefore, the maintenance of the tonotopic map cannot be simply explained by select stability of pure tone responses. To better understand how stability of the map emerges from neurons with unstable tuning properties, we constructed averaged population maps from the responsive cells from each experimental day, which showed a highly similar overall structure (Fig. 8A). To quantify this organization, we calculated the slope of frequency tuning from the characteristic low-to-high frequency axis that spans A1 (Fig. 8B, see Experimental Procedures). Apart from a slight decay in steepness following the first imaging day, this gradient was readily detected throughout the imaging period (Fig. 8C), as expected from a stably topographic map. We attributed this large-scale stability to a balanced loss and gain of responsiveness across the frequency gradient, consistent with the stochastic nature of auditory cortex neuron volatility.

Stable maintenance of the tonotopic map. A) Spatially averaged population frequency tuning maps, as in Fig. 5E, created from pure tone responsive neurons on each of the 4 imaging days. Black outline indicates auditory cortex, dotted line indicates low-to-high frequency tuning gradient characteristic of primary auditory cortex (A1). Scale bar = 0.5 mm. B) Example BF slope calculation across the A1 low-to-high frequency gradient. C) Average BF slope in A1 calculated from neurons deemed responsive on any of the 4 imaging days, across all mice. (2-sample t-test, Bonferroni correction for multiple comparisons, day 1 vs day 3, day 5, and day 7: p < 0.001, day 3 vs day 5: p = 0.07, day 3 vs day 7: p = 0.04, day 5 vs day 7: p = 0.8). D) Average performance of a logistic regression classifier for the discriminability of individual pure tones, when training dataset is taken from a given day (magenta line, repeated-measures ANOVA, F(2) = 0.73, p = 0.48), day 1 only (blue line, F(2) = 24.04, p < 0.001), or day 7 only (red line, F(2) = 17.35, p < 0.001). n = 97 FOVs from 12 mice.
Finally, given volatile individual neuron responsiveness and stable topographic organization, we assessed whether the readout of auditory cortex neuron activity comprised a stable sound representation. We used a logistic regression-based classifier (see section Materials and Methods) to test whether pure tone information encoding was preserved over the 4-time point imaging period. When the training and testing data were taken from the same imaging day, we noted a robust and stable decoding performance for all days (Fig. 8D, purple line), indicating that overall pure tone information was preserved over time. However, volatility in the underlying neuronal population became apparent when training datasets were taken from either the neurons deemed responsive on day 1 (Fig. 8D, blue line) or day 7 (Fig. 8D, red line), and testing was performed on the response patterns of the same neurons on the other imaging days. We observed a significant decline in classifier performance, which steadily deteriorated as the time between training and testing days increased. In summary, auditory cortex topographic map stability was safeguarded by balanced gains and losses of sound responsiveness across the spatial and frequency gradients.
Discussion
We show here, using pure tone frequency tuning in the mouse auditory cortex as a model, that a stable topographic map is maintained by a dynamic subpopulation of neurons that periodically gain and lose responsiveness to sensory stimuli. This is in contrast to models in which the stability of a sensory map is parsimoniously explained by either a generally low degree of plasticity, resulting in an intrinsically stable structure, or that the specific responses to the topographically organized parameters are selectively spared from long-term remodeling of functional properties in cortical neurons.
Our observation that a simple memory-less Markov model can capture much of the changes in responsiveness indicates that these dynamics are dominated by stochastic processes. Stochastic descriptions of fundamental processes in the brain have been applied successfully for the gating of voltage- or ligand-gated channels in the neuronal membrane (Colquhoun 1995) as well as for the release of neurotransmitters at chemical synapses (Redman 1990). The Markov model, despite its simplicity, captures much of the dynamics in responsiveness, but minor deviations from the data can be seen for example when considering sequences of transitions (Supplementary Fig. S7).
Interestingly, the observation of steady-state stochastic dynamics was made under basal conditions. When involving the mice in an associative learning paradigm, we observed that the transition probabilities were transiently altered (Supplementary Fig. S9B). We utilized the model to make long-term predictions about the life-cycles of neuronal responsiveness to sensory stimuli under basal conditions. Despite the fact, that our experimental data covers only up to 10 days of longitudinal imaging and a degree of uncertainty in assessing the model’s long-term predictions is obviously remaining, we found that the vast majority of neurons would be expected to become responsive to the set of sounds used in our study, at least transiently, within 100 days (Fig. 4). This provides an interesting twist on the common observation that many neurons are unresponsive on a given day, even when tested with a broad set of stimuli. One interpretation is that for these neurons, the proper stimulus was not presented, as even in primary sensory cortical regions, many cells have complex nonlinear receptive field properties (Wang et al. 2005; Bizley et al. 2009; Chambers et al. 2014). Another interpretation, is that these neurons are primarily driven by a different sensory modality (Bizley et al. 2007; Kayser et al. 2008; Iurilli et al. 2012; Ibrahim et al. 2016). However, our findings suggest a complementary interpretation—that the neuron was simply not tested at the proper time.
What could cause the volatility of a neuron’s stimulus response properties? Significant changes in sound responsiveness occurred even amidst a backdrop of relatively stable spontaneous activity distributions (Supplementary Fig. S4). This may indicate that rather than changes in the cell’s intrinsic cellular excitability, changes in the weights of presynaptic inputs to a cell could underlie the changes in responsiveness. Continuous remodeling of synaptic connectivity has been previously observed in the mouse auditory cortex during behaviorally stable conditions (Loewenstein et al. 2011, 2015; Rumpel and Triesch 2016). Spontaneous changes in synaptic connections may occur even independently of neuronal activity (Yasumatsu et al. 2008; Rubinski and Ziv 2015; Dvorkin and Ziv 2016; Nagaoka et al. 2016; Kasai et al. 2021). Standard theoretical network models typically incorporate activity-dependent forms of Hebbian or homeostatic synaptic plasticity. In those cases, however, once an equilibrium is reached, no further synaptic changes would be expected to occur. Only recently has theoretical modeling been applied to investigate how ongoing synaptic remodeling observed during basal conditions can affect the long-term stability of activity patterns in a network (Kappel et al. 2015, 2018; Mongillo et al. 2018; Humble et al. 2019; Susman et al. 2019; Raman and O'Leary 2021).
Our findings support a model in which the sensory representation at any given time point is formed by a population of neurons with volatile tuning properties and the stability of the topographic map emerges from a dynamic equilibrium. Whereas the volatility in synaptic connections appears to be a plausible mechanism to drive volatility in neuronal tuning, the putative homeostatic mechanisms that maintain the overall organization of the topographic map are still poorly understood. An important implication of an intrinsically volatile structure is that changes in the equilibrium of the ongoing dynamics could provide a mechanism underlying the large-scale restructuring of sensory maps, as has been reported e.g. in patients following the loss of a limb due to an accident or amputation (Flor et al. 2006). Consistently, we observed that even a brief and strong sensory experience can temporarily perturb the rates of these otherwise stationary dynamics (Supplementary Fig. S9B).
The emerging picture from our study is that in primary sensory cortices, the access to sensory information at any given time is restricted only to a subfraction of neurons. Research in artificial neural networks gives us a clue as to how this could be beneficial for brain function. A feature called “dropout” has shown success in improving the performance of neural networks, particularly in preventing overfitting (Hinton et al. 2012; Srivastava et al. 2014). Dropout entails a random deletion of units, and their connections, during the training process. For the cortex, this random unit deletion, such that only a fraction of the neural population is “accessible” during the learning process, may aid in generalization, as well as support the sparseness and metabolic efficiency of cortical activity.
Overall, our work indicates that representations of fundamental, topographically organized stimulus features in sensory cortical areas have a dynamic nature and their maintenance should be considered as an equilibrium state. This highlights that healthy brain function, even in adulthood, is a constantly moving target.
Funding
This work was supported by research grant Deutsche Forschungsgemeinschaft CRC1080-C05 (SR), Deutsche Forschungsgemeinschaft SPP 2041 Project #347573108 (SR and MK), Deutsche Forschungsgemeinschaft/Agence nationale de la recherche Project #431393205 (SR), Deutsche Forschungsgemeinschaft DIP “Neurobiology of Forgetting” (SR and MK), EMBO Long-term fellowship ALTF-35-2015 (ARC), and a Focus Translational Neuroscience postdoctoral fellowship (ARC).
Conflict of interest statement: The authors declare no conflict of interest.
Data availability statement
Primary data accessible on G-Node (DOI: 10.12751/g-node.trwj8c). Code used for image processing available at Zenodo (DOI: 10.5281/zenodo.5822486).
Notes
We thank members of the Kaschube and Rumpel labs for helpful discussions.
References
Grimm D, Kay MA, Kleinschmidt JA.
Woolsey CN.
Author notes
Present address: Institute of Basic Medical Sciences, University of Oslo, Oslo 0372, Norway.
Anna R. Chambers and Dominik F. Aschauer shared first authorship.
Matthias Kaschube and Simon Rumpel shared last authorship.