State-dependence of climate sensitivity: attractor constraints and palaeoclimate regimes

Equilibrium climate sensitivity (ECS) is a key predictor of climate change. However, it is not very well constrained, either by climate models or by observational data. The reasons for this include strong internal variability and forcing on many time scales. In practise this means that the 'equilibrium' will only be relative to fixing the slow feedback processes before comparing palaeoclimate sensitivity estimates with estimates from model simulations. In addition, information from the late Pleistocene ice age cycles indicates that the climate cycles between cold and warm regimes, and the climate sensitivity varies considerably between regime because of fast feedback processes changing relative strength and time scales over one cycle. In this paper we consider climate sensitivity for quite general climate dynamics. Using a conceptual Earth system model of Gildor and Tziperman (2001) (with Milankovich forcing and dynamical ocean biogeochemistry) we explore various ways of quantifying the state-dependence of climate sensitivity from unperturbed and perturbed model time series. Even without considering any perturbations, we suggest that climate sensitivity can be usefully thought of as a distribution that quantifies variability within the 'climate attractor' and where there is a strong dependence on climate state and more specificially on the 'climate regime' where fast processes are approximately in equilibrium. We also consider perturbations by instantaneous doubling of CO$_2$ and similarly find a strong dependence on the climate state using our approach.


Introduction
In order to estimate the anthropogenic impact on climate in the future, the response of the climate system to the present perturbation by greenhouse gases needs to be quantified. A frequently used measure for the response to changes in atmospheric CO 2 -concentration is the equilibrium climate sensitivity (ECS). This is defined as the increase in global mean surface temperature per radiative forcing change after the fast-acting feedback processes in the Earth System have come into equilibrium [Charney, 1979] 1 . Frequently it is estimated by climate model simulations, where the atmospheric CO 2 -concentration is doubled within a few decades, and equilibrium is assumed after typically 100-200 years; slow climate processes are kept stationary (and non-dynamic) in these model simulations. ECS is the benchmark quantity for climate models 2 but is still characterised by a considerable uncertainty of 1.5 -4.5 K per CO 2 -doubling [IPCC, 2013] and neither recent observations have narrowed down the range of expected climate change [Knutti and Hegerl, 2008]. It is also clear that the feedbacks, and hence the ECS will depend on climate state, e.g., Senior and Mitchell [2000], Gregory et al. [2004], Crucifix [2006], Andrews and Forster [2008], Yoshimori et al. [2011], Caballero and Huber [2013], von der Heydt et al. [2014]. Palaeoclimate studies have tried to use proxy records to independently constrain ECS [Rohling et al., 2012], but even by taking account of fast feedback processes that depend on the climate state von der Heydt et al. [2014], Köhler et al. [2015] it remains difficult to further constrain the range of expected climate warming. In particular, large temperature changes as a consequence of atmospheric CO 2 increase cannot be excluded.
The observed warming of the Earth involves both direct radiative forcing and a variety of (positive and negative) fast feedback mechanisms. These are mostly related to atmospheric water vapour content, sea-ice and cloud albedo and aerosol concentrations. On the slower (decadal) time scales ocean heat uptake also contributes to the radiative (im-)balance. Quantifying the (fast) forcing is therefore not an easy task and limits our ability to reduce the uncertainty on climate sensitivity [Schwartz, 2012]. More generally, the internal variability of the climate system on many time scales leads to a large uncertainty of the ECS. In fact, the most appropriate definition of ECS in the presence of natural variability and forcing is still under debate. Ghil and co-workers propose a non-autonomous stochastic approach in terms of random dynamical systems [Ghil, 2013, Chekroun et al., 2011, and suggesting that climate sensitivity corresponds to a derivative of a metric (Wasserstein distance) evaluated for the invariant measure with respect to some parameter that is changed. Other approaches include considering perturbations that are not necessarily in the linear regime: [Dijkstra and Viebahn , 2015] use a conditional nonlinear optimization approach to define climate sensitivity.
In this paper we discuss climate sensitivity as a property of the climate dynamics projected into the space of forcing R to global mean temperature T . In particular, Section 2 discusses ECS for the unperturbed system with slow variability in terms of pairs of points on or near the "climate attractor". We relate this to more usual concepts of ECS and use this as a way to discuss state dependence. In particular we discuss "climate regimes" such that the sensitivity is well constrained within a regime, but poorly constrained whilst switching between regimes. We also discuss possible perturbations and differentiate those that give return to the same climate attractor from those that do not. In Section 3 we explore these ideas using a specific conceptual model of the Earth system Tziperman, 2001, Gildor et al., 2002], that includes dynamic CO 2 and is able to simulate glacial-interglacial cycles as relaxation oscillations. In this section the model is also perturbed in various ways to obtain (state-dependent) distributions of climate sensitivity. We conclude in Section 4 with a discussion of some issues that arise related to the extraction and interpretation of ECS distributions from palaeoclimate records.
2 Climate sensitivity and dynamics in (T, R) space The usual approach to ECS is to consider the radiative energy balance for the global mean surface temperature T dT dt where R f we understand as the radiation due to (external) forcings, e.g. the radiation received from the sun R ins and the forcing R CO 2 due to the greenhouse effect of atmospheric CO 2 . The fluxes R slow and R f ast are contributions to the radiative balance due to a set of feedback processes in the climate system that are classified as slow and fast, respectively, usually relative to the typical time scale of the forcing. Finally, R OLW = −εσ B T 4 is the outgoing longwave radiation determined by the Stefan-Boltzmann law (with σ B the Stefan-Boltzmann constant and ε the emissivity of the atmosphere). The equation (1) suggests a simple (T, R) relationship but in reality it is hard to unpick this relationship, not least because fast feedbacks may potentially give multiple attractors when the slow feedbacks are fixed.
Here, we simply take the approach that T and R are quantities that one can (in principle) observe from a complex system and we investigate the relationship between them. Given two climate states with forcings R f,1 and R f,2 = R f,1 +∆R and temperatures T 1 and T 2 = T 1 +∆T , we consider climate sensitivity as the ratio of temperature change to radiative forcing change (including slow feedbacks): If there is a functional relationship of the form T = T (R) for R = R f + R slow then in the limit of small differences in R we expect In the (hypothetical) case that all slow processes are known and quantified and that a timescale separation between slow and fast processes exists, S is the usual ECS. However, we take the approach that (2) can still be studied when no functional relation T (R) exists, and this naturally leads to distributions of ECS. If we consider only CO 2 as forcing and the land-ice albedo feedback as slow process (i.e. R f + R slow,1 = R [CO 2 ,LI] ) then the specific climate sensitivity is the ratio LI] .
A priori it is not clear how (4) . (5) By considering a range of possible reference times t ref and delays δ there will be a distribution of sensitivities. An alternative approach is to assume there is a stationary measure (or distribution) µ of points in the (T, R [CO 2 ,LI] )-plane weighted according to how often they are visited over asymptotically long times, i.e. we assume that is independent of t and typical initial condition, where (B) is the length of the set B ⊂ R: see Figure 1a. The measure µ can be thought of as a projection of a natural measure on the attractor onto the two observables (T, R [CO 2 ,LI] ): it gives a distribution of associations between T and R [CO 2 ,LI] . This distribution 3 naturally leads to a distribution of climate sensitivities by picking pairs of points (R [CO 2 ,LI]1,2 , T 1,2 ) that are independently distributed according to µ and evaluating (4). In other words, for any (measurable) A ⊂ R we can use µ to assign a probability to the sensitivity being in A: Note that (5) can be determined from a time series that does not necessarily explore the full attractor, while (7) considers states purely depending on the locations in the (T, R) plane. In section 3, we give an example showing that the approaches (5) and (7) can give similar distributions when considering a wide range of δ and initial points.

Climate sensitivity, regimes and responses to perturbation
A study of palaeoclimate records, for example the ice age cycles of the last 800 kyr, shows the presence of markedly different "regimes" of climate, namely periods of slowly varying climate and rapid transitions between these regimes (the deglaciations). We wish to evaluate the sensitivities associated within one regime, and associated with changing regimes. For definiteness, we only consider climates with two regimes -a cold (C) and a warm (W ) regime -in this paper. If one partitions the attractor into two regimes in state space, this implies a partition of µ into two distributions Evaluating the distribution of climate sensitivities corresponding to choosing typical endpoints relative to these distributions allows one to examine the sensitivities within regimes. In particular one can define conditional distributions of sensitivities of the "warm" (and similarly the "cold") states by where S [CO 2 ,LI] is as in (4). There are conditional sensitivities associated with regime changes, for example from C to W this is and the distribution (7) can be thought of as the sum of the conditional distributions for S W W , S CC , S CW and S W C . Note that from the definition above even if physically and when time progresses the CW transition is different than the W C transition. For an optimal choice of regimes one would aim to ensure that the distribution of sensitivities within each regime is tightly localised, while those associated with regime changes may be poorly localised. A regime could, therefore, be defined as a region in (T, R) space where the T (R) relation is almost linear. We now consider response to two types of instantaneous perturbation. The first type of perturbation does not structurally change the system or leave the basin of the current attractor: the response shows transient decay back to the attractor followed by continued motion on the same attractor. The response to such a perturbation includes the possibility of switching between regimes of the attractor, depending on the initial point on the attractor where the perturbation is applied. Figure 1b illustrates this schematically. In such a case, the distribution of sensitivities will potentially depend on the timescale δ of interest and the initial time t ref . However we expect it to decay to the (regime dependent) sensitivity for large δ.
The second type of perturbation either structurally changes the system attractor, or is large enough to place the state in the basin of a different attractor. In either case the response will approach a new attractor as illustrated in Figure 1d. In this case the distribution of sensitivities obtained by comparing initial and final states may not resemble regimes of either attractor.
Finally, we mention another approach to perturbation that is particularly useful for shortterm prediction: Figure 1c starts with a localised (say Gaussian) distribution µ 0 centred on a perturbed reference state at some time t ref , and propagate this forwards to time t ref + δ for some δ > 0. The initial distribution will spread to give a localised measure µ δ that gives a distribution of possible sensitivities via (4), which again may depend on the timescale δ, which again may depend on the timescale δ. ECS derived from palaeoclimate records typically reflects the climate sensitivity on the attractor (Fig. 1a), while model-determined ECS usually involves some type of perturbation (away from the attractor). In this sense, palaeo ECS and model ECS are conceptually different. In the next section, we explore how and when these two concepts can still give similar disributions, using a conceptual Earth System model.

Climate sensitivity in a conceptual climate model
The conceptual model of the climate system of Gildor and Tziperman [2001], Gildor et al. [2002] has been shown to simulate the glacial-interglacial transitions; the model equations are given in Appendix A. In this model, the atmosphere is represented by 4 meridional boxes while the ocean component consists of two layers of 4 meridional boxes each. The model includes land ice, sea ice and carbon-cycle effects, such that the atmospheric CO 2 concentration is a dynamic variable in the model. The model contains one dynamic fast feedback, namely the sea ice-albedo feedback evolving on (sub-)decadal time scales, and one slow feedback, the land ice-albedo feedback, which evolves on the order of millennial time scales. On the decade-to-century time scale the model includes an additional process in the surface radiative balance due to heat exchange between ocean and atmosphere. All other fast feedbacks (water vapor, clouds, aerosols, lapse rate) are represented by a fixed temperature response to the radiative forcing in the system. In this case, as discussed in Appendix B there is only one active slow feedback process, and the specific climate sensitivity parameter S [CO 2 ,LI] (4) represents the model's ECS. Orbital forcing is included in the model through varying incoming solar radiation averaged over each atmospheric box on seasonal and orbital time scales and modulating the Northern Hemisphere land ice ablation term by the (northern polar box averaged) summer insolation on orbital time scales [Gildor and Tziperman, 2000].
The atmospheric CO 2 -concentration in the full model system deserves further discussion because it can be viewed as both a forcing and a feedback. While the dynamic CO 2 is not essential for generating the glacial-interglacial cycles in the model, it feeds back on their amplitude; during cold periods when land ice is growing, reduced vertical mixing in the Southern Ocean and extended Southern Ocean sea ice cover leads to reduced atmospheric CO 2 [Gildor et al., 2002]. The exchange of CO 2 between ocean and atmosphere is fast (on time scales of a decade), however, the vertical mixing of surface-to-deep water masses in the Southern Ocean is affected by the temperature of the North Atlantic deep water, which evolves on slower time scales. The associated feedback process therefore acts on various time scales. When determining climate sensitivity, CO 2 is generally assumed a forcing. Here, it is important to keep in mind that it also can be viewed as a feedback as has been observed also in other models [Scheffer et al., 2006] and will be the case in the real climate system, particularly on long (geological) timescales.

Glacial-interglacial cycles as relaxation oscillations
We first analyse a simulation with the climate model including prognostic pCO 2 and Milankovitch forcing. The simulation is started 500 kyr ago from initial conditions that are assumed to be close to the attractor and it is run up until the present day. The simulated glacial-interglacial cycles show a peak-to-peak global mean temperature difference of up to 4 K (Fig. 2a). Corresponding CO 2 differences are about 75 ppmv, which here are completely generated by the effect of the solubility pump in the ocean.
In this model the fast sea ice-albedo feedback is responsible for the abrupt glacialinterglacial variations -the so-called sea ice switch mechanism as suggested by Gildor and Tziperman [2001]. The sea ice switch mechanism generates the glacial cycles in the model as self sustained relaxation oscillations because the ice volume thresholds for switching sea ice cover 'on' and 'off' differ [Crucifix, 2012] (see Fig. 3a) and we use this to define two climate regimes, a cold C regime with extensive sea ice cover and a warm W regime without sea ice. When the land ice volume slowly grows (accumulation exceeds ablation), the atmospheric and surface ocean temperature decrease due to increasing albedo of the planet.
Once the polar surface ocean temperature has reached a critical value cold enough to form sea ice, the polar box is rapidly covered with sea ice, which further reduces the atmospheric temperature through the ice-albedo feedback and prevents evaporation from the polar ocean box (Fig. 2b). In addition, atmospheric moisture content is reduced due to lower temperatures, which leads to decreasing land ice volume (accumulation is smaller than ablation, Fig. 2c). Temperature starts rising again both due to smaller albedo and because the ocean warms below the insulating sea-ice cover until it is warm enough to melt the polar sea ice, Fig. 2d). At this point there is a change in regime: the global temperature quickly rises, moisture content in the atmosphere increases and the land ice starts growing again (accumulation becomes larger than ablation).
In this model, Milankovitch forcing is not necessary to generate the glacial-interglacial cycles, but it modifies them and makes them more irregular. Although there is some degree of synchronisation of the glaciation and deglaciations to the orbital forcing, the relation between land ice and global mean solar radiation is not trivial (Fig. 3b). Milankovitch forcing mainly modulates the (otherwise constant) ablation of the northern hemisphere land ice, and therefore, while the land ice is growing and ocean temperatures decreasing in some (slightly warmer) periods, land ice accumulation becomes smaller than ablation and the ice growth and ocean cooling is reversed for a while (Fig. 2c).

Climate sensitivity for unperturbed climates
If one tries to determine climate sensitivity from past climate records, there is only one temporal realisation of a trajectory on the climate attractor that can be measured: perturbations away from the attractor cannot be performed. Defining the climate sensitivity in terms of the measure on the climate attractor (see Fig. 1a), we need to consider the relation between temperature T and radiative forcing due to CO 2 and land ice (the only slow process in the climate model). Fig. 4a,b shows the probability density of (T, R) combinations for the 500 kyr trajectory discussed above, obtained by box-counting the frequency of visits to a uniform discretization of this range of T and R into 120 × 120 cells (we remove a transient of length 10,000 yrs). This empirical distribution can be seen as an approximation of µ in (6).
In the special case of this relation being a linear function T (R), the climate sensitivity is given by the slope of this line and constant for all climate states. However, it has been previously shown that in this climate model the climate sensitivity is strongly state dependent due to the fast sea-ice albedo feedback changing in strength between different climate states [von der Heydt et al., 2014], which allows the definition of a local climate sensitivity (for a reference climate state). Indeed, in Fig. 4b there appear to be regimes where the (T, R) relation is close to a (linear) function, but particularly in the transition region from glacial to interglacial states the sensitivity is less well defined or even negative.
From this data we first estimate the slope of the relation S local between T and R [CO 2 ,LI] = R [CO 2 ] + R [LI] (Fig. 4b) by a linear regression on the warm (W ) and cold (C) parts of the data, giving S W local = 0.45K (W m −2 ) −1 , and S C local = 0.54K (W m −2 ) −1 , respectively. In the regression all points with T ≤ 12 • C are considered for S C local and points with 12.5 • C≤ T ≤ 14.5 • C for S W local , respectively. Note that the temperature classification divides the data into climate states without northern hemisphere sea ice (W ) and those where sea ice is present (C). The physical explanation for the higher sensitivity during the colder part of the data is that the presence of sea-ice in the cold climate states leads to a stronger sea ice-albedo feedback.
As can be seen by the density of points in Fig. 4b, even the almost linear parts of the (T, R) relation are not a (smooth) function: there is a distribution of slopes for each climate state. In  Fig. 6b-d. Comparing only W states (without sea ice) leads to generally lower sensitivity, with its mean close to the value determined by the linear regression of only W states and a rather narrow distribution. Similarly, comparing only C states (with variable sea ice) results in somewhat higher sensitivities and a larger spread around the mean (which is again close to the linear regression of the C states). The larger spread is not surprising given that the (T, R) relation in the C regime is clearly nonlinear.
The plots in Fig. 6e-h correspond to Fig. 6a-d but are calculated using (7) and the approximate measure µ whose density is shown in Fig. 4b. Note the similarity of the distributions found by both methods. The largest (and most negative) values of S [CO 2 ,LI] in (Fig. 6a,e) originate from the cross-comparison of C and W regimes (Fig. 6d,h). On the other hand, the means of S CC LI] agrees well with S W local = 0.45K (W m −2 ) −1 . This suggests (a) the discretization used to approximate µ is sufficient to capture the main features of the regime-dependent sensitivity and (b) that the reference times and delays considered sample the distribution µ well and so distributions of sensitivities given by (7) and (5) are comparable.

Climate sensitivity for perturbed model climates
If we consider climate sensitivity as a local property of a natural measure on the climate attractor as illustrated in Fig. 1a, we need to explore the set of points in the (T, R)-plane that the model visits over the glacial-interglacial cycles. In climate models used for future prediction the usual approach is to perturb the system in some way (e.g. double pCO 2 ) and study the response to this perturbation after some time. An initial distribution µ 0 evolves over a certain time scale δ after a perturbation away from the attractor has been applied as illustrated in Fig. 1b,c. The distribution of sensitivities can be found by perturbing the system instantaneously, assuming that the perturbation is not too large and the system returns to the same attractor after a transient. Note, however, that this approach requires a different type of perturbation than what is usually applied in climate models; the standard procedure in GCMs is to consider a prescribed (non-dynamic) CO 2 -doubling as a perturbation, where in fact a different attractor than that of the full Earth system (including dynamic carbon cycle) is explored. In this section, we derive the model's ECS in response to a perturbation to the initial atmospheric CO 2 including a dynamic carbon cycle, and evaluating the temperature response to this initial perturbation at different times (following the approach illustrated in Fig. 1c).
From the 500kyr model time series shown in Fig. 2 we chose two initial conditions, one in the W regime and one in the C regime with extensive sea ice. The CO 2 is doubled initially in the atmosphere and the extra CO 2 added by the doubling is uniformly subtracted from the ocean boxes, in order to conserve the total amount of carbon in the model system. Note that the atmospheric CO 2 in this model is purely determined by the biological pump in the oceans [?]. Both perturbed initial conditions are run till present day (time 0), time series are shown in Fig. 7 together with the unperturbed time series (as in Fig. 2). The time series starting from the W state quickly returns to the same temperature, CO 2 , sea-and land-ice time series (dotted lines in Fig. 7), where glacial inceptions and deglaciations occur at exactly the same time as in the unperturbed simulation (thin solid lines). In contrast, the perturbed time series starting from the C state (dashed lines) does not return to the same unperturbed time series; the sea ice present initially is melted by the initial warming, and the system undergoes a transition to the W regime. While on the long term, the variation in temperature, CO 2 , sea ice and land ice covers the same range as in the unperturbed simulation, the timing of glacial inceptions and deglaciations is different from the unperturbed simulation. This suggests that the applied perturbation is indeed small enough such that the system returns to the same attractor as can be seen in the left panel of Fig. 8. When the initial condition lies in the C regime, the perturbation induces a regime switch, after which the same attractor is explored, but on a different trajectory.
We have also applied a modified perturbation, where the total amount of carbon is not conserved; atmospheric CO 2 is doubled, while the oceanic CO 2 is unchanged. In this case, the result of the perturbation is to shift the attractor towards higher temperatures, higher atmospheric CO 2 and slightly different amounts of land ice, as shown in the right panel of Fig. 8. This type of perturbation might reflect more realistically the present-day climate change situation assuming that the carbon injected into the system originates from a geological reservoir. However, while the attractor remains very similar in shape in this case but is shifted in phase space, other components of the model system such as the land ice might need adaptations of their parameters. The situation reflects the one depicted in Fig. 1d and we will not further discuss the response to this type of perturbation but instead focus on the situation, where the climate system returns to the same attractor after the perturbation (Fig. 8 left panel).
Starting from 250 different initial conditions chosen along the 500 kyr time series shown in Fig. 2a (one initial condition every 2000 years), the model is integrated for 500 years to give control runs of temperature T cntrl(i) (t) and radiative forcing time series R cntrl(i) (t), respectively, where the index i = 1, ..., 250 denotes the initial condition. The initial CO 2 concentration pCO 0 2 in these simulations varies between 210 -290 ppm, while the global mean temperature varies between 10.8 -14.9 • C. A second set of simulations is performed, where the initial value of the CO 2 is doubled and then the model is integrated for 500 years, giving T pert(i) (t) and R pert(i) (10) Fig. 9 shows time series of CO 2 , temperature, Northern Hemisphere sea ice cover, the ocean meridional overturning circulation strength and S perturb for a few of the ensemble members (both control and perturbed experiments). Clearly, the different time scales in the system become evident; the global mean atmospheric temperature reacts quickly to the elevated CO 2 -level, and for those initial states that have sea ice, the sea ice melts within 10-20 years.
As the CO 2 is dynamic in these simulations, the increased CO 2 -gradient between ocean and atmosphere leads to a rather fast initial reduction in atmospheric CO 2 (timescale of ∼ 10 years [Gildor et al., 2002]), which then keeps decreasing on a longer time scale. After 500 years, temperature and CO 2 are almost back to their original values if the initial condition was within the W regime. However, the initial conditions within the C regime involve a regime shift and do not return to the same temperature and CO 2 -level within 500 years. The strength of the meridional overturning circulation in the ocean weakly responds to the CO 2 perturbation on a slower timescale as can be seen in Fig. 9d. The time-dependent climate sensitivity S perturb is shown in Fig. 9e; here the different behaviour of the C and W states becomes particularly evident: while the response to the perturbation of the W states (red lines) seem to approach an 'equilibrium' value with some spread, increasing in time, the C states (blue lines) produce a wide range of responses depending on where the attractor is met after the perturbation. Snapshots of distributions of S perturb are shown in Fig. 10 for 100, 200 and 500 years after the perturbation. A fast-process equilibrium should be expected after about 100-200 years, however, the spread in S perturb also increases with time, in particular for the C regime. S perturb of the W regime is similar to S W W [CO 2 ,LI] after 100 and 200 years, but further spread out after 500 years. On the other hand, S perturb of the C states resembles S CW [CO2,LI] already after 200 years, and afterwards spreads out even further. In this ensemble, S CC [CO2,LI] is in fact never observed, because the perturbation always induces a C − W transition.

Conclusions
In this paper, we have considered climate sensitivity as a local property of a climate attractor, in particular it is a property of a projection of a measure of this attractor on the (T, R) plane. This naturally leads to distributions of climate sensitivity for every radiative forcing, and if the attractor shows different regimes of special climate dynamics, state dependence of climate sensitivity can be explained in terms of regimes. We have explored this in a conceptual Earth System model with the aim to test how climate sensitivity derived from palaeoclimate records might be compared to model-derived counterparts. Conceptually, climate sensitivity is defined differently in these two situations; while palaeoclimate timeseries reflect trajectories on the climate attractor, in model simulations generally perturbations away from the attractor are applied. Moreover, climate models include only a limited amount of processes (usually the slower processes are fixed, as is the carbon cycle), which means that a different attractor may be explored by the models.
Clearly, we cannot expect to get reliable quantitative conclusions about the distribution of ECS from the low order conceptual model used for this study. Many important processes in the climate system (such as the impact of T on cloud formation) are absent from the model, which was constructed in Tziperman, 2001, Gildor et al., 2002] with the aim of explaining ice age pacing rather than the link between T and CO 2 . Even those processes that are included are open to debate; for example, the sea ice cover changes in the model are about 1.5 times larger than suggested by proxy data [Köhler et al., 2010], while Northern Hemisphere land ice cover changes are smaller (Fig. 2b). Moreover, the climate sensitivity derived from the model is higher during glacial periods, because the fast sea icealbedo feedback is stronger in those regimes. Proxy data suggest, however, higher climate sensitivity during warm periods [von der Heydt et al., 2014, Köhler et al., 2015, most likely because a combination of other fast feedbacks (such as water vapour, cloud feedbacks, etc.) may be stronger during warm climates.
Nonetheless, even for this model, the presence of variability on a number of timescales and regimes within the attractor gives clear and we find non-trivial dependence of sensitivity on regime. This suggests that it could be useful to think of the unperturbed climate sensitivity (which can be determined from palaeoclimate data) as a property of the "climate attractor". For a perturbed system (we have considered instantaneously doubled CO 2 ), which is the normal approach in climate models, this is still useful once an initial transient has decayed. This transient will depend in particular on ocean heat uptake, though also on carbon cycle and biosphere processes that act on time scales roughly equivalent with the forcing time scale. In the case of a regime shift (either natural or induced by perturbation) the spread in climate sensitivity become very large. If the climate system has more than one attractor, the perturbed system may clearly evolve to a completely different set of states than the original attractor -a situation that does not occur in the climate model used here. In less extreme cases, we cannot rule out very long transients (associated with slow feedbacks) for some perturbations.
In most climate sensitivity studies, feedback processes are considered except those related to the carbon cycle. In the history of climate, those processes are active, however, on many different timescales. In our conceptual model, we have included the part of the carbon cycle that is related to the soft-tissue biological pump in the oceans and air-sea CO 2 exchange. The resulting CO 2 variations in the model's glacial-interglacial cycles are in the range of the observed glacial to interglacial CO 2 changes and amplify the glacial-interglacial cycle while they are not necessary to generate those cycles. Accordingly, when exploring climate sensitivity from perturbation experiments with the same model, we have instantaneously doubled CO 2 and kept the model's carbon cycle active. This procedure ensures that the perturbation experiments eventually return to the same attractor as the unperturbed system. Such perturbations (illustrated in Fig. 1b,c) are not normally applied in climate models used for climate predictions [IPCC, 2013], where climate sensitivity is derived from model simulations considering prescribed, non-dynamic atmospheric CO 2 . In our conceptual model, we have also examined climate sensitivities from a classical climate model perturbation (not shown); CO 2 is doubled within the first 30 years of the simulation and kept fixed afterwards for 200 years. In this case we find significantly lower sensitivities and smaller spread than for S perturb obtained from doubling CO 2 with dynamic CO 2 . This emphasises the importance of including dynamic carbon cycle processes into climate projections. Moreover, it supports the idea that the real observed climate response may indeed be larger than the model predicted one, because those models never will include all feedback processes in the climate system. Furthermore, the carbon cycle includes more timescales and processes than considered here in this simple model. For example, processes related to the ocean-seafloor system include carbonate compensation and silicate weathering, which act on much longer timescales and have been suggested to be responsible for a mean atmospheric lifetime of anthropogenic CO 2 of 30-35 kyr [Archer, 2005].
When deriving climate sensitivity from palaeoclimate records it is important to take account of potential state dependence and different climate regimes before drawing conclusions on the ECS distribution that may be relevant for future climate evolution. For the conceptual model we consider, the long tail in the ECS distribution from the unperturbed (palaeoclimate) timeseries mostly results from the cross-comparison of states within different regimes (CW/W C). Similarly, the applied perturbation in the model always induced a regime transition and consequently large ECS values if the initial condition was in the C regime, but not in the W regime. In the context of our model, these high ECS values would not be relevant for the present climate continuing the current regime. On the other hand, if the present climate is in a regime that is susceptible to a regime shift (either natural, or due to anthropogenic 'perturbation') very large ECS values may be possible and indeed relevant. By studying data and models of warmer-than-present climates in the palaeorecord we may be able to achieve information on potentially warmer climate regimes existing for perturbed versions of the climate attractor.

A Model equations A.1 Ocean and sea ice
The ocean consists of 2 layers of 4 meridionally oriented boxes, where the polar boxes extend from 45 • to the pole and the equatorial boxes from the equator to 45 • , with meridional lengths L 1 , L 2 , L 3 , L 4 the same as the atmopsheric boxes. All tracers such as temperature T , salt S and biogeochemical variables are averaged over the two equatorial boxes, such that in fact the dynamics is determined by only three meridional boxes. The two vertical layers have thicknesses D upper and D lower , respectively. The ocean model dynamics includes a simple frictional horizontal momentum balance, is hydrostatic and mass-conserving: Here, (y, z) are the meridional and vertical coordinates and (v, w) the corresponding flow velocities, respectively. p is the pressure, g the gravitational constant, ρ 0 a reference density and r a friction coefficient. In each box, temperature T and salinity S determine the density via the full nonlinear equation of state as recommended by UNESCO [1981]. Temperature and salinity are determined by the following balances: where K h and K v are horizontal and vertical diffusion coefficients, respectively. As in Gildor et al. [2002], the vertical mixing of any tracer tr (e.g., temperature, salinity or ocean CO 2 ) in the southern polar box is dependent on the vertical stratification: where (σ t deep − σ t surf ace ) ∼ dρ/dz. In addition, upper and lower bounds of 280 and 1 Sv are imposed on the vertical mixing rates K 0 v (σ t deep − σ t surf ace ) −1 . Vertical mixing rates between the other surface and deep boxes are set constant, 0.25 Sv for the two equatorial boxes and 5 Sv for the northern polar box. The meridional overturning circulation is treated in the same way as in Gildor et al. [2002], with the upwelling through the southern polar box set to a fixed value of 16 Sv and the downwelling through the northern polar box determined by the meridional density gradient between the northern equatorial and polar ocean boxes.
The Q terms in the above equations are fluxes from other components of the climate model: Q atm T is the atmosphere-ocean heat flux due to sensible, latent and radiative fluxes: where C pw is the heat capacity of water, θ the temperature of the atmospheric box above, γ the insolation effect of a layer sea ice of thickness D sea−ice . f ow and f si are the fractions of the ocean that are open water and sea ice covered, respectively, with f ow = 1 − f si . The time scale τ is chosen such that the ocean heat transport into the northern polar atmopsheric box is about 2.3PW during interglacial periods as in Gildor and Tziperman [2001]. Precipitation P and evaporation E is converted into an equivalent salt flux: with S 0 a reference salinity. Heat and salt fluxes due to sea ice formation or melting are formulated as: where V ocean is the volume of the ocean box, T sea−ice is the temperature threshold where sea ice forms, L f is the latent heat of fusion, ρ sea−ice the density of sea ice and τ sea−ice is a short time scale to ensure that the ocean temperature remains close to the freezing temperature as long as sea ice is present. Sea ice is assumed to grow in area with an initial thickness of 3 and 1.5m in the northern and southern polar boxes, respectively, until the whole box is covered. The volume of sea ice in the polar surface boxes V sea−ice is given by: P on−ice is the amount of sea ice forming due to atmopsheric precipitation falling on the ocean area covered with sea ice.

A.2 Atmosphere
The atmospheric model follows that used in Gildor et al. [2002], with 4 atmospheric boxes above the ocean boxes. The lower surface of each atmospheric box can be either land or ocean, and both can be partly covered with (land or sea) ice. The box-averaged potential temperature is calculated from the energy balance of the box, balancing incoming solar radiation (with a box albedo determined from the relative fraction of each lower surface type in the box), outgoing longwave radiation at the top of the atmosphere, air-sea heat flux and meridional atmospheric heat transport. In each atmopsheric box, the temperature θ is determined by the difference between the heat flux at the top of the atmosphere F top and at the surface F surf ace , following the equation: where are the incoming and outgoing radiation terms at the top of the atmosphere, respectively. (R is the gas constant for dry air, C p is the specific heat of the atmosphere at a constant pressure, P 0 a reference pressure, σ B the Stefan Boltzmann constant and g the gravitational acceleration.) The incoming solar radiation Q Solar for each box is assumed to vary with season and due to orbital variations as in Gildor and Tziperman [2000]. Furthermore, Q Solar is reduced by a constant cloud albedo term α C and a part q seaice in that is directly used to melt sea ice; where sea ice exists, 15% of the incoming shortwave radiation is used to melt sea ice and does not enter the radiation balance of the atmosphere [Gildor et al., 2002]. α surf is the surface albedo of the box and is determined by the fraction of sea ice, land ice, land surface and ocean surface in that box: Here, f L , f LI , f O , f SI correspond to the fraction of land, land ice, ocean and sea ice, respectively, and α L , α LI , α O , α SI to the corresponding albedos of each surface type. The outgoing radiation depends on a mean emissivity of the box ε and a term depending on the atmospheric CO 2 concentration. Here κ is chosen [Gildor et al., 2002] such that a doubling of CO 2 will cause a radiative forcing of 4 Wm −2 . F in merid − F out merid is the net heating due to meridional heat fluxes between the atmospheric boxes. Meridional heat transport between boxes is calculated as: where the coefficient K θ is chosen such that the meridional heat transport between the two northern boxes is about 2.2 PW during interglacial periods [Gildor and Tziperman, 2001].
No net heat flux is assumed over land and land ice, therefore F surf ace includes only the ocean-atmosphere heat exchange. The meridional moisture transport F M q between the atmospheric boxes is parameterised as: where q is the humidity of the box. A constant relative humidity is assumed, with the saturation humidity at temperature θ calculated from an approximate Clausius-Clayperon equation: Over land ice in the polar boxes, another source of precipitation is the local evaporation of that part of the ocean box that is not covered by sea ice, with flux: The total precipitation in each box is then given by Precipitation falling over land or sea ice is assumed to turn into additional ice.

A.3 Land ice
The equations for the land ice sheets follow those of Gildor and Tziperman [2001], with the mass balance The source term L source depends on the amount of precipitation falling over existing ice (or falling on the 0.3 poleward area of the box even if there is no glacier there): where L area is the land area in the box, LI area the ice sheet area and box area the total area of the box. The ice sheet can shrink as a consequence of ablation. The ablation term is assumed a constant C LI [Gildor and Tziperman, 2001] plus a modulation by the summer Milankovitch forcing [Gildor and Tziperman, 2000]: where Solar June − Solar ave,June is the anomaly in summer insolation in this box relative to the average over the past 1 Myr. Southern hemisphere ice sheets are assumed constant.

A.4 Biogeochemistry
In the ocean boxes additional tracers are advected for total CO 2 (ΣCO 2 ), alkalinity (A T ) and phosphate PO 4 . These are used to calculate atmopsheric pCO 2 , see Gildor et al. [2002]. The equations for the three biogeochemistry variables Bio in each ocean box follow: with additional source/sink terms S Bio for these variables in the surface boxes: and in the deep boxes below: EP and RR stand for export production and rain ratio, respectively, and R C , R N for the ratio P : C and P : N in particulate organic matter, respectively. [CO 2,a ] is the saturation concentration with regard to the partial pressure of CO 2 in the atmosphere, and [CO 2,o ] is the CO 2 concentration in the ocean. The flux of CO 2 between ocean and atmosphere A openwater is linearly related to the pCO 2 difference between the atmosphere and the surface ocean via a constant piston velocity P V , giving a time scale of about 10 years for this gas exchange. For more details on the biogeochemistry module, see Gildor et al. [2002].

B Climate sensitivity in the model
Climate sensitivity is determined from the energy balance of the Earth. For the conceptual model [Gildor and Tziperman, 2001], we can explicitly write the energy balance of the atmosphere and extract the different contributions to climate sensitivity. Averaged over all atmospheric boxes of the model the global mean temperature T = 4 i=1 area i area θ i is determined by the difference between the heat flux at the top of the atmosphere F top and at the surface F surf ace (see previous section), where area i , (i = 1, ..., 4) is the surface area of the 4 boxes and area is the total surface area of the Earth.
To access the contributions of the different forcings and feedbacks to the radiation balance, we split the global mean radiation terms into the different components due to solar radiation (R [ins] ), land ice (R [LI] ), sea ice (R [SI] ), outgoing longwave radiation (R [OLW ] ), CO 2 concentration (R [CO 2 ] ) and the radiation at the Earth's surface (R [surf ] ): The different contributions to the radiation balance can be expressed as: When comparing two equilibrium climate states with global mean temperatures T 1 and T 2 (and ∆T = T 2 − T 1 ), the radiation balance Eq.42 reads: As we consider constant solar radiation and no changes in cloud albedo, ∆R [ins] = 0, and, when we put all the forcing or slow feedbacks on the left hand side and all fast feedback processes on the right hand side, we obtain: This finally leads to the expressions for the specific climate sensitivities = −∆T ∆R [OLW ] + ∆R [surf ] .
The last expression should approximate the sensitivity without feedbacks (i.e. only Planck feedback), In the model there is, however, one more radiation term due to the atmosphere-ocean heat exchange (∆R surf ), which acts on fast to intermediate time scales. Therefore, S [CO 2 ,LI,SI] still slightly deviates from the Planck sensitivity.
in the Netherlands. AH thanks CliMathNet (sponsored by EPSRC) for travel support to meetings that facilitated this work. We thank the Lorentz Center in Leiden for organising a "Workshop on Climate Variability: from Data and Models to Decisions" in 2014 where these ideas were first discussed, and the EU ITN "CRITICS" for providing a further opportunity to discuss this research. R T T (a) (b) Figure 1: Schematic diagram showing global mean temperature T versus radiative forcing R due to atmospheric CO 2 . (a) In the presence of natural forcing we assume there is a stationary distribution µ (shown by grey scale) in the (T, R)-plane -this is the projection of a dynamical measure onto this plane and can be divided into two climate regimes for the slow dynamics (shown here as C and W states), linked by fast changes (shown as f ). Picking two points relative to this measure gives a distribution of slopes that quantifies long-term variability of climate sensitivity for this forcing. (b) A small impulsive change of R that does not structurally change the system (dashed line) takes the system state away from the attractor. If the perturbation does not change the attractor, after a transient (small arrow), we expect to continue to explore the plane according to the distribution µ. Depending on where the perturbation is applied and its size, the response may involve a switch between different regimes of the attractor (see the perturbation applied to C state). (c) A small impulsive change R (dashed line) moves an initial distribution µ 0 to a new location µ 0 or µ 0 away from the attractor. After some time δ > 0 we reach a perturbed distribution µ δ or µ δ : these may be in different regimes depending on the initial state and strength of the perturbation. (d) A large, or structural change to the system will give a new attractor and a different set of asymptotic states.  CW/WC h Figure 6: Distributions of the specific climate sensitivity S [CO 2 ,LI] . The left panels (a-d) use (5) and a long glacial-interglacial simulation for a range of reference times and a range of delays greater than 500 yr. The right panels (e-h) use (4) and the approximation of µ in Figure 4b. Values of S outside the range [−1, 3] are truncated to the endpoints of the domain. (a,e) All climate states are considered; (b,f) Only pairs of climate states that are both in the W regime (no sea ice) are considered; (c,g) Only pairs of climate states that are both in the C regime (sea ice present) are considered; (d,h) Only pairs of climate states, where one is in the W regime and the other is in the C regime are considered. Observe that within each regime the distribution appears to be fairly tightly defined, whilst the CW/W C transitions have very long tails. Moreover, the distributions from the two methods give comparable results both within and across regimes.  Fig. 3a. We apply two different types of perturbations: left panel The total amount of carbon in the model system (ocean and atmosphere) is conserved. When doubling the CO 2 concentration in the atmosphere, the same amount of CO 2 is removed from the ocean. After an inital transient, the model returns to the same attractor as schematically illustrated in Fig. 1b; right panel The atmospheric CO 2 is doubled without compensating in the ocean, meaning that extra carbon is added to the model system. In this case, the perturbed simulations return to an attractor that has a similar shape, but is shifted to higher CO 2 levels and global mean temperatures (see also Fig. 1d). Figure 9: Climate sensitivity S perturb from perturbation experiments with dynamical CO 2 . Shown are time series of experiments, where CO 2 is doubled initially and free to evolve until year 500 (coloured lines) along with the control experiments, where CO 2 is not doubled initially (black dashed lines). The ensemble starts from 250 initial conditions taken from the glacial-interglacial time series (500 kyr, shown in Fig. 2a). In this figure we show 10 ensemble members, the blue lines have sea ice initially, while the red lines have no sea ice and darker red indicates warmer initial temperature. Climate sensitivity is determined following eq. 10. (a) Atmospheric CO 2 ; (b) Global mean surface temperature; (c) Northern hemisphere sea ice fraction; (d) Strength of the ocean meridional overturning circulation; (e) S perturb . Figure 10: Climate sensitivity S perturb from perturbation experiments with dynamical CO 2 . Shown distributions of S perturb (cf. eq. 10) at different times after the perturbation. The ensemble starts from 250 initial conditions taken from the glacial-interglacial time series (500 kyr, shown in Fig. 2a). The right panel in each plot shows the full ensemble, the middle panel shows only those initial states that have sea ice (classified as C), and the right panel shows the warm initial states without sea ice (classified as W ). (a) S perturb after 100 years; (b) S perturb after 200 years; (c) S perturb after 500 years.