Mathematical modelling of cortical neurogenesis reveals that the founder population does not necessarily scale with neuronal output

The mammalian cerebral neocortex has a unique structure, composed of layers of different neuron types, interconnected in a stereotyped fashion. While the overall developmental program seems to be conserved, there are divergent developmental factors generating cortical diversity amongst mammalian species. In terms of neuronal output some of the determining factors are the size of the founder population, the duration of cortical neurogenesis, the proportion of different progenitor types, and the fine-tuned balance between self-renewing and differentiative divisions. We develop a mathematical model of neurogenesis that, accounting for these factors, aims at explaining the high diversity found across mammalian species. By framing our hypotheses in rigorous mathematical terms, we are able to identify paths of neurogenesis that match experimentally observed patterns in mouse, and to extrapolate the corresponding patterns in macaque and human. Additionally, we use the model to identify key parameters that would benefit from accurate experimental investigation. We find that the timing of switch to symmetric neurogenic divisions produces the highest variation in neuronal output. Additionally, we find that an increase in neuronal output does not necessarily reflect a larger size of founder population, and we suggest the need for further experimental quantifications to test our predictions.


Introduction
The mammalian cerebral neocortex has a unique structure, composed of layers of different neuron types, interconnected in a stereotyped fashion (Brodmann 1908;Cajal 1909;Rakic 2008;Amunts and Zilles 2015). The cortex is linked to numerous cognitive functions such as language, voluntary movement, and episodic memory. Understanding its development and evolution is key to shedding light on the divergent mechanisms that give rise to inter-species differences or diseases, such as microcephaly. The most striking differences across species consist of final neuronal output and surface area (Krubitzer and Kaas 2005), while the variation in radial thickness of the cortex is less pronounced (Rakic 2009, see Table 1). The 1000-fold variation in cortical surface area between mouse and human results in a remarkable specialisation and diversification of cortical function, allowing faster processing of complex information (Kaas 2013).
However, the early developmental program that gives rise to each of these cortices seems to be conserved (Bystron et al. 2008;Lui et al. 2011;Kwan et al. 2012). Among the many factors determining the development of a normal cortex in mammalian species are: the size of the founder population present at the beginning of cortical neurogenesis, the length of neurogenesis, the dynamics of cell cycle length, the proportion of different cortical progenitor types, and the fine-tuned balance between self-renewing and differentiative divisions (Rakic 1995;Florio and Huttner 2014;Imayoshi and Kageyama 2014). Qualitative changes in the outcome of neurogenesis following variations in any of these factors can be intuitively determined. For example, many decades ago Rakic predicted that a prolongation of the time devoted to amplification of the founder population results in expanded neuronal number hence cortical surface (Rakic 1988). However, understanding how all of these factors work together to ensure the cortex is correctly formed is less amenable to verbal reasoning and more suitable for rigorous mathematical modelling.
Some comparative quantitative data are beginning to be available on neuronal and glial cell numbers in various species (Herculano-Houzel 2012). However, these studies are limited to finding scaling rules that explain the variation in neuronal output by means of other observable characteristics, such as length of neurogenesis and brain volume, and do not investigate the underlying mechanisms (Mota and Herculano-Houzel 2015).
Recently, focus has turned to the identification of the progenitor types involved in cortical neurogenesis, in an attempt to exhaustively classify the cell types, from neural stem cells to post-mitotic neurons, and all stages of differentiation in between, in representative species such as mouse (Gal et al. 2006;Stancik et al. 2010), macaque (Wang et al. 2011;Betizeau et al. 2013), and human (Fietz et al. 2010;Hansen et al. 2010). It is possible to draw similarities in the process of cortical neurogenesis in mouse and human, such as the radial expansion of the cortex orchestrated by stage-specific dividing cells, resulting in the formation of the layers in an "inside-first, outside-last" fashion. However, the diversification of progenitor cell types, and non-apically constrained proliferation, are indicative of divergent processes adopted by different species in the course of evolution (Reillo et al. 2011;Pilz et al. 2013). The more we know about the processes involved in each species' cortex development, the further we seem to be from finding a unifying theory that could explain the key variation observed in various mammals with cerebral cortex.
We propose a theoretical framework, based on a first-principle approach, to build a general, yet accurate, representation of a minimal set of processes and players involved in the cortical neurogenesis of the mammalian brain. Building on the current understanding of the processes involved, and framing hypotheses of unknown processes in a mathematically rigorous way, we can make testable predictions and/or expose where there is a lack of biological data necessary for a systematic testing of these hypotheses.
The final goal of this study is to integrate mathematical modelling and experimental observations to highlight both the communal processes in the early development of, and the divergent factors that can explain cortical diversity amongst, mammalian species.

Materials and Methods
We define a mathematical model of cortical neurogenesis, describing the dynamics of proliferation and differentiation at the cell population level, during the development of a mammalian brain. For this model to be descriptive of any species considered, we classify cells into two compartments: progenitors ( ) and neurons ( ). The former includes all pre-mitotic neural progenitor cell types involved in neocortex development for a given species, e.g. neuroepithelial cells, apical and basal radial glial, intermediate progenitors (Rakic 1995;Huttner and Brand 1997;Noctor et al. 2004;Florio and Huttner 2014). The neuron compartment includes all post-mitotic and permanently differentiated cells. We only consider neurons locally generated by progenitors situated between the ventricular and pial surfaces. Migration of glutamatergic neurons from outside sources (Barber and Pierani 2016) or GABAergic neurons from subpallial ganglionic eminences (De Carlos et al. 1996;Parnavelas 2000;Anderson et al. 2001;Tamamaki et al. 2003) is not explicitly modelled, however it will be accounted for when integrating experimental quantifications into the model.
We consider three possible modes of progenitor cell division: self-amplifying division (SymP), generating two identical progenitors; asymmetric neurogenic division (AsymN), generating one progenitor and one neuron; symmetric neurogenic division (SymN), generating two neurons.
As we will see later, if these three types of division occurred with constant proportions, we would observe one of two possible outcomes: extinction or unlimited growth of the progenitor population. It is known, instead, that these modes of proliferation are preferentially balanced at different stages of neocortical neurogenesis (Noctor et al. 2004;Gӧtz and Huttner 2005), and this program seems to be conserved across species (Kornack 2000).
We, therefore, define time-dependent probabilities describing the preferred mode of division in the progenitor population during neurogenesis: is the probability of where is the rate of division, linked to the cell cycle length 7 by: = log 2 7 , and 2 is the founder population. We define 2 as the time of onset of neurogenesis, i.e. when the first neuron is produced, and ; as the final time of neurogenesis, and consider the time interval ∈ ( 2 , ; ).
Cell death of post-mitotic neurons is not explicitly modelled, but will be accounted for when integrating experimental estimates of final neuronal output into the model. Similarly, the model can be refined to account for cell death of the progenitor cells. Gohlke et al. (2007) investigated the role of cell death in a computational model of neuronal acquisition.
In the special case of constant and , the population either goes extinct, remains constant, or grows without bound, depending on the value of 1 − − 2 , as shown in supplemental Figure S1. Our time-dependent probabilities will model the observed patterns of division (Takahashi et al. 1996), with SymP the preferred mode of division at early neurogenesis, AsymN peaking at mid-neurogenesis and finally decreasing to leave SymN as the preferred mode of division at late neurogenesis. Therefore, we 0 < 2 < 1, 2 < = < 1, 0 < ; < 1, 2 < = < ; .
Within these constraints, any combination of the four parameter values will correspond to a trajectory in the strategy space.
The model outcome, given a choice of parameter values, consists of the functions of time ( ) and , representing the cell counts over the course of neurogenesis, i.e. for ∈ ( 2 , ; ).
We initially consider mouse embryonic cortical neurogenesis, occurring between E11 and E19 (E indicates embryonic day), and fixed cell cycle length, 7 = 14.3 hours (see Table 1). However, in a second version of the model, we introduce an age-dependent cell cycle time, 7 ( ), to account for the experimentally observed modulation of the cell cycle length throughout the course of neurogenesis, mostly due to variations in the S phase (Dehay and Kennedy 2007).

Results
Our study focuses on three species of interest: mouse, macaque monkey, and human.
We intend to build a mathematical framework that can easily be extended to consider a larger number of species, with the aim of analysing and explaining cortical neurogenesis through the lens of evolution. Although data on a large number of other mammalian species exist, we focus on these three species that have been chosen in previous studies to effectively illustrate neocortex development across distinct branches of the phylogenetic tree (Rakic 1995).
Initially focussing on the mouse, a species whose neurogenesis has been best characterised, we search the strategy space to find the strategy, or strategies, matching the observed patterns of neuronal output. Finally, we translate the mouse strategy to macaque and human neurogenesis, to make predictions about the size of the founder population, currently unavailable in the experimental literature. The model we propose can be used to compare different developmental strategies to determine the existence of evolutionary or developmental constraints. Finally, we can determine those parameters, to which the model outcome is most sensitive, suggesting the need for higher resolution data, and identify those parameters whose variation does not substantially affect model predictions.

The strategy to build a mouse neocortex
The laminar position of a newly generated neuron is determined by the time of birth in an "inside-first, outside-last" fashion (McConnell and Kaznowski 1991), therefore we can assume that the radial expansion of the neocortex is proportional to the number of neurons. In mice, specifically, the total number of neurons per column is approximately shared between upper and deeper cortical layers (Markram et al. 2004;Vasistha et al. 2015; Table 1).
Since we are interested in the ratio of population sizes at different times, we can rewrite equations (1) as: where and are progenitor and neuron cell numbers, respectively, per initial progenitor 2 , so that the problem is independent of the size of the founder population.
Therefore, we define the mouse optimal strategy as the 4-tuple ( 2 , = , ; , = ) that best approximates the following: with minimum error . Here, B = 2 + ; − 2 2 , is defined as the time when production of layer V is completed and production of layer IV starts. Hence, we look for the strategy that, to a first approximation, produces half of the total neuronal output (i.e. deeper layers) in the first half of neurogenesis and the remainder (i.e. upper layers) in the second half.
Given the small dimensionality of the problem, it is computationally feasible to perform a systematic and exhaustive search of the parameter space corresponding to all possible 4-tuples that satisfy equations (2). Doing so, we obtain the optimal strategy: 2 = 0.1, = = 0.9, ; = 0.8, = = 13 .
The trajectory corresponding to this mouse optimal strategy is shown in Figure 3(A).
Note that, given the linearity of the system, the strategy satisfying equation (4) for = 0 will still be valid for any value of 2 , or for a different neurogenesis length ( ; − 2 ), provided that the time of switch = is rescaled. Supplemental Figure S2 shows the result of the full search of the 4-dimensional parameter space.
Our parameter sensitivity analysis (see supplemental information for details) reveals that the highest sensitivity of the output can be attributed to variations in the = parameter (Fig. 4). This means that the variation in model output observed when varying the time of switch ( = ) is more substantial when compared to the effect of varying the absolute prevalence of different types of division ( 2 , = , ; ). This sensitivity, as well as lack of experimental quantification, illustrates a demand for future experimental research into quantifying, or at least characterising, this time scale.

Age-dependent cell cycle
A simplifying assumption in the model is that cell cycle length is kept constant during neurogenesis. However, species-specific age-dependencies could significantly affect the total number of divisions occurring during neurogenesis, and consequently the final neuronal output. For example, the linear increase of cell cycle length observed in mice determines a lower rate of division in late neurogenesis. In macaque, however, an initial increase in cell cycle length is followed by a decrease in mid-to-late neurogenesis (Kornack and Rakic 1998 represented in Figure 3(B). Intuitively, the delay in the time of switch to neurogenic divisions increases the pool of progenitors to compensate for a decrease in the frequency of division at late neurogenesis. Supplemental Figure S3 shows the result of the full search of the 4-dimensional parameter space.

Estimate of founder population for mouse, macaque and human cortex
The  Carlos et al., 1996;Lavdas et al., 1999;Anderson et al. 2001;Corbin et al. 2001;Tamamaki et al. 2003 , and the corresponding optimal strategy (Eq. 5 or Eq. 6, respectively). Figure 5(B) shows the estimates of 2 for the three species and two models.
Both the constant and age-dependent cell cycle models predict that the exponential increase of cortical neurons from mouse to macaque is justified by a linear increase in the founder population (note the logarithmic scale on the axis). This prediction agrees with the radial unit linear model (Rakic 2009), attributing an exponential expansion of the number of radial columns to the linear increase in the symmetrically dividing neural stem cell pool that, in our model, is represented by the founder population 2 .
Our estimate for human 2 , however, leads to a somewhat counterintuitive prediction.
When moving from macaque to human, the founder population required to produce larger neuronal output is not increasing. Specifically, the founder population estimated for the human cortex is smaller than that for the macaque, despite producing a considerably larger neuronal output.
Solutions and for the parameterised models are shown in Figure 5 To test the robustness of our result and challenge such a counterintuitive prediction, we introduce variations in the assumptions that could have skewed the results. Specifically, we repeat the estimations for varied amounts of interneuronal migration and postneurogenesis cell death. Even an unrealistically large rate of post-neurogenesis cell death does not alter the prediction of lower human founder population (Suppl. Fig. S4A).
Similarly, the prediction still holds even when increasing the rate of interneuronal migration to 50% (Suppl. Fig. S4B), although a more realistic 13% rate has been suggested (Mountcastle 1997). Finally, due to the lack of quantification of cell cycle length for human progenitors at various developmental stages, we used the experimental data for the macaque cell cycle length. This assumption, adopted in previous comparative studies (Rakic 1995), is valid only as a first approximation, since the two species belong to the same order. However, an underestimation of the cell cycle length in humans could cause an underestimation of the corresponding founder population. Therefore, we repeat the estimations for longer cell cycles, up to twice the baseline value for macaque progenitor 7 (Suppl. Fig. S5A and B for age-dependent and constant cell cycle length models, respectively). For such large (biologically unrealistic) values, the model prediction breaks down: an increasingly large founder population is necessary to obtain increasingly larger neuronal outputs by the end of neurogenesis.

The neurogenesis simulator
We have developed a graphic user interface allowing the user to select and calibrate the neurogenesis model, observing how the temporal population dynamics vary when changing parameter values. The user can select and compare strategies of neocortical neurogenesis across different species. The neurogenesis simulator is available for download at www.dpag.ox.ac.uk/team/noemi-picco .

Discussion
We propose a simple mathematical model of cortical neurogenesis that aims at describing the dynamics of cell proliferation and differentiation in various species.
Specifically, we can think of the cortex in each species arising as the result of an optimisation of cell proliferation and differentiation, to obtain the optimal strategy. Such a strategy consists of the modulation in time of these processes during development ( Fig. 1) so that, within the time allowed for cortical neurogenesis in each species, the required number of neurons is produced and cortical layers are formed.
In order to make meaningful comparisons between species, we defined a twopopulation model, describing the time dynamics of progenitors and neurons during cortical neurogenesis. The progenitor population is defined differently for any species of interest and its description can be refined to include different types of progenitor cells (e.g. neuroepithelial cells, radial glial, intermediate progenitors, etc.).
The main hypothesis of this model is that the three types of division (SymP, AsymN, SymN) are preferentially adopted at different stages of neurogenesis. Therefore, we imposed a two-step linear strategy with the prevalent mode of division shifting from selfreplication at early neurogenesis to neurogenic differentiation at later times (Fig. 2). We propose two variants of the model, with constant, and age-dependent, cell cycle length, respectively.
The model framework allows the systematic exploration of the strategy space by simultaneously varying all parameter values within biologically relevant ranges to find the strategy (strategies) matching the observed outcomes (Suppl. Fig. S2 and S3). We could obtain the optimal strategy for the mouse, experimentally the best characterised species (Fig. 3).
Despite imposing the shape of probability functions, to describe a shift in prevalence of mode of division, we do not make assumptions about the timing of the switch from mostly self-replication to mostly neurogenic differentiation. The model prediction is that, to match the pattern of neuronal output observed in mice, the time of switch must be around early/mid-neurogenesis.
We then used sensitivity analysis to determine which parameters have the highest influence on the model outcome. We found that the timing of switch of strategy from mostly self-replicating to mostly neurogenic divisions is more important, in terms of neuronal output, than the mere proportions of these types of division (Fig. 4). Although a number of cell-intrinsic factors have been found to regulate modes of division at the single-cell level, a full understanding of the mechanisms is lacking. For example, overexpression of p27 Kip1 , a cyclin-dependent kinase inhibitor, was found to substantially reduce neuronal output in supergranular layers (i.e. late-stage neuronal production; Tarui et al. 2005). Although this constitutes only circumstantial evidence, p27 Kip1 is a potential candidate for the experimental validation of the effects of a delay in the time of switch. Similarly, −catenin was found to be a key regulator of cell cycle reentry of neural precursors, promoting self-replicating divisions and generating aberrant cortices in mutant mice (Chenn and Walsh 2002). Primary microcephaly is an example of a developmental disorder associated with a mistimed switch between proliferative and differentiative divisions. Specifically, all microcephaly genes identified are related to centrosomal abnormalities affecting mitosis (Gilmore and Walsh 2013).
Having characterised the optimal mouse strategy, we used the model to estimate another quantity of interest, currently unavailable in the literature, namely, the size of the founder population of progenitors. Our model makes a counterintuitive prediction: the number of founder progenitors needed to produce the neurons present in the human cortex at the end of neurogenesis is lower than the corresponding estimate for macaque, despite the two-fold increase in neuronal output (Fig. 5). This prediction holds even for large variations in key parameters, such as rates of post-neurogenesis cell death and interneuronal migration (Suppl. Fig. S4).
Finally, we investigate the effects of variations in cell cycle length, assumed to be approximately similar in all primates, given the lack of data. We allowed some variation from the baseline value (macaque) and observed that the model prediction breaks down for large, unrealistic, values of human progenitor cell cycle length. This is an example of how, by integrating theoretical and experimental modelling, we can determine how other observed phenomena affect our hypotheses, predictions and understanding of the process that is being investigated. This approach is particularly relevant in resolving questions arising from circumstantial evidence. Indeed, the theoretical model can suggest whether a given mechanism could lead to significantly different predictions, suggesting that further experimental investigation is necessary, or otherwise futile for the problem being investigated.
To the best of our knowledge, a quantification of the cell cycle length of human progenitors is not available from the current literature in vivo. Therefore, at this stage, we can only speculate on the implications of this counterintuitive prediction. However, our study suggests that further experimental studies should focus on the quantification of cell cycle duration, as our model shows that very different predictions can follow.
Conversely, variations in other properties of the system, such as post-neurogenesis neuronal cell death and interneuronal migration, do not affect the qualitative predictions of the model.
There is strong evidence of a link between cell cycle lengthening (specifically, of the G1 phase) and the progression to a higher propensity to differentiative divisions (Dehay and Kennedy 2007). However, the relationship of causality between these two factors has yet to be established. By proposing two versions of the cell cycle model, we offer an alternative interpretation of the developmental program, with no direct relationship between cell cycle length and preferred modes of division. Our constant, and agedependent, cell cycle models lead to qualitatively similar predictions regarding the scaling of the founder population, but very different quantitative predictions for a given species (Fig. 5B). Experimental quantification of 2 would give the added benefit of validating either one of the two models, determining the need (or not) of age-dependent cell cycle to quantitatively recapitulate the population dynamics.

Conclusions
With this study we introduced a minimal theoretical framework which, informed by experimental observations, offers a mechanistic explanation of the development and evolution of the mammalian neocortex. This model allows a comparative study of mammalian species, and future modelling of human pathologies, such as microcephaly.
In building our first minimal model we left out a number of properties of the system and mechanisms that could be integrated as they become relevant and necessary to answer specific biological questions, and as data becomes available. For example, radial migration is only modelled implicitly, by assuming that positioning in the radial direction is related to time of birth (Rakic 1974;McConnell et al. 1991) fundamental to our assumption of half neuronal output being produced by mid-neurogenesis, corresponding to approximately equal neuron numbers between upper and deeper cortical layers.
Additionally, our model captures the dynamics of neuronal acquisition averaged over the entire neocortex, neglecting variations in cortical areas. Early regionalization determined by variations in cell-cycle kinetics is crucial (Polleux et al. 1997). However, such differences are relatively minor compared to the cross-species variation that we intended to address with this work. Furthermore, our model explains inter-species variation in neuronal output regardless of cortical surface and folding. To explain these differences (i.e. gyrencephalic vs lissencephalic) we would have to account for density, cell size, tangential migration, etc., all of which are species-specific quantities and properties. Specifically, it was shown that cortical folding approximately scales with surface area, and thickness, and not with neuronal numbers (Mota and Herculano-Houzel 2015). Additionally, through integration of imaging and mathematical modelling it was demonstrated that variations in cortical thickness within the cortex are caused by the physical forces emerging during the folding process (Holland et al. 2017). However, the problem of neocortex folding is a separate avenue of research, decoupled from the mechanisms that we want to investigate here.
The aim of this work was to go beyond scaling rules, to identify the minimum set of players, properties and interactions sufficient to explain cross-species variation in cortical neurogenesis in terms of mechanisms.
Specifically, we can exploit the predictive and analytical power of mathematical modelling to discriminate between the many factors contributing to the formation of the cortex. Additionally, sensitivity analysis allows us to pinpoint those mechanisms where higher resolution data would be beneficial to improve our understanding of cortical

2013.
(1) 75% of pallial progenitors estimated by Haydar et al, 2000, discounting progenitors of the medial, lateral and ventral pallia.    . Local sensitivity analysis of strategy parameters around the optimal mouse strategy (Eq. 5). Sensitivity is calculated according to the formula reported in the supplemental information. * indicates the reference value for the parameter whose sensitivity is being calculated. By definition sensitivity corresponding to * is zero.