Modelling the crop: from system dynamics to systems biology

There is strong interplant competition in a crop stand for various limiting resources, resulting in complex compensation and regulation mechanisms along the developmental cascade of the whole crop. Despite decades-long use of principles in system dynamics (e.g. feedback control), current crop models often contain many empirical elements, and model parameters may have little biological meaning. Building on the experience in designing the relatively new model GECROS, we believe models can be made less empirical by employing existing physiological understanding and mathematical tools. In view of the potential added value of robust crop modelling to classical quantitative genetics, model input parameters are increasingly considered to represent ‘genetic coefﬁcients’. The advent of functional genomics and systems biology enables the elucidation of the molecular genetic basis of these coefﬁcients. A number of case studies, in which the effects of quantitative trait loci or genes have been incorporated into existing ecophysiological models, have shown the promise of using models in analysing genotype–phenotype relationships of some crop traits. For further progress, crop models must be upgraded based on understanding at lower organizational levels for complicated phenomena such as sink formation in response to environmental cues, sink feedback on source activity, and photosynthetic acclimation to the prevailing environment. Within this context, the recently proposed ‘crop systems biology’, which combines modern genomics, traditional physiology and biochemistry, and advanced modelling, is believed ultimately to realize the expected roles of in silico modelling in narrowing genotype–phenotype gaps. This review summarizes recent ﬁndings and our opinions on perspectives for modelling genotype 3 environment interactions at crop level. GECROS-predicted a comparison of stimulation of grain yields of well-watered and water-stressed wheat crops (expressed as the average ratio of grain yield under elevated CO 2 to the yield under ambient CO 2 for the two growing seasons 1992–1993 and 1993–1994) from ﬁve different crop models (Demeter, LINTUL, AFRC, mC-wheat, and Sirius) previously evaluated by Tubiello and Ewert (2002) and from our model GECROS. The GECROS model can mimic the trend that compared with those grown under the ambient CO 2 level, plants grown under the elevated CO 2 level can eventually lose their advantage in photosynthetic rates with progress of development—photosynthetic acclimation—due to relatively reduced leaf N content (a), can have an even lower green area index towards crop maturity due to faster senescence [b, also see Equation (1)], and can have a lower canopy photosynthesis towards crop maturity (not shown), as a result of (a) and (b). Because of these mechanisms captured by the model, GECROS can better predict grain yield differences among the eight environments (2 CO 2 levels 3 2 water treatments 3 2 seasons) than other models in terms of the percentage of variation explained by the model ( r 2 ) and relative root mean square error (rRMSE) (c), and the stimulation effects of the elevated CO 2 on grain yields, simulated by GECROS, was closest to the observed values (d). Other models tended to overestimate the effects, largely leading Long et al. (2006) and Ainsworth et al. (2008) to conclude that model parameterization based on chamber experiments is inappropriate to project crop response to elevated CO 2 under ﬁeld conditions.


Introduction
Since the days of von Bertalanffy (1933Bertalanffy ( , 1969, Wiener (1948), and Forrester (1958and Forrester ( , 1961, systems theory and systems approach have been the subject of renewed interest in biological sciences. Ludwig von Bertalanffy published on general systems theory and recognized the importance of 'wholeness'-the 'systems' of various orders not understandable by investigating their respective parts in isolation. Norbert Wiener blended systems theory and control theory (embedding in the title of his work the notion that his ideas were equally applicable to living entities and to machines) and demonstrated that communication and control were inseparable. Jay Forrester combined the theory and methods needed to analyse the behaviour of various systems and proposed the field of 'system dynamics'. System dynamics uses concepts drawn from the field of feedback control to organize available information into quantitative simulation models. The model-based simulation reveals the behavioural implications of the system. The basis of the method is the recognition that the structure of a system is often more important in determining its behaviour than the individual components themselves. Originally developed to improve the understanding of industrial processes, system dynamics has also been used for analysis of other types of systems.
For decades, crop scientists have used these systems concepts and approaches in the form of dynamic crop growth simulation models ('crop models' hereafter) to investigate crop growth mainly in response to abiotic environmental factors. These models emerged in the mid-1960s with the pioneering work of de Wit (1965Wit ( , 1968Wit ( , 1978 and others (e.g. Duncan et al., 1967;Thornley, 1972;Charles-Edwards and Fisher, 1980;Charles-Edwards, 1982). In these models, constituting elements and processes are put together in mathematical equations. The rules by which the elements or processes interact give rise to systems behaviour and emerging properties, which may well be unexpected and even counterintuitive. This is model heuristics, which in turn enhances the understanding of individual processes and further improves the models. There have been active debates about model usefulness and limitation (e.g. Passioura, 1973). Yet, continuous efforts are being made to improve the usefulness of these models (Priesack and Gayler, 2009). Models are currently being used in support of theoretical research, yield predictions, and decision making in agriculture. In recent years, attempts have been made to use models to quantify crop genotype-phenotype relationships (Yin et al., 2000a(Yin et al., , 2004Hammer et al., 2006). The results of Yin et al. (2000a) indicate that analysing genotype-phenotype relationships requires more robust crop models than do conventional agricultural applications.
To study genotype-phenotype relationships, an alternative, 'bottom-up' approach is emerging in systems biology (e.g. Kitano, 2002). The initiative of systems biology was analogous to that of crop modelling in the late 1960s, because of the need for instruments that could summarize increasing quantities of high-throughput experimental data at the cell and subcellular level. Although the meaning of systems biology as a new scientific discipline is still under debate, systems biology generally aims to synthesize complex data sets from the genome, transcriptome, proteome, and metabolome into useful mathematical models. It seeks to explain biological functioning in terms of 'how things work' in (sub-)cellular units. Similarly, plant systems biology was defined (Minorsky, 2003); using computational modelling approaches to predict a plant cell(ome) from underlying genomic understanding. The initiative to develop (plant) systems biology will facilitate the development of functional genomics as a scientific discipline. Systems biology also exemplifies the proposition that simulation of biological systems is a broad field.
However, plant systems biology should not be considered as an entirely new field or even as a research paradigm shift (Bothwell, 2006), given also the current status of conventional crop modelling based on whole-plant or whole-crop physiology. Hammer et al. (2004) have argued that the above definition of plant systems biology not only largely overlooks the rich history of crop modelling, but it is probably also not the best approach to quantify the phenotypes at the crop level for solving the real-world problems towards crop improvement for increased production-the ultimate goal plant systems biology wants to achieve (Minorsky, 2003). Many others (e.g. Sinclair and Purcell, 2005;Boote and Sinclair, 2006;Struik et al., 2007) also stressed that molecular plant sciences alone do not work at the level where it matters for increasing crop productivity. Most probably, alterations made at the genome level, although substantial, are moderated by a damping effect along the biological hierarchy, and could end up with only a small effect on the crop-level phenotypes Yin and Struik, 2008). Struik (2007, 2008) therefore stated that to face challenges in coping with complex genotype-phenotype relationships, initiatives for crop-based systems research should draw on the existing modelling developments, and, at the same time, one should parameterize and redesign some subroutines of crop models by making use of biochemistry and genomics. The concept 'crop systems biology' was described thereof, in order (i) to bring the information from functional genomics to the crop level; (ii) to better understand the organization, intra-and interplant competition of the whole crop and its response to environmental conditions; (iii) to fill the vast middle ground between '-omics' and relatively simple crop models; and (iv) to incorporate more biological mechanisms in current crop models.
Here further progress in modelling crop genotypephenotype relationships is discussed, built upon historical views and current affairs in crop modelling, and especially upon the experience in designing the relatively new model GECROS (Yin and van Laar, 2005). To advance towards crop systems biology modelling, several areas for model improvement will be identified, and considerations for linking crop modelling with biochemistry and '-omics' will be further elaborated.
Modelling the whole crop using principles of system dynamics It is well recognized that complex crop phenotypes, such as grain yield or water and nutrient use efficiencies, are regulated by both multiple interacting genes and environmental conditions. The effects and expression of these genes may depend on the developmental stage via certain molecular pathways. More importantly, these phenotypes are also a reflection of the cumulative effects of (fluctuating) environmental conditions and their interactions on multiple intermediate component processes and feedback and compensation mechanisms, and of intra-and interplant competition at higher levels of aggregation (Yin and Struik, 2008). Because a series of competitions, interactions, and feedbacks operate along a crop developmental cascade, a change of one component may result in an unwanted, negative consequence in other components. As a result, yield per hectare of a crop cannot be extrapolated from the yield of a single plant grown in isolation multiplied by planting density. For example, Ainsworth et al. (2008) calculated unrealistic soybean yields of 34.8 t ha À1 and 43.6 t ha À1 for crops under ambient and elevated CO 2 levels, respectively, from data collected in chamber experiments where plants were grown with little competition.
To unravel the underlying physiology and feedback mechanisms of crop growth, it is important to recognize that often the behaviour of the whole cannot be explained in terms of the behaviour of the parts in isolation. For example, Mittler (2006) indicated molecular evidence that the response of plants to multiple stress cannot be predicted from their responses to individual stresses. However, the activities of crop modelling during the last decades did not always honour this principle of system dynamics. Prevailing model concepts that demarcate crop production into potential, water-limited or nitrogen (N)-limited levels (e.g. Penning de Vries and van Laar, 1982), facilitating model development by focusing on one major factor at a time, cut the internal link among processes and therefore may not help to model individual processes that interact. For example, leaves with high N levels transpire more than do low N leaves, because photosynthesis, stomatal conductance, and transpiration are coupled (Wong et al., 1985). Thus, models considering water-and N-limited levels separately are not useful for environments whereby N and water may be co-limiting or limiting in tandem.
It has long been established that crop growth relies on the functional balance of contrasting components (e.g. shoots versus roots, sources versus sinks) and processes [e.g carbon (C) metabolism versus N metabolism, assimilation versus dissimilation] (Thornley, 1972(Thornley, , 1998Charles-Edwards, 1976). In addition, crop growth is associated with many feedback features. For example, more N uptake by roots results in a higher rate of photosynthesis by shoots, which in turn results in more growth and more N uptake (positive feedback). However, N assimilation is an active process requiring ATP to support it; so, more N uptake is accompanied by more respiration and less growth, which in turn leads to less N uptake (negative feedback). Similarly, to fill the larger number of grains formed as a result of increased photosynthesis during sink formation requires more N, which will then not be available to maintain the photosynthetic activities of leaves in a later period (Triboi and Triboi-Blondel, 2002). Another indication of negative feedback is the apparent reduced amount or down-regulation of Rubisco under elevated CO 2 conditions (Xu et al., 1994;Paul and Pellny, 2003;Ainsworth et al., 2004). The consequences of these feedback and compensation mechanisms are those often reported generalities such as low yielding ability of cultivars having high protein concentration (Munier-Jolain and Salon, 2005), high postflowering N uptake of 'stay-green' plants (Triboi and Triboi-Blondel, 2002), conservative ratio of respiration to photosynthesis (e.g. Gifford, 1995), and low N concentration of plants when grown at an elevated CO 2 level (e.g. Sakai et al., 2006). Some models use these emergent generalities as empirical input functions to guarantee model predictability over a certain domain. In contrast, Yin and van Laar (2005), along the aforementioned lines of thinking, presented a crop model, GECROS, to overcome some of the weaknesses of earlier crop models. GECROS captures traits of genotypespecific responses to environment based on quantitative descriptions of complex traits related to phenology, root system development, photosynthesis and stomatal conductance, and stay-green traits. The model can generate physiological observations, such as photosynthetic acclimation to elevated CO 2 , as reported by experimental studies (e.g. Xu et al., 1994) (Fig. 1a, b). It also predicts better grain yields (Fig. 1c) and shoot biomass (not shown), and the impact of stimulation of crop yield by elevated CO 2 (Fig. 1d), as observed in a large-scale FACE (free-air CO 2 enrichment) experiment, than do other existing crop models. Long et al. (2006) and Ainsworth et al. (2008) expressed the concern that there are some quantitative differences in how crops respond to elevated CO 2 in FACE and chamber experiments given that current popular crop models parameterized from chamber experiments typically overestimate the CO 2 fertilization effect on crop yields (Fig. 1d). They indicated that controlled chamber environments clearly are not the best experimental facilities for estimating the CO 2 response ratio of crop yield. It is demonstrated here that the robust crop model GECROS does allow a translation and extrapolation of input information at the single-organ level in a short time scale [e.g. parameters, estimated by Yin and van Laar (2005), from leaf photosynthetic rates per second from controlled-environment chamber studies] to the crop performance in a continuously changing field environment. This is line with the modelling work of Chenu et al. (2008), who illustrated the upscaling of short-term responses of leaf growth rate to water deficit to predict the leaf area index (LAI) at the crop level.

Reducing empiricism of the models
Models, if robust, can play roles not only in data synthesis and prediction, but also in heuristics and system design (Yin and Struik, 2008). For the latter two roles, models have to be structured as mechanistically as possible at the level of the trait concerned, in order to embody its biological causes that interact to drive system dynamics and to generate the emergent consequences at a higher level (Passioura, 1979). Cheeseman (1993) showed that using equations to define mechanisms for strictly local events allows complex behaviour to emerge at higher levels of organization without separate rules or integrating functions. However, no model is absolutely mechanistic, because our understanding will eventually become limited with a lowering of the level of analysis; so models often help to identify knowledge gaps. Sometimes mechanistic models are not applied simply because detailed modelling at lower levels may be unnecessary for predicting the trait at a distant higher level (Granier and Tardieu, 2009). For example, the steady-state C 3 photosynthesis model of Farquhar et al. (1980) is generally considered as mechanistic; but its electron transport limitation was based on several assumptions ) and used only a hyperbolic equation to describe the light response of the electron transport rate, whereas a more mechanistic form for this rate (Farquhar and von Caemmerer, 1981) was largely discarded. Also important is the need to keep model complexity commensurate with the quality and amount of available data.
Models that are absolutely empirical are hardly used, although they may be used within particular contexts [e.g. Mündermann et al. (2005) described the plant growth rate merely according to a logistic equation as a function of time with the purpose of visualizing the dynamics of a plant's three-dimensional shape]. Empirical elements are, however, still common in many models that are mechanistic in their overall structure. A typical example is the transport re-sistance approach to model the partitioning of substrates (C and N) between roots and shoots (Thornley, 1972(Thornley, , 1998. In line with the classical Mü nch's hypothesis, the approach describes the transport rate of C and N being proportional to root-shoot substrate concentration gradients divided by a transport resistance, and partitioning is a result of transport and chemical conversion in the root and shoot. The overall frame of this model is mechanistic, but underlying resistance coefficients are empirically scaled by structural mass with allometric constants. Another example is the crop models using the concept of light use efficiency (LUE) to predict crop dry matter production. LUE itself is a simple concept, yet it is robust if well represented (Charles-Edwards, 1982;Monteith, 1994;Fig. 1. Time course, simulated by the crop model GECROS, of average leaf nitrogen content in the canopy (a), and total green surface (i.e. leaf+stem+ear) area index (b), of well-watered wheat crops grown under ambient CO 2 (370 lmol mol À1 , thick curves) and elevated CO 2 (550 lmol mol À1 , thin curves) for the free-air CO 2 enrichment (FACE) experiment conducted in the 1992-1993 season, in Maricopa, Arizona, USA (see Kimball et al., 1995), and (c) a comparison between observed and GECROS-predicted grain yields and (d) a comparison of stimulation of grain yields of well-watered and water-stressed wheat crops (expressed as the average ratio of grain yield under elevated CO 2 to the yield under ambient CO 2 for the two growing seasons 1992-1993 and 1993-1994) from five different crop models (Demeter, LINTUL, AFRC, mC-wheat, and Sirius) previously evaluated by Tubiello and Ewert (2002) and from our model GECROS. The GECROS model can mimic the trend that compared with those grown under the ambient CO 2 level, plants grown under the elevated CO 2 level can eventually lose their advantage in photosynthetic rates with progress of development-photosynthetic acclimation-due to relatively reduced leaf N content (a), can have an even lower green area index towards crop maturity due to faster senescence [b, also see Equation (1)], and can have a lower canopy photosynthesis towards crop maturity (not shown), as a result of (a) and (b). Because of these mechanisms captured by the model, GECROS can better predict grain yield differences among the eight environments (2 CO 2 levels32 water treatments32 seasons) than other models in terms of the percentage of variation explained by the model (r 2 ) and relative root mean square error (rRMSE) (c), and the stimulation effects of the elevated CO 2 on grain yields, simulated by GECROS, was closest to the observed values (d). Other models tended to overestimate the effects, largely leading Long et al. (2006) and Ainsworth et al. (2008) to conclude that model parameterization based on chamber experiments is inappropriate to project crop response to elevated CO 2 under field conditions. Dewar, 1996). However, many models treat its responses to environmental stresses as: LUE¼LUE 0 f(T)g(W s )h(N s ), where LUE 0 is the baseline value of LUE, and effects of temperature (T), water stress (W s ), and N stress (N s ) are expressed often by arbitrarily defined linear segment functions (e.g. Brisson et al., 2003). Neither individual linear segment stress functions (Sinclair and Horie, 1989) nor the multiplicative form of the combined stress effects are justified from a physiological understanding. For example, water stress affects growth via both its direct stomatal and non-stomatal regulation of photosynthesis (Tezara et al., 2002) and its indirect effect from a changed leaf temperature. Multiplicative forms are commonly used to quantify the interactions of environmental factors on physiological parameters (e.g. Thornley, 1998).  presented evidence that a multiplicative model can result in a wrong direction of interaction between two involved factors (temperature and CO 2 ) on the trait under study (quantum efficiency of leaf photosynthesis) (Fig. 2). This type of model error can lead to unreliable predictions of the impact of climate change on crop yields.
It is difficult to model all plant processes with a consistent mechanistic detail. The term 'phenomenological modelling' has been used as the intermediate approach between absolutely empirical and mechanistic modelling (e.g. van Oijen and Levy, 2004). This opens up the feasibility of avoiding empirical relationships, using physiological observations. An example is how to establish the relationship between LAI (L) and total amount of N in leaves of a full canopy (N). One can draw an empirical L-N relationship from curve-fitting experimental measurements of these two variables. One problem of this curve-fitting is that coefficients of the obtained relationship may have no biological meaning. It has been established, either experimentally or theoretically, that (i) leaf N at various heights of the canopy normally follows an exponential profile (Field, 1983;Chen et al., 1993;Bertheloot et al., 2008); and (ii) there is a base value of leaf N, n b , at or below which the leaf photosynthetic rate is zero (Sinclair and Horie, 1989). Using these two observations, Yin et al. (2000b) presented an L-N relationship: where k is the extinction coefficient of leaf N in the canopy. Curve-fitting using Equation (1) will result in more biologically meaningful parameters (k and n b ) for a fully grown canopy-a canopy where the N content of bottom leaves is as low as n b . Any leaf area beyond the value given by Equation (1) is supposed to senesce; so Equation (1) engenders a biologically coherent algorithm to predict leaf senescence (Yin et al., 2000b), thereby avoiding empirical stage-dependent leaf turnover coefficients as often used in many models (e.g. Thornley, 1998;Bouman et al., 2001). This algorithm can predict an accelerated leaf senescence of plants grown under elevated CO 2 conditions (Fig. 1b) as often observed experimentally (e.g. Kimball et al., 1995;Zhu et al., 2009), agreeing with the concept of C and N interaction on senescence (e.g. Paul and Pellny, 2003).
Another way to reduce the empiricism is to use some basic mathematics, not necessarily involving additional biological understanding. For example, the cubic polynomial model is often used to describe a symmetric sigmoid pattern, for example the course of sink demand as a function and g(C a ) were empirically established from separate experimental data. This figure is drawn from , with permission from Elsevier. For C 3 photosynthesis, the biochemical model predicts a reduced temperature sensitivity of U CO2(LL) at a doubled CO 2 level, a response curve close to that of C 4 photosynthesis (which has a CO 2 -concentrating mechanism) and supported by experimental data (e.g. Ku and Edwards, 1978), indicating that the sensitivity of U CO2(LL) to temperature is reduced under conditions of a reduced photorespiration (low O 2 or high CO 2 levels). The empirical model predicted an increased temperature sensitivity of U CO2(LL) under the elevated CO 2 condition (b). An earlier version of the ORYZA model was used to assess the impact of global warming on the Asian rice production (Matthews et al., 1997). Their simulation results should receive a critical reservation, partly because their photosynthesis submodel does not predict correctly the interaction between temperature and elevated CO 2 on U CO2(LL) . of time (t). Use of polynomials may be subject to criticisms because their coefficients are lacking any biological interpretation. When cubic polynomial equations are expressed as (Yin et al., 2003): the meaning of parameters does show up, whereby w max is the maximum value of growth quantity (w) achieved at the end of the time span for the growth (t e ), and w b is the initial weight at the beginning of the time span (t b ). Equation (2) is a cubic polynomial suitable for the growth pattern when growth starts at time zero with the initial weight as zero.
Equation (3) is suitable for any initial condition. The generalized (non-symmetric) polynomial model was presented by Yin et al. (2003). Because of their determinate nature, the differential form of these equations is more suitable than that of the asymptotic logistic-type equations for embedding in a crop model to describe the dynamics of organs as sinks to absorbed assimilates (Yin et al., 2003). Mathematics has also been explored to enhance computational efficiency. For example, the analytical algorithms for calculating photosynthetic rate at the leaf level (e.g. ) avoid an earlier, computationally demanding approach for numerical iteration (e.g. Leuning, 1995) to deal with awkward loop relationships for simultaneous solution to leaf photosynthetic rate, diffusional conductance, and internal CO 2 concentrations. Goudriaan (1986) and de Pury and Farquhar (1997) reported efficient mathematical methods for calculating canopy photosynthesis without errors of the simple 'big leaf' approach (e.g. Charles-Edwards, 1982).
In short, with insights into a biological understanding and use of mathematical tools, it is possible to upgrade crop models for excellent core properties: robust model structure, enhanced heuristics, numerical consistency and stability, minimum input requirement, and accurate outputs.

Integration of crop modelling with genetics
Upgraded crop models with the properties discussed in the previous section can better address genotype-phenotype relationships, provided that model input parameters can be easily measured (Yin et al., 2004) and vary little with environmental conditions (Reymond et al., 2003;Tardieu, 2003). Model input parameters (also called 'genetic coefficients') reflect the effects of genetic origin in the way that one set of parameters represents one genotype (Tardieu, 2003). Hence, the models manifest that the crop phenotype is achieved through non-linear interactive and ontogenetic responses of component processes to multiple environmental factors. The models can help to identify traits having the greatest impact on yield and simulate genotype 3 environment interaction (G3E) and even epistasis (e.g. Chapman et al., 2003). Such an approach has added values to classical genetics, since geneticists often ignore or overlook competition, density, nutrient supply, morphology, physiology, and plasticity, lumping such matters vaguely under the 'G3E' term or introducing simple response functions in their statistical models (e.g. van Eeuwijk et al., 2005). First attempts have been made through so-called 'QTL-based ecophysiological modelling'.
In genetics, complex crop traits can be unravelled into the effects of individual QTLs-quantitative trait loci (Paterson et al., 1988)-commonly using the materials of a segregating population derived from a bi-parental cross. However, QTL expression is usually conditional on the environment and this greatly impedes the application of QTL mapping information for manipulating complex traits (Stratton, 1998). Several studies attempted to link crop physiology with genetics, focusing on the G3E problem and genotypephenotype relationships. A simple genetic model can be assumed for QTL analysis of the component traits, but more sophisticated genetic control (epistasis and G3E) on the complex trait per se can be manifested when QTL-based parameter values are fed-back to the ecophysiological model. The QTL-based models can be used to predict performance of any genotype in any environment.
This approach of QTL-based modelling was first used to predict a very complex trait-the grain yield of barley (Hordeum vulgare)-by Yin et al. (1999Yin et al. ( , 2000a, but that first explorative study was affected by the fact that neither the designs nor the input parameters of earlier crop models were amenable to analysis, with sufficient accuracy, of yield differences for a large number of relatively similar genotypes. Later, the same approach for QTL-based modelling analyses was applied to simpler crop traits such as the leaf elongation rate in maize (Zea mays; Reymond et al., 2003), flowering time in barley , rice (Oryza sativa; Nakagawa et al., 2005), and Brassica oleracea (Uptmoor et al., 2008), and fruit quality in peach (Quilot et al., 2005). In the domain of morphological traits, the phenotypic effects of QTLs for culm length, grain number, and grain size in barley have been simulated using morphologically explicit models (Buck-Sorlin, 2002). A common feature of these studies is that predictability of QTL-based models is nearly comparable with that of the model using original parameter values, as the gain from the removal of random noise in original parameters by QTL statistics is roughly cancelled out by the loss due to the fact that the identified QTLs cannot explain 100% of the genetic variance of the parameter values (e.g. . These studies on relatively simple developmental or morphology-related traits demonstrate that the approach can unravel G3E, and highlight the potential to analyse more complex traits manifested through season-long growth dynamics. Yin and Struik (2007) envisaged that the approach as practised for QTL-based modelling for individual lines of the population of a bi-parental cross could be extended using linkage disequilibrium mapping, in which association between genotypes and phenotypes is scrutinized over a large germplasm collection (e.g. Remington et al., 2001). This development in association genetics may enhance opportunities for gene-based crop modelling, as empirically practised by White and Hoogenboom (1996), Messina et al. (2006), and White et al. (2008), which predicted flowering and yield traits of crop cultivars via regressing input parameters against binary values of relevant candidate genes.
In short, genetic mapping dissects a quantitative trait into various genetic factors-QTLs (Paterson et al., 1988)-but it can only predict the trait phenotype in independent new environmental conditions to a limited extent (Stratton, 1998). Ecophysiological modelling can reveal how G3E comes about (Tardieu, 2003), but it does not consider the genetic basis of model parameters that describe genotypic differences. Combining ecophysiological modelling and genetic mapping can dissect complex yield traits into component traits, integrate effects of QTLs of the component traits over time and space at the whole-crop level, and predict yield performance of various genetic make-ups under different environmental conditions. There is in silico evidence that this combined approach can facilitate translating the QTL mapping into more efficient marker-assisted breeding strategies (Hammer et al., 2005).

Towards crop systems biology
The potential of QTL-based modelling and the advent of systems biology can bring crop modelling into a further stage. Struik (2007, 2008) argued the need for the development 'crop systems biology' in order to better study crop genotype-phenotype relationships. This can be achieved by linking those physiological processes captured by current crop models with biochemical and '-omics'-level understanding.

Linking with biochemistry
To upgrade crop modelling further, there is a need to make comprehensive synthesis and use of the rich biological understanding of plant functional relationships. In relation to genotype-phenotype relationships, a key element would be to identify the parts of mechanisms that are conservative in energy and water transfer and C and N metabolism, and the parts of mechanisms that show genetic variation and are amenable to selection and engineering. For example,  summarized current understanding of leaf physiology and presented an analytical model for leaf photosynthesis of both C 3 and C 4 species considering diffusional resistances of CO 2 transfer across the boundary layer, stomata, and mesophyll within a leaf. This model has been incorporated into the model GECROS (Yin and van Laar, 2005) by replacing its earlier leaf photosynthesis model. This coupled model will enable assessment of how genotypic differences in mesophyll conductance can affect complex resource (radiation, water, and N) use efficiency at various aggregation levels from leaf to crop levels, given the recent recognition that mesophyll conductance has strong relevance for the CO 2 level at the carboxylation sites of Rubisco in C 3 species (see Flexas et al., 2008;. Biochemical modelling is so far confined to canopy or lower levels. Detailed models of photosynthetic electron transport, the Benson-Calvin cycle, and the photorespiratory cycle have been published (e.g. Laisk et al., 2006Laisk et al., , 2009Zhu et al., 2007). Numerical simulation conducted by Zhu et al. (2007) revealed that manipulation of N partitioning could greatly increase light-saturated C gain without any increase in the total protein-N investment in the apparatus for photosynthetic C metabolism. To optimize N use, C assimilation models could be extended to associate with the stoichiometry of N assimilation, in relation to the activity of key enzymes, for example nitrate reductase and glutamine synthetase, due to the close coupling between C and N assimilation in plants (e.g. Noctor and Foyer, 1998;Paul and Pellny, 2003). Another example is the investigation of the benefit of selection for genotypes able to recover rapidly from the photoprotected state of photosystem II (PSII) by Zhu et al. (2004), who simulated the spatial and temporal heterogeneity of light flux in crop canopies and the cost of delayed recovery in PSII photochemical efficiency on transfer from high to low light. They predicted an increase of daily C uptake by at least 13% for a canopy with an LAI of 3. If these approaches are embedded in a crop model so that any temporal and spatial feedback of the advantages can be quantified for the whole growing season, potential yield gain under field conditions could be assessed in silico, although this level of detail is not required for general predictive purposes.
There are some well-known complicated phenomena that current crop models cannot accommodate but may need biochemistry to solve. First, sink formation (e.g. grain number and grain set) in response to environmental stresses is not yet well understood (Reynolds et al., 2009). Temperature, including changes in plant temperature induced by other stresses, is probably the most important factor that directly determines spikelet fertility and survival (e.g. Jagadish et al., 2007). The detrimental effect of low and high temperature has been well recognized, but their quantification in crop models, if any, is highly empirical. Genetic variation with regard to sink sensitivity to temperature is expected to be high, relative to the sensitivity of the 'source' components. However, the feedback effect of a reduced sink capacity on source activity (photosynthesis) (e.g. Ainsworth et al., 2004;McCormick et al., 2006) is another subject meriting attention.
Secondly, there is substantial evidence of both photosynthetic (e.g. Yamasaki et al., 2002;Haldimann and Feller, 2005) and respiratory (Atkin et al., 2005) acclimation to growth environments. The long-term relationship of respiration with temperature may be inconsistent with the accelerating-alike character of the Q 10 concept. The optimum temperature for photosynthesis appears to be close to the temperature at which plants were grown (e.g. Yamasaki et al., 2002). Yet virtually no crop model has quantified this acclimation phenomenon, despite its important consequence on crop production under fluctuating field conditions (Rascher and Nedbal, 2006). Next, we will mainly discuss photosynthesis, not only because of its direct relevance in agriculture for conversion of solar energy into biomass (Kruger and Volin, 2006;Murchie et al., 2009) but also because it indirectly affects plant morphology (Huner et al., 1998).
Photosynthesis involves light energy absorption through primary photochemistry, energy transformation through electron transport, and energy utilization by stromal metabolism. Several workers (e.g. Durnford and Falkowski, 1997) have suggested a putative role for any imbalance between energy supply and utilization in sensing environmental changes through alternations in PSII excitation pressure, which reflects the relative reduction state of the photosystem (Huner et al., 1998;Mullineaux and Karpinski, 2002). Modulation of this chloroplastic redox signal initiates a signal transduction pathway, which coordinates photosynthesis-related gene expression (Fig. 3). As a result, the partitioning of leaf N among various photosynthetic protein groups varies (Hikosaka and Terashima, 1995;Walters, 2005), resulting in photosynthetic acclimation.
Acclimation does not only operate for biochemical processes of photosynthesis in the short term; in the longer term, its impact can also extend to morphological traits that have direct relevance on crop modelling. Specific leaf area (SLA), an indicator of leaf thickness and plant morphology, is used as an input parameter to describe LAI in many crop models. However, this approach implicitly assumes that SLA is the cause of LAI and takes little account of changes in SLA in response to the environment. Literature reports (e.g. Tardieu et al., 1999;Poorter et al., 2009) have shown a surprising similarity in the morphological acclimation to various stresses: low temperature, excess light, drought, and low N all induce a smaller SLA. This similarity is probably due to common biochemical pathways. Again, photosynthesis, acting not only as the primary energy source for plant growth but also as an environmental sensor (Huner et al., 1998;Murchie et al., 2009), plays a role. The chloroplastic redox signalling as induced by an imbalance between energy supply and use acts synergistically with other pathways to elicit responses (Bowler and Chua, 1994), extending its influence beyond leaf mesophyll cells to the meristems to affect plant morphology (Fig. 3). All the abiotic stress events have this common mechanism for plant morphology acclimation as they all lead to an overexcitation state of PSII (Long et al., 1994). So, any abiotic stress that leads to excessive energy supply relative to demand by metabolism will result in a smaller SLA, because an increased leaf expansion to intercept more light is not beneficial in this case; conversely, those environmental changes that lead to a reduced energy supply will result in an elongated morphology. This effect of photosynthetic acclimation on SLA not only has a feedback effect on leaf photosynthesis as SLA affects the concentration of photosynthetic enzymes per leaf area basis, but also has an additional impact on canopy photosynthesis because a change in SLA is inevitably associated with the LAI of the canopy. Incorporation of the photosynthetic acclimation mechanism based on energy balance and its aftereffects on morphology will enable crop models to better predict season-long crop performance under fluctuating field environments.

Linking with '-omics'
To elucidate biochemical mechanisms that underpin the change of phenotypes in response to genetic architecture and environmental variables, there will be increasing benefit from molecular genetics and '-omics' studies. Wilczek et al. (2009) extended a common phenological model of crop development by linking individual model coefficients to the activities of specific genes and their regulators involved in the transitions to flowering in Arabidopsis thaliana. The model explained >92% of variation in days to bolting of mutants impaired in different signalling pathways across field conditions of the species' native European range, and also accurately predicted flowering date variation of the plants not used in model parameterization.
Metabolite and transcript profiling enables an integrated analysis of the metabolic network. For instance, the mechanism of soybean respiratory responses to growth at Fig. 3. A schematic illustration of the chloroplast as an environmental sensor that initiates intracellular as well as intercellular signal transduction pathways. Through modulation of photosystem excitation pressure, photoautotrophs sense changes in environmental conditions through imbalances in the energy absorbed versus energy utilized through metabolism. Modulation of chloroplastic redox poise initiates a signal transduction pathway whereby the chloroplast affects nuclear gene expression. In addition, this redox signal transduction pathway may act synergistically with other signal transduction pathways. This appears to extend the influence of chloroplastic energy imbalance beyond leaf mesophyll cells to the meristem regions of the plant to affect leaf morphology [reflected by, for example, SLA (specific leaf area)] as a result of plant acclimation to environmental changes (based on Huner et al., 1998). The affected leaf morphology will ultimately have a feedback effect (the dashed arrow) on the balances between the energy absorption and utilization in chloroplasts, and on rates of leaf photosynthesis. elevated CO 2 has been investigated in studies that combined molecular, biochemical, and physiological analyses of plants. Using cDNA microarrays, Ainsworth et al. (2006) examined molecular drivers for the increased leaf area of soybean plants grown at elevated CO 2 . They showed that at the transcript and metabolite level, elevated CO 2 leads to the increased respiratory breakdown of carbohydrates, which probably provides increased energy required for leaf expansion. This result suggests that in addition to the well documented importance of photosynthetic C fixation, respiratory processes also play a role in determining leaf expansion at elevated CO 2 . Leakey et al. (2009) examined transcript profiles, leaf carbohydrate status, rates of photosynthesis, and respiration of mature leaves at multiple developmental stages, over two growing seasons, again providing evidence that long-term growth at elevated CO 2 leads to transcriptional reprogramming of metabolism that stimulates respiration.
Modern genomics offer the knowledge needed to ascribe physiological function to gene function. However, to close the gap between genotype and crop-level phenotypes, an integrated modelling approach to link various disciplines from molecular to crop physiology is required . It can be envisaged that the functioning of the whole crop can be described by integrating processes across various biological scales. Combined studies of physiological components with transcript and metabolite profiles are increasingly conducted to elucidate how gene functions, biochemical pathways, and cellular processes are coordinated. Such studies should lay the groundwork for, and will in turn benefit from, modelling regulatory networks and linkages among gene products, biochemistry, and wholeplant physiology. Obviously, different temporal and spatial scales are required for different structural components, developmental pathways, and biological processes of the whole system (Yin and Struik, 2008).
Ultimately, crop systems biology may evolve into a highly computation-intensive discipline. For example, a multiscale modelling can help to assess, in silico, complex crop phenotypes and their underlying biological components in response to genetic fine-tuning and environmental scenarios. Dwivedi et al. (2007) expected that this type of modelling will provide a platform to integrate knowledge of processes at various scales, and help to generate methodologies to enhance the efficiency of using the outputs of genomics research in a crop improvement programme. It is expected that crop systems biology models will act not only as predictive, but also as highly heuristic engines that can lead to innovations by design for crop research. This echoes the view of Di Ventura et al. (2006) that 'although a global and perfect understanding of a biological system is not expected in the near future, the combination of modelling and experimentation offers the possibility of making inroads towards that goal, as well as developing new exciting, useful applications'. Within this context, it is necessary to revise the early view (de Wit and Penning de Vries, 1985) that 'attempts to construct models containing so many levels of explanation that the behaviour at the population level is explained on the basis of knowledge at the molecular level do not serve any purpose'.

Conclusions
Recent studies have shown that combining physiological modelling and genetic mapping (molecular genetics) into a QTL (gene)-based modelling could be powerful to resolve complex environment-dependent traits on a genetic basis. To that end, robust physiological models that can manifest both genetic and environmental control of crop phenotypes are required. Although current crop models, partly based on decades-long use of the system dynamics method, have been considered by many to be matured enough for various agricultural applications, they still need to be upgraded to model genotype-phenotype relationships. Uses of physiological understanding and mathematical tools can overcome the apparent empiricism of many current crop models. For further progress, the opportunities to develop a 'crop systems biology' model in view of the rich history in biochemistry and the advent of the '-omics' are outlined. In our opinion, the 'crop systems biology' model should consider photosynthesis-driven acclimation to environmental changes for a better understanding of G3E for some basic processes. This may be achieved by quantifying the imbalance between light energy supply and use in photosynthesis, and its resultant signalling transduction in the short term that may affect leaf morphology in the longer term. We should then try to understand, characterize, and model genetic variation in plasticity of plant acclimation, using information from QTL and gene network analysis. The genetic variation in the plasticity of plant acclimation, the present knowledge of biochemical pathways and gene expression, and the elegance of various energy-utilizing and dissipating mechanisms should offer excellent opportunities to analyse G3E effects on crop yields. Crop systems biology models can enhance our capability of in silico up-and down-scaling between gene expression and crop production.