Self-learning growth simulator for modelling forest stand dynamics in changing conditions

Finnish forest structures vary from even-aged planted forests to two- and multi-storied mixed stands. Also, the range of silvicultural systems in use has increased because thinning from above and continuous cover management are gaining popularity. The data currently available for modelling stand dynamics are insufﬁcient to allow the development of unbiased and reliable models for the simulation of all possible transitions between various current and future stand conditions. Therefore, the models should allow temporal and regional calibration along the accumulation of new information on forest development. If the calibration process is automated, the simulators that use these models constitute a self-learning system that adapts to the properties of new data on stand dynamics. The current study ﬁrst developed such a model set for stand dynamics that is technically suitable for simulating the stand development in all stand structures, silvicultural systems and their transitions. The model set consists of individual-tree models for diameter increment and survival and a stand-level model for ingrowth. The models were based on the permanent sample plots of the 10 th and 11 th national forest inventories of Finland. Second, a system for calibrating the models based on additional data was presented. This optimization-based system allows different types and degrees of calibration, depending on the intended use of the models and the amount of data available for calibration. The calibration method was demonstrated with two external datasets where a set of sample plots had been measured two times at varying measurement intervals.


Introduction
Growth predictions have a central role in forest planning and the practice of sustainable forest management. Errors in growth predictions lead to suboptimal use of forest resources, and they may also lead to unsustainable forestry. Most growth models (see Pretzsch, 2009) are best suited for even-aged stands, corresponding to the forestry doctrine of the past decades, and available empirical data. However, in Finland for example, all forests are not even-aged, and alternatives for even-aged management have become more tempting for various reasons as described by Nieminen et al. (2018) and Vauhkonen and Packalen (2019). In the latest forest act of Finland, issued in 2014, continuous cover forestry became equally acceptable as even-aged management, and silvicultural instructions have recently been developed also for continuous cover forestry (Äijälä et al., 2014).
The structures of Finnish forests form a continuum from even-aged, single-species stands via two-storied and multispecies stands to multi-storied and uneven-aged structures (Kuuluvainen, 2009). A common structure due to natural succession is to have spruce regeneration under canopies dominated by pine and birch. The spruce layer may be unevenaged, and it may replace the pines and the birches when they are cut or start to die. During this process, an originally even-aged stand first develops into a two-storied stand, from which it may further develop toward a multi-storied structure. This process is sometimes counteracted by human interference, namely thinning from below and removing the advance regeneration before the thinning operation.
Growth model systems based on site index (or age and dominant height) are not well suited for stand structures other than even-aged forest that is treated with thinning from below. Besides, stand age is not always measured in forest inventories. Management-oriented forest inventories are increasingly based on remote sensing (e.g. Maltamo and Packalen, 2014). Airborne laser scanning techniques provide good estimates for tree height (Wang et al., 2019) and other forest biophysical attributes that are correlated with height (e.g. Nilsson, 1996;Naesset, 1997a, b;Maltamo et al., 2004a, b). The stand age can be imputed from plots measured as ground truth data in other stands. Despite Forestry this, it is reasonable to develop age-independent methods for estimating the site index (Solberg et al., 2019).
As a partial reaction to the lack of correspondence between existing growth models and emerging forest structures, Pukkala et al. (2013) developed prediction models that are technically suitable for all stand structures. These models do not use stand age or dominant height as predictors. The model set also includes a sub-model for ingrowth, which is an important element of the stand dynamics in continuous cover forestry. The models of Pukkala et al. (2013) are based on datasets that are geographically biased and partly also quite old. The models of Hynynen et al. (2002) might also be unfit as management alternatives have increased, resulting in changes in stand structures. Also, climate change and other factors (for instance nitrogen deposition) have already significantly increased the growth rate of Finnish forests  compared with the years of growth modelling data. Therefore, there are obvious reasons to develop new growth models for Finnish forestry.
Simultaneously, new demands have been set for forest planning and growth simulators: prediction models and simulators should be transparent, flexible and adjustable. In Finland, forest management is highly important for climate change mitigation, maintenance of biological diversity and the national economy. Because of the importance of forest management for multiple ecosystem services, both the decision-makers and the citizens are highly interested in the growth predictions for Finnish forests, estimates of sustained yield as well as knowledge and reasoning behind them. In this respect, transparency of the prediction models is a necessity. When the prediction models and calculation systems are open and transparent, all interested parties can test the assumptions on which the calculations supporting national (Sievänen et al., 2014;Kärkkäinen et al., 2020) or regional (Haakana et al., 2017;Kärkkäinen et al., 2019;Hyvönen et al., 2020) forest policies are based. The users of forest growth prediction models may also have their own regional or context-specific datasets, which they want to use to test the models and perhaps calibrate them. The possibility for easy calibration and modification extends the life-span of the prediction models and simulation systems in changing conditions and makes them more widely applicable and acceptable.
The objective of this study was to develop a complete set of generic forest growth prediction models for three sub-processes of stand dynamics: diameter increment, tree mortality/survival and regeneration (ingrowth). Model forms that can be used in different stand structures and silvicultural systems were targeted. Therefore, stand age was not used as a predictor (nor site index that is based on stand age and dominant height), and the model set includes sub-models for ingrowth. Also, a method for calibrating the models or part of them is presented and demonstrated. A self-learning simulator for stand dynamics, adjustable for changing conditions, evolved in the course of the study. This simulator can be used to test the generic prediction models developed in this study, calibrate the models with empirical datasets and simulate stand dynamics with either the generic or calibrated model sets.

Data
Sample plot and stand description data from the Finnish National Forest Inventory (NFI; Korhonen, 2016;Korhonen et al., 2017) were used. The NFI data are a statistical sample of all land area in Finland. The whole data of the 11th NFI (NFI11, measured in 2009-2013) consist of 54 396 plots. The sample plots were socalled restricted relascope plots, i.e. on each sampling point (plot) the measured trees were selected with the probability proportional to size sampling design. The plot radius thus increased as a linear function of tree diameter. The maximum radius of the plot was 12.52 m in South Finland and 12.45 m in North Finland.
This study focused specifically on observations from permanent inventory plots measured in the 10 th NFI (NFI10, 2005(NFI10, -2008 and NFI11. The total number of permanent plots was 16 062, of which altogether 11 987 plots were selected based on two criteria: (1) the plots studied had to be located on productive or poorly productive forest land and (2) the centre point of the previously measured plot was found with certainty in the latter inventory. These plots cover mainland Finland except for the sparsely forested three northernmost municipalities where NFI10 was not implemented (in 2020, Finland had 310 municipalities). Altogether 8258 plots (69 per cent) were located on mineral soils and the remainder on different types of peatlands. No silvicultural treatment history during the most recent 10-year period was discovered on altogether 7394 plots (62 per cent), while the rest of the plots had been treated.
The tree-level data consisted of observations for a total of 121 231 and 119 391 trees measured from the permanent plots in NFI10 and NFI11, respectively. The tree variables included diameter at breast height (diameter at breast height (d.b.h.) at 1.3 m from ground), tree species, tree status (dead, alive) and change in tree status between NFI10 and NFI11 (survivor, ingrowth, removed and mortality). Of the tree level observations, 21 per cent were measured on herb-rich and better sites, 47 per cent on mesic sites, 25 per cent on sub-xeric sites and 6 per cent on xeric or poorer sites. Of all measured trees, 50.1 per cent were Scots pine (Pinus sylvestris L.), 27.6 per cent Norway spruce (Picea abies (L.) H. Karst.), 3.6 per cent silver birch (Betula pendula Roth), 14.8 per cent downy birch (Betula pubescens Ehrh.), 1.4 per cent aspen (Populus tremula L.) and 2.4 per cent other species.
For each plot, more than 100 stand-level variables were additionally available, describing the stand compartment of the plot. In NFI, the stand compartment is determined as a region that is homogenous in terms of site type, growing stock, past management and recommended future management. Thus, the standlevel variables describe an area that is markedly larger than the plot. The following stand-level variables were extracted for this study: • Site type: mineral soil or peatland • Site fertility (6 classes, see below) • Altitude above sea level (m) • Effective temperature sum (degree days) • The most recent silvicultural treatment and its estimated timing.
Self-learning growth simulator

Data preparation
The data measured on relascope plots were used to prepare modelling data for the individual-tree diameter increment model, individual tree survival function and stand-level ingrowth model. The potential predictors for the diameter increment and survival models included variables that describe tree size, competition and site productivity. The models were fitted separately for Scots pine, Norway spruce and broadleaf species. The main broadleaf species were silver birch, downy birch, aspen, alder (Alnus incana (L.) Moench) and rowan (Sorbus aucuparia L.). Tree size was described by d.b.h., competition total by the basal area of trees (expressed in m 2 ha −1 ) and the basal area of trees larger than the subject tree (BAL; Wykoff, 1982;Vanclay, 1994), and site productivity by latitude, altitude, temperature sum, fertility class and soil type (mineral soil vs peat) (Hynynen et al., 2002;Trasobares et al., 2004). The fertility classes were mesotrophic herb-rich, herb-rich, mesic, sub-xeric, xeric and barren heath. Diameter at breast height and all the site variables were directly available in the sample plot data. The stand basal area and BAL were calculated by summing the basal area factors of the measured trees. Each tree represented the same basal area. The basal area factor was 2 m 2 ha −1 in South Finland and 1.5 m 2 ha −1 in North Finland. However, there were some exceptions from this rule since there was a maximum distance from the plot centre beyond which trees were no longer measured. BAL and basal area were computed separately for pines, spruces and broadleaved species. This made it possible to include the effect of species composition in the models.
Diameter increment was obtained as the difference in the diameter of the same tree in two consecutive measurements. The measurement interval was ∼5 years. However, since plots were measured also during the growing season and the date was not the same in the two measurements of the same plot, the diameter increment was converted to a 5-year increment by utilizing the starting and ending dates of the growing season at different latitudes (https://www.ilmatieteenlaitos.fi/terminen-ka svukausi). It was assumed that the rate of diameter increment is constant during the growing season.
In a relascope plot, there can be ingrowth to all diameter classes, and the number of trees per hectare represented by one measured tree decreases when the tree grows. In this study, the calculation of the number of ingrowth trees was based on the following formula (N = number of trees per hectare): where N(t) is the number of trees per hectare at measurement occasion t, calculated using the tree diameter (expansion factor) of time point t; N(t + 1) is the number of trees per hectare at measurement occasion t + 1, calculated using the tree diameter (expansion factor) of time point t + 1 and N mortality is the number of trees per hectare that died between the time points t and t + 1, calculated using the expansion factor of time point t.
Of the variables of equation 1, all except N ingrowth were known. Therefore, N ingrowth was computed as follows: Since N and N mortality apply to trees taller than 1.3 m, N ingrowth is the estimate of new trees that pass the 1.3-m height limit between time points t and t + 1. Ingrowth was calculated separately for pine, spruce, silver birch, downy birch and other broadleaved species.
Only non-cut plots were used for fitting the baseline models (Tables 1 and 2). This allowed us to keep the models simple as variables describing the removed competition were not needed as model predictors. The average 5-year diameter increment of all trees was 1.3018 cm, and the 5-year survival rate calculated over all trees was 0.9829.
Small trees have very high expansion factors when the basal area factor is 2 m 2 ha −1 . Therefore, the overall level of ingrowth estimate obtained from equation 2 mainly depends on the smallest trees that were measured at time point t + 1 but not at time point t. These are mostly trees that had passed the 1.3-m limit during the measurement interval. These trees must be very close to the plot centre to become included in a relascope plot. Therefore, if there are more than one tree that had passed the 1.3-m limit, these trees are very close to each other. Often they are a group of trees such as coppice shoots sprouting from the same stump. Since a maximum of one of these shoots has space to grow to larger dimensions, the data were censored following the suggestion of Lappi and Pukkala (2020). All ingrowth estimates larger than 25 000 trees ha −1 were replaced by 25 000, which roughly corresponds to accepting only one of those trees that had passed the 1.3-m limit during the measurement interval. Compared with uncensored data, censoring decreased the mean ingrowth estimate by 8 per cent in Scots pine, 20 per cent in Norway spruce, 43 per cent in birch and 53 per cent in other broadleaf species.

Model fitting method
The candidate predictors and their transformations were selected based on previous research and literature (Vanclay, 1994;de-Miguel et al., 2012;Pukkala et al., 2013) and the availability of the variables both in the dataset of this study and in datasets of potential applications. Different variable combinations were tested using fixed-effects regression modelling, aiming at a low number of predictors, logical signs of the regression coefficients and robust model forms. The fit statistics used in these comparisons were the Akaike information criterion, standard deviation of residuals and statistical significance of the parameters. Of the available site variables (temperature sum, altitude, latitude, longitude, fertility class, soil type), only temperature sum, fertility class and soil type (mineral soil vs. peat) were systematically significant at the 95 per cent confidence level.
Non-linear regression analyses and model forms that prevent illogical predictions were used. The forms of the models are as follows.

Diameter increment (DI)
Survival rate (S) :  where x is a vector of predictors and f (x) is of the form a 0 + a 1 x 1 + a 2 x 2 . . . (a i is regression coefficient and x i is predictor). These model forms guarantee that negative model predictions are not obtained, and the predicted survival rate is between 0 and 1. Since trees measured in the same plot were correlated observations, the final models were fitted as mixedeffects models assuming a normally distributed random plot factor in the model (a 0 was replaced by a 0 + u j where u j is random plot factor). The models fitted in regression analysis are referred to as generic models.

Model calibration procedure
The procedure originally proposed by Pukkala et al. (2011) and later used by de-Miguel et al. (2014), Juma et al. (2014) and Jin et al. (2019) was adapted for calibrating the models with datasets other than the modelling data of this study. The method needs data on the diameter distribution of trees in several sample plots in two or more measurement occasions. The idea of the method is to use the diameter distribution at the first measurement occasion and simulate the stand development until the second measurement occasion. The simulated diameter distribution (or some other stand variable) is compared with the measured one. The simulator is linked with an optimization algorithm that adjusts the parameters of the models aiming at minimizing the difference between the simulated and measured diameter distributions at the second measurement occasion.
In this study, the minimized loss function was as follows: Self-learning growth simulator where is the set of coefficients of the models for diameter increment, survival and ingrowth; J is the number of plots; I j is the number of diameter classes in plot j; g m ij and g s ij (θ) are, respectively, measured and simulated basal area (m 2 ha −1 ) of diameter class i at the end of the measurement interval; f ij m and f s ij (θ ) are, respectively, measured and simulated number of trees in diameter class i; and a and b are the weights of the diameter distributions of basal area and number of trees. Threecentimetre diameter classes were used in the loss function. The weights (a and b) also remove the effects of the different units of basal area and number of trees (a should be larger than b if the unit of basal area is m 2 ha −1 and f ij is the number of trees per hectare). Parameters c1 and c2 determine how quickly the difference between measured and simulated values increases the loss.
As an alternative, the following loss function was also used in model calibration: where G m jk and G s jk (θ ) are, respectively, measured and simulated basal area (m 2 ha −1 ) of species k (k = 1 for pine, k = 2 for spruce, k = 3 for birch and k = 4 for other species) at the end of the measurement interval and F ik m and F s jk (θ ) are, respectively, measured and simulated number of trees per hectare.
The method of Nelder and Mead (1965) was used in calibration (as explained in Supplementary Data). It is a population-based method, which operates with several solution vectors (Pukkala 2009). The number of solution vectors (population size) is one more than the number of optimized model coefficients. For example, if 10 coefficients of the models are adjusted with the calibration method, 11 combinations of 10 coefficients are produced and updated for several iterations using the principles of the Nelder and Mead algorithm. In the calibration examples of this study, the initial population of solution vectors was obtained by drawing a uniformly distributed random number from a range specified for each coefficient: where x i is the value of coefficient i in an initial solution vector. The minimum value of the coefficient was 0.5 times the value of the coefficient obtained from regression analysis (Tables 3, 5 and 6) and the maximum value was 1.5 times the coefficient.

Calibration datasets
Two datasets were used to test and demonstrate the model calibration procedure described above. The two datasets were Spati and Silvi, both described in Pukkala et al. (2013). The Spati data were collected from 135 rectangular plots during the 1980s. Plots were measured in pine, spruce and mixed conifer stands in North Karelia on mineral soil sites. The past 5-year radial growth was measured from every tree using increment cores. Ingrowth was not measured, and tree mortality was negligible in these plots that represented managed commercial forests.
The plots have been used in several studies to fit distancedependent individual-tree diameter increment models (e.g. Pukkala and Kolström, 1987;Miina and Pukkala, 2000). Plots were purposefully measured in different spatial distributions of trees, ranging from very regular plantations to highly aggregated tree arrangements.
The Silvi data were measured from several silvicultural experiments established during the 1980s. The individual plots of these experiments represented even-aged management and different variants of continuous cover forestry. Altogether 139 measurement intervals ranging from 2 to 22 years were available. In each measurement, the number of trees was counted in 1-cm diameter classes by tree species. The trees were not numbered. Therefore, no growth, survival and ingrowth data were available as measurements. Only the combined influences of the three sub-processes of stand dynamics were observed in these plots.
In the Spati dataset, only diameter increment models were calibrated and used in simulation. The calibration was referred to as 'scaling' when only the intercepts of the models were optimized. In the other case (referred to as 'fitting'), all model parameters except the coefficients for temperature sum and peatland were optimized. Equation (7) was minimized in scaling and equation (6) in fitting. Weights a (for basal area) and b (for number of trees per hectare) were 1 and 0.001, respectively, and the exponents of the loss functions (c1 and c2) were both 1.5.
When the Silvi data were used, the intercepts of all 10 models were optimized in scaling, and equation (7) was minimized. In fitting, a total of 64 parameters of the 11 models were optimized (all parameters except the coefficients for peatland since none of the plots was located in peatland).

Diameter increment models
All three diameter increment models (pine, spruce and broadleaves) had 10 predictors (Table 3). Increased temperature sum increased diameter increment most in spruce and least in pine, implying that the growth difference between southern and northern Finland is the greatest in spruce. Indicator variable Peat affected diameter increment only in pine, most probably because the peat layer is often thin in spruce and broadleaf peatland forests. The reduction corresponded to the effect of one site class; for example, the growth of pine on mesic peatland was approximately the same as on sub-xeric mineral soil site (Figure 1). Site fertility affected diameter increment more in spruce than in pine.
The effects of tree size and competition on diameter increment were logical. Among broadleaved species, increasing diameter started to decrease growth later in silver birch and aspen, as compared with other species (Fig. 1). The model predicted similar growth rates for small trees of different broadleaf species, but the growth rate of species other than silver birch and aspen started to decline earlier. The coefficients for the BAL variables (BAL/ (d + 1)) show that the diameter increment of spruce increased with the increasing proportion of pine and broadleaves among larger trees. The growth of broadleaf species increased with the increasing proportion of pine. d is diameter at breast height (cm); G is stand basal area (m 2 ha −1 ); BAL is basal area in larger trees (m 2 ha −1 ); TS is effective temperature sum (degree days); herb-rich, sub-xeric and xeric are indicator variables for the respective site fertility classes; Peat is indicator variable for peatland forest; Pendula and Aspen are indicator variables for Betula pendula and Populus tremula, respectively; Sdev is standard deviation; and RMSE is the square root of the mean of squared errors.

Figure 1
Predictions of diameter increment models for pine, spruce and silver birch (temperature sum is 1100 d.d., stand basal area is 25 m 2 ha −1 and basal area in larger trees is 10 m 2 ha −1 ). 'in Pine' means that the tree (spruce or silver birch) is growing in a pine stand, 'Peat' means that the soil is peat and 'Downy' means that the prediction is for downy birch. 'Old' refers to predictions calculated with the models of Pukkala et al. (2013).

Survival models
Increasing diameter first increased survival rate, after which it decreased slowly (Tables 4 and 5, Figure 2). The survival rate was slightly lower in peatland forests in all species, as compared with mineral soil sites. Temperature sum and fertility class were not significant predictors of survival rate. Species composition had a significant effect on survival rate in such a way that the predicted mortality of spruce increased with increasing competition by spruce, and competition by pines increased the mortality of broadleaves less than competition by other species. With a certain diameter and competition (BAL), birch had a higher predicted survival rate than the other broadleaf species. Figure 2 suggests that the fixed parts of the mixed-effects models may not be plausible since they predict almost 100 per cent survival rate for all except very small trees (Figure 2, red curves). The reason for this outcome might be the combined effect of small plot size (few trees per plot) and the very small average number of dead trees in the sample plots. As a consequence, the plot factors of the mixed-effects models may explain most of the variation in survival rate. The high standard deviation of the random plot factor (Table 3) also suggests that the effect of the plot on survival rate may be much larger than the effect of the fixed model predictors. Since the use of the fixed part of the mixed-effects models seems to result in an over-estimated survival rate, also fixed-effects survival models were fitted (Table 4). They are more recommendable than the mixed-effects model when the latter models cannot be calibrated.

Ingrowth models
The ingrowth models predicted less ingrowth with increasing basal area (Table 6, Figure 3). The ingrowth of spruce and birch increased with the increasing presence of pine. The effect of species composition was substantial in spruce; at typical values of basal area, the model predicted 5-10 times higher spruce ingrowths for pine stands, as compared with pure spruce stands.
The ingrowth of pine was the highest in sub-xeric sites and the lowest in herb-rich (or better) sites, mesic and xeric (or poorer) sites being in between. The situation was very different in spruce, where sub-xeric sites had much less ingrowth than more fertile sites. Birch resembled spruce but the effect of site fertility on birch ingrowth was less pronounced.
In all species, increasing temperature sum increased ingrowth. This can be explained by more frequent seed years in the south and the slow growth rate of seedlings in the north. The effect of temperature sum was the largest in broadleaf species other than birch (Table 5).
Ingrowth is an important element of stand dynamics in all silvicultural systems, which are not based on artificial regeneration. The essential prediction is the number of surviving ingrowth trees. The mortality of small trees may be high under dense canopies of large trees, which means that many of the new ingrowth trees may soon die. The difference between predicted ingrowth and 5year mortality of the ingrowth trees is visualized in Figure 4 for different basal areas, temperature sums and site fertility classes. A typical basal area after thinning from above is often around 10 m 2 ha −1 in continuous cover management (or between 10 and 20 m 2 ha −1 on fertile sites). For a temperature sum of 1100 degree days (d.d.), the models of this study predicted net annual ingrowth rates ranging from 2 (birch on xeric site) to 50 (pine on xeric site) trees per hectare in pure stands when the basal area was 10 m 2 ha −1 (Figure 4). Ingrowth decreased with increasing basal area, most rapidly in pine and birch. In Figure 4, the net ingrowth rates of spruce in spruce stands range from 0.5 to 7.5 trees per hectare in fire years. However, the models predict that an admixture of pine would significantly increase spruce ingrowth.  d is diameter at breast height (cm); BAL is basal area in larger trees (m 2 ha −1 ); TS is effective temperature sum (degree days); Peat is indicator variable for peatland forest; Birch and Aspen are indicator variables for Betula spp. and Populus tremula, respectively.

Figure 2
Prediction of the survival models of pine, spruce and broadleaf species (temperature sum is 1100 d.d. and basal area in larger trees is 15 m 2 ha −1 ). 'Mixed' means that the prediction is calculated with a mixed-effects model (otherwise it is based on fixed-effects model), 'Peat' means that the soil is peat and 'in Pine' means that the tree is growing in a pine stand. 'Old' refers to predictions calculated with the models of Pukkala et al. (2013).

Comparison with previous models
All models exhibited fairly similar effects of tree size, species composition, competition, temperature sum and site fertility as the models of Pukkala et al. (2013). The similarity of the effect of tree size on diameter increment and survival can be seen from Figures 1 and 2 (dotted lines are predictions of the earlier model of Pukkala et al. (2013)). However, the diameter increment models predict higher growth rates than the previous model. As a difference to the previous models, the new models also describe the effect of temperature sum on ingrowth and the effect of soil type (mineral soil vs peatland forest) on diameter increment and survival.
Of the three sub-processes of stand dynamics, namely increment, mortality and regeneration (ingrowth), ingrowth is the Self-learning growth simulator G is stand basal area (m 2 ha −1 ); TS is effective temperature sum (degree days); herb-rich and sub-xeric are indicator variables for the respective site fertility classes.

Figure 3
Prediction if the ingrowth models for pine, spruce and birch (temperature sum is 1100 degree days if not indicated otherwise). '30 per cent Pine' means that 30 per cent of stand basal area is pine.
most erratic and most difficult to predict. In relascope plots with a high basal area factor, the sampling error is very high for ingrowth since one new ingrowth tree to a small diameter class corresponds to a very high number of trees per hectare. In addition, the calculated ingrowth of relascope plots may also be negative as the number of trees per hectare represented by a tree decreases when the tree grows. Despite these features of relascope plots used in this study, the average number of ingrowth was of the same magnitude in the current study and the earlier studies of Pukkala et al. (2013) and Lappi and Pukkala (2020) (Table 7). The number of pine ingrowth was higher in the current study, which may be explained by differences in the proportions of site fertility classes in the two studies. In Table 7, the difference between the measured and predicted ingrowth of this study is explained by the use of censored data.

Calibration examples
All calibration cases decreased the root mean squared error of the periodical basal area increment calculated for the plots ( Figure 5). The generic models were almost non-biased in the Spati data, and therefore, calibration had almost no effect on bias. In the Silvi data, a large positive bias (overestimated growth) turned into a smaller negative bias. Figure 6 shows the biases in basal area increment in the Spati and Silvi data when growth was simulated with the generic Forestry Figure 4 Net ingrowth predictions [(5-year ingrowth -5-year mortality of the ingrowth trees)/5] for different basal areas (G), fertility classes (mesic, sub-xeric, herb-rich) and temperature sums (800, 1100 or 1400 degree days). Table 7 Average number of ingrowth trees (number of trees that pass the 1.3-m height limit during a 5-year period) in the current study and two earlier studies.
and calibrated models. All other coefficients of the diameter increment models except the coefficients for temperature sum and peat were solved using optimization (the loss function of equation (6) was minimized). In the Silvi data, all coefficients of all models except the coefficients for peat were solved. It can be seen that the bias of the generic models increased with increasing stand basal area. Calibration removed a part of this trend, and the large positive bias of the Silvi data was greatly reduced. Figure 7 graphically depicts the generic and calibrated diameter increment models for pine and spruce. The overall shape of the models did not change much. However, the predicted level of the diameter increment of spruce was reduced significantly when the model was calibrated using the Silvi data. It can also be seen that scaling the model for pine in the Sati data had almost no effect on the model (the curves 'Generic' and 'Scaled Spati' coincide in Figure 7). However, calibrating all model parameters except the coefficient for temperature sum and peat changed the model so that the predicted growth of small trees increased and the prediction for large trees decreased.

Discussion
The models fitted in this study resemble those that have been developed for uneven-aged stands (de-Miguel et al., 2012;Pukkala et al., 2013). Model forms and predictors were selected in such a way that the models can be used in all stand structures, and they can be used to simulate transformation from even-aged to multi-layered structures. de-Miguel et al. (2012) compared even-aged (with site index based on age and dominant height used) and uneven-aged types (with site index, age and dominant height not used) of individual-tree growth models in the Pinus brutia forests of Lebanon where Self-learning growth simulator Figure 5 Calibration results. In the Spati data, three diameter increment models were scaled (3 model intercepts optimized) or fitted (26 parameters of the three diameter increment models were optimized). In the Silvi data, 10 models were scaled (10 model intercepts were optimized) or fitted (64 model parameters were optimized). In 'Scaled', the loss function minimized deviations between measured and measured species-specific stand basal areas in the second measurement, whereas deviation between measured and simulated frequencies of trees in 3-cm diameter classes were minimized in 'Fitted'. the situation is comparable to Finland in the sense that stand structures cannot be easily categorized into even-or unevenaged. Based on several criteria, de-Miguel et al. (2012) concluded that the uneven-aged modelling approach is to be preferred in structurally complex stands. These models may not be as good as the best possible models for even-aged stands because age, dominant height and site index derived from age and dominant height were not used. However, the loss in precision may be small in forestry practice, due to inaccurate measurement of age in forest inventories and the fact that many even-aged stands have been developed from released advance regeneration, which weakens the ability of age and dominant height to indicate site productivity.
The generic models developed in this study seem to predict higher growth than the earlier models (Hynynen et al., 2002;Pukkala et al., 2013). Compared with Pukkala et al. (2013), higher growth is predicted for all species but especially for large silver birches and aspens. The effect of species composition on diameter increment, survival and ingrowth is fairly similar as in Pukkala et al. (2013). As a new element, the new models include the effect of latitude (temperature sum) on ingrowth. The models predict that ingrowth decreases toward the north, most probably due to smaller seed yields and slower growth of seedlings. The effect of peatland site is included in the diameter increment model of pine and in all survival models, which is also an improvement compared with previous models (Pukkala et al., 2013).
The number of observations available for fitting the generic models is not larger than in previous studies (Hynynen et al., 2002;Pukkala et al., 2013). However, the advantage of the current modelling dataset is good geographic coverage and a high number of plots. Therefore, the models are likely to predict the Forestry Figure 6 Bias (simulated -measured) in periodical basal area increment in the Spati and Silvi datasets when diameter increment was calculated with the generic models (Tables 3, 5 and 6) or using calibrated models (denoted as 'Fitted' in Figure 5). In the Spati data, the length of the growth period was 5 years for all plots, and in the Silvi data, it ranged from 2 to 22 years. effect of location (temperature sum) on stand dynamics more reliably than the previous models.
The study also developed a model adjustment procedure that allows model calibration with empirical data. Calibration utilizes diameter distributions measured from the same plots or stands at least twice. It is noteworthy that the models for diameter increment, survival or ingrowth can be calibrated without measured diameter increment, survival and ingrowth. Only the outcome of these processes, i.e. the change in diameter distribution, needs to be measured. Counting the number of trees in different diameter classes is fast, which means that calibration data can be collected at low cost.
The degree of calibration depends on the amount and quality of the calibration data. In most cases, calibrating the diameter increment models would most probably suffice as mortality is usually low in managed forests and fitting improved ingrowth models needs large datasets due to the erratic nature of ingrowth. The simplest calibration is to just adjust the intercepts, which is recommended if the calibration dataset is small. The dataset also affects the loss function that is minimized in calibration. Minimizing the deviation between measured and simulated diameter distribution (equation 6) makes sense only if the sample plots used in calibration are large and include tens of trees.

Figure 7
Generic and calibrated models for 5-year diameter increment of Scots pine and Norway spruce. Temperature sum is 1100 d.d., stand basal area is 25 m 2 ha −1 and basal area in larger trees is 10 m 2 ha −1 .
The extent of calibration also depends on the use of the models. If the models are used in the short-term simulations of forest planning (for instance one or two decades), it is most important to calibrate the diameter increment models. Assuming that a new inventory is conducted after 10 or 20 years, the survival and ingrowth models are not critically important, not even in continuous cover forestry. They become important if the models are used in simulations and optimizations that cover the whole rotation or several decades.
As shown in the calibration examples, it is possible to calibrate only a part of the model coefficients and only a part of the models. Although the calibration procedure described here is flexible in terms of data and degree of calibration, the topic deserves further investigation. For example, Bayesian approaches should be studied in the future and well as machine learning methods other than the optimization approach used in the current study.
Model forms proposed for the modelling of count variables, such as Poisson, negative binomial and hurdle models (e.g. Zell et al., 2019;Lappi and Pukkala, 2020), were not used in ingrowth modelling. It was known, which ones of the trees of the relascope plots had passed the 1.3-m height limit during the previous 5year period, i.e. how many such trees there were in different plots. The reason for not using count modelling was the fact that relascope plots have no plot area, which is required in count modelling. In cases where there is one new tree (a tree that passed 1.3 m), it is possible to derive the plot area from the tree diameter and basal area factor. The plot area does not need to be the same in different plots since the plot area can be treated as Self-learning growth simulator an offset variable (Lappi and Pukkala, 2020). Problems arise when there is more than one new tree in the plot, and especially when there are no new trees at all. In the latter case, it is impossible to calculate any plot area within which there are no trees.
The generic models of this study were based on plots where the number of trees was low. A likely impact of this situation is that the growth and survival of the measured trees depend significantly on the surroundings of the plot. The situation can be expressed by saying that the sampling errors of the predictors, especially stand basal area and BAL, are large. As a consequence of high sampling error, the effect of basal area and BAL on diameter increment and survival rate may be underestimated. Figure 6 supports this conclusion as it shows that the bias increases with increasing basal area. In Figure 6, positive bias means overestimated growth. Figure 6 suggests that high basal areas do not decrease growth sufficiently and/or low basal areas (lack of competition) do not increase growth sufficiently. This means that the generic models should be used with caution in exceptionally sparse and dense stands.