In-situ calibration of stand level merchantable and sawlog volumes using cut-to-length harvester measurements and airborne laser scanning data

Forest management inventories assisted by airborne laser scanning (ALS) can be used to predict different forest attributes. These predictions are utilized in practical forestry, but in the case of timber assortment-speciﬁc volumes, the ALS-based predictions can be inaccurate. This causes uncertainty in harvest planning. However, ALS-based predictions can be calibrated to achieve greater accuracy with local measurements. In this study, we used ALS data and accurately positioned cut-to-length harvester measurements from Norway spruce ( Picea abies (L.) Karst.) dominated clear-cuts. We ﬁtted linear mixed-effects (LME) models with exponential correlation structure for merchantable volume and sawlog volume for 225 m 2 cells. Our aim was to study the effect of local harvester measurements on the accuracy of stand level merchantable and sawlog volumes. LME-based predictions were calibrated repeatedly up to 40 times as the cutting progressed. ALS data and harvester measurements were used to predict both the random effects and residual errors for each validation unit. At best, relative root mean square error (RMSE%) of initial predictions of 15.4 per cent for merchantable volume and 22.1 per cent for sawlog volume were reduced to 4.1 and 5.3 per cent, respectively, when measurements from 40 harvested cells of size 225 m 2 were used. These results suggest that spatially accurate harvester data could be utilized during harvesting to increase the accuracy of volume and timber assortment predictions.


Introduction
In many countries, forest management inventories assisted by airborne laser scanning (ALS) are carried out to produce predictions for various stand level forest attributes such as the total volume (Naesset, 2014). These ALS-based predictions are utilized, for example, when timber assortment volumes are predicted for the marked stands. However, for individual stands, the accuracy of such predictions may be low (Holopainen et al., 2010). Therefore, different calibration procedures based on field measurements have been proposed (Korhonen et al., 2019;Karjalainen et al., 2020;Maltamo et al., 2012). These calibrations are usually based on linear mixed-effects (LME) models that are locally adjusted by predicting the stand effects with local observations. Usually, the aim of such calibrations has been to increase the accuracy of predictions for the standing stock. More accurate pre-harvest information about, for example, the tree quality or the sawlog volume would ease the scheduling and overall planning of silvicultural and harvesting operations. With spatially or temporally correlated data, the prediction can be further improved by also predicting the residual errors of the model (Mehtätalo and Lappi, 2020), but that possibility has not been previously used in this context.
In addition to obtaining more accurate pre-harvest information, calibrations can be done also during harvesting by using the data collected by cut-to-length (CTL) harvesters. That is, the predictions for the currently uncut part of a stand are successively re-calibrated with the harvester measurements as the harvesting is progressing through the stand. A similar application was tested by Uusitalo et al. (2006), who aimed to estimate the stand diameter distribution by using the measurements of the first 30, 90 and 150 harvested trees. More accurate information about the diameter distribution could potentially be useful, for example, in the tree bucking control. According to Uusitalo et al. (2006), the reliability of predicted diameter distributions seemed to increase if harvester measurements were combined with prior information. However, without prior information, the process resulted in inaccurate predictions.

Forestry
Along with bucking optimization, accurately positioned CTL harvester data also have the potential to increase the efficiency of harvesting operations. In CTL harvesting operations, logs need to be cut into the correct dimensions during the harvesting operation. Consequently, there may be numerous different timber assortments (e.g. species-and quality-specific sawlogs, speciesspecific pulpwood, energy wood) that are being cut simultaneously from a stand. At the intermediate storage at the roadside, these different timber assortments are stacked in different piles because they are often transported to specific processing facilities. Sometimes, the amount of suitable space for the storage may be very limited (Metsäteho, 2010) which may require new temporary spots for log piles to be found or the logs at the roadside to be transported to a different location during harvesting. Therefore, predictions of accruals of different timber assortments during harvesting may improve the logistics.
In addition, in clear-cut stands where small diameter trees are rare and there exist some estimate of mean diameter, volume of trees is a key indicator for the productivity of a CTL harvester (Eriksson and Lindroos, 2014). Therefore, improved prediction of the total merchantable volume in a stand would also help to estimate the total time needed for harvesting. This, again, helps the harvesting organizations to plan and schedule consecutive operations in different stands more effectively so that unnecessary idle time of expensive machinery can be kept at a minimum. In smaller stands, the advantages of calibrations during the harvesting would be only minor, but in larger stands where the cutting takes multiple days, the work effectivity could be improved by using calibrated predictions.
The proposed in-situ calibrations are made possible by the advanced technology that is mounted on modern CTL harvesters. The computer in a harvester records, among other things, the volume and timber assortment for each log. Therefore, harvesters provide an easy way to measure attributes such as merchantable volume and sawlog volume that are time-consuming to measure manually (see, e.g. Karjalainen et al., 2019 for visual bucking). However, a pre-requisite for such calibrations that combine harvester and ALS data is that the trees can be positioned accurately, preferably with positioning errors less than 1 m. So far, the inaccuracy in the positioning of the harvested trees has limited the usability of harvester measurements together with ALS data. For example, the position for the harvested trees has sometimes been determined as the position of the harvester itself at the time of felling which means that the movement of the boom and the harvester head has been ignored (Bollandsås et al., 2011). The large potential of spatially accurate harvester data (Lindroos et al., 2015) has sparked the development of such positioning systems that take the movement of the boom into account (Hauglin et al., 2017;Melkas and Riekki, 2017). For example, in the study by Hauglin et al. (2018), sub-metre accuracy was obtained for the harvested trees. However, systems providing sub-metre accuracies for positions of harvested trees are not commonly available yet, so more development is still needed.
The aim of this study was to test the effects of in-situ calibrations of LME models on the accuracy of stand level merchantable and sawlog volume (in cubic metres per hectare) predictions. The calibrations were carried out while the cutting was in progress in the stand in question, and they were based on ALS data and observations collected with a CTL harvester from different numbers of 15 m × 15 m (225 m 2 ) grid cells tessellating the harvested stands. In the calibrations of LME models, both the random effects and residual errors were predicted for each validation unit. For comparison, the use of the mean values of the harvester measurements as stand level volume estimates was also evaluated.

Methods
In the current study, the pre-processed field data and ALS data as used by Karjalainen et al. (2020) were subject to analysis. The method used to acquire the data is summarized below; full details can be found in Karjalainen et al. (2020).

Study area
The study area is located in southeastern Norway (60 • 25 ′ N, 11 • 4 ′ E) in Romerike region (Figure 1), and the forests are dominated by Norway spruce (Picea abies (L.) Karst.) (87.0 per cent of total measured merchantable volume). Scots pine (Pinus sylvestris L.) (7.3 per cent) and deciduous species (5.7 per cent) are also abundant in the area.

Field data and data procession
The field data were collected with a John Deere 1270E CTL harvester in 2017. The harvester was equipped with a system that provided sub-metre accuracy for the position of each harvested tree. In accordance with the StanForD2010 forest data standardization (Skogforsk, 2018), the harvester recorded the volume of all logs ('merchantable volume') and the volume of those logs that were qualified as sawlogs ('sawlog volume') for each harvested tree. The deciduous trees were not classified as sawlogs which is consistent with common practice in Norway. The data consisted of 48 clear-cut stands.
The field data included only the tree positions, not any geometries for stand borders. As our intention was to use the ALS data with the area-based approach, and to produce wall-to-wall predictions, the stands needed to be tessellated into grid cells first. We opted to use a cell size of 15 m × 15 m, which is close to the size that is commonly used in operational ALS-assisted management inventories in Norway (Naesset, 2014) and Finland (Metsäkeskus, 2021). The R software (R Core Team, 2017) was used in the process. First, we generated realistic stand borders by aggregating the point-wise tree data to polygons by creating 2-D alpha shapes using the alphahull R package (Pateiro-Lopez and Rogriguez-Casal, 2015) with an alpha value of 10. The same procedure with the same alpha value was used also in the studies of Hauglin et al. (2018) and Maltamo et al. (2019). Stand borders were necessary to determine which grid cells were 'acceptably within' the clear-cut area. A cell was 'acceptably within' if at least 215 (of total 225) 1-m 2 sub-cells intersected the underlying clear-cut area created with the alpha shape. All the accepted cells were considered equal, i.e. no weighting was applied (Karjalainen et al., 2020). We also wanted to maximize the number of cells in the data set, so in each stand, instead of accepting the first selected position of the grid, an optimal position for the grid was found through an iterative process where the number of accepted cells was maximized (Karjalainen et al., 2020). After In-situ calibration of stand level merchantable and sawlog volumes the optimal position for the grid was found, the harvested trees were assigned to grid cells by using their harvester head-based positions. For each cell, the merchantable volume (m 3 ha −1 ) and the sawlog volume (m 3 ha −1 ) were calculated.
The entire data set consisted of 2292 cells distributed on the 48 stands. The data set was split into training and validation stands so that the 16 largest stands according to area, excluding the second largest stand, were chosen as validation data and the rest of the stands (including the second largest stand) composed the training data. Such a split of the data was adopted because the calibrations implemented in Karjalainen et al. (2020) were more reasonable for large stands than in stands that consist of only a few 15 m × 15 m grid cells. The same justification applied in this study, too, so we did not make any changes to training and validation stands. The second largest stand (285 cells) was included in the training data to provide observations also from one of the large stands in the model fitting. The smallest validation stand consisted of 41 cells (0.92 ha), whereas the largest validation stand consisted of 453 cells (10.19 ha). Mean values for the training stands and stand-specific information for the validation stands are provided in Table 1. For detailed information about the cells within the training and validation stands, see Table 1 in Karjalainen et al. (2020).

ALS data
The ALS data were acquired in 2013 using a Leica ALS70 instrument operated from an average flying altitude of 3000 m above ground level. The pulse repetition frequency was 104.6 kHz, and the resulting point density was about 0.7 points m −2 at ground level. The classification algorithm proposed by Axelsson (1999) was used to separate ground hits and vegetation hits. Moreover, as the used ALS instrument could record multiple echoes for each pulse, two echo categories were constructed and used in the analysis, namely first (first of many + only) and last (last of many + only) echoes. Finally, the ALS echoes were extracted for each cell and the ALS metrics were calculated separately for both echo categories using the LASmetrics function in rLiDAR package (Silva et al., 2017). The derived ALS metrics included all the common metrics such as the maximum, median, mean, standard deviation and the variance of heights of the ALS echoes. In addition, different height percentiles (1, 5, 10, . . . , 90, 95, 99) were calculated. LME models and their calibration LME models were used as they are able to take the hierarchical structure of the data (i.e. grid cells within stands) into account. Furthermore, LME models also allowed to calibrate the models for individual stands. The optimal structure (including the predictors, random part, variance function and correlation structure) of the models for both merchantable and sawlog volumes was carefully examined using the nlme package (Pinheiro et al., 2019) in the R software (R Core Team, 2017). See Appendix for formal definition of the model. For merchantable and sawlog volumes, the same univariate model forms as were constructed by Karjalainen et al. (2020) were used also in the current study, with the exception that for both models, the general correlation structure was changed to a better justified exponential spatial correlation structure, which also allows more efficient use of the calibration measurements through prediction of the residual errors.
Local observations make it possible to predict the random effects for all the responses, and thus, to calibrate the models in each stand. In addition, also the residual errors can be predicted by utilizing the spatial autocorrelation of the data (Mehtätalo and Lappi, 2020). If the residual errors are spatially correlated, n cells = number of the cells, Cell area (ha) = the total area covered by the cells, Stand area (ha) = the total area of the stand created with the alpha shape, Merch. = merchantable, V = volume, T = training stands.

Forestry
then their prediction should increase the accuracy of predictions for the cells that are located close to the calibration cells. Three different components can be summed up to obtain total predictions: (1) predictions based on the fixed part of the model, (2) the predicted random effects and (3) the predicted residual errors. When local observations are not available, only the fixed part of the model can be used. In this study, calibrations were based on the harvester measurements from the grid cells. Predictions were always updated after a new 15 m × 15 m grid cell was clear-cut. By using the observed merchantable and sawlog volumes and the known ALS metrics, random effects for the entire group (i.e. all the cells within the same stands) could be predicted by using the Estimated Best Linear Unbiased Predictor (EBLUP). The observed residuals of the harvested cells were utilized to predict the residual errors for the uncut cells in the stand. The whole calibration procedure is described in detail in Appendix.

Selection of calibration cells
The sizes of the validation stands ranged from 41 to 453 cells. We wanted to have the same number of stands for each calibration with different number of calibration cells, so the maximum number of calibration cells was 41. We opted to test the calibration by using 1-40 cells in the calibration in each stand. For the smallest stand, this meant that almost the entire stand was already clearcut after the 40th cell was harvested. In the largest stand, 40 cells cover only 8.8 per cent of the total area.
Two different strategies for choosing the calibration cells were tested: (1) using the actual cutting order based on the time stamp of each harvested tree and (2) using a strategy by which one main strip road is first harvested, and then subsequently, the rest of the area is harvested by creating smaller loops from the main strip road. In the current study, we refer to this latter strategy as the 'backline method'. The main advantage of the backline method in practical forestry is to direct most of the forest machine traffic into main strip road placed on a sturdy ground, whereas more vulnerable and soft soils are less trafficked. This method is quite commonly used in operational harvesting, for example, in Finnish forestry (Uusitalo, 2003). Here, the cells for the backline method were subjectively chosen so that the cells composed a clear main strip road along the longest axis, and possibly some loops until the threshold of 40 cells was reached. The backline method was included in the study to provide a plausible comparison for the actual cutting order. It appeared that in most of the stands, the actual cutting order moved forward rather systematically from one corner of the stand to the remaining areas without any clearly observable main strip road. The two strategies for different cutting orders are illustrated in Figure 2. Note that the results of this study were obtained with only one set of calibration cells for both strategies. Every time the model was re-calibrated, predictions for all the cells in the stand were updated, that is, also the predictions for the already harvested cells changed. Our interest was to improve the predictions of stand level merchantable and sawlog volumes as the clear-cutting progressed. Therefore, it was justified to fix all the predictions of the already harvested cells to the correct values that were observed with the harvester. Consequently, with this approach, the errors of predictions would have eventually become zero if the re-calibration would had continued until the stand had been completely harvested. For one of the small stands (stand #15), this was close to appear as only a single cell remained uncut.
In addition to the calibration that utilizes the ALS data, we calculated the results also by determining the stand level merchantable and sawlog volumes directly as the observed mean of the harvested cells. Similarly, the predictions were always updated after a new cell was completely harvested. For example, In-situ calibration of stand level merchantable and sawlog volumes

Accuracy assessment
Our interest was in stand level predictions as the clear-cutting progressed. Therefore, the cell level predictions were aggregated to stand level prior to accuracy assessment. The accuracy of stand level predictions was assessed using empirical relative root mean squared error (RMSE%) (Eq. 1) and relative mean difference (MD%) (Eq. 2).

RMSE%
where n is the number of stands in the dataset, y i is the observed merchantable/sawlog volume for stand i,ŷ i is the predicted merchantable/sawlog volume for stand i and y is the measured mean of the merchantable/sawlog volume in the dataset. In addition, plots showing the error of predictions on the y-axis and the relative proportion of the harvested stand on the x-axis were visually inspected.

LME models
The univariate models that were constructed for the multivariate model system in Karjalainen et al. (2020) were used in the current study with the exception that exponential spatial correlation structure was now applied. Fixed effects for the univariate models are provided in Table 2. Variances for the random parts of the models and the estimated parameters for the power-type variance function and the exponential spatial correlation structure are provided in Table 3.

Calibration of merchantable and sawlog volumes
The RMSE% and MD% values of predicted merchantable and sawlog volumes for actual cutting order and the backline method are illustrated in Figure 3. The corresponding RMSE% and MD% values for the cases where 1, 10, 20, 30 and 40 cells were used to calibrate the predictions are provided in Tables 1 and  2 in Supplementary data for merchantable volume and sawlog volume, respectively. The RMSE% and MD% values of predictions clearly improved due to calibration. For the LME model, the calibration that also utilized the prediction of residual errors proved to be more accurate than the calibration based on the prediction of random effects only. At best, the RMSE% values of LME-based merchantable In most of the cases, the calibration based on both ALS and harvester data resulted in smaller RMSE% values than the calibration based on the harvester data alone. The only exception was the prediction of merchantable volume with the backline method. In that case, the calibration based only on the harvester data was eventually slightly more accurate than the LMEbased calibration that also utilized the prediction of residual errors. When less than 16 cells were used in the calibration, the LME-based calibration was clearly more accurate also in this case.

Forestry
The lines illustrating the actual cutting order and backline method were somewhat similar in Figure 3, especially in the case when both the harvester and ALS data were used. If only harvester data were used, then there were some clear differences especially in the MD% values: the backline method overestimated both the merchantable and sawlog volumes, particularly in the beginning of the clear-cut. However, after 40 cells were harvested, the accuracies were mostly greater with the backline method than with the actual cutting order. Especially, when also the residuals were predicted, the backline method resulted in more accurate predictions in all the cases except when n = 1 and n = 17, for merchantable volume, and n = 1 for sawlog volume (not shown). With 40 harvested cells, the differences in RMSE% values were greater than 1 per cent point for both responses.
The stand-specific results for the calibration of merchantable volume are provided in Figure 4 for the actual cutting order and in Figure 5 for the backline method. For actual cutting order (Figure 4), the calibration of the LME-model with measurements from 40 grid cells improved the predictions in all stands, except for stands #1 and #4. Especially in the stand #1, the calibration of the LME model was clearly disturbed by a single extraordinary cell: the use of the cell #32 shifted the LME-based merchantable volume prediction from ∼291 to 340 m 3 ha −1 (observed value was 295 m 3 ha −1 ). However, also the curve showing the mean of the harvester measurements started to recede the zero line after about 10 harvested cells, indicating that thereafter merchantable volumes of harvested cells were clearly smaller than the actual mean of the stand. In the stand #4, on the other hand, the accuracy of the LME-based calibrations decreased rather constantly as the clear-cutting progressed. Assumingly, the used calibration cells were not representative for the entire stand #4.
For the backline method ( Figure 5), extraordinary cells were avoided in stand #1 and the errors of the calibrated LME-based predictions were always close to zero. On the other hand, in stand #12, there were now some clearly visible fluctuations in the predictions when about 15 cells were used in the calibration. Otherwise, the curves of ALS-based predictions were similar between the cutting approaches. Overall, regardless of cutting approach, it appears that on stand level, the calibrations based on mean of In-situ calibration of stand level merchantable and sawlog volumes Figure 4 The change in stand-specific errors (observed -predicted volume at stand level) of merchantable volume predictions v. the proportion of the harvested area of the total stand area. Actual cutting order. Solid black line = calibration of LME model with both the predicted random effects and residual errors; dashed grey line = mean of harvester measurements. the harvester measurements resulted in greater variation than the calibration of the LME model as the cutting progressed, as expected by the theory (see Mehtätalo and Lappi, 2020, section 5.4.4).
When predicted volumes were compared with observed volumes in individual stands instead of calculating RMSE% and MD% values over all 15 validation stands, it turned out that the differences between cutting approaches were not as clear Forestry Figure 5 The change in stand-specific errors (observed -predicted volume at stand level) of merchantable volume predictions v. the proportion of the harvested area of the total stand area. Backline method. Solid black line = calibration of LME-model with both the predicted random effects and residual errors; dashed grey line = mean of harvester measurements.
anymore. In fact, with 40 harvested cells, it appeared that the backline method was more accurate in just 8 of 15 stands for both response variables (in stands #1, 4, 5, 9, 11, 12, 13 and 14 for merchantable volume and stands #1, 2, 3, 4, 5, 11, 13 and 15 for sawlog volume, respectively). There were no clear trends with respect to the stand size, that is, the actual cutting order resulted in greater accuracy in some large stands as well as in small stands. The average cell number of those stands in In-situ calibration of stand level merchantable and sawlog volumes which actual cutting order was more accurate than the backline method was about 97 and 74 cells for merchantable volume and sawlog volume, respectively.

Discussion
The aim of this study was to test whether calibrations based on ALS data and harvester data from 1 to 40 grid cells of size 15 m × 15 m could be used to improve the predictions of stand level merchantable and sawlog volumes, while the cutting is in progress in the stand in question. The results showed that for both response variables, notable increases in accuracy can be expected as the clear-cutting progresses. Therefore, the proposed calibration procedure could be useful in practice. However, positioning systems providing sub-metre accuracies for the positioning of harvested trees would be required to allow practical implementation of the application examined in this study.
There were some restrictions associated with the data and methods used in this study and which may influence on the extent to which the results can be generalized. First, the results were calculated with just a single set of grid cells in the calibration and for a restricted number of stands. Allocating other grid cells and in different order might alter the results. Likewise, stands with properties different from those we analyzed may yield different results. In addition, the used harvester data only provided information regarding categorization into sawlogs or non-sawable logs. These non-sawable logs included both pulpwood and energy wood, so we were not able to distinguish the energy wood and pulpwood logs from each other. Moreover, local aerial images were not available, so we were not able to do any species-specific predictions for each timber assortments. Thus, a more thorough analysis for the need of storage space for log piles would require species-specific and timber assortment-specific predictions. Nevertheless, based on the results of this study, it is likely that at least some level of increase in the accuracies of also species-specific and timber assortment-specific predictions would likely be obtained if such data were available.
In Karjalainen et al. (2020), it was noted that the residuals and the random effects of the merchantable and sawlog volume models are correlated in the used dataset. Thus, more information would have been obtained if the calibration utilized also the cross-model correlation. However, using cross-model correlation is expected to provide only minor improvement compared with the individual model-specific calibration when both responses have been observed for all calibration cells. In addition, fitting of a spatially autocorrelated multivariate mixed-effects model is a complicated task and to the best of our knowledge, such a model cannot be fitted straightforwardly with currently available software. Our available resources did not allow to fit such a model for this study. Therefore, in this study, we opted for keeping the two models separate instead of combining them to a spatially autocorrelated seemingly unrelated model system. Anyway, fitting and calibrating a spatially autocorrelated multivariate mixed-effects model is an interesting topic for a future study.
On general level, the utilization of ALS data in the calibration resulted in greater accuracies compared with the mean of harvester measurements (Figure 3). However, on stand level ( Figures 4 and 5), the more accurate calibration method varied between stands. One explanation for the generally better performance of ALS-based calibrations could be that it is not that sensitive to extremely small or large measured volumes because the predictions are shrunk towards the marginal mean given by the fixed part of the model. If the volume of a cell is small, then also the initial LME-based prediction should be small, and the residual can be either positive or negative, shifting the calibrated predictions in either direction. In the case of harvester measurements, a cell with small volume automatically reduces the predicted stand level volume so occurrence has a greater effect. The difference in accuracies between calibration methods is especially pronounced in the beginning of a clear-cutting. As assumed, the performance of the calibrations based only on the mean of the harvester measurements improved the more cells were harvested. In some stands, it happened to result in greater accuracies than the calibrations of LME models.
We tested two different strategies to choose the 1-40 cells for the calibration. From a practical point of view, the cells derived by the actual cutting order were probably the more interesting ones as they described how the cuttings are carried out in practice in Norway. However, there may be differences in practices between countries and, for example, in Finland, the presented backline method could also be applied in stands where the soil bearing capacity varies within the stand steering the main forest machine traffic into the main strip road placed on a robust ground. Therefore, the comparison of the two strategies was also interesting. In the calibrations based only on the harvester data, especially the MD% values differed between the cutting orders. The result that the backline method was prone to overestimate both the merchantable and sawlog volume predictions in the beginning of clear-cut was logical because the cells for the backline method were chosen from the central part of the stand. That is, the cells near the stand borders with presumably different volumes were not presented in the sample. Consequently, stand level volumes were here overestimated.
The results with 40 harvested cells were mostly more accurate with the backline method. This was logical as well, because instead of observing 40 units from only one part of the stand, the entire stand was now better represented with the backline method as the strip road travelled through the stand along the longest axis. This point was important also in the case of ALSbased predictions with the predicted residuals: with the backline method, the harvested cells were distributed at greater distances from each other, and therefore, a greater portion of the predicted residuals for the uncut cells were non-zero. As the data were spatially autocorrelated, at least the signs of the predicted residuals were mostly correct, and therefore, the accuracies of total predictions increased. To our knowledge, this is the first study in which also the residual errors were predicted as a part of the calibration of an LME model. In general, the acquisition of such tree data that allows to model the spatial autocorrelation of residuals has been challenging, so there are not yet that many studies on the topic (Breidenbach et al., 2016;Mauro et al., 2017).
Another interesting result was that the sawlog volume predictions were clearly improved with the calibrations, resulting in an unforeseen level of accuracy. However, a pre-requisite for improved predictions was that numerous local measurements were carried out. With only the fixed part of the model, the Forestry RMSE% was about 22 per cent, so local measurements are crucial. The obtained results are in line with Karjalainen (2020) who concluded that for RMSE% values notably smaller than 20 per cent some stand-specific auxiliary information should be collected, preferably underneath the canopy. This is because the correlation between ALS data and the defects causing sawlog reduction is negligible (see, e.g. Karjalainen, 2020, for requirements for sawlogs). On the other hand, for example due to properties of the site, the tree quality seems to correlate strongly within a stand which allows effective calibration of stand-level sawlog volume.
The proposed calibration approach could potentially be used to also improve the ALS-based diameter distribution prediction of a stand as the cutting progresses. By knowing the diameter distribution of the remaining trees more accurately, bucking of logs could possibly be optimized during the clear-cut to better match to the demands of sawmills (Uusitalo et al., 2006). Uusitalo et al. (2006) reported that when the first harvested trees were combined with prior information, the reliability of diameter distribution predictions was increased. In the proposed calibration approach, such prior information about stand composition can be derived from ALS data. In addition, the quality of trees that has great effect on the accruals of different timber assortments can be considered with the harvester measurements. Calibration of diameter distributions is another interesting topic for a future study. Overall, the results of this study indicate that spatially accurate harvester data have a great potential for improving the performance of harvest operations in practical forestry.

Conclusion
Harvester measurements allow to calibrate ALS-based sawlog and merchantable volume models, while the harvesting is in progress. In this paper, the harvester-based in-situ calibrations increased the accuracies of predictions notably, indicating potential usefulness of the developed methodology in harvest management of individual clear cuts. In many cases, the utilization of ALS data in the calibrations resulted in more accurate predictions than the mean of the harvester measurements only. In addition, in the calibration of LME models, the prediction of residual errors (allowed by spatial autocorrelation of the data) increased the accuracies further compared with prediction of random effects only. The chosen strategy on how the clear-cutting is started may affect how effectively the predicted residuals can be utilized.

Data availability statement
The data underlying this article were provided by Norwegian University of Life Sciences by permission. Data will be shared on reasonable request to authors from Norwegian University of Life Sciences.

Supplementary data
Supplementary data are available at Forestry online.
In-situ calibration of stand level merchantable and sawlog volumes y i includes the observed merchantable volumes and X i the predictors for the p harvested grid cells as follows: β is the same as earlier in the first component of Eq. A4. Employing the previously described vectors and matrices to EBLUP equation results in predictionb i for the stand effect b i .The vectors and matrices in the third component w ′ i R −1 i (y i −X iβ −Z ibi ) are already described, expect for the w ′ i , which includes the covariances between the target cell 0 and the other observations in the group i as follows: The covariances in w ′ i are calculated in the same way as in matrix R i . Predictions for each cell were always updated after a new cell was harvested. For example, when the third cell from a stand was clear-cut, the prediction for random effectb i was recalculated using measurements from all three harvested cells instead of only the two previous ones. In addition, the predictions for the residual errors were also recalculated separately for each cell in the stand. The closer the cell located to the last harvested cell, the more the predicted residual error was affected.