Improved accuracy of aboveground biomass and carbon estimates for live trees in forests of the eastern United States

Virginia Tech, Forest Resources and Environmental Conservation, 319 Cheatham Hall, Blacksburg, VA 24060, USA University of Maine, School of Forest Resources, 5755 Nutting Hall, Orono, ME 04469-5755, USA American Forest Management, Inc., 8702 Red Oak Blvd., Suite C, Charlotte, NC 28217, USA Michigan State University, Department of Forestry, 480 Wilson Rd, East Lansing, MI 48824, USA USDA Forest Service, Northern Research Station, 11 Campus Blvd., Suite 200, Newtown Square, PA 19073, USA US Forest Service, Southern Research Station, 1710 Research Center Drive, Blacksburg, VA 24060-6349, USA


Introduction
Accurate estimates of forest biomass and carbon (C) stocks at regional to national scales are needed in policymaking and research related to the role of forest ecosystems in the global C cycle and related efforts to mitigate greenhouse-gas emissions through increasing forest C sinks. For example, reporting of forest C stocks is an administrative requirement for participants in the United Nations Framework Convention on Climate Change (IPCC, 2006;Tomppo, 2009). In countries where national forest inventories (NFI) are maintained, year-to-year differences in forest biomass and C stocks are used to estimate C flux and sequestration rates on forest lands (Dunger et al., 2012;Fang et al., 2014;U.S. EPA, 2014). To meet this requirement with maximum accuracy, ongoing efforts focus on testing and improving procedures used in estimating forest C stocks and their year-to-year differences for estimating C fluxes (Smith et al., 2013;Westfall et al., 2013;Woodall, 2012).
Forests in the eastern US currently store an estimated 8.5 Pg (1 Pg (petagram) is equivalent to 10 15 g or 1 billion (10 9 ) metric tons) of C in the aboveground components of live growing stock (Miles, 2016), roughly 20 per cent of all C stored in living and dead plants and soil organic matter in forests of the conterminous United States (US). Data collected and maintained by the Forest Inventory and Analysis (FIA) program of the US Forest Service provide the basis for C stock estimates across a range of spatial scales (Domke et al., 2014). Forest vegetation surveys conducted by FIA personnel provide the statistical means for unbiasedly estimating many parameters of forest resource attributes (McRoberts et al., 2010). In addition, allometric models of individual-tree volume and biomass contents are applied to field measurements, especially in the estimation of the aboveground components of live trees. Consequently considerable accuracy is required in the individual-tree volume and biomass models used in national-scale forest biomass and C stock estimation procedures (Domke et al., 2012;Duncanson et al., 2015).
Because many NFIs were designed with the estimation of growing stock volumes as a primary goal, estimates of biomass and C stocks are often obtained using secondary conversion and expansion functions known as biomass conversion and expansion factors (BCEF), or the somewhat simpler biomass expansion factors (BEF) (Soares and Tomé, 2012). The term BEF is used differently in some contexts; however, a widely-adopted convention defines BEF as a unitless number representing the ratio of the biomass contained in the merchantable stems of trees to the same trees' aboveground biomass (AGB) (IPCC, 2006;Skovsgaard and Nord-Larsen, 2012). BEFs may be defined on either per-tree or per-unit-area bases and these bases are sometimes used interchangeably or left unspecified. BCEF is equivalent to BEF times wood density defined on a dry weight to green volume basis; therefore, BCEF can be used to determine AGB directly from stem wood volumestypically merchantable volumes (IPCC, 2006).
BEFs are a type of component ratio (CR)quantities that express the biomass contents in one component of a tree, e.g. stem, crown or foliage, in proportion to the tree's total biomass or AGB (Jenkins et al., 2003). CRs, along with wood density or specific gravity (SG; dry weight per unit of green volume, expressed in relation to density of water at 4°C) values, are used in the component ratio method (CRM) to calculate AGB from stem volumes of individual trees using measurements, e.g. stem diameter, collected in the US NFI (Heath et al., 2009;Woodall et al., 2011). The combined use of wood density information and stem wood CRs provides a means to ensure consistency between estimates of wood volume and AGB in NFIs (Domke et al., 2012).
Despite the extensive body of knowledge that contributes to the workings of the CRM, comprehensive evaluation of the method's performance to date has been limited by a paucity of data suitable for such a task (Domke et al., 2012;MacLean et al., 2014;Westfall, 2012;Zhou and Hemstrom, 2009). To address this need, two objectives were pursued here. The first involved assessing accuracy of CRM predictions using a large collection of tree biomass data and related measurements compiled from legacy sources. The second objective was aimed at testing alternative models to reduce uncertainties in CRMbased predictions and examining the alternative models' effects on AGB estimates for eastern US forests. By design the CRM method provides consistent estimates of stem wood volume, tree biomass components, and AGB obtained by summing the AGB components; accordingly, alternative models for each of the constituent equations or their fitted coefficients were examined for their ability to improve AGB prediction accuracy.

Materials and methods
The geographic scope of the study was limited to 33 states in the US that lie completely east of 100°west longitude, excepting Texas and Oklahoma, which were included in the study despite ranging west of the 100 th meridian. These states are grouped into the following four FIA regions where different gross cubic volume models and coefficients are assigned for AGB determination: Central States; Lake States; Northeastern States; and Southern States (Figure 1).
A large collection of 'legacy' tree measurements was compiled for this work from past studies conducted over the past 115 years where various aboveground attributes were observed on standing or felled trees, including volumetric, weight, and basic wood properties measurements (Radtke et al., 2016). Some harmonization of legacy measurements from different studies was necessary, mainly where stump heights and top-diameter measurement limits differed among studies. Details of the data and methods used to harmonize their measurements are arranged into three general categories: (1) data used to determine stem volumes; (2) weight measurements from tree components and AGB; and (3) wood or bark properties used to convert volumetric measurements to stem weights.

Stem volume data
The largest and most extensive collection of data compiled for this work included stem profile or taper measurementspaired height and diameter measurements from base to the tip of individual tree stemscompiled from 63 published and unpublished studies (Supplementary data, Table A1). Taper measurements from legacy trees allowed for the determination of individual tree stem volumes based on a specified stump height and diameter at the top of the usable portion of the stem. The CRM standard 'merchantable' stem volume specification refers to insidebark (ib) volume from a 0.305 m (1 ft) stump to a 10.2 cm (4 in) outsidebark (ob) top diameter. This standard is widely used for forest inventories and assessments aimed at wood fibre production in the US and was thus adopted by Jenkins et al. (2003) in their development of national scale stem biomass equations. The same trees were used in assessing an alternative merchantability standard adopted here that includes the whole stem, i.e. the ib volume from groundline to the tip of the main stem. In order to be useful in this work, any trees included in the volume data collection were required to include measurements of both diameter at breast height (dbh; 1.37 m) and total tree height.
The most useful taper measurements for our purposes included both ib and ob diameter measurements at multiple heights up the stem; Figure 1 Areas where FIA regional volume equations are adopted for use in the CRM. Improved accuracy of aboveground biomass and carbon estimates however, some legacy dataespecially those measured on standing trees using optical dendrometersonly included ob diameters. For data from two such studies (Martin, 1980;Westfall and Scott, 2010) ib diameters were predicted from the ob taper measurements given the availability of suitable bark-thickness models for a number of northeastern US tree species (Hilt et al., 1983;Li and Weiskittel, 2011;Wingerd and Wiant, 1982). In total, 76 610 trees from 80 species or species groups were available for volume and taper determination, with an average of 14 taper measurements per tree from stump to merchantable top (Supplementary data, Table A2; see also, Table A2 for all species names used in this work).

Aboveground biomass data
Data were also compiled from past studies where aboveground component biomass had been observed (Supplementary data, Table A3). Because of the importance of AGB in the CRM framework, the most useful observations came from trees that had been felled and weighed or otherwise measured to obtain estimates of AGB, including stem wood and bark, branches and foliage. Weights of dead branches were excluded where they were separately recorded. Unless otherwise noted, recorded aboveground dry weights were assumed to have included no dead branches. Trees with any recorded hollow or decayed stem sections were omitted. Despite missing one or more aboveground component measurements, trees from a number of studies were usable for evaluating a subset of CRM-related biomass attributes, e.g. stem wood, stem bark, branches or foliage. Depending on how components were defined for any particular study, it was sometimes possible to obtain only a subset of the CRs defined in the CRM. For example, if an author defined the crown as a component, making no distinction between branchesi.e. branch wood and barkand foliage, the measured AGB and stem wood and bark components were still used in model testing and revision. A total of 6617 trees from 31 species or genera were included in the AGB and component model test data set (Supplementary data, Table A4).

Wood and bark properties
Measurements of wood and bark properties were compiled from subsets of records in the stem volume and biomass data sets described above. Stem bark:wood volume ratios were calculated from ib and ob stem volumes as defined by Miles and Smith (2009). Profile data from 71 907 trees on 76 species groups having ib and ob stem taper measurements were used for this purpose (Supplementary data, Table A5). Bark and wood basic SG were calculated from legacy tree records having both dry weight and green volume measurements of wood and bark. These were generally available only on the subset of legacy records on 33 species groups that had crosssectional disks cut, measured for volume, and dried before weighing (see supplementary data, Table A6). The numbers of observations available for wood and bark SG determination -10 672 and 7228, respectivelywere greater than AGB sample sizes because they included data from a number of studies focused on stem weights but not AGB.

Component ratio method
Baseline predictions of AGB were generated using the CRM method adopted in the US forest inventory, which is briefly described here and diagramed in Figure 2, with the steps described here each referring to steps numbered in the diagram (Figure 2). In step 1 merchantable stem wood volumes (V merch , m 3 ) were predicted for those trees having dbh ≥ 12.7 cm (5 in) using species-specific regional volume equations (Hahn, 1984;Hahn and Hansen, 1991;Oswalt and Conner, 2011;Scott, 1981). Merchantable stem bark volumes were subsequently predicted (step 2) using species-specific tree-level average bark volumes, expressed as ratios of stem bark : wood volumes ). Species-level averages for wood and bark SG from Miles and Smith (2009) were applied in steps 3 and 4 to convert predicted merchantable stem wood and bark volumes to dry weights. Stem wood and bark dry weights were summed in step 5. Next, the CR regression model [1] from Jenkins et al. (2003) was used to predict the fraction of each tree's AGB allocated to stem wood (CR wd ) and bark (CR bk ) components. The summed merchantable stem wood and bark dry weights were divided by the sum (CR wd + CR bk ) in step 6 to arrive at CRM predictions of AGB (Woodall et al., 2011). Other biomass components were predicted by multiplying AGB by the corresponding CR; for example, foliage biomass was calculated by multiplying AGB x CR fol (step 7). Finally, AGB for sapling-sized trees (dbh < 12.7 cm) was predicted directly from a generalized allometric model multiplied by a species-specific sapling adjustment factor (SAP adj ) to ensure a smooth transition at the 12.7 cm dbh merchantability threshold (step 8; Heath et al., 2009;Jenkins et al., 2003).
where, CR = component:aboveground dry-weight ratio; D = dbh, cm; β 0 and β 1 are the model coefficients. Two adjustments to calculated AGB were made in accord with the CRM, one that subtracts the foliage component biomass from AGB and another that adjusts stump biomass to ensure consistency with the predicted volume and dry weight contents of the merchantable stem wood and bark.

Alternative model formulations
Seven alternative scenarios were formulated for evaluating the effects that adopting alternative constituent models in the CRM would have on FIA state and regional estimates of live tree AGB in the eastern US. The existing CRM was denoted as the baseline scenario, or scenario 0. In scenarios 1-7 the definitions of CR wd and CR bk were modified to reflect ratios of total stem wood and bark to AGB rather than the merchantable stem wood and bark ratios defined in the baseline scenario. This (1) regional volume equations; (2) bark:wood volume ratio; (3 and 4) wood and bark specific gravity coefficients; (5) add wood + bark biomass; (6) division by stem:aboveground component ratio (CR wd + CR bk ); (7) multiplication by foliage:aboveground component ratio (CR fol ); (8) species or species-group specific biomass equations. Note that only the aboveground component biomass attribute is estimated for saplings, i.e. trees with dbh < 12.7 cm.
Forestry modification was based on the work of DeYoung (2014), who noted a somewhat extreme nonlinear pattern in stem CRs for trees close in size to the 12.7 cm dbh threshold that FIA uses to distinguish between saplings and trees. The foliage component definition was left unchanged; however, the branch component ratio (CR br ) was redefined in scenarios 1-7 to exclude any part of the main stem, i.e. topwood or stump, thus ensuring that stem, branch, and foliage CRs sum to one. In the approach proposed by Jenkins et al. (2003) CRs were assured to sum to one by fitting Eq. (1) separately to CR wd , CR bk , and the foliage component ratio (CR fol ), then calculating CR br by subtraction from one, i.e. CR br = 1 -(CR wd + CR bk + CR fol ).
The comparatively flexible nonlinear Chapman-Richards functional form [2] was proposed as a potential improvement over Eq. (1) for predicting biomass CRs.
where, e = the base of the natural logarithm; a, b and c are the model coefficients.
To ensure that aboveground components summed to one Eq.
(2) was first fitted separately to data from all four aboveground CRs, then adjusted by dividing each fitted equation by the sum of all four components. For example, the adjusted foliage component (CR fol1 ) was calculated as Newly fitted component-ratio regression models based on the modified CR definitions and the Eq. (2) were implemented in scenario 1. To maintain a smooth transition for trees near the sapling size dbh = 12.7 cm threshold, SAP adj was recalculated for each species as the national average ratio of the CRM-calculated AGB divided by AGB predicted from the Jenkins et al. (2003) biomass equation for all 5-in trees, as described by Heath et al. (2009). Other than the new CR model and sapling adjustment factors, scenario 1 was equivalent to the baseline CRM (Table 1).
Scenario 2 differed from Scenario 1 in that the newly fitted component-ratio regression model [2] used a predictor consisting of the product of dbh-squared (D 2 ) and total height (H, m) instead of using dbh alone (Table 1). As in scenario 1 and all subsequent scenarios, new sets of sapling adjustment factors were calculated to maintain a smooth transition between saplings and larger sized trees.
Scenario 3 included a further modification, the replacement of the allometric AGB models from Jenkins et al. (2003) with species and species-group-specific segmented regression models as in Eq. (3) following Clark et al. (1985) fitted to legacy tree AGB observations.
where, D = dbh, cm; H = total height, m; D J = segment joint point 12.7 cm; a, b and c are the model coefficients (Coefficient symbols a, b, c, etc., not intended to signify equivalence in differently numbered equations). Two possible improvements were sought in this modification. First, including both dbh and height in the AGB model was seen as a way to improve prediction accuracy compared with using dbh as the only predictor. Second, residual plots of log(AGB) vs either log(dbh) or log (dbh 2 × H) regression models (not shown here) showed some systematic over-or underprediction of AGB across the range of tree sizes, with biases being most evident in small trees. Model fits were assessed using the pseudo R 2 fit index described by Parresol (1999) and a relative rootmean squared error (RMSE) based on per cent errors calculated from observed (obs) and predicted (pred) AGB as in Eq. (3) (Supplementary file Eq_3_AGB_Coeffs_SI.csv).
where, n = number of trees fitted to regression model; and p = number of regression parameters. While models relying on both dbh and height as predictors were fitted for this purpose, a supplementary model for sapling-sized trees was also fitted that relied only on dbh (Supplementary file Eq_3_Sapling_AGB_D_ Coeffs_SI.csv). This was needed to make predictions for roughly 50 per cent of saplings in the FIA database that have no recorded total heights. Scenario 4 included a further modification, replacing the regional V merch prediction equations with an alternative regression model fitted to volume data from legacy tree measurements. A segmented V merch regression Eq. (5) following Clark et al. (1985) was adopted with dbh joint points D J = 22.9 cm (9 in) for softwoods and D J = 27.9 cm (11 in) for hardwoods. No distinction was made for trees observed in different regions, meaning only one model was fitted for a given species or species group regardless of the geographic distribution of the species and the data representing it in the legacy database.
(3) n a n a n a 1 Baseline scenario used CR regression models from Jenkins et al. (2003). All others used Eq.
Improved accuracy of aboveground biomass and carbon estimates where, D = dbh, cm; H = total height, m; D J = segment joint point, 22.9 cm for softwoods, 27.9 cm for hardwoods; a, b and c are the model coefficients.
In scenario 4 and all other scenarios 1-6,V merch predictions from either the regional FIA volume equations or the newly developed models based on Eq. (5) were subsequently adjusted using species-specific regional taper equations to expandV merch to the corresponding total stem volume (V tot ) before converting volumes to dry weights and AGB (Clark et al., 1991;Li et al., 2012;Westfall and Scott, 2010). This adjustment was needed to match the redefinitions of stem components that affected CR wd , CR bk and CR br .
where,V tot = predicted total stem ib volume;V merch = predicted merchantable stem ib volume; Vol(h 1 , h 2 |dbh, H) = stem ib volume between heights h 1 and h 2 (h 1 < h 2 , m), calculated from the integral form of a suitable taper model; H 10.2 is the height (m) of the stem to a 10.2-cm ob diameter. We assumed that SG and bark:wood volume values presented by Miles and Smith (2009) were suitable for either merchantable or total stem contents. As such, converting ib volumes from eitherV merch orV tot to stem wood and bark biomass could be accomplished using the same SG and bark:wood volume coefficients regardless of how the stem CR was defined.
Scenario 5 replaced published values of wood and bark SG and the average bark:wood volume percentages from Miles and Smith (2009) with values observed from legacy data. A further refinement adopted in scenario 6 was the replacement of species averages for stem bark:wood volume percentages with values predicted from species-specific regression equations based on the Weibull cumulative distribution function fitted to legacy data (Table 1) For emphasis we restate here that the reformulation of CRs to represent whole stem biomass contents in scenarios 1-6 did not eliminate the initial step of predicting tree V merch , either with FIA regional volume equations (scenarios 1-3) or from Eq. (5) fitted to legacy tree volumes (scenarios 4-6; Table 1). It did, however, require the use of suitable taper functions for calculating stem ib volumes from Vol(h 1 , h 2 |dbh, H) in Eq. (6). Because we considered the development of a comprehensive set of taper models for eastern US tree species to be outside the scope of this work, and because a number of regional taper models were already available for use here, we relied on published taper equations to perform taper-based volume adjustments in Eq. (6) (Clark et al., 1991;Li et al., 2012;Westfall and Scott, 2010).
In scenario 7 AGB was predicted using only species-specific allometric Eq. (3) fitted to legacy biomass tree data. No stem wood or volume predictions, conversions from volume to biomass using SG, or expansion of stem biomass to AGB using CRs were employed. Because of its relative simplicity and direct modelling of AGB, scenario 7 was expected to achieve the highest accuracy of any scenarios tested. As such it would serve as a benchmark for evaluating the accuracy of the other scenarios that included steps needed to ensure consistency between volume and AGB estimates.

Model evaluation
Legacy tree and component observations were used to evaluate the accuracy of tree-level predictions of volumes, wood and bark properties, CRs and AGB. With this suite of variables, up to nine intermediate predicted attributes used in the CRM could be compared with observed values for any particular tree; however, most trees in the legacy database had only a subset of the variables of interest observed on them. The primary variable of interest here was AGB; however, CRM predictions for intermediate attributes were compared with observed data to identify possible ways to increase their accuracy while maintaining optimal accuracy of AGB.
Given the large number of volume and biomass models employed in the CRM, it seemed likely that some models were originally developed using data being treated here as legacy observations. No effort was made to avoid reusing any legacy observations that had previously been involved in development of regional volume equations or other models that are part of the baseline CRM (Jenkins et al., 2003;Miles and Smith, 2009). In newly developed alternative models, however, legacy data formed the basis for evaluating prediction accuracy. Both bootstrap and tenfold cross-validation procedures were employed, selecting some legacy observations for model fitting and others for model testing (Efron, 1983;Kuhn and Johnson, 2013, pp. 72-73). This ensured some degree of independence between data used for developing new models and those used for model validation. In addition to graphical diagnostics, e.g. inspection of bootstrap prediction errors, two accuracy statisticsmean bias and RMSEwere calculated from bootstrap prediction errors. Accuracy statistics were tabulated and examined at various levels of aggregation including species, region, tree size and overall to gauge whether improvements at one level may have reduced accuracies in some of the subgroups formed at finer levels of aggregation.
Since each scenario relied on different sets of calculations to obtain CRM-based and allometric-model-based AGB predictions, a different set of SAP Adj factors was calculated for application in each scenario (Supplementary data, Table A7).

Stem wood and bark volume
The accuracy of V merch equations used in the CRM varied considerably by FIA region when compared with legacy tree volume measurements across all species (Table 2). Southern States FIA  The profile-based model of Clark et al. (1991), including a variant used by the US Forest Service (2011) for species in its Eastern Region (R9, Table 2) showed relatively high precision as measured by prediction error standard deviation. However, considerable biases in the R9 variant for Central and Lake States regions indicated some need for additional testing or development before implementing it widely across those regions. The approximation of Gevorziantz and Olsen's (1955) volume and taper tables implemented by Burk and Ek (1999) was most accurate of any existing merchantable ib volume models for Central and Lake States regions. In contrast, the models of Hahn (1984) and Stone (cf. Hahn, 1973;Hahn, 1984) were among the most heavily-biased of any models tested.
The V merch model of Eq. (5) fitted to legacy tree taper data (Supplementary file Eq_5_ Merch_Vol_Coefs_SI.csv) represented improvements over Central, Northeastern, and Lake States regional FIA volume models; however, it did not consistently result in smallest prediction biases nor variance when compared with other models evaluated (Table 2). Detailed analysis of model prediction accuracy among regions for various species and species groups showed a range of results (Supplementary file Error_Vol_Species.csv). Among them, we noted potentially useful relationships between model prediction variance and sample sizes that could help to inform future efforts to balance Figure 3 (A, B) Stem bark:wood volume ratios from paper birch and red maple showed nearly constant relationships across a range of tree sizes. (C, D) Decreasing trends were evident for several thick-barked species including white oak and longleaf pine. Dashed lines represent species-specific average bark ratios from published values  and the means of the legacy data shown. Improved accuracy of aboveground biomass and carbon estimates costs with accuracy requirements in volume prediction applications (Duncanson et al., 2015).
Observed bark:wood volume ratios showed varying relationships with stem dbh. Some species with thin, smooth bark, e.g. American beech, paper birch and quaking aspen (cf. Borger, 1973), showed nearly constant bark ratios across a range of tree sizes ( Figure 3A). Thin, scale-barked species like balsam fir, red maple, and red, white and black spruces exhibited weak correlations and slightly decreasing bark volume ratios with tree size ( Figure 3B). Thick, scale-and furrowed-bark species including many pines and hardwoods showed more-pronounced decreases in bark volume ratios with increasing stem dbh, despite there being considerable variability in the data ( Figure 3C,D).
Observed bark:wood volume means were nearly all larger than published values, with largest discrepancies in several southern yellow pine and gum species (Nyssa sylvatica, N. biflora, Liquidambar styraciflua). The same species were generally the ones with the highest observed mean bark volume ratios (Table 3). Reduction in RMSE using species-specific nonlinear regression models showed the potential for predicting bark volume ratios more accurately using dbh as a predictor, as compared with simply using species means (Table 3). Southern yellow pines, oaks and hickory showed the greatest gains when modelling bark volume ratios from dbh, but sweetgum and white spruce also showed reductions in RMSE of up to 10 per cent compared with using species means (Table 3).

Specific gravity
Observed species averages for whole-stem wood SG were strongly correlated (ρ = 0.945) with published values from Miles Forestry and Smith (2009). For 33 species having at least n = 50 trees in the legacy data set, observed wood SG means by species averaged just 2 per cent greater than published values. The largest difference was noted in quaking aspen with an observed meantree wood SG obs = 0.44 (n = 88) compared with the published value SG pub = 0.35. Other notable differences were observed for southern red oak (SG obs = 0.59; SG pub = 0.52; n = 81), elm (SG obs = 0.595; SG pub = 0.54; n = 55), chestnut oak (SG obs = 0.62; SG pub = 0.57; n = 100) and water tupelo (SG obs = 0.41; SG pub = 0.46; n = 150). Compared with wood SG, bark SG values observed here were less strongly correlated (ρ = 0.846) with values reported by Miles and Smith (2009). For 27 species having at least n = 50 trees in the data set, observed bark SG was lower than published values by about 14 per cent (Table 4). Only shortleaf pine had an observed mean bark SG larger than the species' published value. Observed variation in bark SG among trees of a given species was generally low, with coefficients of variation (CV) between 5 and 20 per cent in all but two species studied, longleaf and loblolly pines (Table 4). Scatterplots (not shown) generally showed no relationship between bark SG and tree dbh; however, weak correlationseither positive or negativewere noted in some hardwoods including sweetgum, yellow poplar, and several oak species.

Component ratios
In baseline analyses stem CR wd and CR bk both showed steep drops toward zero as tree dbh values decreased toward the CRM-specified merchantable top diameter ( Figure 4A,B). This pattern matched the CRM merchantability limits; namely, any tree having a stump diameter equal to the CRM-specified top diameter would, by definition, contain no stem wood or bark biomass (Heath et al., 2009). A contrasting patternowing to the additivity of merchantable stem and branch componentswas observed in CR br , which increased sharply toward 1.0 where dbh approached the merchantable top diameter limit ( Figure 4C). Similar trends were noted in both hardwoods and conifers, with only data from conifers shown in Figure 4.
Considerable changes in CR trends with dbh were noted after reformulating them so stem wood and bark components were based on total rather than merchantable stem biomass ( Figure 4D-F). The decreasing pattern of the CR wd with dbh became less steep, with minimum values of wood biomass generally not falling below 20 per cent of AGB, even in trees as small as 3 cm dbh ( Figure 4D). Total stem bark comprised a larger share of AGB in small-diameter softwoods than in larger trees, a trend opposite of that observed in merchantable stem bark ( Figure 4B,E). Branch biomass component relationships to dbh also changed markedly following the whole stem component reformulation. In contrast to the existing CRM branch CRs, which approached 100 per cent of AGB in small trees, branch biomass under the new formulation did not exceed 50 per cent of AGB in any of the eastern conifers observed here ( Figure 4F). Reformulated CR br often appeared flat over a wide range of tree sizes in conifers ( Figure 4F); however, gradual increases in branch biomass ratios with dbh were noted in relatively large trees of some speciesboth hardwoods and conifers (some plots not shown).
Examination of biomass CRs vs the combined variable dbh 2 × total height (D 2 H) indicated some potential for improving predictions using both dbh and height as predictors, as compared with dbh alone (Figure 5). Replacing dbh with D 2 H in Eq.
(2) reduced RMSEs by 6 per cent on average over 32 species having observations from n ≥ 50 trees ( Table 5). The gains in models of stem CR wd were comparatively modest at under 2 per cent reduction of RMSE. The accuracy of fitted CR bk and CR br models were improved only slightly by adding height as a predictor. Their RMSE reductions averaged < 1 per cent across all species; nonetheless, some species' RMSE were reduced as much as 8 and 14 per cent for bark and branch CRs, respectively (Table 5).

Aboveground biomass
Examination of legacy tree data indicated some potential for improving the accuracy of AGB models over those developed by Jenkins et al. (2003) and Chojnacky et al. (2014). The most Improved accuracy of aboveground biomass and carbon estimates notable discrepancies between observed and predicted AGB were in pines ( Figure 6A). Smaller differences were noted between observed data and published model predictions in hardwood species groups ( Figure 6B,C). The fir and hemlock model, which agreed well with observed data showed little need for improvement ( Figure 6D). Allometric models of AGB fitted here all predicted higher biomass than the models of Jenkins et al. (2003) currently used in the CRM.

AGB estimation scenarios
All seven alternatives tested as potential modifications to the CRM produced AGB estimates higher than the baseline (scenario 0) estimate of 17.1 Pg. The magnitudes of increases over scenario 0 ranged from 6.6 to 20.1 per cent, corresponding to between 1.1 and 3.4 Pg of additional live-tree biomass (Figure 7). Cross-validation prediction errors from n = 6480 trees indicated that RMSE could be reduced from the baseline value of 39.5 per cent by any of the scenarios tested. Scenario 7 netted the greatest reduction in RMSE by replacing the CRM approach with predictions from species-specific AGB allometric Eq.
(3) (Figure 7). Scenarios 1-6, all of which functioned within the existing CRM framework were able to improve prediction accuracy, with the greatest overall improvement achieved by adopting scenario 5, which resulted in RMSE for AGB predictions of 28.8 per cent. The baseline scenario exhibited an average bias of 12.2 per cent, meaning CRM predictions underestimated legacy tree AGB by this amount (Figure 7). Alternative scenarios also exhibited some underprediction bias, but none as large as scenario 0. Several scenarios resulted in underprediction biases smaller than 5 per cent, including scenario 5, which, at 2.5 per cent, exhibited the smallest bias of any scenario tested (Figure 7).
Despite its having the smallest per cent bias and secondsmallest per cent RMSE of any scenarios tested, scenario 5 was not particularly accurate in predicting biomass for sapling-sized trees (Table 6). Biomass predictions for saplings were generally less accurate as measured by RMSE compared with merchantable sized trees (Table 6). Detailed breakdowns of crossvalidation accuracy statistics by species and size classes (not shown) indicated that a relatively small number of species contributed to the high variance and bias noted for saplings. Included in this group were species: red spruce (Picea rubens (1); curves in (A-C)) was not sufficiently flexible to reproduce the nonlinearity. Reformulating the ratio as total stem wood:aboveground biomass reduced the degree of discontinuity in small trees, facilitating accurate modelling of CRs (lower panels, D-F).

Discussion
Several findings of this study demonstrate the potential for improvement of live-tree AGB estimates obtained by linking CRM predictions with field-plot data from the U.S. Forest Service FIA NFI data. In Lake States and Central States regions merchantable volume equations are among the constituent parts of the CRM that can likely be improved without undue effort (Miles and Hill, 2010). The cubic foot volume equation of Stone is a notable candidate for improvement (Hahn, 1984), as it was originally developed from cordwood volumes and adapted for cubic volumes using multiplication by a single conversion factor, 2.24 m 3 (79 ft 3 ) per cord. Regional volume equations from the Central and Northeastern States relied on tree volumes imputed from volume tables, such as those developed by Gevorkiantz and Olsen (1955) and Bickford (1951), rather than on regression models developed directly from observed stem volumes and associated predictor variables (Barnard et al., 1973;Hahn and Hansen, 1991;Scott, 1981). Where existing models can be validated for accuracy in regional applications, it may be possible to improve CRM accuracy using existing volume estimators (e.g. Burk and Ek, 1999). Where no such models currently exist improving CRM V merch predictions should be an attainable goal assuming sufficient sample data can be obtained for developing new equations (e.g. Westfall and Scott, 2010).
Defining BEFs according to merchantable stem biomass is a preferred approach in national C assessments conforming to Intergovernmental Panel on Climate Change good practice guidelines (IPCC, 2006). This definition was found to be problematic when applied to individual trees at or near the 12.7 cm sapling DBH threshold (Figure 4). Even after reformulating CR definitions based on whole stem rather than merchantable stem biomass contents, the equation form used by Jenkins et al. (2003) was found to be insufficiently flexible to adequately represent observed CR patterns across a range of tree sizes. Other work has found nonlinearities like those noted here in BEFs computed in forests comprised of small-sized trees (Soares and Tomé, 2012). Sapling adjustment factors likely contribute to biases unresolved by modifications to the CRM examined here (Nelson et al., 2014).
Felled tree studies have often collected stem profile data for both ob and ib diameters, from which accurate models of wood and bark volume can be developed (Li and Weiskittel, 2011). We noted that bark:wood volume ratios from legacy trees (Table 3) agreed with results reported by Gevorkiantz and Olsen (1951) better than those reported by Miles and Smith (2009) for 24 Lake States species. This agreement may be due in part to consistency of definitions, as Gevorkiantz and Olsen (1951) were clear in noting that their bark volumes were calculated using ob diameters, which include airspaces in bark (MacFarlane and Luo, 2009). Laboratory determination of bark volumes is often done Figure 5 Prediction models for CRs were less precise when fitted to dbh alone (A, C) than when fitted to a variable combining dbh and total height (B, D). Improved accuracy of aboveground biomass and carbon estimates using water displacement, which excludes airspaces in the volume measurement (Phillips and Taras, 1987). Any revisions to the bark volume quotients used in the CRM should be developed to ensure consistency of definitions, but also to account for the large number of species for which FIA requires accurate bark volume quotient information.
Our observations showed somewhat distinct patterns in how bark:wood volume ratios varied in trees of differing sizes and species (Figure 2). Smooth-barked species such as paper birch and quaking aspen tended to maintain a relatively constant bark:wood volume ratio over a range of tree dbh, while furrowed and scale type barks such as oaks and pines tended to show decreasing bark:wood volumes with increasing dbh. These patterns are consistent with shedding characteristics of various bark types; while smooth barked species shed little material over time under ideal growing conditions, shedding is continual and typical in other bark types (Borger, 1973;Kaufert, 1937). These details notwithstanding, direct accounting of bark:wood volume ratio relationships to dbh in scenario 6 did not lead to appreciable improvement in overall accuracy of AGB predictions in the CRM framework compared with using species averages (Table 1, Figure 7).
Published SG values for North American tree species date at least as far back as Fernow (1897), with numerous studies having been conducted in past century (Antony et al., 2015;Newlin and Wilson, 1917;Wahlgren et al., 1966). One challenge in working with published SG summaries is that the variability in sample data is not always well characterized, whether it be in terms of variation among specimens collected from a single tree, variation among trees from a particular study site or region, or variation across species' geographic ranges. Another challenge is that many wood properties surveys were limited to clear wood specimens, so the suitability of using their SG measurements as standing-tree BEFs should be tested (Markwardt, 1930). Also, since bark SG relies on laboratory measurements of green volume, the method of bark volume determinationeither including airspaces or notdirectly affects results. Bark SG determined using water displacement for green volume determination will necessarily overestimate bark biomass on standing trees unless the standing-tree bark volume measurement is adjusted to exclude airspaces (MacFarlane and Luo, 2009).
Notable improvement in CRM accuracy was achieved by revising bark and wood properties to match observed values from legacy felled-tree studies, including a threefold reduction in prediction bias between scenarios 4 and 5 (Table 1, Figure 7). Wood and bark properties are known to vary with tree size and tissue position in stems, with different silvicultural practices, and across space; however, the quantification of these patterns is not always straightforward because the same relationships that hold for one species may not be present in others (Antony et al., 2015;Tasissa and Burkhart, 1998;Wiemann and Williamson, 2014).
Despite the relatively complex formulation of volume, SG, and CR models used to obtain volume and biomass estimates in the CRM under alternative scenarios 1-6, results showed potential for reducing RMSE and bias to levels comparable with the relatively simple formulation tested in scenario 7. Although scenario 7 achieved greater precision than any of other alternatives tested it has some notable limitations. One is that it relies entirely on species or species-group-specific AGB allometric equations, with no accounting for stem volumes, component relationships, or wood properties in forest trees. Another is that the use of AGB allometric equations alone does not ensure any consistency between estimates of wood volume and AGB. In multiresource inventories like NFIs, such consistency is often needed . The overall reduction of apparent bias in scenario 5 is also a favourable outcome compared with the other scenarios tested since bias is an overriding concern in estimators that are applied to large data sets in NFI applications (Roxburgh et al., 2015).
It is important to note that RMSE and bias values reported here pertain to legacy data rather than biomass and AGB estimates reported by the US Forest Service FIA program. Errors in regional carbon estimates are subject to many factors including Forestry sampling error and model prediction errors that can be investigated using a number of analytical techniques (McRoberts and Westfall, 2014), all of which employ certain assumptions that must be considered. A key assumption in the errors reported here is that suitable model forms were chosen for volume, CR, and AGB equations. Another concern is whether data compiled from legacy volume and biomass studies adequately characterize the ranges of tree form, wood and bark properties, component relationships, and AGB that arise in forests of the eastern US. Legacy data undoubtedly include fewer trees of poor form than trees in larger populations due to longstanding recommendations that forked, broken, leaning or scarred trees be excluded from mensurational studies (Behre et al., 1926). Destructive sampling for volume and biomass is often limited to areas where road access is good to accommodate equipment needs and specimen collection (Williams et al., 1999). These and other concerns about the data used here apply in most applications of allometric model development.
Notwithstanding the extraordinary collection of legacy biomass equations and wood properties information compiled for   (Table 1) to the current CRM estimator (scenario 0). Per cent RMSE and underestimation Biases were calculated by 10-fold cross-validation of AGB values predicted for n = 6480 trees under each scenario. Table 6 Per cent RMSE and bias from 10-fold cross-validation on legacy trees show relatively poor performance of biomass prediction methods for saplings (dbh < 12.7 cm; n = 2306) compared with larger merchantable sized trees (dbh ≥ 12.7 cm; n = 4174) Improved accuracy of aboveground biomass and carbon estimates the development of US national-scale biomass and C estimators currently in use (Jenkins et al., 2004;Miles and Smith, 2009), the magnitudes of prediction errors observed here indicated considerable room for improvement. Legacy tree data may serve a vital role in future research, both in testing existing methods and developing new models that may ultimately prove superior in national forest C assessments (MacFarlane, 2015;Ung et al., 2008;Weiskittel et al., 2015). The work carried out here points to a possible reduction in prediction biases of nearly 80 per cent over the baseline CRM, and reduction of overall prediction RMSE of more than 25 per cent (Figure 7). The need for additional research on improving national scale biomass estimators seems clear given the impact such improvements will have on C stock estimates for NFIs in the US and other countries. Even though AGB can be predicted accurately for many species using conventional allometric modelling, substantial challenges remain in predicting stem volume, AGB, and biomass components simultaneously (Castedo-Dorado et al., 2012;Enes and Fonseca, 2014). Recent work has shown advances in the development of regression equations that maintain additivity in biomass component predictions (Affleck and Dieǵuez-Aranda, 2016;Dong et al., 2015;Poudel and Temesgen, 2016;Zhao et al., 2015). Past work has also demonstrated positive results in the simultaneous estimation of volume and biomass, although primarily in stem wood and bark (Brooks et al., 2007;Jiang and Brooks, 2008;Parresol and Thomas, 1995). To meet the information needs for both volume and biomass inventories, further work is needed to identify suitable approaches for combining information on both attributes .
While this work focused on models used to estimate AGB and C stocks in forests of the eastern US, additional work is needed to determine how revisions to CRM biomass estimators might affect regional estimates of C stock change (Magnussen et al., 2014). Stock change is a primary determinant of forest C sequestration or emissions over time, e.g., in tier 2 and 3 methods adopted in the United Nations' Good Practice Guidance for Land Use, Land-Use Change, and Forestry (IPCC, 2006). The US NFI includes remeasured data for all the states studied here, which will facilitate the determination of what effects the alternatives proposed here may have on estimates of C stock change. Despite the findings here that C stocks are presently underestimated in eastern US forests, a similar finding for sequestration rates is not assured (Domke et al., 2012;Woodall, 2012). Aside from land use and land-use changes, forest C sequestration rates are affected by many factors including the mixtures of tree species, ages, and size classes. How these factors interact in forest C sequestration is a subject of ongoing investigation Woodall et al., 2015).

Supplementary data
Supplementary data are available at Forestry Online.