-
PDF
- Split View
-
Views
-
Cite
Cite
Daniel S. Maynard, Mark J. Ducey, Russell G. Congalton, Joel Hartter, Modeling Forest Canopy Structure and Density by Combining Point Quadrat Sampling and Survival Analysis, Forest Science, Volume 59, Issue 6, December 2013, Pages 681–692, https://doi.org/10.5849/forsci.12-086
Close - Share Icon Share
Point quadrat sampling has been used relatively infrequently for modeling canopy structure and density, primarily because of the large number of sample points needed to obtain accurate estimates. We address these limitations by showing how point quadrat data are a form of time-to-event data, analogous to what are commonly observed in biomedical studies. This equivalence allows for point quadrat data to be analyzed using existing survival analysis methods. We illustrate the usefulness of this relationship by analyzing data from a field study conducted in northeast Oregon. Within each of 60 forest plots, we obtained canopy-height measurements using a handheld laser rangefinder, and we used a survival-based regression model to estimate canopy profiles and leaf area indices via the Weibull hazard function. The resulting survival-based estimates of canopy density and structure appeared robust to sample size limitations, whereas the relatively small number of samples per plot led to an apparent underestimation of canopy density by the traditional point quadrat estimator. Overall, the incorporation of survival analysis methods and point quadrat sampling greatly increases the usefulness of this sampling method, resulting in an efficient tool for quickly assessing the structure of forest canopies.
The structure and density of forest canopies are important indicators of forest productivity and function. Canopy characteristics are strong indicators of individual tree growth, vigor, and overall stand productivity (Waring et al. 1981, Mitchell et al. 1983, Waring 1983, Maguire and Kanaskie 2002). Canopies are responsible for energy production, carbon sequestration, and water transfer (Ford and Deans 1978, Hollinger 1989, Baldocchi et al. 2002, Baldocchi and Harley 2006). They regulate the quantity of light and water that reach the forest floor, thereby influencing temperature, humidity, and moisture (Seidel et al. 2011). These effects can alter understory vegetation patterns and corresponding wildlife populations (Hayes et al. 1997, DeMaynadier and Hunter 1999, Jennings et al. 1999, Martens et al. 2000, Wilson et al. 2007). Accurate estimation of canopy structure is integral for making responsible and effective stand-level management decisions regarding the health, productivity, habitat suitability, and ecosystem processes of forests.
Numerous ground-based methods exist for assessing canopy structure and canopy density, although most are subject to one or more limitations: they can be expensive, time-consuming, and difficult to implement, or they can yield data that are imprecise or hard to interpret (Radtke and Bolstad 2001, Jonckheere et al. 2004, Côté et al. 2011, Seidel et al. 2011, 2012, Zhao et al. 2011). Common techniques for estimating canopy structure include stratified clipping and scaffolding (Fujimori 1971, Fukushima et al. 1998), litter traps (Ovington 1963, Aber 1979), inclined point quadrat methods, and optical or photographic methods (Levy and Madden 1933, Wilson 1960, MacArthur and Horn 1969, Aber 1979, Rich 1990). More recently, ground-based optical point quadrat methods such as light detection and ranging (LiDAR) and laser-point sampling have been used with increasing frequency (Vanderbilt et al. 1979, Radtke and Bolstad 2001, Lovell et al. 2003, Parker et al. 2004, Coops et al. 2007).
In this article, we build on the point quadrat and laser point quadrat sampling methods of Levy and Madden (1933), MacArthur and Horn (1969), and Radtke and Bolstad (2001) by illustrating how laser point quadrat sampling is analogous to the field of survival analysis as used in biomedical and technology research. In doing so, we hope to alleviate many of the limitations that are common to ground-based canopy sampling methods. The objectives of this article are to demonstrate the equivalence between survival analysis and point quadrat analysis, to provide an example that illustrates the potential usefulness of this relationship, and to investigate how survival-based regression analysis can be used to overcome limitations present in traditional point quadrat sampling.
First, we give a brief review of existing point quadrat sampling methods, followed by an overview of survival analysis methods. We then outline the previously undescribed relationship between survival analysis and point quadrat sampling. To investigate the feasibility and efficacy of applying regression-based survival analysis techniques to point laser quadrat data, we conducted a field study in northeast Oregon. The results and conclusions from this study are reported, along with limitations and suggestions for further research on the use of survival analysis for point quadrat data.
Point Quadrat Sampling
The primary variables of interest in point quadrat sampling are leaf area index, canopy density, and canopy profile. Leaf area index (LAI) can be defined numerous ways: as the horizontally projected leaf area (Ross 1981, Bolstad and Gower 1990); as the total one-sided leaf area per unit ground area (Watson 1947); or as one-half of the total leaf area per unit ground area (Chen and Black 1992a). Following the tradition of Levy and Madden (1933), MacArthur and Horn (1969), and Radtke and Bolstad (2001), we do not account for the angular distribution of foliage in our analysis. Our use of the term “leaf area index” in this article therefore refers to horizontally projected leaf area per unit ground area. In addition, laser point quadrat sampling measures plant area index (projected plant area per unit ground area), rather than leaf area index, because the laser does not differentiate between foliage and woody material. Despite this limitation, we use the term “leaf area index” (LAI) rather than “plant area index” to coincide with traditional point quadrat sampling terminology. Possible ways of overcoming this limitation using survival analysis methods are addressed in the Discussion.
Chen and Black (1992a) point out that the horizontal projection for nonflat leaves can be problematic and has limited biological significance. Furthermore, the clumping of foliage due to the nonrandomness of shoot positions and the interdependence of leaf positions in the canopy can lead to underestimates of LAI (Chen and Black 1992b, Chen and Cihlar 1995, Radtke and Bolstad 2001). Although these limitations can affect the interpretability of the results, the purpose of the proposed methods is to further the applicability of point quadrat sampling by highlighting the similarities between survival analysis and existing point quadrat methods. Techniques for addressing these limitations have been discussed elsewhere (see Wilson 1960, 1963, 1965, Chen and Black 1992a, Chen and Cihlar 1995, Welles and Cohen 1996, Denison 1997, Groeneveld 1997, Radtke and Bolstad 2001).
MacArthur and Horn Estimator
The original methods for conducting point quadrat sampling involved passing a needle (i.e., a sewing needle) vertically through the canopy and recording the number of contacts between the needle and foliage (Levy and Madden 1933). MacArthur and Horn (1969) expanded on this method by using a camera to measure height to first leaf. According to this scheme, a random point was selected, a tripod was centered over that location, and a camera was fixed to the tripod and sighted vertically up into the forest canopy. Using the focusing scale on the telephoto lens, distance to the first leaf at that point was measured. By repeating this procedure across a random series of sample points they obtained a set of measurements of “height to first leaf.”




Laser Point Sampling—Radtke and Bolstad Estimator


Survival Analysis
In biostatistics, whenever a subject is followed until a predefined event occurs, the resulting data are termed “failure time data” or “time-to-event data” (Kalbfleisch and Prentice 2002). The group of techniques and methods used to analyze failure time data is known as “survival analysis.” Failure time data occur most frequently in biomedical studies in which patients or persons of interest are followed for a period of time until a specific event occurs (e.g., death, adverse reaction, or clinical diagnosis). It is also commonly used in industrial applications, in which failure rates and life expectancies of specific components are of interest.




Equivalence of Point Quadrat Sampling and Survival Analysis
Although the most common estimators for laser point quadrat analysis and its optical analogs have been known since the work of MacArthur and Horn (1969), the connection to survival analysis does not seem to have been recognized in the literature. There are a considerable number of statistical techniques associated with survival analysis that are useful in the analysis of laser or optical point quadrat data, including straightforward variance estimation, hypothesis testing, and the incorporation of covariates in the analysis.
To understand this equivalence between point quadrat sampling and survival analysis, we can imagine that each laser shot is “enrolled” at height zero and followed upward in height until the predefined event (i.e., leaf interception) occurs. Just as in the traditional survival setting, this results in time-to-event data. The difference is simply the reference scale: in traditional survival analysis, the subjects are followed across time until the first event occurs; in point quadrat sampling, the subjects (laser shots) are followed across distance until the first leaf intercept occurs. As evidenced by the similarities between Equations 1 and 8 and between Equations 2 and 10, this change from time to distance does not affect the underlying mathematics.


Furthermore, MacArthur and Horn (1969) suggested estimating ϕ(h) via the proportion of sample points that exceeded h. In survival analysis, this is known as the empirical survival function (Tableman et al. 2004). The overall estimate of leaf area as suggested by MacArthur and Horn (1969) in Equation 4 is equivalent to the Kaplan-Meier product limit estimate of the cumulative hazard (Kaplan and Meier 1958), provided that there are no censored observations (such observations are common in failure-time studies, where the subject is removed or withdraws from the study before the outcome can be observed; methods of incorporating censoring are addressed in the Discussion).


Not only does survival analysis allow for variance estimation of traditional, nonparametric point quadrat estimators but also it importantly enables point quadrat sampling to be embedded within a regression framework. The methods provided by MacArthur and Horn (1969) and Radtke and Bolstad (2001) can be used to estimate LAI and to obtain a graph of the canopy profile, but they are limited in their ability to incorporate stand-level covariates to improve canopy profile and LAI estimation.
Regression Analysis of Point Quadrat Data
Radtke and Bolstad (2001) note that the number of sample points needed to obtain accurate estimates using laser point quadrat sampling can be quite high. They observed minimal improvement in LAI estimation over existing LAI estimation methods with a sample size of approximately 1,000 laser points per 13 × 13-m plot. Fukushima et al. (1998) suggest that adequate foliage profile estimates can be obtained with a minimum of 1,600 sample points distributed uniformly across the stand. These large sample sizes make direct inference of a single stand difficult and make comparison between many stands impractical using traditional point quadrat analysis methods.
Survival-based regression analysis of point quadrat data can address this situation by incorporating stand-specific covariates into the LAI estimation process. This enhanced methodology can lead to improved estimates of LAI and canopy profile, and it allows for quantitative identification of those stand-level variables that affect canopy structure and LAI. Survival-based regression analysis can potentially result in fewer sample points by combining information across plots.
Study Design
As a simple illustration of the usefulness of the survival-based regression framework for analyzing point quadrat data, we conducted a field study in the counties of Union and Baker in northeast Oregon. Annual precipitation in this region is generally less than 50 cm/year, and the study sites were dominated by dry upland forest communities comprised of ponderosa pine (Pinus ponderosa), Douglas-fir (Pseudotsuga menziesii), grand fir (Abies grandis), and western larch (Larix occidentalis), with lesser percentages of lodgepole pine (Pinus contorta), Engelmann spruce (Picea engelmannii), and subalpine fir (Abies lasiocarpa).
We subjectively selected 60 plots across these two counties in order to obtain a wide range of forest types and structures. Within each of the 60 stands, a plot center was randomly selected. At each plot, a series of three sample points was established, each spaced 30-m apart in a north-south orientation. At each point, we tallied trees using a Spiegel Relaskop with a metric basal area factor of 4 m2 ha−1 (17.4 ft2 ac−1) and recorded the species name and diameter at breast height (dbh) for each of these trees. To obtain approximately 10–12 tree height measurements per plot, we used a variable basal area factor of either 4 m2 ha−1 (17.4 ft2 ac−1), 9 m2 ha−1 (39.2 ft2 ac−1), or 16 m2 ha−1 (69.7 ft2 ac−1) to identify a subset of trees, from which we obtained height measurements using a clinometer. Based on the dominant and codominant tree species at each plot, we categorized the sites as (1) ponderosa pine, (2) ponderosa mix, (3) mixed conifer, (4) lodgepole pine, or (5) subalpine fir forest types.
Within each stand, we established a 90-m transect, centered at the middle sample point and extending 15 m beyond the northern and southern sample points. A Leica DISTO D8 Laser Rangefinder with tilt sensor was used to measure height-to-first-leaf along this transect. This laser rangefinder has a range of 0.05–200 m with an accuracy of approximately 1 mm. It includes a “long-range mode,” which we used in bright sunlight or for very high canopies. Using this laser rangefinder, a field crew member walked along the transect and took a height-to-canopy measurement every 1 m, with no measurement taken at the 0-m mark. The rangefinder was held at breast height (approximately 1.3 m above the ground), and the tilt sensor was used to ensure that the unit was held vertically. Using the built-in wireless capability of the DISTO D8, the height measurement at each location along the transect was immediately transmitted to a handheld tablet computer being operated by another field crew member.
This procedure resulted in 90 foliage height measurements for each stand. Sky hits (those points that passed through the entire canopy without intercepting any foliage) returned an error and could not be transmitted to the tablet computer. The number of sky hits for each stand was therefore indicated by the number of missing measurements (out of 90) for that stand.
Statistical Analysis



This hazard function provides a range of monotonic increasing, decreasing, and constant canopy density profiles (Figure 1). The possible forms for D(h) given by the Weibull hazard function are undoubtedly oversimplifications of real-world canopy profiles because of the lack of unimodal or bimodal hazard functions, but the flexibility of these potential canopy density profiles makes the Weibull hazard function ideal for illustrative purposes and for modeling different forest types with disparate canopy structures. In addition, this distribution has previously been used for modeling canopy structure in various forms (Schreuder and Swank 1974, Moril and Hagihara 1991, Yang et al. 1993, Coops et al. 2007), although not via the hazard function. Alternative parametric functions that are commonly used in survival analysis (including the exponential, log-normal, log-logistic, Cauchy, and gamma distributions) were considered but were ultimately not selected owing to various limitations.
Various density functions given by the two-parameter Weibull distribution. Although limited to monotonic functions, the Weibull hazard function provides a wide range of canopy profile shapes.


Incorporating Sky Hits into the Regression Framework
In the nonparametric methods proposed by MacArthur and Horn (1969) and Radtke and Bolstad (2001), sky hits do not present any issues. In the parametric framework, however, sky hits propose a unique challenge, as their survival times are infinite. These points cannot be trivially excluded from the analysis, as they are informative with respect to total leaf area. They also cannot be included as censored observations, as such observations are still assumed to have finite failure times.
. By ignoring the difference between foliage and woody canopy material, this latter area includes all of the locations in the stand that would result in sky hits. Implicit in this partitioning is the assumption that foliage movement is minimal and that the gap fraction is roughly constant over short periods of time. From Equation 2, we see that the canopy density function can be written as the derivative of the LAI function. That is, 


is the density function at height h for the portion of the stand not beneath foliage (and is trivially equal to 0 for all values of h). This area-weighted canopy density function is identical to a hazard mixture model in industrial and technological survival analysis settings (Abdelfattah and Mustafa 2012), in which a device can fail for one of two reasons, but it cannot be at risk of failing for both reasons simultaneously.

An important implication of this result is that estimation of the survival regression parameters can be conducted using only the intercepted laser hits, thus eliminating the difficulty of incorporating infinite survival times into the model. Under this framework, estimation of the overall parametric density equation is a four-step process: (1) estimate DB(h) using only the laser hits that intercepted foliage, (2) estimate g using the proportion of sky hits, (3) estimate D(h) by taking the product of DB(h) and (1 − g), and (4) truncate D(h) at the maximum canopy height.
Canopy Profiles and LAI Estimates
To allow for greater flexibility in the shape of the canopy density function between forest types, the Weibull regression model given in Equation 18 was fit separately for ponderosa pine, ponderosa pine mix, and mixed-conifer forest types. Using the above framework, we estimated the coefficients using only those laser shots that intercepted foliage. In addition, a value of 1.3 m was added to each laser intercept height, as the laser rangefinder was held approximately 1.3 m above the ground.
The covariates included in the regression models were basal area per hectare (BA), trees per hectare (TPH), stand density index (SDI), and indicator variables denoting the ownership type for each plot (private nonindustrial, private industrial, and USDA Forest Service [USFS] land). SDI was calculated using an additive version of Reineke's SDI given by Long and Daniel (1990), and it was first calculated for each plot and then averaged across the three plots at each site. In addition, as specified in Gove (2003), we fit a two-parameter size-biased Weibull model to the dbh measurements for each plot to estimate the complete diameter distribution of trees in that stand. The shape and scale parameters from these size-biased Weibull diameter models were included as additional covariates. Quadratic mean diameter (QMD) was originally included but was removed because of high collinearity with the size-biased Weibull diameter distribution parameters. Site-specific differences were originally modeled through a random-effects term (known as a “frailty” model in survival analysis). These results did not appreciably differ from the fixed-effects model, so this random-effects term was removed for parsimony and to facilitate site-specific estimation of the canopy profiles and LAI values.

is the matrix of the site-specific covariate vectors corresponding to each of the n leaf intercept height measurements. Although the number of measured leaf intercept heights, n, is a random variable, it is treated as a fixed constant to simplify estimation.
After estimation of regression coefficients, the site-specific density functions from Equation 18 were each multiplied by the corresponding empirical gap fraction for that site (the proportion of sky hits at that site), resulting in a smooth estimate of the canopy profile as given in Equation 23. The graphs were constructed by plotting
(h) against h and truncating the curve at the maximum tree height for each site (using the maximum of the recorded tree heights and laser intercept heights for that site). Parametric regression-based LAI values were obtained by integrating
(h) up to the maximum measured tree height, as indicated in Equation 24. We also considered using the average of the recorded tree heights, but the results did not differ appreciably from using the maximum recorded tree height.
Nonparametric canopy profiles were obtained by dividing the canopy height into 1-m intervals and using Equation 6 to calculate the Radtke-Bolstad LAI estimate within each of these intervals for each site (analogous to the Nelson-Aalen estimate). These interval-specific LAI estimates were then plotted using a stepwise linear function. Nonparametric LAI values were estimated using the Kaplan-Meier product-limit estimator (equivalent to the MacArthur-Horn estimator as given in Equation 4). As with the parametric canopy profiles, the maximum canopy height used to calculate the parametric LAI was estimated as the maximum of the recorded tree heights or laser intercept heights for each plot. The muhaz package in R was used for nonparametric estimation of the LAI estimates and canopy profiles.
LAI ratios for each forest type and covariate were calculated by exponentiating the estimated regression coefficients for each of the three forest-type survival models. For each of these LAI ratios, the asymmetric 95% confidence intervals (CIs) were obtained by exponentiating the 95% CI of the corresponding regression coefficient (assuming a Student's t-distribution). CIs for the regression-based LAI estimates and the canopy profiles are not given, because additional uncertainty is present in the estimate of the gap fraction and the maximum canopy height. These point estimates should therefore be interpreted with caution, and they are reported to allow for comparisons between the nonparametric and parametric approaches.
Results
Of the 60 sites, we classified 22 as mixed conifer, 19 as ponderosa pine, 14 as ponderosa mix, 4 as subalpine fir, and 1 as lodgepole pine. The subalpine fir and lodgepole pine sites were omitted due to low numbers, resulting in 55 sites used in the analysis. Of these remaining sites, 18 were on USFS land, 16 were on private nonindustrial land, and 21 were on private industrial land.
The sites were primarily open, with gap fractions ranging from approximately 60–70% (as estimated by the fraction of sky hits). The sites in each of the three remaining forest types were generally comparable in QMD, height, and live crown ratio (Table 1). The mixed-conifer stands showed fewer signs of active management and/or timber harvests. Accordingly, these stands had greater crown closure and denser foliage throughout the canopy strata. These mixed conifer sites had slightly more BA than ponderosa and ponderosa mix stands (29.5 m2 ha−1 compared with 21.8 and 21.0 m2 ha−1, respectively; Table 1). TPH and SDI were also markedly larger in mixed conifer stands.
Summary statistics of the 55* forest plots.

Data are means ± SD.
Lodgepole pine and subalpine fir sites (n = 5) are omitted.
† Estimated using the proportion of laser sky hits.
Summary statistics of the 55* forest plots.

Data are means ± SD.
Lodgepole pine and subalpine fir sites (n = 5) are omitted.
† Estimated using the proportion of laser sky hits.
Regression-based canopy profiles for all sites are given in Figure 2. The ponderosa pine sites were generally the shortest of the three forest types, with minimal foliage density below 10 m and foliage density that increased relatively quickly near the upper portion of the canopy. In contrast, the ponderosa mix sites showed more uniform foliage density throughout all strata. The mixed-conifer stands showed the most even distribution of foliage throughout the canopy, with the majority of stands having very little change in foliage density from 10 m in height to the top of the canopy.
The estimated canopy profiles by forest type for all sites using the Weibull regression model. The maximum canopy profile for each site is equal to the maximum measured tree height (or laser intercept height) at that site. The canopy area contained within each profile is equal to the parametric LAI estimate for that site.
Because of the small number of laser intercepts for each stand (approximately 30 non-sky hits for each site), very few laser shots penetrated into the upper portions of the canopy. Accordingly, the nonparametric profiles drastically underestimate canopy density, with no foliage or woody material estimated above 25 m for any of the stands (Figure 3). Conversely, the parametric regression model interpolated the canopy density profiles to the upper portions of the canopy. For stands with dense foliage close to the ground (e.g., sites 3 and 6 of Figure 3), the parametric and nonparametric profiles show the largest discrepancies. For stands with less dense foliage (e.g., sites 1 and 2), the parametric and nonparametric profiles are much more similar, because the laser shots were able to penetrate farther into the canopy. This improvement by the parametric canopy profiles in dense stands highlights the benefit of a parametric approach: under the assumption that the underlying probability model is approximately correct, height measurements at the upper portions of the canopy are not needed to infer the density at that height.
The estimated canopy profiles for six different sites. The top row gives the nonparametric estimates and the bottom row gives the corresponding parametric estimates.
The parametric LAI estimates reflect these qualitative differences in canopy profile form. The mixed conifer sites had the largest mean LAI estimate, with a value of 4.4 (95% CI = 1.7–7.0). The ponderosa pine sites had a substantially higher mean LAI estimate than the ponderosa mix sites, with a value of 4.1 (95% CI = 2.6–5.6) compared with 2.4 (95% CI = 1.4–3.4).
The MacArthur-Horn estimates of LAI were markedly less than the corresponding parametric values (Figure 4). The empirical 5 and 95% quantiles for the regression-based LAI estimates and MacArthur-Horn LAI estimates were (0.9, 9.0) and (0.1, 1.0), respectively, showing large differences in the LAI estimates between the two methods.
The estimated MacArthur-Horn LAI values and the regression-based LAI values for each site by forest type.
The estimated LAI ratios from the three Weibull regression models are given in Table 2. These ratios correspond to the relative difference in LAI between two stands that differ only by one unit of the reference variable and have all remaining covariates identical. The results show that private industrial corresponds to significantly higher LAI for all three forest types than identical sites on USDA Forest Service land, with LAI ratios of 1.63 (95% CI = 1.23–2.15), 1.63 (95% CI = 1.20–2.22), and 2.38 (95% CI = 1.92–2.96) for ponderosa, ponderosa mix, and mixed-conifer forest types, respectively. Conversely, a site on private property is estimated to have a reduction in LAI by a factor of 0.70 (95% CI = 0.55–0.90) compared with a similar site on USDA Forest Service property.
Results from the Weibull regression models for ponderosa pine, ponderosa pine mix, and mixed-conifer forest types.

* Equals the relative change in LAI and canopy density, holding all other covariates and the gap fraction fixed.
† Corresponds to a 5 m2 ha−1 increase.
‡ Corresponds to a 100 TPH increase.
§ Corresponds to a 10-unit increase.
Results from the Weibull regression models for ponderosa pine, ponderosa pine mix, and mixed-conifer forest types.

* Equals the relative change in LAI and canopy density, holding all other covariates and the gap fraction fixed.
† Corresponds to a 5 m2 ha−1 increase.
‡ Corresponds to a 100 TPH increase.
§ Corresponds to a 10-unit increase.
Across all three sites, an increase in basal area corresponded to a decrease in canopy density (potentially due to increased competition and decreased growing space). Specifically, a 5-m2 ha−1 decrease in BA resulted in a reduction in LAI by factors of 0.81, 0.89, and 0.80 for ponderosa, ponderosa mix, and mixed-conifer sites, respectively. Although this result may seem counterintuitive, it is important to realize that this relationship is the result of holding all other variables constant (including the gap fraction). Neither TPH nor SDI showed any obvious trends. An increase in TPH led to an increase in density for ponderosa and mixed-conifer sites, but it led to a decrease in density for ponderosa mix sites. Conversely, an increase in SDI corresponded to an increase in LAI among ponderosa mix and mixed-conifer sites, but a decrease in LAI for ponderosa sites.
Discussion
The primary objectives of this study were to outline the relationship between survival analysis and laser point quadrat sampling and to provide a simple example that illustrates the utility of this relationship. In this study we used only 90 sample points per stand, with approximately 35% of the laser shots intercepting foliage. For comparison, Radtke and Bolstad (2001) reported mixed results with approximately 1,000 sample points per stand, and Fukushima et al. (1998) recommended approximately 1,600 sample points per stand using the traditional MacArthur-Horn estimators. By using a parametric approach and incorporating covariates into a regression framework, we were able to obtain smooth canopy profiles and LAI estimates with a fraction of the sample points used in previous studies. Direct validation of these parametric estimates was beyond the scope of this study, although the results allow for qualitative comparisons between the nonparametric and survival-based approaches to analyzing laser point quadrat data.
To improve on the applicability of combining survival analysis and point quadrat sampling, additional research is needed in five distinct areas: (1) appropriate selection of the density (hazard) function, (2) simpler methods for incorporating sky hits, (3) techniques for addressing multiple sources of error in the estimation process, (4) tools for distinguishing LAI from plant area index, and (5) techniques for adjusting for clumping and interdependence between sample points.
Although we selected the Weibull hazard function for its simplicity, flexibility, and interpretability, it is probably ill-suited for those situations in which highly detailed canopy profile estimates are needed. The parametric canopy profiles shown in this article allow for general, qualitative comparisons between forest canopies, but the available canopy profile forms were limited by the monotonic nature of the Weibull hazard function. Overall, selection of an appropriate hazard function for laser survival times is difficult. Distributions such as the log-normal, log-logistic, and Cauchy have unimodal hazard functions, but the maximum values of these hazard functions (and accordingly, the height of the maximum canopy density) typically occur much closer to the ground than would be seen in most forest stands. Additional distributions common to survival analysis were also investigated for use in this study (including the exponential distribution, the normal distribution, the t-distribution, the gamma distribution, the extreme-value distribution, and the uniform distribution) but were likewise not used either because the corresponding hazard profiles did not reflect standard canopy profiles or because the hazard function did not provide the desired flexibility in canopy profiles. The suitability of additional distributions not typically used in survival analysis will need to be investigated on a one-by-one basis, as the shape of the hazard function is often not known or is rarely reported.
To achieve more realistic canopy profiles using survival analysis, more complex modeling methods or distributions may be needed. Finite mixture models (Schlattmann 2009) would allow for the canopy density function to be a modeled using a combination of simple hazard functions. This could be used in forests with bimodal or nonstandard canopy structures. Bayesian methods (both parametric and nonparametric) would also probably improve canopy profile estimation by allowing for the specification of a baseline canopy density function that is then updated to reflect the laser intercept heights and covariates for each stand. Appropriate selection of this baseline (prior) distribution would allow for more realistic canopy profiles that approximate known canopy structures and borrow results from past research.
An additional possibility for improving hazard estimation is to take a nonparametric or semiparametric approach. The Weibull distribution used here is an example of a proportional hazards models, because the hazard functions differ only by a constant of proportionality given by their covariates. More general proportional hazards models make no assumptions about the underlying baseline hazard. If the assumption of proportional hazards holds, then it is possible to estimate the relative impact of each covariate without specifying the failure-time distribution (Cox 1972), thus limiting the effect of model misspecification. We considered various nonparametric approaches to modeling canopy profiles, such as using a piecewise constant hazard model, but such models offered no improvement over the traditional nonparametric MacArthur-Horn approach with regards to sample size, accuracy, or resulting canopy profile shapes.
More parsimonious regression approaches than the one used in this study may also be possible. Because of sky hits, we used a two-step procedure to estimate the density function. A drawback of the parametric estimation procedure that we selected is that error is present in three places: the estimate of the gap fraction, the estimate of
(h), and the estimate of the upper portion of the canopy profile. For this reason, statistically valid 95% CIs and SEs were not calculable for the parametric canopy profiles or LAI estimates. Although this is undoubtedly a limitation of this approach, we intend the methods presented here to serve as a useful starting point for future research in this area. Ideally, it would be desirable to develop a methodology that either does not require data partitioning or allows for simultaneous estimation of the density parameters and the gap fraction parameters. In hope of avoiding this two-step procedure, we investigated the use of infinite-valued distributions, mixture distribution, and cure fraction models, where a subset of patients are “cured” and no longer at risk of failure (De Angelis et al. 1999, Lambert et al. 2007). All of these models allow for a partitioning of the data into two general cohorts (i.e., those points beneath foliage and not beneath foliage), but they all invoke additional constraints that are not immediately relevant in the point quadrat setting. Further research into the use of these distributions in laser point quadrat sampling is needed.
As discussed, our use of the term “leaf area index” throughout this article refers to the “plant area index” because both foliage and woody material returned laser distance readings. Many laser rangefinders (including the Leica DISTO used in this study) include a digital display that allows the user to identify the exact object that is reflecting the laser shot. By recording this additional information at each laser hit, two existing survival analysis methods can be used to distinguish LAI from plant area index. The first method involves modeling the woody laser hits as censored values, because these shots were intercepted (censored) by woody material before they could be intercepted by foliage. Including these censored values into a regression model for failure time data is simple, because most statistical analysis packages for survival analysis are specifically designed to incorporate censored values. A second method for incorporating woody material is via a “competing risks model” (Kalbfleisch and Prentice 2002). In these models, multiple sources of failure are accounted for, and hazard function estimation is conducted for each source. In the point quadrat setting, this would allow for the estimation of two separate density functions: one corresponding to LAI and one corresponding to wood area index. Methods for implementing competing risk models are likewise available in many statistical software packages.
Last, clumping of foliage along with the interdependence of foliage remains an issue with point quadrat sampling. The presence of strong clumping (particularly in coniferous stands) can make it impossible for the laser to reach some foliage that is totally obscured by other foliage. This limitation has been acknowledged and well-documented elsewhere (see Radtke and Bolstad 2001), although there are no well-established methods of adjusting for clumping in the laser point quadrat setting.
Conclusions
Survival analysis methods provide a promising set of statistical tools for analyzing point quadrat data. Although additional research is needed to explore the effectiveness and limitations of this method, survival-based point quadrat sampling is a useful technique for quickly obtaining LAI estimates, modeling canopy profiles, and identifying stand-level covariates that affect canopy density. The flexibility and simplicity of survival-based regression models make them ideal for identifying stands with poor canopy structures, for conducting widespread forest health studies or for assessing wildlife habitat. Although other optical sampling methods such as LiDAR offer much higher spatial resolution, the advantage of laser point quadrat sampling is that rangefinders are inexpensive, easy to operate, and portable, and little is needed in terms of operational infrastructure. In addition, the laser point quadrat sampling method we used took less than 5 minutes to complete per plot, and the ancillary regression covariates we included are typically already gathered in most forest inventory procedures. Overall, the relationship between point quadrat sampling and survival analysis increases the efficiency and functionality of point quadrat sampling, resulting in a useful tool for quickly and accurately assessing the structure of forest canopies.
Acknowledgments: Funding for this research was provided in part by the Disaster Resilience for Rural Communities Program through the USDA National Institute of Food and Agriculture (Award 2010-67023-21705) and by the New Hampshire Agricultural Experiment Station.
Literature Cited



