-
PDF
- Split View
-
Views
-
Cite
Cite
Ryan S Terrill and others, Threshold models improve estimates of molt parameters in datasets with small sample sizes, Ornithology, Volume 138, Issue 3, 1 July 2021, ukab038, https://doi.org/10.1093/ornithology/ukab038
Close - Share Icon Share
Abstract
The timing of events in birds’ annual cycles is important to understanding life history evolution and response to global climate change. Molt timing is often measured as an index of the sum of grown feather proportion or mass within the primary flight feathers. The distribution of these molt data over time has proven difficult to model with standard linear models. The parameters of interest are at change points in model fit over time, and so least-squares regression models that assume molt is linear violate the assumption of even variance. This has led to the introduction of other nonparametric models to estimate molt parameters. Hinge models directly estimate changes in model fit and have been used in many systems to find change points in data distributions. Here, we apply a hinge model to molt timing, through the introduction of a double-hinge (DH) threshold model. We then examine its performance in comparison to current models using simulated and empirical data. Our results suggest that the Underhill–Zucchini (UZ) and Pimm models perform well under many circumstances and appear to outperform the DH model in datasets with high variance. The DH model outperforms the UZ model at low sample sizes of birds in active molt and shorter molt durations and provides more realistic confidence intervals at smaller sample sizes. The DH model provides a novel addition to the toolkit for estimating molt phenology, expanding the conditions under which molt can accurately be estimated.
Resumen
El momento en que ocurren los eventos en los ciclos anuales de las aves es importante para entender la evolución de la historia de vida y la respuesta al cambio climático global. El momento de la muda a menudo se mide como un índice de la suma de la proporción o la masa de las plumas crecidas dentro de las plumas de vuelo primarias. La distribución de estos datos de muda a lo largo del tiempo ha resultado difícil de modelar con modelos lineales estándares. Los parámetros de interés se encuentran en puntos de cambio en el ajuste del modelo a lo largo del tiempo, por lo que los modelos de regresión de mínimos cuadrados que asumen que la muda es lineal violan el supuesto de homogeneidad de varianza. Esto ha llevado a la introducción de otros modelos no paramétricos para estimar los parámetros de la muda. Los modelos de bisagra estiman directamente los cambios en el ajuste del modelo y se han utilizado en muchos sistemas para encontrar puntos de cambio en las distribuciones de datos. Aquí, aplicamos un modelo de bisagra a la temporalidad de la muda, mediante la introducción de un modelo de umbral de doble bisagra (DB). Luego, examinamos su desempeño en comparación con los modelos actuales utilizando datos empíricos y simulados. Nuestros resultados sugieren que los modelos Underhill-Zuchinni (UZ) y Pimm funcionan bien en muchas circunstancias y parecen superar al modelo DB en conjuntos de datos con una varianza muy alta. El modelo DB supera al modelo UZ en tamaños de muestra bajos de aves en muda activa y duraciones de muda más cortas y proporciona intervalos de confianza más realistas en tamaños de muestra más pequeños. El modelo DB representa una herramienta adicional novedosa para estimar la fenología de la muda, expandiendo las condiciones bajo las cuales la muda se puede estimar con precisión.
Lay Summary
Modeling molt timing has been difficult for researchers, especially when sample sizes of birds in active molt are small.
Many species of birds in museum and banding databases are represented by relatively small sample sizes of birds in active molt.
We apply a threshold model approach by developing a novel double-hinge threshold model that estimates onset and termination of molt.
We test this new model and existing models by comparing performance on simulated datasets and find that the threshold model performs well with small sample sizes of birds in active molt.
We also demonstrate the threshold model on empirical datasets and make this model available in an update to the existing R package cnhgpt.
INTRODUCTION
Avian molt is an important annual cycle event in which birds replace worn feathers to maintain basic functions such as thermoregulation, flight, and signaling (Murphy et al. 1988, Holmgren and Hedenström 1995). The timing of molt is structured by many factors, especially other life history events such as breeding (Hahn et al. 1992, Johnson et al. 2012) and migration (Yuri and Rohwer 1997), as well as the seasonal variation in abundance of food resources (Butler et al. 2002, Pyle et al. 2009, Rohwer and Rohwer 2009). Understanding avian molt phenology is important for a wide range of processes, such as understanding life history and phenotype evolution (Terrill et al. 2020, Wolfe et al. 2021), effects of anthropogenic climate change on birds (Kiat et al. 2019), and carryover effects throughout the annual cycle (Norris et al. 2004). Because molt is the most important annual cycle event for birds in terms of survival, it is essential that we are able to accurately estimate both the timing and pace of this phenological event.
Precisely describing the timing of molt is has been a statistical challenge. The timing and pace of feather molt can be studied from museum specimens, especially with spread-wing preparation (Rohwer and Rohwer 2009), as well as field studies with mist-nets (Pyle et al. 2018), and even photographs (Vieira et al. 2017). Although many species undergo multiple molts a year, all species exhibit the prebasic molt: an annual, complete, or nearly complete molt of the feathers (Humphrey and Parkes 1959, Howell 2003, Wolfe et al. 2014). One widely used method for modeling molt timing is to take advantage of the sequential molt of primaries in many birds (Stresemann and Stresemann 1966). By scoring each primary feather, a molt score can be calculated for each bird. Traditionally, researchers had difficulties fitting molt data to a linear model because disregarding birds not in active molt results in uneven variance, and so least-squares methods produced inaccurate parameter estimates (Pimm 1976, Erni et al. 2013), because the assumption of homoscedasticity is violated (Underhill and Zucchini 1988).
Pimm (1976) first outlined a solution to this violation of assumptions by reversing the dependent and independent variables. While predicting timing as a function of molt reduces the problem of heteroscedasticity and provides a useful heuristic, this approach may not be sufficient for those interested in predicting molt as a function of time. Underhill and Zucchini (1988) found an alternative solution to the violation of homoscedasticity by introducing a nonparametric model and deriving the likelihood function for that model. This Underhill–Zucchini (UZ) model can be implemented in the R package moult (Erni et al. 2013).
However, the UZ and Pimm models often fail to converge to a solution and provide the user with a model when sample sizes of birds in molt are low and sometimes produce parameter estimates that appear unrealistic when examined alongside the data (Erni et al. 2013; Figure 3). Furthermore, the UZ and Pimm models estimate molt initiation and duration but do not directly estimate molt termination, and thus the standard error for initiation and termination is assumed to be the same. Another model, introduced by Rothery and Newton (2002), fits a probit model to molt data when birds are scored as molting and non-molting. This model can independently assess molt initiation and completion times but cannot directly describe variation in molt intensity. Thus, there is a need for developing molt models that address these shortcomings.
Here, we introduce a double-hinge threshold model (hereafter DH) that independently assesses initiation and termination of molt, handles small sample sizes of molting birds, and can incorporate measures of molt intensity. In order to incorporate non-molting birds, this model does not assume even sampling throughout the year, because it is explicitly searching for change points (hinges) in the data. Scoring feathers as “already molted” or “not molted yet” gets more difficult when feathers are neither fresh nor worn, and so the most useful feather scores on non-molting birds are just before and after molt. The DH model can best incorporate information from non-molting birds into models by allowing researchers to choose data from non-molting birds that are most obviously pre or post-molt, based on feather wear, and exclude ambiguous birds. We compare the DH to the UZ and Pimm models using both simulated and empirical data and discuss the uses and relative strengths and weaknesses of each model for estimating molt phenology.
METHODS
Model Structure
The DH model is held constant outside the hinges and linear between the hinges (Figure 1). The hinges are defined as the points where the best-fit function of the model to the data changes, given our assumptions about the model shape (Fong et al. 2017; Figure 1). Given a dataset of xi,yi in=1, where x represents time (generally Julian day) and y represents molt completion (generally % of prebasic molt completed), the model estimates the location of the hinges as e1 and e2, which represent the dates of initiation and completion of molt in a population. The constants c and d represent the y values before and after e1 and e2, respectively. For the purposes of the molt models, c and d are set to 0 and 1, but this is not a requirement of the model. The function of the DH model is represented as:
The DH threshold model. The model is constant before the first hinge (e1) and after the second hinge (e2) and takes a linear function between the 2 hinges, with a slope that estimates molt intensity.
Estimation and Inference
Given the above dataset, the least-squares estimator of the hinge parameters is defined as:
To find the estimated hinge parameters, we perform a grid search over all pairs of observed xs. To speed up the search process, this part is implemented in the C programming language and can be implemented in parallel over multiple processing cores. This process can be bootstrapped to generate confidence intervals (CIs) around the estimates of the hinges in the model, which can be especially useful for hypothesis testing. We construct bootstrap CIs for the parameter estimates by the 1−a symmetric percentile methods (Hansen 2017). For example, if q1 is the 1−a empirical quantile of the absolute difference between the bootstrap parameter estimate and the point estimate, the interval for e1 is ê1±q1. We also obtain standard error estimates by q1/z1-a/2, where the denominator is the 1−a/2 quantile of the standard normal distribution. We make this model available in an update to the R package chngpt 2020.1-9 (Fong et al. 2017), which we implemented for this analysis in R 3.6.1 (R Core Development Team 2020).
Empirical Data
To demonstrate the utility of the DH model on empirical data, we compiled various data sources to fit and compare the UZ, Pimm, and DH models. To obtain a robust dataset of birds with a relatively well-known molt season, we compiled a dataset of 147 Painted Buntings (Passerina ciris) from specimens we examined at the Moore Lab of Zoology at Occidental College (MLZ) and in the field in Guiracoba, Sonora, Mexico (26.903, –108.695). We scored each primary feather from 0 (old) to 5 (fresh), with scores of 1–4 indicating 1–24%, 25–49%, 50–74%, and 75–99% active feather growth, and calculated the molt score as the sum of scores divided by the number of primaries. We then fit the DH, UZ, and Pimm models to the Painted Bunting data and used 1,000 bootstraps to generate CIs around the initiation and completion parameters for the DH model. We also fit the DH, Pimm, and UZ models to the “Sanderlings” data provided with the package moult, which represent 164 Sanderlings captured in South Africa over the winter of 1978/1979 (Underhill and Zucchini 1998). We then calculated goodness of fit of each model for both empirical datasets by predicting molt scores at each observed date for each model, then calculating an R2 value from the residuals based on observed and predicted data. Because the Pimm model flips the dependent and independent variables, the residuals output by the model are not appropriate for goodness of fit, as the model will calculate residuals along the dependent axis instead of the appropriate independent axis. Instead, we calculated residuals manually along the x axis (which is the “true” y axis, molt) and then used those residuals to calculate R2.
Data Simulation
In order to compare the DH model with the UZ and Pimm models, we simulated molt data, so that we would have prior knowledge of the parameters the models estimated. To do this, we wrote a custom molt simulation function in R 4.0.3 (R Core Development Team 2020) that is available from Terrill et al. (2021). This function takes as input the initiation date of molt, duration, variance, and sample size (n). Given these inputs, it draws n random samples from throughout the year. Samples from before molt initiation were given a 0 and after molt termination were given a 1. Between initiation and termination, the y value follows the linear function y = mx + b, where y = molt completion, m = molt intensity, x = time, and b is the y-intercept of the line between initiation and completion. To mimic natural populations, a variance term is included. For every draw of the simulation, initiation is not taken directly from the input, but drawn randomly from a normal distribution where the mean and variance are input by the user. Sampling can also be limited to a time period, with user input of start and end that correspond to the Julian day to begin and end sampling. This will be most useful to users attempting to simulate a limited field season worth of data.
Simulated Data Model Comparison
We simulated 500 datasets to compare fits of the DH, UZ, and Pimm models, composed of 100 datasets of varying sample sizes, 100 of varying initiation variance, 100 of varying molt durations, 100 with a “hidden population” of 10% of the sample size with molt initiation shifted in time away from mean molt initiation, and 100 with variation in the molt intensity. The sample size datasets were simulated with n of 5, 10, 15, 20, 30, 40, 50, 75, 100, 150, 200, and 250. The variance datasets were simulated with a σ of 1, 2, 3, 4, 8, 13, 21, 34, 55, 89, and 144 days, and the duration datasets were simulated with duration of 1, 2, 5, 10, 20, 30, 40, 50, 100, 150, 200, and 250 days. For the hidden population test, hidden populations were created with 10 individuals offset from the molt timing of the main population by –50, –30, –20, –10, 5, 0, 5, 10, 20, 30, 50, and 60 days. All nonvarying parameters were held constant across the simulations, with start and end of sampling at days 0 and 365, and molt initiation at day 150 with a standard deviation of 10 days for the sample size simulations, molt initiation day of 150 and a sample size of 50 for the variance simulations, and molt initiation of 50, sample size of 50, and variance 10 for the duration simulations. We then fit the UZ model to each set of data using the package moult (Erni et al. 2013) in R (R Core Development Team 2020) and fit Pimm models using a linear model with the date as the response variable, following Pimm (1976). We then fit the same datasets to the DH model in the package chngpt (Fong et al. 2017) in R and obtained CIs using 1,000 bootstrap replicates. We compared the parameters estimated by each model to the input parameters for the simulated data by testing whether the input parameters were within the 95% CI for each model generated.
RESULTS
Empirical Data
When we fit all 3 models to the Painted Bunting data, the DH model (R2 = 0.851; Figure 2) estimated a start date for the prebasic molt as August 1 with a 95% CI from July 27 to August 6 and a completion date of September 30 (95% CI: September 26–October 4). The UZ (R2 = 0.847) model estimated the start date of the Painted Bunting molt as August 4 (95% CI: August 2–7), and the termination date as September 29 (95% CI: September 26–October 2). The Pimm model (R2 = 0.820) estimated the start date of Painted Bunting molt as August 5, with a 95% CI of August 2–7, and a termination date of September 26, with a 95% CI from September 22 to September 30.
An example of implementation of the DH model on Painted Buntings molting in the North American Monsoon region of Western Mexico. The top plot shows the data with the DH model plotted in red, and the bottom plot shows histograms of bootstrap estimates of the initiation and termination parameters, with the 95% confidence intervals shown by vertical dashed lines.
For the Sanderling data (Figure 3), the DH model (R2 = 0.877) estimated the date of onset of molt as October 26 (95% CI: October 26–November 5) and termination on February 24 (95% CI: February 24–March 27. The UZ model (R2 = 0.851) estimated a later onset of November 9 (95% CI: November 5–13) and termination date as February 13 (95% CI: January 31–February 26). The Pimm model (R2 = 0.870) estimated onset similarly to the DH, though with a narrow CI (Figure 3), on October 29 (95% CI: October 26–November 1), with a termination date of March 4 (95% CI: March 1–17).
The UZ (A), DH (B), and Pimm (C) models fitted to the “Sanderling” dataset from Underhill and Zucchini (1988). Models are plotted as solid lines, and upper and lower bounds of 95% confidence intervals are indicated by dashed lines. The UZ model (R2 = 0.851) appears to estimate an initiation date that is too late and a termination date that is too early for these data, where the DH (R2 = 0.877) and Pimm (R2 = 0.870) models estimate these parameters with what appear to be more accurate estimates. The Pimm model presents a small confidence interval that excludes many data points, whereas the DH model provides a broader confidence interval. Note, however, that the DH model cannot explore x-axis space beyond data points, and so the lower bound of the 95% confidence interval for initiation is the same as the parameter estimate, because these both occur at the first data point.
Simulated Data
For sample size variation, the DH model performed better than the UZ model when fit to simulated data, especially at smaller sample sizes and durations (Figures 4 and 5; Supplementary Material Figures S1 and S2). Similar to the UZ model, the Pimm models failed to converge and produce parameter estimates with smaller sample sizes (about <15 individuals), because there were too few birds in active molt. However, when Pimm models did produce parameter estimates for datasets with smaller sample sizes, these models tended to better estimate parameters than UZ models, but both were still inferior to DH models in these situations (Figures 4 and 5; Supplementary Material Figures S1 and S2). The UZ models often produced narrow CIs that did not encompass the input parameters, especially in datasets with fewer than 50 data points (Figure 4). This behavior likely stemmed from an increased reliance on birds in active molt in Pimm and UZ models compared to the DH model. Variance around the mean molt onset date did little to affect model performance, with all 3 models performing adequately; however, at high variance, the DH models tended to consistently overestimate the termination parameter (Supplementary Material Figures S1 and S2), even if the input parameter was within the 95% CI (Figure 5; Supplementary Material Figures S1 and S2). The one exception to this was the DH model, which produced only 75 accurate models out of 100, when variance was 144 days.
Data simulated to approximate a natural population of birds in molt, with molt initiation set to 100 (dashed line) with a standard deviation of 10, and molt completion at 150 (dashed line) with a standard deviation of 10. The upper 3 rows show the raw data, with start and end days in dashed lines. The lower 2 rows present model fit to each dataset, with parameter estimates as black dots, with 95% confidence intervals as vertical lines. The x axis, Julian day, of the top 3 rows, corresponds to the y axis of the bottom 2 rows. Confidence intervals that continue off plot are marked with arrows. We then used the DH, UZ, and Pimm models to fit the simulated data with different sample sizes (n = 5 to n = 250) to understand how the 3 models performed in estimating the known parameters. We found that the small confidence intervals of the UZ model sometimes excluded the true parameters, whereas the larger confidence intervals of the DH model in small sample sizes encompassed the input parameters; with confidence intervals that appropriately became smaller with larger sample sizes, means that approached the known parameters with larger samples sizes, and, importantly, the known parameters were inside the confidence intervals.
Evaluation of model performance over 100 simulations per sample size. The top panel indicates the proportion of simulations in which the input parameter falls within the model’s 95% confidence interval as correct models, those that produce 95% confidence intervals that do not encompass the input parameters as incorrect, and models that failed to converge as no model. The bottom panel depicts mean predicted molt initiation and termination dates by simulation, with input parameters depicted as horizontal lines.
The model performance for the set of simulations with duration variation was similar to those fit to simulations with sample size variation. Datasets with short molt durations lacked individuals in active molt, and so Pimm models were unable to produce models below ~30 days of variation, and UZ models produced variable results below ~10 days of variation, while the DH models produced fairly consistent results across the spectrum of durations given.
When we presented the models with a “hidden population,” a population 10% the size of the original sample, shifted before or after the molt onset of the main populations, the Pimm model clearly outperformed the UZ model when the hidden population was shifted more than ~10 days, and the DH model when the hidden population was shifted more than ~30 days. Rate variance showed surprisingly little effect on model performance, and all models performed similarly, even up to a standard deviation of 5 days for the rate term.
DISCUSSION
Molt represents a critical phase of the avian annual cycle that informs our understanding of life history evolution (Holmgren and Hedenström 1995, Leu and Thompson 2002), social organization (Norris et al 2004), resource use (Pyle et al. 2009, Rohwer and Rohwer 2009, Pillar et al. 2016), speciation (Rohwer and Manning 1990), physiology (Murphy et al. 1992, Murphy and Taruscio 1995, Murphy 1996, 1988), and endocrinology (Romero et al. 2005, DesRochers et al. 2009). Similar to well-documented shifts in the timing of migration and breeding seasonality (Visser et al. 2012, Clausen and Clausen 2013, Burgess et al. 2018), climate-induced shifts in the timing of molt will provide insight into the consequences of phenological mismatch (Holmgren and Hedenström 1995). Many bird specimens in collections contain mostly non-molting birds and few individuals in active molt (R. S. Terrill personal observation). Similarly, most birds captured for bird banding projects or photographed by community scientists are not in active primary molt (Pyle et al. 2018). Therefore, in order to understand molt, especially changes in molt phenology over time, a DH model makes the best use of data from birds both in active molt and not in active molt to estimate molt timing. In this study, we aimed to develop a model that is flexible to different types of datasets and provides accurate estimates of molt parameters in datasets with smaller sample sizes of actively molting birds, as well as to ascertain which molt models work best under which conditions.
The results from our simulation study suggest that the DH model performs better than the UZ model in datasets with few actively molting birds and so should provide a useful resource to researchers studying molt timing from these types of datasets. In datasets with 1 or 0 birds in active molt (Figure 4), the UZ model was often unable to generate a model, whereas the DH model accurately estimated molt parameters, though with sometimes large CIs in small datasets (Figure 4). Molting birds tend to be underrepresented in collections and other datasets due to collector bias (Rohwer et al. 2007), variable mist-net capture rates (Remsen and Good 1996), and seasonal bias of data collection (Saracco et al. 2009). When sample sizes were below ~50 individuals, the hinge models did not see the decline in performance that the UZ and Pimm models had and even provided correct parameter estimates in >80% of models when sample sizes were small (Figures 4 and 5; Supplementary Material Figures S1 and S2). The UZ model, however, outperformed the hinge model when variance was over ~50 days (Figure 5; Supplementary Material Figures S1 and S2). This suggests that all 3 models should work well for highly seasonal species with large sample sizes of birds in active molt. However, the UZ model should be used if molt initiation variance is suspected to be high, and the DH model should be used when sample sizes are below about 50 individuals, or the dataset contains few birds in active molt. For species with little molt data, we suggest plotting data along with model fit for potential models in order to evaluate which model to use (Figure 3). Furthermore, any covariates which may be able to separate molt a population into smaller demographic groups will help to reduce improve model fit if these groups vary in molt timing.
Visual examination of the fit of the UZ model to the Sanderling data reveals apparent issues with parameter estimates (Figure 3). The UZ model produces an initiation estimate that coincides with when many birds are already well into their molt, and a termination point that occurs just a few days after many birds are still a little over halfway through molt. This may be due to differences in how models can handle sampling around molt initiation: all samples but one in the Sanderling dataset are from a single day, and the UZ model identifies this single day as the mean initiation date for molt, while the DH and Pimm models are able to predict molt initiation before this date. Both the DH and Pimm models provide parameter estimates that make more intuitive sense (Figure 3). By comparing residuals using R2 across the 3 models in the Sanderling and Painted Bunting data, we found that the DH model showed the highest R2 and thus the best fit. However, we caution interpretation of goodness of fit metrics across these models because they are fundamentally different. The Pimm model is parametric, and the UZ and DH models are not. Furthermore, the DH model directly models the observed molt index as the outcome, and so residuals may unfairly favor the DH model. Goodness-of-fit metrics like R2 and the residual sum of squares may be informative about the quality of each model, but we encourage researchers to use these metrics only in concert with plotting and visually examining the fit of each model to the data. For this reason, and because we knew the true parameters, we did not compare R2 in the simulated datasets. We also caution correct interpretation of residuals from the Pimm model. Because the Pimm model flips the dependent and independent axes, residuals output by the model are calculated along the axis of date, not molt score. Proper residuals from this model must be calculated separately, along the x axis, which is the true independent variable.
All 3 models are imperfect. The DH model cannot explore x-axis space beyond the data, and so while the CI on the upper range of dates encompasses the data points, the lower end of the CI at molt initiation is limited by occurring on the same day as the parameter estimate. This CI still looks to fit better than the Pimm model, which, while producing parameter estimates that are similar to the DH model, produces a narrow CI that excludes most of the data points (Figure 3). It is hard to know which model performed better on these data since we do not know the true parameters, but given the performance of the models on the simulated data, it appears that the broader standard errors and longer molt durations are more realistic. When we compared the UZ, DH, and Pimm models to simulated data, we found that the DH model performed better than the UZ and Pimm models in datasets with a similar sample size (Figures 4 and 5; Supplementary Material Figures S1 and S2). However, with increases in both sample size and number of birds in active molt, the UZ and Pimm models eventually approached the DH model (Figure 5; Supplementary Material Figures S1 and S2). Thus, the Sanderling dataset appears to reside within the parameter space where the DH model provides more accurate parameter estimates than the UZ model and CIs that better reflect true variance in the populations than the Pimm model.
Although imperfect, the UZ, DH, and Pimm models perform well in most circumstances. In general, all 3 models provided acceptable levels of type I error (Figure 5; Supplementary Material Figures S1 and S2), and the UZ and Pimm models outperformed the hinge model at large variances (Figure 5; Supplementary Material Figures S1 and S2). This is probably because the UZ and Pimm models estimate variance as a random variable within the model, whereas the DH model estimates variance as a fixed variable. As with the UZ model, all 3 of these models are likely most appropriate for populations of birds that have seasonal and largely synchronized molt. The protracted, unsynchronized molts of some lowland rainforest understory bird species such as Pythis albifrons likely require different statistical models to analyze. Mark-recapture of individual birds is probably the best approach for these species (Johnson et al. 2012).
The DH, UZ, and Pimm models all perform quite well in many contexts, but each has strengths and weaknesses. The DH model, specifically, outperforms the other 2 when sample sizes are small but spread out in time. These types of datasets reflect many field studies from banding or photography records and especially Natural History Museum holdings. It is our hope that by adding the DH model to the list of tools available to researchers, we will be able to expand the numbers of species and types of data that can provide useful estimates of molt duration. The DH model will be especially useful for rare or extinct species with few specimens in the museum collection. One caveat to note is that as a nonparametric model based on grid-search bootstrapping, the DH model cannot explore axis space outside of data points and so is limited if molt starts or ends outside or at the edge of the sampling period (Figure 3). We suggest that researchers who plan to use the DH model plan their sampling scheme to include birds before and after the molting period.
Molt is an essential annual activity for birds and the most important annual cycle event for survival and renewal of the functions that feathers provide (Jenni and Winkler 2020). In order to fully understand the life history of birds, we much understand the timing and patterns of molt. So far, most of our knowledge of molt comes from a relatively small number of the world’s birds (Stresemann and Stresemann 1966, Pyle 1997, Howell 2010, Jenni and Winkler 2020). Most species housed in natural history museums and banding databases are represented by relatively small sample sizes, and so estimating their molt timing has been difficult or not attempted. By introducing a DH threshold model for the timing of molt data in birds, we hope that we can open the doors of the wealth of data available to researchers interested in molt. Between community science projects like eBird, natural history collections, and banding records, there are millions of records available to study. Previously, estimation of molt timing was limited by the need to find an appropriate sample size of birds in active molt. Through simulating data and then fitting a new model, we find that the DH model even provides reasonable estimates of molt timing in populations of birds with no active molt at all (Figure 4). We hope that this model can increase not only sample sizes for species already included in molt studies, but also which species are available for molt studies. We further recommend that researchers use this article and the simulation and model fitting tools associated to determine the best model to use with their datasets.
ACKNOWLEDGMENTS
The authors would like to thank Jessica Oswald Terrill, John McCormack, and Peter Pyle for discussions about this paper.
Funding statement: R.S.T. was supported by National Science Foundation Postdoctoral Research Fellowship in Biology (NSF) PRFB 1711245 during the course of this project.
Author contributions: R.S.T. and A.J.Z. conceived of the application of the model to this system. Y.F. and R.S.T. developed and coded the model. R.S.T. and J.D.W. provided empirical data for analysis. R.S.T., Y.F., J.D.W., and A.J.Z. wrote and edited the manuscript.
Data availability: All code and data associated with this project are available at Terrill et al. (2021).




