Summary

Removal of protected species from sites scheduled for development is often a legal requirement in order to minimize the loss of biodiversity. The assumption of closure in the classic removal model will be violated if individuals become temporarily undetectable, a phenomenon commonly exhibited by reptiles and amphibians. Temporary emigration can be modeled using a multievent framework with a partial hidden process, where the underlying state process describes the movement pattern of animals between the survey area and an area outside of the study. We present a multievent removal model within a robust design framework which allows for individuals becoming temporarily unavailable for detection. We demonstrate how to investigate parameter redundancy in the model. Results suggest the use of the robust design and certain forms of constraints overcome issues of parameter redundancy. We show which combinations of parameters are estimable when the robust design reduces to a single secondary capture occasion within each primary sampling period. Additionally, we explore the benefit of the robust design on the precision of parameters using simulation. We demonstrate that the use of the robust design is highly recommended when sampling removal data. We apply our model to removal data of common lizards, Zootoca vivipara, and for this application precision of parameter estimates is further improved using an integrated model.

1 Introduction

Removal sampling is commonly used to estimate abundance of exploited populations in which captured individuals are permanently removed from a study area (Otis et al., 1978; Pollock, 1991; Hilborn and Walters, 1992). In recent years, it has been adopted as conservation management tools such as mitigation translocations (Germano et al., 2015) and removal of invasive species (Davis et al., 2016). We present our work in the context of mitigation translocations however it may be also useful for modeling invasive species removals.

Conservation mitigation translocations, the movement of protected species to prevent their extinction prior to the development of land, have become popular. Translocations involve capture, relocation and release or introduction of species from one area to another habitat. Amphibians and reptiles are frequently found at development sites in the United Kingdom, resulting in hundreds of such translocation projects annually (Germano et al., 2015). Although millions of pounds have been spent on removing protected animals out of the path of land development annually in the UK, such translocations may not meet the objective of preserving the target population as intended by legislation, and numerous reviews mention poor rates of success from these projects (e.g., Linnell et al., 1997). Failure of such projects can be a result of insufficient survey effort, resulting in too few animals being captured to establish a viable population elsewhere. Equally, many animals may go undetected at the removal site, resulting in the loss of the majority of the population when the site is developed. This raises questions concerning the amount of survey effort required to remove a significant proportion of the population. However, few publications highlight the need for improved statistical removal models to evaluate conservation actions (Griffiths et al., 2015). From a statistical perspective, the current abundance modeling approaches for removal data may give rise to misleading conclusions, as imperfect availability of individuals is ignored.

The work of this article is motivated by removal data sets which consist of counts recording the number of individuals removed at each occasion. The classic removal model was introduced by Moran (1951) and Zippin (1956) and relied on the assumptions of population closure and constant detection probability, meaning that all animals are assumed to be available for capture with the same probability throughout the study and there are no births, deaths or permanent immigrations/emigrations during the study. The basic removal model results in a geometric decline in the expected number of captured individuals over time. This classic removal model is a special case of model formula for closed populations which allows for a behavioral response to initial trapping (Otis et al., 1978).

Recently, removal models have been presented as a class of hierarchical models. Dorazio and Howard (2005) present a hierarchical removal model where the sites are assumed to have several distinct sub-sites located spatially. Chandler et al. (2011) developed a spatially explicit temporary emigration model permitting the estimation of population density for point count data such as removal sampling, double-observer sampling, and distance sampling. However, their model cannot be applied to removal data when spatial information is unavailable. More recently, Matechou et al. (2016) developed a Bayesian approach for removal data observed at a single site which allows for population renewal through birth/immigration as well as for population depletion through death/emigration in addition to the removal process. However, they assume that any emigration from the population is permanent.

Amphibians and reptiles exhibit temporary emigration as they have relatively limited dispersal abilities (e.g., up to a few hundred square metres for common lizards) and their daily activities are secretive. For instance, they can temporarily hide in shelters such as burrows and vegetation for extended periods resulting in zero detection probabilities during some sampling occasions for part of the population (Edgar et al., 2010). Their behavior is also highly weather dependent because they rely on the external environment to raise their body temperature. For example, slow-worms, Anguis fragilis, a legless lizard, primarily live underground or underneath objects lying on the ground and although they may be detected basking on the ground to maintain their temperatures, most activity takes place out of sight of ecologists as they are fossorial (Edgar et al., 2010). Such temporary emigration can be modeled as a partial hidden process between two states that describes the underlying movement pattern of individuals between the study area and an area outside of the study. The multievent framework, formulated by Pradel (2005), accommodates state uncertainty for capture–recapture data. However, no approach currently exists for modeling temporary emigration for removal studies at a single study site. Ignoring such ecological features of species results in a positively biased estimate of the number of individuals left behind after the end of removal projects.

Pollock (1982) introduced the robust design and Schwarz and Stobo (1997) generalized the model in Pollock (1982) by allowing individuals to enter and leave the population between secondary samples. The robust design accommodates multiple secondary sampling occasions within each primary sampling period, and enables the estimation of temporary emigration from the study site (Kendall et al., 1995; Kendall and Bjorkland, 2001). The population is assumed to be open only for temporary emigration between primary occasions and closed within each primary sampling period. Such emigration can be modeled as a first-order Markov process with different transition probabilities for individuals depending on which state they currently reside. Gould and Pollock (1997) first implemented a catch-effort removal model under the robust design where the population is assumed to be open for survival and/or recruitment between primary periods.

Motivated by real data and ecological features of amphibians and reptiles, we develop novel removal models that bring together both the robust design and a multievent structure for removal data using maximum likelihood inference. The objective is to provide an unbiased estimate of the number of animals remaining at the site at the end of the removal project. The article is structured as follows: in Section 2, we describe the robust design multievent removal model (RMER) framework and the integrated RMER (IRMER) for modeling multiple populations simultaneously. Because of the complexity of the proposed models, it is not possible to estimate all of the model parameters in some cases; this is known as parameter redundancy (Cole et al., 2010). Parameter redundancy of the proposed RMER is explored in Section 3. Section 4 presents simulations for the proposed removal models and explores the benefits of the new modeling approach. Within section 5, we present the results obtained from fitting IRMER models to juvenile and adult data of common lizards, Zootoca vivipara. The article concludes with a discussion in Section 6.

2 Robust Design Multi-Event Removal Model

2.1 Notation

Consider a removal experiment conducted at a site with a population of N individuals. N is the total number of animals that become exposed to sampling efforts at least once during the study. Individuals are permanently removed from the study area once captured during the study period. Suppose, there are two states in the model: individuals in state 1 are present and available for removal, while individuals in state 2 are absent from the study site and hence unavailable for capture. We assume the removal study is conducted within a robust design framework which comprises of primary periods formula and secondary sampling occasions formula within the ith primary period. The population is assumed to be open for temporary emigration only (i.e., no recruitment or permanent departures) between primary periods and we assume population closure between secondary samples within a primary session. The total number of sampling occasions is denoted by formula. The removal data that arise is an array with entry formula representing the number of individuals removed at the jth secondary occasion within the ith primary period. The total number of individuals removed is denoted by formula). We define:

  • formula: the number of animals that have not been removed by the end of the study, where formula.

  • formula: the initial state matrix, defined as a row vector, formula, where formula represents the proportion of individuals in state 1 and the complement of formula, formula, the proportion of individuals in state 2, at the start of the study.

  • formula: the state transition matrix, where formula and formula are transition probabilities from state 1 to state 2, and transition probabilities from state 2 to state 1, respectively, between the ith and formulath primary period, where formula.
  • formula: the state-event matrix, where events are “Removed” and “Not Removed” in the first and second column, respectively. States 1 and 2 are in the first and second row, respectively. formula is the probability that an individual is captured at the jth secondary sample within the ith primary period. The detection probability matrix (denoted as formula) is a diagonal matrix with elements equal to the first column vector of formula. Similarly, formula is the diagonal probability matrix of not detecting individuals.
     

Constant parameters are designated by the absence of a subscript from the corresponding time-specific parameters, for example, p denotes a constant capture probability over time.

2.2 The Likelihood Formulation

We adopt a multievent approach taking into account the robust design framework for the computation of the likelihood function. The probability of an individual being removed at the jth sample within the first primary period is, formula, where formula, formula is the formula identity matrix and formula is the column vector of two ones (and thereafter). The probability of an individual being removed at the jth sample within the ith primary period is,

where formula, formula and formula is the formula identity matrix.

The probability of not being removed by the end of the study is given by

The full product multinomial likelihood is given by

(1)

We note that the likelihood function (1) can be easily adapted to accommodate multiple species or different age/sex groups for a single species using an integrated population modeling approach (Chapter 12 Besbeas et al., 2002; McCrea and Morgan, 2014). Consider a removal experiment conducted on W species (or W categories for a single species) with the individual likelihoods defined as formula. Assuming the groups of individuals are removed independently, the full likelihood L can be written as the product of individual likelihoods, that is, formula.

The model belongs to the family of hidden Markov models (Pradel, 2005; Zucchini et al., 2016), so standard errors can be obtained from the Hessian. When some parameters lie on the boundary, non-parametric bootstrap can be used instead to compute standard errors and confidence intervals (Section 3.6 Zucchini et al., 2016).

2.3 Constraints

The model that assumes fully time-dependent parameters (formula, formula, formula, formula, formula) has formula parameters and the number of observations is K. This model is parameter redundant (Cole et al., 2010) because it has more parameters than the number of observed data points. Therefore, we cannot estimate all of the parameters individually without further constraints. The use of constraints has been suggested in the literature (e.g., Kendall et al., 1997; Williams et al., 2002). One natural way of enabling the estimation of parameters involves constraining them to be constant over time. However, capture probability is rarely constant at different sampling occasions and a closed removal model requires covariates to estimate time-varying capture probability (Williams et al., 2002). Kendall et al. (1997) recommend to constrain the last two transitions to be equal in order to provide identifiability of the last survival parameter in a time-varying temporary emigration model. We describe a list of constraints for our proposed models below:

  • Constraint related to detection probability: The time-dependent capture probability formula can be modeled using a logistic regression in terms of covariate formula at the jth secondary occasion within the ith primary period, that is, formula (Madsen and Thyregod, 2011). We label this constraint “Z”.

  • Constraint related to initial state parameter: formula, the initial state parameter formula can be constrained using this expression if we assume the population is initially allocated to two states according to the mean of the stationary distributions of the transition matrices across time which is formula, formula. We label this constraint “S” to represent that the stationary distribution is being assumed for formula.

  • Constraint related to transition probability: The superscript “t in the following constraints denotes fully time-dependent transition probabilities and the absence of the superscript indicates that constant transition parameters are assumed.

    • - formula. This constraint is equivalent to the random emigration/movement model for capture recapture data sampled with robust design as described in Kendall et al. (1997). It suggests that the probability of being in the unobservable state between the ith and formulath primary session is the same for individuals in and individuals outside the study area. In this case, we treat the transition probability formula as a free parameter to be estimated in the model, and the transition probability formula is reparameterized using the constraint, that is, formula. This constraint is labeled “R” and “formula” to denote constant and time-dependant random emigration, respectively. Furthermore, we extend this constraint in Web Appendix C to formula for formula where v is an additional parameter.

    • - formula. This is an “even flow” model (Kendall et al., 1997), where the probability of transitioning from the study area to an unobservable state is the same as the probability of moving back to the study area between the ith and formulath primary period. This is denoted by “formula” and “E” for time-varying and constant transition parameters, respectively.

    • - formula and formula; the penultimate and final transition probabilities are assumed to be equal, an approach commonly used to permit identifiability of the parameters in the first-order Markovian robust design model for capture recapture data (Kendall et al., 1997). We denote this constraint by “2” as the last two transitions are assumed to be equal. If constraint “2” is combined with constraint “formula” or “formula”, “formula” or “formula” are used.

    • - Suppose, we have W populations of interest indexed as formula. Time-dependent transition probabilities for population w, for example formula, can be modeled using formula, where formula is the logit of transition probabilities for a baseline population (numbered 1) and formula represents an additive effect of group w. The use of this constraint reduces the complexity of the model and enables better precision by sharing additional information across different groups. We denote this constraint by “formula” or “formula” to represent the additive effect for the time-varying transition probabilities. If we have more than two populations, we can use “formula” or “formula” to represent the additive effect for formula for population w. In addition, for the IRMER models the absence of the subscript in “formula” and “formula” (i.e., “formula” and “formula”) denotes that time-dependent transition probabilities are equal for both of the populations accounting for the same constraint.

We propose a model name that is composed of both the model structure and the different combinations of constraints for all models that we consider. We employ the structure of “MODEL-XYZ” in Tables 1, 2, and 3, where “X”, “Y” and “Z” respectively represent the constraint used for the initial state parameter (e.g., “S”), the transition probabilities (e.g., “formula”) and whether the capture probability is constant over time (“C”) or time-dependent in term of covariates (“Z”). In addition, “N” suggests that no constraint has been considered. “MODEL” is either “R” or “IR” for the RMER and IRMER models, respectively. For the MER models, we ignore the “MODEL-” and only use “XYZ” to denote the model. For the IRMER models, we use the population number as the subscript in “S” to denote which population is subject to the constraint “S”, for example, “formula” indicates that both initial state distributions for population 1 and 2 are assumed to be stationary. For clarity, formulaC denotes the IRMER model with no constraint for the initial state parameter, constraint “formula” for the time-varying transition probabilities and a constant capture probability.

Table 1

Parameter redundancy results of RMER and MER models, where formula and formula are constant over time. The parameter redundancy results hold for K sampling occasions, where there are two secondary samples for each primary period and formula. h is the number of parameters in the model. d is the deficiency of the model. “t” denotes time-dependence, “c” denotes a constant parameter, and “formula” denotes additive effect in terms of covariates. The PRS column represents the parameter redundancy status, where FR indicates that all parameters in the model are theoretically estimable, PR indicates that the model is parameter redundant, NR indicates that the model is full rank but near redundant for the scenarios we considered. formula indicates that the estimable combination of parameters are formula, formula and formula. formula indicates that the estimable combinations of parameters are formula and formula. formula indicates that the estimable combination of parameters is formula

RMERMER
ModelhModel codedPRSModel codedPRS
formula4R-NNC0NRNNC1PR,formula
formula5R-NNZ0NRNNZ0NR
formula3R-SNC0FRSNC0NR
formula3R-NRC0FRNRC1PR,formula
formula3R-NEC0FRNEC0NR
formula4R-SNZ0FRSNZ0NR
formula4R-NRZ0FRNRZ0NR
formula4R-NEZ0FRNEZ0NR
formula2R-SRC0FRSRC1PR,formula
formula2R-SEC0FRSEC0NR
formula3R-SRZ0FRSRZ0NR
formula3R-SEZ0FRSEZ0NR
RMERMER
ModelhModel codedPRSModel codedPRS
formula4R-NNC0NRNNC1PR,formula
formula5R-NNZ0NRNNZ0NR
formula3R-SNC0FRSNC0NR
formula3R-NRC0FRNRC1PR,formula
formula3R-NEC0FRNEC0NR
formula4R-SNZ0FRSNZ0NR
formula4R-NRZ0FRNRZ0NR
formula4R-NEZ0FRNEZ0NR
formula2R-SRC0FRSRC1PR,formula
formula2R-SEC0FRSEC0NR
formula3R-SRZ0FRSRZ0NR
formula3R-SEZ0FRSEZ0NR
Table 1

Parameter redundancy results of RMER and MER models, where formula and formula are constant over time. The parameter redundancy results hold for K sampling occasions, where there are two secondary samples for each primary period and formula. h is the number of parameters in the model. d is the deficiency of the model. “t” denotes time-dependence, “c” denotes a constant parameter, and “formula” denotes additive effect in terms of covariates. The PRS column represents the parameter redundancy status, where FR indicates that all parameters in the model are theoretically estimable, PR indicates that the model is parameter redundant, NR indicates that the model is full rank but near redundant for the scenarios we considered. formula indicates that the estimable combination of parameters are formula, formula and formula. formula indicates that the estimable combinations of parameters are formula and formula. formula indicates that the estimable combination of parameters is formula

RMERMER
ModelhModel codedPRSModel codedPRS
formula4R-NNC0NRNNC1PR,formula
formula5R-NNZ0NRNNZ0NR
formula3R-SNC0FRSNC0NR
formula3R-NRC0FRNRC1PR,formula
formula3R-NEC0FRNEC0NR
formula4R-SNZ0FRSNZ0NR
formula4R-NRZ0FRNRZ0NR
formula4R-NEZ0FRNEZ0NR
formula2R-SRC0FRSRC1PR,formula
formula2R-SEC0FRSEC0NR
formula3R-SRZ0FRSRZ0NR
formula3R-SEZ0FRSEZ0NR
RMERMER
ModelhModel codedPRSModel codedPRS
formula4R-NNC0NRNNC1PR,formula
formula5R-NNZ0NRNNZ0NR
formula3R-SNC0FRSNC0NR
formula3R-NRC0FRNRC1PR,formula
formula3R-NEC0FRNEC0NR
formula4R-SNZ0FRSNZ0NR
formula4R-NRZ0FRNRZ0NR
formula4R-NEZ0FRNEZ0NR
formula2R-SRC0FRSRC1PR,formula
formula2R-SEC0FRSEC0NR
formula3R-SRZ0FRSRZ0NR
formula3R-SEZ0FRSEZ0NR
Table 2

Parameter redundancy results for RMER models, where formula and formula are time-varying over time. The parameter redundancy results hold for an even number of total sampling occasions K, where there are two secondary samples for each primary period and formula. h is the number of parameters in the model. d is the deficiency of the model. “t” denotes time-dependence, “c” denotes a constant parameter, and “formula” denotes additive effect in terms of covariates. “formula” denotes additive effect for transition probabilities. PRS represents the parameter redundancy status, where FR indicates that all parameters in the model are theoretically estimable, PR indicates that the model is parameter redundant, NR indicates that the model is full rank but near redundant for the scenarios we considered. All results hold for formula occasions, except that * indicates the results hold for formula occasions and ** indicates the results hold for formula occasions

Model codeModelhdPRS
RMER model
formulaCformulaKformulaPR
formulaZformulaformulaformulaPR
formulaCformulaformulaformulaPR
formulaCformulaformula0NR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR *
formulaZformulaKformulaPR
formulaZformulaformula0NR
formulaZformulaformula0NR
formulaZformulaformulaformulaPR *
formulaCformulaformula0FR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR **
formula Cformulaformula0NR
formula Cformulaformula0NR
formulaZformulaformula0FR
formulaZformulaformula0NR
formula ZformulaformulaformulaPR **
formula Zformulaformula0NR
formula Zformulaformula0NR
formula Cformulaformula0FR
formula Zformulaformula0FR
IRMER model
formulaCformulaformula0NR
formulaCformulaformula0FR
formulaformulaformula0FR
formulaformulaformula0FR
Model codeModelhdPRS
RMER model
formulaCformulaKformulaPR
formulaZformulaformulaformulaPR
formulaCformulaformulaformulaPR
formulaCformulaformula0NR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR *
formulaZformulaKformulaPR
formulaZformulaformula0NR
formulaZformulaformula0NR
formulaZformulaformulaformulaPR *
formulaCformulaformula0FR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR **
formula Cformulaformula0NR
formula Cformulaformula0NR
formulaZformulaformula0FR
formulaZformulaformula0NR
formula ZformulaformulaformulaPR **
formula Zformulaformula0NR
formula Zformulaformula0NR
formula Cformulaformula0FR
formula Zformulaformula0FR
IRMER model
formulaCformulaformula0NR
formulaCformulaformula0FR
formulaformulaformula0FR
formulaformulaformula0FR
Table 2

Parameter redundancy results for RMER models, where formula and formula are time-varying over time. The parameter redundancy results hold for an even number of total sampling occasions K, where there are two secondary samples for each primary period and formula. h is the number of parameters in the model. d is the deficiency of the model. “t” denotes time-dependence, “c” denotes a constant parameter, and “formula” denotes additive effect in terms of covariates. “formula” denotes additive effect for transition probabilities. PRS represents the parameter redundancy status, where FR indicates that all parameters in the model are theoretically estimable, PR indicates that the model is parameter redundant, NR indicates that the model is full rank but near redundant for the scenarios we considered. All results hold for formula occasions, except that * indicates the results hold for formula occasions and ** indicates the results hold for formula occasions

Model codeModelhdPRS
RMER model
formulaCformulaKformulaPR
formulaZformulaformulaformulaPR
formulaCformulaformulaformulaPR
formulaCformulaformula0NR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR *
formulaZformulaKformulaPR
formulaZformulaformula0NR
formulaZformulaformula0NR
formulaZformulaformulaformulaPR *
formulaCformulaformula0FR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR **
formula Cformulaformula0NR
formula Cformulaformula0NR
formulaZformulaformula0FR
formulaZformulaformula0NR
formula ZformulaformulaformulaPR **
formula Zformulaformula0NR
formula Zformulaformula0NR
formula Cformulaformula0FR
formula Zformulaformula0FR
IRMER model
formulaCformulaformula0NR
formulaCformulaformula0FR
formulaformulaformula0FR
formulaformulaformula0FR
Model codeModelhdPRS
RMER model
formulaCformulaKformulaPR
formulaZformulaformulaformulaPR
formulaCformulaformulaformulaPR
formulaCformulaformula0NR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR *
formulaZformulaKformulaPR
formulaZformulaformula0NR
formulaZformulaformula0NR
formulaZformulaformulaformulaPR *
formulaCformulaformula0FR
formulaCformulaformula0NR
formulaCformulaformulaformulaPR **
formula Cformulaformula0NR
formula Cformulaformula0NR
formulaZformulaformula0FR
formulaZformulaformula0NR
formula ZformulaformulaformulaPR **
formula Zformulaformula0NR
formula Zformulaformula0NR
formula Cformulaformula0FR
formula Zformulaformula0FR
IRMER model
formulaCformulaformula0NR
formulaCformulaformula0FR
formulaformulaformula0FR
formulaformulaformula0FR
Table 3

List of models fitted to common lizard data. h is the number of parameters in the model, ML is the value of the maximized loglikelihood, formulaAIC are computed as the difference in the AIC value between the current and the best model, where AIC is the Akaike information criterion calculated as formula. Only the 10 models with lowest AIC are shown. “t” denotes time-dependence, “c” denotes a constant parameter, and “formula” denotes additive effect in terms of covariates

Model codeModelCovariatehMLformulaAIC
IRMER model
formulaformulaPrecipitation51−259.530
formulaformulaPrecipitation52−259.852.64
formulaformulaPrecipitation50−261.942.82
formulaformulaAverage humidity52−262.047.02
formulaformula-50−269.8617.46
formulaformulaMean air temperature51−268.9818.90
formulaformulaMax air temperature52−268.0919.12
formulaformula-51−269.7320.39
formulaformulaMax air temperature51−270.7122.36
formulaformulaMin air temperature51−271.2123.36
GRM model
G–ZformulaPrecipitation4−387.83161.40
G–Cformula-3−390.05163.84
Model codeModelCovariatehMLformulaAIC
IRMER model
formulaformulaPrecipitation51−259.530
formulaformulaPrecipitation52−259.852.64
formulaformulaPrecipitation50−261.942.82
formulaformulaAverage humidity52−262.047.02
formulaformula-50−269.8617.46
formulaformulaMean air temperature51−268.9818.90
formulaformulaMax air temperature52−268.0919.12
formulaformula-51−269.7320.39
formulaformulaMax air temperature51−270.7122.36
formulaformulaMin air temperature51−271.2123.36
GRM model
G–ZformulaPrecipitation4−387.83161.40
G–Cformula-3−390.05163.84
Table 3

List of models fitted to common lizard data. h is the number of parameters in the model, ML is the value of the maximized loglikelihood, formulaAIC are computed as the difference in the AIC value between the current and the best model, where AIC is the Akaike information criterion calculated as formula. Only the 10 models with lowest AIC are shown. “t” denotes time-dependence, “c” denotes a constant parameter, and “formula” denotes additive effect in terms of covariates

Model codeModelCovariatehMLformulaAIC
IRMER model
formulaformulaPrecipitation51−259.530
formulaformulaPrecipitation52−259.852.64
formulaformulaPrecipitation50−261.942.82
formulaformulaAverage humidity52−262.047.02
formulaformula-50−269.8617.46
formulaformulaMean air temperature51−268.9818.90
formulaformulaMax air temperature52−268.0919.12
formulaformula-51−269.7320.39
formulaformulaMax air temperature51−270.7122.36
formulaformulaMin air temperature51−271.2123.36
GRM model
G–ZformulaPrecipitation4−387.83161.40
G–Cformula-3−390.05163.84
Model codeModelCovariatehMLformulaAIC
IRMER model
formulaformulaPrecipitation51−259.530
formulaformulaPrecipitation52−259.852.64
formulaformulaPrecipitation50−261.942.82
formulaformulaAverage humidity52−262.047.02
formulaformula-50−269.8617.46
formulaformulaMean air temperature51−268.9818.90
formulaformulaMax air temperature52−268.0919.12
formulaformula-51−269.7320.39
formulaformulaMax air temperature51−270.7122.36
formulaformulaMin air temperature51−271.2123.36
GRM model
G–ZformulaPrecipitation4−387.83161.40
G–Cformula-3−390.05163.84

3 Parameter Redundancy

A model is parameter redundant if it is impossible to estimate all the parameters individually, because the model could be reparameterised in terms of a small number of parameters (Catchpole and Morgan, 1997). The techniques for detecting parameter redundancy have been developed for a wide range of applications, (Catchpole and Morgan, 1997; Cole et al., 2010; Cole and McCrea, 2016).

We employ the methods of Cole et al. (2010) to assess whether our constrained RMER and IRMER models are parameter redundant by forming a derivative matrix formula, where formula is a vector of parameter combinations that represents the structure of the model for a set of parameters formula. Once formula is formed, we can determine whether or not the model is parameter redundant by calculating the rank of D, r. The deficiency of the model, d, is the number of parameters, h, minus r. If formula the model is parameter redundant, otherwise if formula the model is not parameter redundant, and termed full rank. We also show which combinations of parameters are estimable if the model is parameter redundant by solving a system of first-order partial differentiation equations (Cole et al., 2010). This method is explained in full in Web Appendix A, which includes an illustrative example for the model R-NNC. Software Maple has been used for the symbolic algebra computations.

We obtain general parameter redundancy results for our removal models with an even number of total sampling occasions with two secondary samples within each primary period in Tables 1 and 2. In addition to numbering all the models with model codes, we also denote different models by their constituent parameters. We show the estimable combinations of parameters for parameter redundant models in Table 1 only in the article and more results for Table 2 are available in Web Appendix A.

Catchpole et al. (2001) found that a full rank model can perform badly in practice, because the model is close to a nested parameter redundant model; this is known as near redundancy. In a parameter redundant model the expected information matrix will be singular (Rothenberg, 1971). As a result, the expected information matrix will have at least one zero eigenvalue. In a near redundant model, the smallest eigenvalue will be close to zero rather than exactly zero (Catchpole et al., 2001). Hence, even some of the full rank models in Tables 1 and 2 can give biased results regardless of how large the sample sizes are (see Web Appendix A). We list the parameter redundancy status for each model in Tables 1 and 2 to indicate whether the model is parameter redundant, full rank or near redundant.

Considering the results in Table 1, the specified RMER models are full rank for all cases. As a result, the robust design improves the estimation of the models in general, as the secondary samples within each primary period provide an additional source of information about capture probability. We also observe that the MER models are either parameter redundant or near redundant. Table 2 shows that all models with both fully time-dependent transition probabilities formula and formula (i.e., without any constraint for formula and formula) are parameter redundant. formulaC becomes full rank with constraint “formula”. Furthermore, the issue of near redundancy for formulaC is overcome if constraint “S” is used for the initial state parameter.

We also investigate the parameter redundancy of IRMER models using two populations (numbered 1 and 2) with results shown in the last rows of Table 2. All of these IRMER models are determined to be full rank. However, we find formulaC is near redundant. Hence, we conclude that we need to apply at least constraints “S” and “formula” in order to avoid parameter redundant and near redundant models.

4 Simulation Results

The aim of these simulations is to examine the precision of maximum likelihood estimators for RMER, MER, and IRMER within the likely range of ecological applications of the model. Three simulation settings are investigated, where RMER and MER models under Setting 4.1 have constant transition probabilities, RMER models under Setting 4.2 have time-varing transition probabilities and Setting 4.3 presents results obtained from IRMER models. For Setting 4.1 and 4.2, 500 simulations are conducted for a study with formula individuals, formula or formula, with formula or formula primary periods and 2 secondary sampling occasions within each primary period (i.e., formula). For Setting 4.3, we conduct simulations for IRMER, where 500 replicates are simulated for a removal study with two populations (numbered 1 and 2) where the population sizes are formula and formula. We consider eight scenarios because the performance of the models depends on the relationship between formula and formula and on capture probability. We only show the simulation results under Scenarios 1 and 2 in the article. More simulations are available in Web Appendix B. The true values of parameters used in the simulations are presented in the subsequent sections.

  • Scenario 1: low capture probability and individuals tend to stay offsite,

  • Scenario 2: low capture probability and individuals tend to stay onsite.

4.1 Setting 4.1 RMER/MER with Constant Transition Probabilities

We only show results from R-NNC, R-SNC, R-NRC, R-SRC, and SRC in Table 1. We are interested in the precision of the estimators for the constraints used/not used for the initial state parameter and the transition probability for the RMER models. Furthermore, we demonstrate the distribution of the estimates for R-NNC, which is classified as a near redundant model in Table 1. In addition, we show the results for the SRC model where both “S” and “R” are taken into account but without the robust design for comparison.

The true value of the constant capture probability is 0.3 under both Scenarios 1 and 2. In addition, we use formula, formula in Scenario 1, when individuals tend to move to the unobservable state, while in Scenario 2, we define formula, formula so that individuals tend to move to the observable state. The true value of the initial state parameter formula is defined as the first element of the stationary distribution of the corresponding transition matrix.

As shown in Figure 1, it is clear that estimation of population size N is reliable for all models when formula, although long positive tails are recognized under Scenario 1 when individuals tend to emigrate offsite and capturing them becomes impossible. When formula, longer positive tails are observed and the estimates of population size are negatively biased for models R-NNC and R-NRC under Scenario 1. The results for detection probability p show that the use of the robust design considerably improves the performance in terms of bias, compared with the SRC model that exhibits large bias for p in all cases. The bias in estimating formula is modest for R-SRC with both constraints “S” and “R” for all cases even when formula. In contrast, estimation of formula for SRC without the robust design is not reliable for any cases. In addition, R-NNC yields biased estimates for formula due to near redundancy. Overall, we conclude that R-SRC performs the best as shown in Figure 1.

Estimated population size N (A), capture probability p (B) and transition probability  (C) for simulations with  and  sampling occasions under simulation Setting 4.1. The black horizontal lines are the true values used for simulation.
Figure 1

Estimated population size N (A), capture probability p (B) and transition probability formula (C) for simulations with formula and formula sampling occasions under simulation Setting 4.1. The black horizontal lines are the true values used for simulation.

4.2 Setting 4.2 RMER with Time-Varying Transition Probabilities

RMER models with constant transition probabilities may not be realistic for real data as individuals may tend to stay in one state at times. Here we investigate the full-rank models formulaC, formulaC, formulaC in Table 2 by simulation.

The true value of constant capture probability p is 0.3 for both Scenarios 1 and 2. In addition, the true transition probabilities formula for simulating the data for RMER models under Scenario 1 when formula are (0.8, 0.7, 0.8, 0.3, 0.6, 0.7, 0.8, 0.6, 0.6) where individuals tend to stay in the area outside the study for the majority of times which is more realistic for real data. Furthermore, the vector of true formula is defined as (0.4, 0.4, 0.8, 0.4, 0.4, 0.8, 0.4, 0.4, 0.4) under Scenario 2. For a study with formula occasions, we specify the true formula as (0.7, 0.2, 0.7, 0.7) for Scenario 1, and (0.3, 0.8, 0.3, 0.3) for Scenario 2. We only display the estimates of formula for the first two and the last two transitions in the article. The value of the initial state parameter formula is set to be the mean of the first element of the stationary distributions of transition matrices across time.

The bias in the estimation of N is negative for formulaC. The estimates of N are slightly biased low for formulaC under Scenario 1 and unbiased for other Scenarios (See Figure 2 and Web Appendix B). The results of the estimated transition probabilities in Figure 2 suggest unbiased estimates obtained from formulaC and negative bias from formulaC. This is expected as R-SRC with constant formula performs better than R-NRC under Setting 4.1. The performance of formulaC with time-varying formula is reliable for real data in practice. None of the remaining RMER models in Table 2 yields unbiased estimates of time-dependent transition probabilities apart from formulaC. We observe biased estimates for the RMER models with constraint “formula” due to near redundancy (see more simulation results in Web Appendix B). Therefore, we conclude that good performance can be obtained for the RMER models with time-varying formula with at least the combination of constraints “S” and “formula”.

Simulation results of Setting 4.2 with  and  sampling occasions. (A) Estimates of population size N under Scenarios 1 and 2, (B) estimates of transition probabilities for the first two and last two transitions under Scenario 1. The black horizontal lines are the true values used for simulation.
Figure 2

Simulation results of Setting 4.2 with formula and formula sampling occasions. (A) Estimates of population size N under Scenarios 1 and 2, (B) estimates of transition probabilities for the first two and last two transitions under Scenario 1. The black horizontal lines are the true values used for simulation.

4.3 Setting 4.3 IRMER with Time-Varying Transition Probabilities

Under this setting, we investigate the IRMER models in Table 2. We only show the results obtained for population size for formulaC and formulaC in the article. The true values of parameters for population 1 are the same as those under Setting 4.2. We define the true value of the additive effect formula for population 2 to be formula when constraint “formula” is used.

The estimates of population size for two populations and the additive effect formula obtained from IRMER modeling are displayed in Figure 3. Estimation is unbiased for formula sampling occasions, while formulaC slightly underestimates population sizes under Scenario 1 when formula. Moreover, estimation of formula is unbiased under Scenario 1 for both formula and formula. However, when we have a small number of sampling occasions (formula) under Scenario 2, formula is slightly underestimated and estimation becomes unbiased for formula.

Estimated population sizes N (A), M (B) and additive effect  on transition probabilities (C) for simulations under Setting 4.3 with  and  sampling occasions. The black horizontal lines are the true values used for simulating the data.
Figure 3

Estimated population sizes N (A), M (B) and additive effect formula on transition probabilities (C) for simulations under Setting 4.3 with formula and formula sampling occasions. The black horizontal lines are the true values used for simulating the data.

We also show that the classic geometric removal model (denoted as GRM) overestimates the number of animals remaining at the study area at the end of the study and underestimate capture probability for the simulated data exhibiting temporary emigration with the robust design sampling protocol (Web Appendix B).

5 Application to Common Lizards

Removal of common lizards, Zootoca vivipara, was conducted daily in both the morning and afternoon from the September 13, 2010 to October 29, 2010. There were 94 sampling occasions, with 13 missed visits. 334 common lizards were captured and permanently removed from the study site. The removals consisted of 274 juvenile and 60 adult individuals. Eight covariates: mean/maximum/minimum air temperature, precipitation, average/maximum/minimum humidity and season stage, were recorded daily.

Migration and dispersal of reptiles and amphibians are generally limited during daytime (Edgar et al., 2010), therefore we used a robust design approach for our analysis, with days corresponding to primary periods and the repeated samples within days being the secondary sampling occasions. Hence, there are formula primary occasions and two secondary samples within each primary period. Given the nature of the available data, we assume juveniles and adults are sampled independently and hypothesize that their transition probabilities may be related. This is ecologically sensible since the dynamics exhibited by the population are likely to be driven by external influences. We define the global likelihood to be the product of individual likelihoods, that is, formula where formula and formula are the likelihood for juvenile and adult populations respectively, and are both of the form described in equation (1).

We also considered incorporating the climatic covariates using a logistic regression to account for the time variation exhibited within the transition and capture probabilities. The likelihood is maximized using the optimizer optim in R (R Core Team, 2017). The results from performing model selection on the integrated data are displayed in Table 3, where the 10 IRMER models with the lowest Akaike information criterion (AIC) values are shown.

All of the top 10 IRMER models ranked by AIC include fully time-dependent transition probabilities formula for juvenile individuals. As there are many boundary estimates of transition probabilities, as shown in Web Appendix E, we computed standard errors and confidence intervals empirically using non-parametric bootstrap (500 resamples). The procedure for the non-parametric bootstrap is described in Web Appendix D. The estimate of the number of individuals not captured is 57.74 (SE 146.02, 95% bootstrap CI 31.75, 473.33) for juveniles and 3.36 (SE 49.76, 95% bootstrap CI 0.02, 180.11) for adults. The poor precision of population sizes is likely due to the small sample sizes, low capture probability and low availability of individuals. The time-varying capture probability is estimated to be 0.18 on average where the estimated intercept and slope of the logit are 1.98 (SE 0.60, 95% bootstrap CI −3.60, −1.46) and 1.61 (SE 0.37, 95% bootstrap CI 0.98, 2.32), respectively. The estimated additive effect of transition probabilities for adults is −0.97 (SE 1.54, 95% bootstrap CI −1.84, 2.75). The mean of formula are 0.70 and 0.57 for juveniles and adults, respectively, so the common lizards, we analyzed tend to stay in an unobservable state on average. The results for the transition probabilities are shown in Web Appendix E. Standard errors for the formula are large for some primary sessions. The complexity of the model can be reduced by defining the fully time-dependent formula to follow parametric distributions such as a Beta distribution. However, no improvement in relative fit to the common lizard data is observed.

A visual assessment of observed and expected numbers provided no evidence of systematic lack of fit of the selected model (see Web Appendix E). Juveniles exhibit more powers of dispersal than adults as they can rapidly colonize new habitats which often become available adjacent to already occupied sites (Edgar et al., 2010). These characteristics are supported by the results from our top model, suggesting that the formula of juveniles are higher than for adults. None of the available covariates collected during the study adequately accounted for the time-dependent transition probabilities. However, the logistic regression of time-varying capture probabilities in terms of precipitation is supported by our top ranked model.

We also considered the GRM model in Table 3, where G–C and G–Z represent the geometric removal model with constant and time-varying capture probabilities in terms of covariates respectively, where the same capture probability at each sampling occasion for both populations is assumed. The estimate of the population size obtained by the G–Z model is 161.12 (SE 52.00, 95% bootstrap CI 51.91, 230.41) for juveniles and 34.90 (SE 18.41, 95% bootstrap CI 0.01, 71.65) for adults. The GRM models give larger estimates for the sizes of both populations.

6 Discussion

Removal models have considerable potential to inform the design and execution of removals of protected/invasive species, but need to take into account temporary emigration to reduce the risk of biased estimates of the number of animals not captured. We have extended the classic removal model to accommodate a robust design sampling strategy and multi-event framework with one unobservable state. Our work is motivated by real data from translocation projects and it could also be adopted for removals of invasive species. Simulations and theoretical parameter redundancy assessment have demonstrated that the RMER models perform better than MER models. Our approaches yield unbiased estimates of the number of individuals in the populations residing in the sampling area when the sample size is large enough.

The adequate design of sampling protocols is fundamental at the data analysis stage. Mitigating a problem with study design is always highly recommended as the design of a study will govern how data can be analyzed. In this article, we have demonstrated that the use of the robust design for removal data can overcome issues of parameter redundancy and enable the estimation of transition probabilities between observable and unobservable states at a single study site. In addition, RMER models result in estimators of population size and capture probability which have better properties than MER under the standard sampling protocol. Therefore, we would like to raise the awareness of good study design for removal experiments as in our experience only a small number of removal studies have repeated samplings conducted within a day; however if sampling strategies were simply altered to allow for multiple secondary samples, uncertainty in estimates of detection and transition probabilities would reduce considerably.

The general RMER model with fully time-dependent parameters is parameter redundant. Although the assumption of constant parameters across time is the most straightforward way of constraining models in order to enable estimation, using simulation we have demonstrated that the best performing models with least bias incorporate at least two constraints ’ constraint “formula”, which denotes random emigration, and constraint “S,” which denotes that the initial state parameter formula is constrained as the first element of the mean of the stationary distributions of the transition matrices across time.

Our proposed RMER model is general and can be extended to the IRMER modeling approach which permits the analysis of multiple data sources, exploiting the relationship between parameters expected between related populations. Furthermore, we have applied the IRMER model to two age groups (adults and juveniles) of common lizard data and the results align with our understanding of the natural history of this species.

We define the population size N as the total number of individuals that could possibly be captured across the study. However, if some individuals are unlikely to enter the study area and are unavailable for capture over the whole duration of the study, then they cannot be included in the estimate. Practical aspects of study design such as an efficient distribution of traps to make sure the arrangement of traps exposes as many individuals as possible, should be considered, to overcome this issue as much as is feasible.

As permanent departure from the population is possible during the study, it may be useful to model mortality and temporary emigration simultaneously. We have considered a constant survival probability as an extra parameter in our proposed model in Web Appendix F. We find that the formulaC model with a survival probability is full-rank but parameters are not identifiable for most of the simulation scenarios due to near redundancy. Therefore, we assume that all individuals survive when analyzing the real data as translocation studies are usually conducted over a relatively short period of time (up to months). Improved estimates of survival probability can be obtained from removal data by collecting ancillary information during removal sampling, for example, concurrent capture–recapture sampling as suggested in Gould and Pollock (1997) or a few capture–recapture sampling occasions prior to removal sampling which is a design we are currently investigating.

Spatial information has been widely used in the capture recapture literature (Royle et al., 2013), however, there is no spatial information on sampling available for the real data. Translocation projects are generally poorly documented globally. In the UK, less than 10% of submitted reports contain detailed population monitoring data and one-half of the cases on file lack any type of report (Germano et al., 2015). In order to optimize the success of translocation studies, we should not only design the study properly, but also record any informative component which may help evaluate the sampling methodologies.

Acknowledgements

We thank Byron Morgan, the editor, the associate editor, and two reviewers for their insightful comments on a previous version of the manuscript. Ming Zhou is funded by the Graduate Teaching Assistantship, University of Kent. Rachel S. McCrea was funded by Natural Environment Research Council fellowship grant NE/J0.18473/1.

References

Besbeas
,
P.
,
Freeman
,
S. N.
,
Morgan
,
B. J. T.
, and
Catchpole
,
E. A.
(
2002
).
Integrating mark-recapture-recovery and census data to estimate animal abundance and demographic parameters
.
Biometrics
 
58
,
540
547
.

Catchpole
,
E.
and
Morgan
,
B. J. T.
(
1997
).
Detecting parameter redundancy
.
Biometrics
 
84
,
187
196
.

Catchpole
,
E. A.
,
Kgosi
,
P. M.
, and
Morgan
,
B. J. T.
(
2001
).
On the near-singularity of models for animal recovery data
.
Biometrics
 
57
,
720
726
.

Chandler
,
R. B.
,
Royle
,
J. A.
, and
King
,
D. J.
(
2011
).
Inference about density and temporary emigration in unmarked populations
.
Ecology
 
92
,
1429
1435
.

Cole
,
D.
and Mc
Crea
,
R. S.
(
2016
).
Parameter redundancy in discrete state-space and integrated models
.
Biometrical Journal
 
58
,
1071
1090
.

Cole
,
D. J.
,
Morgan
,
B. J. T.
, and
Titterington
,
D. M.
(
2010
).
The parametric structure of models
.
Mathematical Biosciences
 
228
,
16
30
.

Davis
,
A. J.
,
Hooten
,
M. B.
,
Miller
,
R. S.
,
Farnsworth
,
M. L.
,
Lewis
,
J.
,
Moxcey
,
M.
, et al. (
2016
).
Inferring invasive species abundance using removal data from management actions
.
Ecological Applications
 
26
,
2339
2346
.

Dorazio
,
R. M.
and
Howard
,
L. J.
(
2005
).
Improving removal-based estimates of abundance by sampling a population of spatially distinct subpopulations
.
Biometrics
 
61
,
1093
1101
.

Edgar
,
P.
,
Foster
,
J.
, and
Baker
,
J.
(
2010
).
Reptile Habitat Management Handbook
.
Bournemouth
:
Amphibian and Reptile Conservation
.

Germano
,
J. M.
,
Field
,
K. J.
,
Griffiths
,
R. A.
,
Clulow
,
S.
,
Foster
,
J.
,
Harding
,
G.
, et al. (
2015
).
Mitigation-driven translocations: Are we moving wildlife in the right direction?
 
Frontiers in Ecology and the Environment
 
13
,
100
105
.

Gould
,
W. R.
and
Pollock
,
K. H.
(
1997
).
Catch-effort estimation of population parameters under the robust design
.
Biometrics
 
53
,
207
216
.

Griffiths
,
R.
,
Foster
,
J.
,
Wilkinson
,
J.
, and
Sewell
,
D.
(
2015
).
Science, statistics and surveys: A herpetological perspective
.
Journal of Applied Ecology
 
52
,
1413
1417
.

Hilborn
,
R.
and
Walters
,
C. J.
(
1992
).
Quantitative Fisheries Stock Assessment: Choice, Dynamics, and Uncertainty
.
New York
:
Chapman and Hall
.

Kendall
,
L. W.
and
Bjorkland
,
R.
(
2001
).
Using open robust design models to estimate temporary emigration from capture recapture data
.
Biometrics
 
57
,
1113
1122
.

Kendall
,
L. W.
,
Nichols
,
J. D.
, and
Hines
,
J. E.
(
1997
).
Estimating temporary emigration using capture–recapture data with pollock's robust design
.
Ecology
 
78
,
563
578
.

Kendall
,
L. W.
,
Pollock
,
K. H.
, and
Brownie
,
C.
(
1995
).
A likelihood-based approach to capture–recapture estimation of demographic parameters under the robust design
.
Biometrics
 
51
,
293
308
.

Linnell
,
J. D. C.
,
Aanes
,
R.
,
Swenson
,
J. E.
,
Odden
,
J.
, and
Smith
,
M. E.
(
1997
).
Translocation of carnivores as a method for managing problem animals: A review
.
Conservation Biology
 
6
,
1245
1257
.

Madsen
,
H.
and
Thyregod
,
P.
(
2011
).
Introduction to General and Generalized Linear Models
.
London, UK
:
Chapman and Hall
.

Matechou
,
E.
, Mc
Crea
,
R. S.
,
Morgan
,
B. J. T.
,
Nash
,
D. J.
, and
Griffiths
,
R. A.
(
2016
).
Open models for removal data
.
The Annals of Applied Statistics
 
10
,
1572
1589
.

Mc Crea
,
R. S.
and
Morgan
,
B. J. T.
(
2014
).
Analysis of capture–Recapture Data
.
London, UK
:
Chapman and Hall
.

Moran
,
P. A. P.
(
1951
).
A mathematical theory of animal trapping
.
Biometrika
 
38
,
307
311
.

Otis
,
D. L.
,
Burnham
,
K. P.
,
White
,
G. C.
, and
Anderson
,
D. R.
(
1978
).
Statistical inference from capture data on closed animal populations
.
Wildlife Monograph
 
62
.

Pollock
,
K.
(
1991
).
Modelling capture, recapture, and removal statistics for estimation of demographic parameters for fish and wildlife populations: Past, present, and future
.
Journal of the American Statistical Association
 
86
,
225
238
.

Pollock
,
K. H.
(
1982
).
A capture–recapture design robust to unequal probability of capture
.
Journal of Wildlife Management
 
46
,
752
757
.

Pradel
,
R.
(
2005
).
Multievent: An extension of multistate capture–recapture models to uncertain states
.
Biometrics
 
61
,
442
447
.

R Core Team
(
2017
).
R: A Language and Environment for Statistical Computing
.
Vienna, Austria
:
R Foundation for Statistical Computing
.

Rothenberg
,
T. J.
(
1971
).
Identification in parametric models
.
Econometrica
 
39
,
577
591
.

Royle
,
A. J.
,
Chandler
,
R. B.
,
Sollmann
,
R.
, and
Gardner
,
B.
(
2013
).
Spatial Capture–Recapture
.
Waltham, MA
:
Academic Press
.

Schwarz
,
C. J.
and
Stobo
,
W. T.
(
1997
).
Estimating temporary migration using the robust design
.
Biometrics
 
53
,
178
194
.

Williams
,
B. K.
,
Nichols
,
J. D.
, and
Conroy
,
M. J.
(
2002
).
Analysis and Management of Animal Populations
.
New York, USA
:
Academic Press
.

Zippin
,
C.
(
1956
).
An evaluation of the removal method of estimating animal populations
.
Biometrics
 
12
,
163
169
.

Zucchini
,
W.
,
MacDonald
,
I. L.
, and
Langrock
,
R.
(
2016
).
Hidden Markov Models for Time Series: An introduction using R
, second edition.
Boca Raton
:
CRC Press
.

This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Supplementary data