-
PDF
- Split View
-
Views
-
Cite
Cite
Lisa Chong, Tobias K Mildenberger, Merrill B Rudd, Marc H Taylor, Jason M Cope, Trevor A Branch, Matthias Wolff, Moritz Stäbler, Performance evaluation of data-limited, length-based stock assessment methods, ICES Journal of Marine Science, Volume 77, Issue 1, January-February 2020, Pages 97–108, https://doi.org/10.1093/icesjms/fsz212
- Share Icon Share
Abstract
Performance evaluation of data-limited, length-based methods is instrumental in determining and quantifying their accuracy under various scenarios and in providing guidance about model applicability and limitations. We conducted a simulation–estimation analysis to compare the performance of four length-based stock assessment methods: length-based Thompson and Bell (TB), length-based spawning potential ratio (LBSPR), length-based integrated mixed effects (LIME), and length-based risk analysis (LBRA), under varying life history, exploitation status, and recruitment error scenarios. Across all scenarios, TB and LBSPR were the most consistent and accurate assessment methods. LBRA is highly biased, but precautionary, and LIME is more suitable for assessments with time-series longer than a year. All methods have difficulties when assessing short-lived species. The methods are less accurate in estimating the degree of recruitment overfishing when the stocks are severely overexploited, and inconsistent in determining growth overfishing when the stocks are underexploited. Increased recruitment error reduces precision but can decrease bias in estimations. This study highlights the importance of quantifying the accuracy of stock assessment methods and testing methods under different scenarios to determine their strengths and weaknesses and provides guidance on which methods to employ in various situations.
Introduction
Fisheries are considered data-limited if there are insufficient data to conduct a comprehensive quantitative, model-based stock assessment to estimate time-series of biomass and fishing mortality relative to their reference points (Dowling et al., 2019). Nevertheless, even with limited data, some aspects of stock status can be inferred. Data-limited assessment methods are increasingly used for management purposes to report on the regional status of fisheries across many stocks and to assess the status of individual data-limited stocks as inputs to management decisions (Dowling et al., 2015, 2019; Chrysafi and Kuparinen, 2016). In data-limited fisheries, length-frequency data from commercial catches are often the primary data type collected because they are relatively economical and easy to collect (Pilling et al., 2008; Hordyk et al., 2015a; Mildenberger et al., 2017). As a result, numerous length-based methodologies have been developed. Prominent methods include the length-based Thompson and Bell (TB) model (Thompson and Bell, 1934), length-based spawning potential ratio (LBSPR; Hordyk et al., 2015b), length-based integrated mixed effects (LIME; Rudd and Thorson, 2018), and length-based risk analysis (LBRA; Ault et al., 1998, 2008, 2019).
TB is a yield-per-recruit (YPR) model that evaluates stock status relative to fishing and selectivity reference levels using length composition data (Thompson and Bell, 1934; Sparre and Venema, 1998; Mildenberger et al., 2017). LBSPR is a well-known length-based model that assesses stock status by comparing the spawning potential as measured through the length composition data to that expected in an unfished stock (Hordyk et al., 2015b). LIME relaxes the equilibrium assumptions of other methods, accounting for time-varying recruitment and fishing mortality (though assumes constant selectivity), and derives population parameters associated with an age-structured model and length compositions (Rudd and Thorson, 2018). LBRA uses the mean length of the catch to calculate reference points that address sustainability risks (Ault et al., 1998, 2008, 2019). These approaches can be used to estimate spawning potential ratio (SPR) and F/FMSY, which are commonly used as indicators for recruitment overfishing and growth overfishing, respectively. SPR is the proportion of the unfished reproductive potential left at any given level of fishing pressure (Hordyk et al., 2015b). SPR is 100% in an unexploited stock, and 0% in a stock with no spawning (e.g. all mature fish have been removed or all female fish have been caught).
The aim of this study is to quantify accuracy and precision for four length-based data-limited methods (TB, LBSPR, LIME, and LBRA) under various life history, exploitation, and recruitment scenarios given only a single year of length-frequency data, which is a limited field period common in very data-limited fisheries (Tesfaye et al., 2016; Herrón et al., 2018; Tuda, 2018; Abobi et al., 2019). This comparison allows for the identification of the method most suitable in different data-limited assessment scenarios by revealing the strengths and weaknesses of each method with reference to how well it captures true stock status, estimates key reference points, and characterizes uncertainty. It further helps to expose discrepancies in the performance of the methods (Cadrin and Dickey-Collas, 2015) and thus provides guidance about model applicability, expected bias, and advantages or disadvantages in uncertainty characterization. Although several studies have tested the performance of these methods through simulation testing (Hordyk et al., 2015b; Rudd and Thorson, 2018), this study differs in using an individual-based modelling (IBM) framework to track individuals in populations rather than using an approximation of lengths distributed in a population by age, thus offering a new and important test of these length-based approaches. The IBM framework also provides an alternative way of conducting a simulation–estimation analysis to ensure that the operating model is distinct from the estimation model.
Methods
We conducted a simulation–estimation analysis using an individual-based population model as the operating model, which simulated population dynamics and generated length composition catch data. The assessment models were given the “true” (i.e. unbiased) input parameter values for the mean somatic growth and mortality rates and were then used to estimate various reference points and derived quantities relating to stock status and exploitation rate. This simulation loop allows us to compare how far the outputs of the assessment models are from the “true” stock status estimates and investigate the sensitivities of the models to violated model assumptions. An overview of the experimental design is depicted in Figure 1.

Simulation study methodology diagram. There are seven operating model setups. Scenarios differ in life history with (I) medium-lived (base model*), (II) short-lived, (III), and longer-lived stocks. All fish longevity simulations were run with constant recruitment and exploitation at target level (SPR ∼ 0.4). From the base model, the exploitation scenario, (IV) underexploited and (V) overexploited, or the recruitment scenario, (VI) stochastic, and (VII) autocorrelation (AR), changes. For each operating model, 1 year of monthly length-frequency data and the “true” stock estimate parameters were simulated and extracted. The “true” life history values (asymptotic length L∞, growth coefficient K, natural mortality M, and length at 50% maturity ) from the operating models and the length-frequency data were then used as input for the simulated assessments with the four length-based estimation models: (i) length-based TB, (ii) LBSPR, (iii) LIME, and (iv) LBRA.
Operating model
The stock dynamics were simulated using the “fishdynr” R package (Taylor, 2017), which contains several models for simulating stock or population dynamics and fisheries management. The package’s function “virtualPop” creates an IBM of a fish stock with certain life history traits subjected to a fishing fleet with specific selectivity characteristics and fishing pressure. Information about the modelling approach for growth, mortality, selectivity, and recruitment is outlined by Taylor and Mildenberger (2017). Functions and equations for the population dynamics used in the operating model are listed in Table 1.
Functions and population dynamic equations used for generating stocks and length-frequency data in the operating models.
Number . | Description . | Function/equation . |
---|---|---|
1.1 | von Bertalanffy growth function | |
1.2 | Variability in of von Bertalanffy growth function | |
1.3 | Variability in K of von Bertalanffy growth function | |
1.4 | Length–weight relationship | |
1.5 | Selectivity/maturity probability | |
1.6 | Non-autocorrelated recruitment deviation | |
1.7 | Autocorrelated recruitment deviation | |
1.8 | ||
1.9 | Spawning-stock biomass | |
1.10 | Beverton–Holt relationship | ; beta = 1 (constant recruitment) |
1.11 | Total mortality | |
1.12 | Fishing mortality | |
1.13 | Probability of death | |
1.14 | Rand = random number generated | If , individual dies |
1.15 | Probability of death due to either natural or fishing mortality (0 = M, 1 = F) | |
1.16 | Unfished expected lifetime egg production | |
1.17 | Fished expected lifetime egg production | |
1.18 | SPR | |
1.19 | Yield | YF, Lc, t = FtBLc, t |
Number . | Description . | Function/equation . |
---|---|---|
1.1 | von Bertalanffy growth function | |
1.2 | Variability in of von Bertalanffy growth function | |
1.3 | Variability in K of von Bertalanffy growth function | |
1.4 | Length–weight relationship | |
1.5 | Selectivity/maturity probability | |
1.6 | Non-autocorrelated recruitment deviation | |
1.7 | Autocorrelated recruitment deviation | |
1.8 | ||
1.9 | Spawning-stock biomass | |
1.10 | Beverton–Holt relationship | ; beta = 1 (constant recruitment) |
1.11 | Total mortality | |
1.12 | Fishing mortality | |
1.13 | Probability of death | |
1.14 | Rand = random number generated | If , individual dies |
1.15 | Probability of death due to either natural or fishing mortality (0 = M, 1 = F) | |
1.16 | Unfished expected lifetime egg production | |
1.17 | Fished expected lifetime egg production | |
1.18 | SPR | |
1.19 | Yield | YF, Lc, t = FtBLc, t |
Functions and population dynamic equations used for generating stocks and length-frequency data in the operating models.
Number . | Description . | Function/equation . |
---|---|---|
1.1 | von Bertalanffy growth function | |
1.2 | Variability in of von Bertalanffy growth function | |
1.3 | Variability in K of von Bertalanffy growth function | |
1.4 | Length–weight relationship | |
1.5 | Selectivity/maturity probability | |
1.6 | Non-autocorrelated recruitment deviation | |
1.7 | Autocorrelated recruitment deviation | |
1.8 | ||
1.9 | Spawning-stock biomass | |
1.10 | Beverton–Holt relationship | ; beta = 1 (constant recruitment) |
1.11 | Total mortality | |
1.12 | Fishing mortality | |
1.13 | Probability of death | |
1.14 | Rand = random number generated | If , individual dies |
1.15 | Probability of death due to either natural or fishing mortality (0 = M, 1 = F) | |
1.16 | Unfished expected lifetime egg production | |
1.17 | Fished expected lifetime egg production | |
1.18 | SPR | |
1.19 | Yield | YF, Lc, t = FtBLc, t |
Number . | Description . | Function/equation . |
---|---|---|
1.1 | von Bertalanffy growth function | |
1.2 | Variability in of von Bertalanffy growth function | |
1.3 | Variability in K of von Bertalanffy growth function | |
1.4 | Length–weight relationship | |
1.5 | Selectivity/maturity probability | |
1.6 | Non-autocorrelated recruitment deviation | |
1.7 | Autocorrelated recruitment deviation | |
1.8 | ||
1.9 | Spawning-stock biomass | |
1.10 | Beverton–Holt relationship | ; beta = 1 (constant recruitment) |
1.11 | Total mortality | |
1.12 | Fishing mortality | |
1.13 | Probability of death | |
1.14 | Rand = random number generated | If , individual dies |
1.15 | Probability of death due to either natural or fishing mortality (0 = M, 1 = F) | |
1.16 | Unfished expected lifetime egg production | |
1.17 | Fished expected lifetime egg production | |
1.18 | SPR | |
1.19 | Yield | YF, Lc, t = FtBLc, t |
We simulated seven scenarios to test for the effects of variations in life history, fishing exploitation level, and recruitment types on assessment model performance. The base model was comprised of a medium-lived species (maximum age ≈ 18 years), an exploitation rate at the target level of SPR (≈ 40%), and constant recruitment with no recruitment variability. From this base model, we varied one of the three characteristics (life history, current exploitation status, or recruitment) according to the parameter values in Table 2. For each scenario, we generated 300 IBM replicates, which differed from each other due to random sampling. Each simulation replicate consisted of a 35-year simulation period, of which the first 10 years had no fishing activity and the remaining 25 years were fished at the desired exploitation rate at a monthly time-step. This provided sufficient time for the model to reach equilibrium. We then only extracted the final year of each simulation replication (year 35) to provide monthly length-frequency data from catches (200 individuals per month). For each simulation replicate, we calculated the “true” SPR and F/FMSY based on life history (Figure 2; detailed description is provided by in the Supplementary Figure S1).

Visual representation of the calculation and YPR of the “true” SPR and FMSY based on three life histories. The “true” SPR and FMSY were calculated by pushing the IBM forward 100 years and producing a YPR curve. The fishing mortality (F) for each scenario was simulated based on this calculation as well.
Input values used to generate length-frequency data with the operating models.
Parameter . | Changes . | Asymptotic length (L∞; cm) . | Growth coefficient (K) . | Age at length = 0 (t0) . | Natural mortality (M) . | Fishing mortality (F) . | Theoretical maximum age (AMAX) . |
---|---|---|---|---|---|---|---|
Life history | Base | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
Short | 36.2 | 0.87 | –0.01 | 0.9 | 0.45 | 4 | |
Longer | 90 | 0.13 | –0.01 | 0.18 | 0.08 | 26 | |
Exploitation level | Underexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.06 | 18 |
Overexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.28 | 18 | |
Recruitment | Stochastic | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
AR | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 | |
Parameter | Changes | Length at 50% maturity () | Width of maturity ogive (cm) | Recruitment standard deviation (σR) | Length at 50% selectivity () | Width of selectivity ogive (wqs; cm) | Bin size (cm) |
Life history | Base | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Short | 20.2 | 4.04 | 0.01 | 9 | 1.8 | 1 | |
Longer | 50 | 10 | 0.01 | 20 | 4 | 3 | |
Exploitation level | Underexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Overexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 | |
Recruitment | Stochastic | 34 | 6.8 | 0.4537 | 11 | 2.2 | 2 |
AR | 34 | 6.8 | 0.737 | 11 | 2.2 | 2 |
Parameter . | Changes . | Asymptotic length (L∞; cm) . | Growth coefficient (K) . | Age at length = 0 (t0) . | Natural mortality (M) . | Fishing mortality (F) . | Theoretical maximum age (AMAX) . |
---|---|---|---|---|---|---|---|
Life history | Base | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
Short | 36.2 | 0.87 | –0.01 | 0.9 | 0.45 | 4 | |
Longer | 90 | 0.13 | –0.01 | 0.18 | 0.08 | 26 | |
Exploitation level | Underexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.06 | 18 |
Overexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.28 | 18 | |
Recruitment | Stochastic | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
AR | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 | |
Parameter | Changes | Length at 50% maturity () | Width of maturity ogive (cm) | Recruitment standard deviation (σR) | Length at 50% selectivity () | Width of selectivity ogive (wqs; cm) | Bin size (cm) |
Life history | Base | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Short | 20.2 | 4.04 | 0.01 | 9 | 1.8 | 1 | |
Longer | 50 | 10 | 0.01 | 20 | 4 | 3 | |
Exploitation level | Underexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Overexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 | |
Recruitment | Stochastic | 34 | 6.8 | 0.4537 | 11 | 2.2 | 2 |
AR | 34 | 6.8 | 0.737 | 11 | 2.2 | 2 |
The base model is comprised of the medium-lived species at the target exploitation level with constant, non-stochastic recruitment. From the base model, changes in life history (short- or longer-lived), exploitation level (under- or overexploited), and recruitment type [stochastic (Thorson, in press) or autocorrelated (AR)] was made, respectively, to create the other six scenarios.
Input values used to generate length-frequency data with the operating models.
Parameter . | Changes . | Asymptotic length (L∞; cm) . | Growth coefficient (K) . | Age at length = 0 (t0) . | Natural mortality (M) . | Fishing mortality (F) . | Theoretical maximum age (AMAX) . |
---|---|---|---|---|---|---|---|
Life history | Base | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
Short | 36.2 | 0.87 | –0.01 | 0.9 | 0.45 | 4 | |
Longer | 90 | 0.13 | –0.01 | 0.18 | 0.08 | 26 | |
Exploitation level | Underexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.06 | 18 |
Overexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.28 | 18 | |
Recruitment | Stochastic | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
AR | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 | |
Parameter | Changes | Length at 50% maturity () | Width of maturity ogive (cm) | Recruitment standard deviation (σR) | Length at 50% selectivity () | Width of selectivity ogive (wqs; cm) | Bin size (cm) |
Life history | Base | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Short | 20.2 | 4.04 | 0.01 | 9 | 1.8 | 1 | |
Longer | 50 | 10 | 0.01 | 20 | 4 | 3 | |
Exploitation level | Underexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Overexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 | |
Recruitment | Stochastic | 34 | 6.8 | 0.4537 | 11 | 2.2 | 2 |
AR | 34 | 6.8 | 0.737 | 11 | 2.2 | 2 |
Parameter . | Changes . | Asymptotic length (L∞; cm) . | Growth coefficient (K) . | Age at length = 0 (t0) . | Natural mortality (M) . | Fishing mortality (F) . | Theoretical maximum age (AMAX) . |
---|---|---|---|---|---|---|---|
Life history | Base | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
Short | 36.2 | 0.87 | –0.01 | 0.9 | 0.45 | 4 | |
Longer | 90 | 0.13 | –0.01 | 0.18 | 0.08 | 26 | |
Exploitation level | Underexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.06 | 18 |
Overexploited | 64.6 | 0.21 | –0.01 | 0.32 | 0.28 | 18 | |
Recruitment | Stochastic | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 |
AR | 64.6 | 0.21 | –0.01 | 0.32 | 0.13 | 18 | |
Parameter | Changes | Length at 50% maturity () | Width of maturity ogive (cm) | Recruitment standard deviation (σR) | Length at 50% selectivity () | Width of selectivity ogive (wqs; cm) | Bin size (cm) |
Life history | Base | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Short | 20.2 | 4.04 | 0.01 | 9 | 1.8 | 1 | |
Longer | 50 | 10 | 0.01 | 20 | 4 | 3 | |
Exploitation level | Underexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 |
Overexploited | 34 | 6.8 | 0.01 | 11 | 2.2 | 2 | |
Recruitment | Stochastic | 34 | 6.8 | 0.4537 | 11 | 2.2 | 2 |
AR | 34 | 6.8 | 0.737 | 11 | 2.2 | 2 |
The base model is comprised of the medium-lived species at the target exploitation level with constant, non-stochastic recruitment. From the base model, changes in life history (short- or longer-lived), exploitation level (under- or overexploited), and recruitment type [stochastic (Thorson, in press) or autocorrelated (AR)] was made, respectively, to create the other six scenarios.
All operating models assumed von Bertalanffy growth and logistic-type selectivity and maturity. The three life histories were based on those used by Rudd and Thorson (2018) to test the LIME approach: Siganus sutor (Kenyan rabbitfish) for the short-lived (Hicks and McClanahan, 2012), Lutjanus guttatus (spotted rose snapper) for the medium-lived (Bystrom, 2016), and Epinephelus morio (red grouper) for the longer-lived (Heemstra and Randall, 1993) life history types. Examples of each life history are shown in Figure 3 as length-frequency graphs for one simulation replicate. All simulations and analyses were conducted using the statistical programming language R (R Core Team, 2018).

Length-frequency distribution graphs. For each life history scenario (short, medium, and longer), , , and L∞ are visualized (in blue, red, and black, respectively) over the length-frequency graphs for one iteration.
Estimation models
The estimation models refer to the length-based assessment methods that derive estimates of stock status from sampled length-frequency catch data. All estimation models compared in this study assume the true values for life history parameters are known (e.g. growth, maturity, and natural mortality parameters; see Inputs in Table 3) and, therefore, the same values that were used in the operating models were supplied to each of the methods as inputs. As a result, the estimation capabilities of each method are better than would be expected if these parameter values were estimated with uncertainty or incorrect, as is normally the case. Life history parameters, fishing mortality rate, and recruitment are often confounding factors affecting the shape of the length-frequency distribution; thus, the focus of this article is on sensitivities of methods estimating stock status based on an IBM approach alone rather than confounding it with biased life history parameters, because previous simulation studies have already shown the effects of incorrect life history parameters (Punt, 2003; Deroba and Schueller, 2013; Hordyk et al., 2015b; Rudd and Thorson, 2018).
Method . | Inputs . | Assumptions . | Outputs . |
---|---|---|---|
LCCC (not part of the comparison) |
|
|
|
Length-based TB |
|
|
|
LBSPR |
|
|
|
LIME |
|
|
|
LBRA |
|
|
|
Method . | Inputs . | Assumptions . | Outputs . |
---|---|---|---|
LCCC (not part of the comparison) |
|
|
|
Length-based TB |
|
|
|
LBSPR |
|
|
|
LIME |
|
|
|
LBRA |
|
|
|
The data inputs, assumptions, and expected outputs are listed for each method including the LCCC. In the outputs, the estimates in bold are those used this study uses for the comparison.
Method . | Inputs . | Assumptions . | Outputs . |
---|---|---|---|
LCCC (not part of the comparison) |
|
|
|
Length-based TB |
|
|
|
LBSPR |
|
|
|
LIME |
|
|
|
LBRA |
|
|
|
Method . | Inputs . | Assumptions . | Outputs . |
---|---|---|---|
LCCC (not part of the comparison) |
|
|
|
Length-based TB |
|
|
|
LBSPR |
|
|
|
LIME |
|
|
|
LBRA |
|
|
|
The data inputs, assumptions, and expected outputs are listed for each method including the LCCC. In the outputs, the estimates in bold are those used this study uses for the comparison.
The selectivity values and , lengths at 50% and 95% selectivity, respectively, were calculated using the length-converted catch curve (LCCC) from the R package TropFishR (Mildenberger et al., 2017) and used as the selectivity inputs for all estimation models. The LCCC method plots the natural log of catch vs. age, estimates total mortality (Z) from the negative slope of the regression of the curve and derives a selection ogive. For TB, Z was calculated from the LCCC and used as an input value, similar to how TB is applied to data in reality. The four length-based methods assessed, (i) TB, (ii), LBSPR, (iii) LIME, and (iv) LBRA, are contained within the R packages TropFishR, LBSPR (Hordyk et al., 2015b), LIME (Rudd and Thorson, 2018), and fishmethods (Nelson, 2017), respectively. The inputs, assumptions, and outputs of each of the methods, including the LCCC method, are listed in Table 3.
Performance metrics
Many studies have explored the levels of SPR to be used as target and limit reference points, often applying SPR of 30% as a limit and 40% as a target reference point (Mace and Sissenwine, 1993; Clark, 2002; Hordyk et al., 2015c). F/FMSY is a reference point relating the current fishing mortality (F) to the level that would sustain maximum yield (FMAX). There is a common practice of linking FMAX to FMSY based on the assumption that recruitment is independent of spawning-stock size for fishing mortalities between 0 and FMAX (Reynolds et al., 2001) and that selectivity is asymptotic and around the maturity ogive. These assumptions are often invalid for most species and many fisheries. However, in this study, density-dependent effects on recruitment were negligible, resulting in equivalency between FMAX and FMSY. Although reference points based on MSY are based on a measure of magnitude (i.e. catch), FMSY can be calculated with length only instead of catch using relative YPR (Larkin, 1977; Holt and Talbot, 1978; Caddy and McGarvey, 1996; Punt and Smith, 2001).
where xest is the estimated value (calculated from the respective estimation model) and xtrue is the true value (calculated from the operating models). Values closer to zero represent the least biased (MRE) and most precise (MARE) results.
Results
Performance of the length-based methods
TB and LBSPR were <30% biased and imprecise across scenarios when estimating the reference points (Figures 4 and 5, Tables 4 and 5). Of the four scenarios, TB was the least biased and most precise in estimating SPR for the short-lived and stochastic scenarios with <10% bias and imprecision. When estimating F/FMSY, TB performed the best in the base and overexploited scenarios, each being <10% biased and precise. The LBSPR estimate of SPR was the closest to the truth in the overexploited (<5% biased and imprecise) and autocorrelated scenarios, and was the least biased in estimating F/FMSY in the underexploited, stochastic, and autocorrelated scenarios. LBSPR was relatively robust as natural mortality was fixed (M/K input; Table 5), which increases precision in the calculation of F/M and thus SPR and F/FMSY. Rudd and Thorson (2018) found that LBSPR performed better when the stocks were in equilibrium and when the operating model matched LBSPR’s assumptions; this result is supported given that the base IBM assumptions included constant recruitment (except in the stochastic and autocorrelated scenarios), and fishing. LBSPR was found to be accurate when the underlying selectivity is asymptotic (Hordyk et al., 2015b; Rudd and Thorson, 2018; Pons et al., 2019).
![Violin plots of relative error for SPR for 300 iterations per scenario with 200 individuals per month sampled in a single year across three life histories (short-, medium-, and longer-lived), three exploitation levels (target, under, and overexploited), and three recruitments (no error, stochastic error, and autocorrelation pattern and error). Four methods (length-based TB [in orange], LBSPR [in green], LIME [in blue], and LBRA [in violet]) were analysed. The grey horizontal line is the zero relative error line, and the black dot is the median relative error indicating bias. Each plot has a different y-axis with a smoother tail.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/icesjms/77/1/10.1093_icesjms_fsz212/2/m_fsz212f4.jpeg?Expires=1747886715&Signature=EFKUx-8QFHIkR1oNfu23YImmga7L8ESKJ6owvJJppGwaKkPdPBJ-D1k0VoyDarUA~p7fJ790jx3daYh17Th97srO7vlm3QhhpE32gorDg-1Zt7z3JCQVivM87b11K4VIazaELpuTq42YbFAVX9Oqs-u0DVPTHUG90rJKKkpnY9I6UCyIfEDnxFaIVfqskKLkBnaMyezPEuNSLQZmUEQTADoV3gqFPxo9nsvdrZ1~v~K4skHBvsRFbI5wyuheydDA8e~J1DlrWxAwWu-Z7K7oR4Cy-QhZWqcdYquTz6YQj8C2U83mNWQMe7xbGeo-pf94Fp4F39Ex8~cbOv6MvYvkUQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Violin plots of relative error for SPR for 300 iterations per scenario with 200 individuals per month sampled in a single year across three life histories (short-, medium-, and longer-lived), three exploitation levels (target, under, and overexploited), and three recruitments (no error, stochastic error, and autocorrelation pattern and error). Four methods (length-based TB [in orange], LBSPR [in green], LIME [in blue], and LBRA [in violet]) were analysed. The grey horizontal line is the zero relative error line, and the black dot is the median relative error indicating bias. Each plot has a different y-axis with a smoother tail.
![Violin plots of relative error for F/FMSY for 300 iterations per scenario with 200 individuals per month sampled in a single year across three life histories (short-, medium-, and longer-lived), three exploitation levels (target, under-, and overexploited), and three recruitments (no error, stochastic error, and autocorrelation pattern and error). Four methods (length-based TB [in orange], LBSPR [in green], LIME [in blue], and LBRA [in violet]) were analysed. The grey horizontal line is the zero relative error line, and the black dot is the median relative error indicating bias. Each plot has a different y-axis with a smoother tail.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/icesjms/77/1/10.1093_icesjms_fsz212/2/m_fsz212f5.jpeg?Expires=1747886715&Signature=t~s1nRubDffE4z4sn0ex~IHPNz8pai-0mPpF~Ukjv5HOF8L1onl324qyf8nJxZKVDwC1C6OlI4iyI~C2v8s0rnJsj4psXb03TuqPwOU9qfgGQ7jPPGkhd7FixkoI4HHbteKyi~a44Axwy9PI~-LlooqHrB194NRjgZ2he6YIGArJMq6hBBP2WGQuncqvGp1gGHo0XvJfq3MV1-zzmW~GBvmC0k6r1I1~BuHZJclczuowRX3SN~Hc6VBzHw0ebTPRVIVy41TaqP6~AZ2S06ZzLGIdajk1znk~LuQBHVR1P9n~r-JHRQEg5szIfmU~2z5vmY46hR474Rva0L3MJSvPtA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Violin plots of relative error for F/FMSY for 300 iterations per scenario with 200 individuals per month sampled in a single year across three life histories (short-, medium-, and longer-lived), three exploitation levels (target, under-, and overexploited), and three recruitments (no error, stochastic error, and autocorrelation pattern and error). Four methods (length-based TB [in orange], LBSPR [in green], LIME [in blue], and LBRA [in violet]) were analysed. The grey horizontal line is the zero relative error line, and the black dot is the median relative error indicating bias. Each plot has a different y-axis with a smoother tail.
Bias (MRE) and precision (MARE) table of SPR from length-based TB, LBSPR, LIME, and LBRA performance across life histories, exploitation levels, and recruitment types.
![]() |
![]() |
These are components of the base model, and thus are of a single scenario.
Bias (MRE) and precision (MARE) table of SPR from length-based TB, LBSPR, LIME, and LBRA performance across life histories, exploitation levels, and recruitment types.
![]() |
![]() |
These are components of the base model, and thus are of a single scenario.
Bias (MRE) and precision (MARE) table of F/FMSY from length-based TB, LBSPR, LIME, and LBRA performance across life histories, exploitation levels, and recruitment types.
![]() |
![]() |
Lighter colours indicate better models: the lightest red colour indicates bias/precision <5%, and the darkest red colour indicates bias/precision >30%.
These are components of the base model, and thus are of a single scenario.
Bias (MRE) and precision (MARE) table of F/FMSY from length-based TB, LBSPR, LIME, and LBRA performance across life histories, exploitation levels, and recruitment types.
![]() |
![]() |
Lighter colours indicate better models: the lightest red colour indicates bias/precision <5%, and the darkest red colour indicates bias/precision >30%.
These are components of the base model, and thus are of a single scenario.
LIME was the least biased and most precise when estimating SPR for the base and longer-lived scenarios (Table 4) but usually performed the worst in estimating SPR for the over- and underexploited scenarios (Table 4) and F/FMSY for short-lived and autocorrelated recruitment scenarios (Table 5). The relatively high bias and low precision for these estimates indicate that the complexity of this method requires longer time-series; a single year of monthly length-frequency data is insufficient for this method to untangle the effects of fishing mortality and recruitment on the length-frequency data. Lastly, LBRA only performed well in the short-lived scenario. Due to the method’s constraint in using a “truncated model” of Ehrhardt and Ault (1992), meaning that LMAX is the cut-off for the upper length class, some of the lengths above LMAX were removed. Truncations lead to overestimation of Z (Then et al., 2015), which results in underestimation of SPR and overestimation of F/FMSY. This was especially evident in the medium- and longer-lived scenarios (Figures 4 and 5). Although LBRA was more biased than the other methods, it gave low values of SPR and high values of F/FMSY, meaning that it is more precautionary, a more desired feature in data-limited approaches if error is unavoidable.
Across life history, exploitation level, and recruitment scenarios
Life histories have a clear impact on the performance of these four length-based assessment methods (Figures 4 and 5). Notably, each method had difficulties in assessing short-lived stocks, especially when estimating F and FMSY, and consequently F/FMSY. The increased bias and decreased precision of F/FMSY stem from the decreased accuracy of the calculation of fishing mortality and FMSY. All methods overestimated SPR; LBSPR and LIME performed the worst and TB and LBRA performed the best across the life histories. For medium- and longer-lived species, TB, LBSPR, and LIME overestimated SPR, whereas LBRA on average underestimated SPR. In the medium-lived scenario, TB, LBSPR, and LIME were negatively biased in calculating F/FMSY, whereas LBRA was positively biased. For the longer-lived scenario, TB and LBSPR underestimated F/FMSY, whereas LIME and LBRA overestimated F/FMSY.
Although the performance of the methods under different exploitation scenarios varied among reference points, it is evident that stocks that are either under- or overexploited are more difficult to assess than those that are fished near the target exploitation level (SPR ≈ 40%). When stocks are severely overexploited, the methods are less accurate in estimating SPR. When the stocks are severely underexploited, the methods present inconsistencies in estimating F/FMSY. The error stems from the calculation of fishing mortality as seen in Supplementary Figure S1 and Table S1. For both exploitation levels, the inaccuracy of F/FMSY is due to the large bias and imprecision in calculating the fishing mortality.
The impacts of increasing recruitment error were also evident in each of the methods, as precision decreased in the stochastic and autocorrelated scenarios. Although precision decreased, the bias in SPR was also lower in most of the methods because no recruitment error was implemented in the base model. In the stochastic scenario, each of the methods on average underestimated SPR, with three methods (TB, LBSPR, and LIME) also being <5% biased, whereas in the autocorrelated scenario, the three methods on average overestimated SPR, but remained <10% biased. When estimating F/FMSY in general, the bias generally decreased with the addition of recruitment error compared with when there was no error. The high bias in estimating F/FMSY stems from FMSY, and the low precision stems from fishing mortality.
Discussion
This study is a first in simulation testing length-based models using an IBM framework, thus adding a level of independence in population dynamics not seen in other studies. In similar studies (Ault et al., 1998; Hordyk et al., 2015b; Rudd and Thorson, 2018), the operating models used are identical to the assessment models and assume that all dynamic processes are fully understood. An alternatively structured operating model can help avoid this problem and identify misleading assumptions that may be implicit in the design of an assessment model (Cao et al., 2016). In addition, population-based methods (age or length) make assumptions regarding error in length-at-age to create length distribution within each age class (Cao et al., 2016). Our operating model considers the Rosa Lee phenomenon, which is usually not seen in many assessment models, where slow-growing fish may have lower mortality rates because they become susceptible to fishing selection (i.e. reach ) at a larger age than faster-growing individuals (Lee, 1912; Kraak et al., 2019).
Performance of methods
Stock assessment methods often perform poorly with short-lived species, as the annual time-step does not provide enough information about their dynamics (Thiaw et al., 2011; Alemany et al., 2015; Maunder and Piner, 2015). In addition, the biomass of short-lived species is more sensitive to environmental variability because of their fast growth rates and short generation times (Winemiller, 2005; Dichmont et al., 2006; Pinsky et al., 2011). Hordyk et al. (2015b) also found that length-based methods tend to be biased for short-lived species, as these methods often rely on detecting the signal of fishing mortality on the upper tail of the length composition. Rudd and Thorson (2018) noted that with increasing length and age, the cohorts tend to be indistinct. In general, longer-lived species may have lower SPR levels, as there is a relationship between longevity and sensitivity of SPR to exploitation pressures (Nadon et al., 2015). Thus, for longer-lived species, spawning biomass is represented by older individuals, and their numbers can be reduced even with low fishing rate. Medium-lived species have new recruits at an early age, which allows better detection of information about the population, and are not as vulnerable to fishing due to a low reproductive rate. Varying life histories have an impact on the quality of length-frequency data as they affect the ideal sample size, sparseness in small or large length classes, and effects of selectivity and fishing pressure. Although this study does not address common issues associated with the quality and bias of length-frequency data, we can, however, conclude that medium-lived species seem the easiest to assess as their cohorts can be tracked, and it is easier to sufficiently sample length classes.
We also show that fishing mortality in overexploited and underexploited stocks is harder to assess with the four tested methods. In the majority of the scenarios, fishing morality was underestimated, which highlights the challenge of accurately estimating mortality rates and emphasizes the need for conservative interpretation of assessment outcomes, particularly in assessments that can only estimate stock status based on mortality rates (e.g. exploitation). In a future study, the influence of historical fishing patterns (e.g. a history of increasing and decreasing fishing mortality) would be of interest to investigate as fishing mortality is usually not constant, and a consistently high fishing mortality results in a different length distribution than that seen in a recently developed fishery.
Many stock assessment models, including length-based methods, assume equilibrium conditions. However, this assumption is typically incorrect as recruitment variation changes the age structure of a population with time (Haddon, 2001). Recruitment patterns can vary greatly among stocks (e.g. pulsed, autocorrelated), seasonal variations and modes (uni- vs. bimodal; Isaac, 1990) and should not be overlooked. Despite the underlying uncertainties about recruitment error and type and whether a fishery is in non-equilibrium conditions, one could still apply TB or LBSPR as they were less biased with a single year of data. Although this study lightly addressed the effects of including recruitment variability, and many management strategy evaluation studies investigate different levels of recruitment error, further studies should investigate how this may affect data-limited fisheries assessments.
Caveats of length data
Typically, length data representing a single year of sampling are frequently the case in tropical, data-limited areas. However, some methods have limited abilities with only a single year of data as there is no information to tease apart recruitment from fishing mortality leading to high uncertainty. Time-series data are not always guaranteed to give better estimates (Carruthers et al., 2014; Rudd and Thorson, 2018; Dowling et al., 2019); rather, the validity of the assumptions of the method and the attributes of the stock life history are more important to assessment outcomes (Carruthers et al., 2014). Although stock assessments usually use multiyear data, the quality of the data and the performance of the methods are often more important in determining the reliability of the assessment. In addition, the ideal sample size and sampling period are dependent on life history and should be considered in the assessment.
Length-frequency data obtained from small-scale, data-limited fisheries are often strongly biased due to mixed gear selectivity not being properly accounted for or because not all gear types are sampled. The size composition of the catch usually reflects a mix of sizes due to the use of a variety of fishing gears. This issue was addressed for tropical fisheries by Wolff et al. (2015), who found that different gear selectivities impact YPR and spawning biomass. The data generation of our simulation study assumed asymptotic selectivity and = 0.25 L∞. Ideally, the selection characteristics of the gear(s) should be known prior to any length-frequency analysis, and (if possible) catch length-frequency data should first be reconstructed based on the selectivity features of the gear. A study by Pons et al. (2019) investigated the influence of different gear selectivity from multiple fleets on length composition data of scombrids in the Atlantic Ocean and found that accounting for multiple selectivity curves reduces bias in estimating SPR. Further consideration and investigation of gear selectivity influences on length data are warranted.
Guidance to practitioners and conclusions
TB and LBSPR are methods based on YPR or spawning per recruit, meaning that they do not assume any stock–recruitment relationship. These are important assumptions of the two models, and when violated, their performances will be biased. A common practice in using TB is to perform the length-based Jones’ cohort analysis beforehand (Mildenberger et al., 2017), which relies on catch in numbers/biomass at age and can estimate fishing mortality per length class. However, it should be noted that this approach adds error from the cohort analysis to TB. We, therefore, recommend calculating fishing mortality per length class using the LCCC with the estimation of selectivity pattern.
When there is conflicting information and results among methods, we recommend using the information provided in this study to advise which type of model is less biased for a given combination of life history and other factors. This study guides which type of model, out of the four tested, can be considered most reliable given a stock’s life history, recruitment, and exploitation. If those properties are unknown (and they most often will be), TB and LBSPR should be run in parallel. Minor differences in outcomes would then indicate uncertainties due to model structure, whereas major discrepancies point to model assumptions potentially being overlooked. Although we highlight which methods performed well in a given scenario, we also encourage using a combination of length-based methods to compare their performances and define a range of possible stock estimates. In the absence or uncertainty of such knowledge on life history, recruitment, and exploitation, LBSPR and TB can be expected to perform more consistently than LIME and LBRA in the rapid assessment of limited data of a single year.
These four length-based methods can be used for rapid assessments in data-limited fisheries to provide a cost-effective starting point for management. As the assumptions and sensitivities of each method were analysed in this study, scientists and managers can use this information to quickly assess data-limited stocks for an indication of stock status, thus providing guidance on which methods to employ given a situation. For the development of new data-limited assessment approaches, understanding how different life histories, exploitation levels, and recruitment errors affect existing methods is essential as it highlights the weaknesses of the current methods.
Funding
TKM was funded in part by the EMFF project “ManDaLiS—Improving the management basis for Danish data-limited stocks” (33113-B-16-085), which is funded by the European Maritime and Fisheries Fund and the Danish Fisheries Agency. TAB was funded in part by the Richard C. and Lois M. Worthington Endowed Professor in Fisheries Management. MBR was funded by the Joint Institute for the Study of the Atmosphere and Ocean, NOAA Cooperative Agreement No. NA150AR4320063 (2017–2019), Contribution No. 2018-0161.
Acknowledgements
We thank James Thorson and Adrian Hordyk for their technical support in applying FishLife and LBSPR tools, respectively. We acknowledge Andi Stephens, the editor, and three anonymous reviewers for comments that significantly improved the paper.
References
R Core Team.