-
PDF
- Split View
-
Views
-
Cite
Cite
Felipe Hurtado-Ferro, Cody S. Szuwalski, Juan L. Valero, Sean C. Anderson, Curry J. Cunningham, Kelli F. Johnson, Roberto Licandeo, Carey R. McGilliard, Cole C. Monnahan, Melissa L. Muradian, Kotaro Ono, Katyana A. Vert-Pre, Athol R. Whitten, André E. Punt, Looking in the rear-view mirror: bias and retrospective patterns in integrated, age-structured stock assessment models, ICES Journal of Marine Science, Volume 72, Issue 1, January 2015, Pages 99–110, https://doi.org/10.1093/icesjms/fsu198
- Share Icon Share
Abstract
Retrospective patterns are systematic changes in estimates of population size, or other assessment model-derived quantities, that occur as additional years of data are added to, or removed from, a stock assessment. These patterns are an insidious problem, and can lead to severe errors when providing management advice. Here, we use a simulation framework to show that temporal changes in selectivity, natural mortality, and growth can induce retrospective patterns in integrated, age-structured models. We explore the potential effects on retrospective patterns of catch history patterns, as well as model misspecification due to not accounting for time-varying biological parameters and selectivity. We show that non-zero values for Mohn’s ρ (a common measure for retrospective patterns) can be generated even where there is no model misspecification, but the magnitude of Mohn’s ρ tends to be lower when the model is not misspecified. The magnitude and sign of Mohn’s ρ differed among life histories, with different life histories reacting differently from each type of temporal change. The value of Mohn’s ρ is not related to either the sign or magnitude of bias in the estimate of terminal year biomass. We propose a rule of thumb for values of Mohn’s ρ which can be used to determine whether a stock assessment shows a retrospective pattern.
Introduction

Two generated retrospective patterns (a and b) and their relative errors (c and d), along with the corresponding values for Mohn’s ρ and κ-statistics.
Retrospective patterns in estimated biomass present large challenges for fisheries stock assessment and management. Total allowable catches in each year are generally based on an estimate of biomass and some agreed level of fishing mortality. A systematically biased estimate of biomass can result in not only recommended catches for a single year that are higher or lower than intended, but also for several consecutive years. From a conservation point of view, retrospective patterns are of particular concern when they lead to overestimation of the biomass used to set catch levels, whereas they could lead to underutilization if biomass is systemically underestimated. In the former cases, fishery scientists and managers may believe that there are more fish in the sea than appears based on subsequent assessments, and catches can be set at levels exceeding targets while being unnoticed for several years (e.g. Pacific halibut, Hippoglossus stenolepis; Valero, 2012). On the other hand, when a retrospective pattern is noticed, it can be so severe that an assessment could be considered unreliable for management purposes (Cadigan and Farrell, 2005; Valero, 2012).
The root causes of a specific retrospective pattern are often difficult to determine, given available data. Parma (1993) was the first to ascribe a potential causal mechanism (time-varying catchability) to an observed retrospective pattern. NOAA (2009) demonstrated a large number of ways in which retrospective patterns can be produced in simulated populations to which virtual population analysis (VPA) methods were applied to estimate biomass and other quantities important in management. Generally, retrospective patterns arise from two general pathologies: time-varying processes unaccounted for in the assessment (i.e. model misspecification), or contradictory (or incomplete) data. To date, simulation studies have used primarily VPA-type stock assessment methods, but integrated, statistical age-structured models are also extensively used around the world (Maunder and Punt, 2013) and have also shown large retrospective patterns in some assessments (e.g. Pacific halibut Hippoglossus stenolepis; Valero, 2012; Norton sound red king crab Paralithodes camtschaticus; Hamazaki and Zheng, 2012). Nonetheless, it is not known how retrospective patterns emerge in integrated stock assessments, though model misspecification and conflicting data are likely culprits.
Measuring retrospective patterns has also proven challenging. The most commonly used metric is the “ρ” statistic proposed by Mohn (1999), which measures the relative difference between an estimated quantity from an assessment with a reduced time-series and the same quantity estimated from the full time-series. However, despite this being a relative measure, no rules of thumb have been developed with respect to how large the value of Mohn’s ρ must be before an assessment is deemed to have a retrospective pattern (NOAA, 2009). It is also unclear how much information does Mohn’s ρ provide about the bias in the final year of an assessment.
In this study, we aim to understand how model misspecification in three processes modelled in integrated age-structured models (natural mortality, growth, and selectivity) may generate retrospective patterns, and how these patterns vary across life histories and exploitation patterns. We propose ranges of Mohn’s ρ that can be used as guidelines to determine whether an assessment exhibits retrospective patterns that are substantial enough to be of concern to assessment scientists and managers based on life history, and explore methods to determine the cause of retrospective patterns.
Methods
Overview

Model description
Life history, fishery, and modelling parameters used for each life history type (cod-like, flatfish-like, and sardine-like).
Parameter (units) . | Symbol . | Estimated . | Cod . | Flatfish . | Sardine . |
---|---|---|---|---|---|
Base parameters | |||||
Natural mortality (year−1) | M | No | 0.2 | 0.2 | 0.4 |
Reference age (year) | a1 | No | 0.5 | 0.5 | 0.5 |
Maximum age (year) | Amax | No | 25 | 25 | 15 |
Biology | |||||
Length at a1 (cm) | L1 | Yes | 20 | 12.7 | 10 |
Length at Amax (cm) | L∞ | Yes | 132 | 47.4 | 25 |
Growth rate (year−1) | K | Yes | 0.2 | 0.35 | 0.4 |
CV L1 | CV1 | Yes | 0.1 | 0.2 | 0.14 |
CV L∞ | CV∞ | Yes | 0.1 | 0.2 | 0.05 |
Length-weight scaling (kg cm−3) | α | No | 6.8e−6 | 1.0e−5 | 1.7e−5 |
Allometric factor | β | No | 3.1 | 3.0 | 2.9 |
Maturity slope (cm−1) | Ω1 | No | −0.27 | −0.42 | −0.90 |
Length at 50% maturity (cm) | Ω2 | No | 38.2 | 28.9 | 15.9 |
Recruitment | |||||
Log mean virgin recruitment | ln R0 | Yes | 18.7 | 10.5 | 16 |
Steepness | h | No | 0.65 | 0.76 | 0.59 |
Recruitment variability | σr | No | 0.4 | 0.7 | 0.73 |
Selectivity | |||||
Mean fishery length-at-50% selectivity (cm) | S1 | Yes | 38.2 | 28.9 | 15.9 |
Fishery length selectivity slope (cm) | S2 | Yes | 10.6 | 7 | 3.3 |
Survey length-at-50% selectivity (cm) | S3 | Yes | 30.5 | 23.1 | 12.7 |
Survey length selectivity slope (cm) | S4 | Yes | 10.6 | 7 | 3.3 |
Log-catchability | ln q | Yes | 0 | 0 | 0 |
Survey observation error s.d. | σS | No | 0.2 | 0.2 | 0.2 |
Time-varying parameters (final values) | |||||
K (increase) | – | No | 0.2731 | 0.4871 | 0.548 |
K (decrease) | – | No | 0.1578 | 0.2736 | 0.3149 |
S1 (increase) | – | No | 47.745 | 36.125 | 19.87 |
S1 (decrease) | – | No | 28.655 | 21.675 | 11.93 |
M (increase) | – | No | 0.255 | 0.3 | 0.57 |
M (decrease) | – | No | 0.165 | 0.14 | 0.29 |
Parameter (units) . | Symbol . | Estimated . | Cod . | Flatfish . | Sardine . |
---|---|---|---|---|---|
Base parameters | |||||
Natural mortality (year−1) | M | No | 0.2 | 0.2 | 0.4 |
Reference age (year) | a1 | No | 0.5 | 0.5 | 0.5 |
Maximum age (year) | Amax | No | 25 | 25 | 15 |
Biology | |||||
Length at a1 (cm) | L1 | Yes | 20 | 12.7 | 10 |
Length at Amax (cm) | L∞ | Yes | 132 | 47.4 | 25 |
Growth rate (year−1) | K | Yes | 0.2 | 0.35 | 0.4 |
CV L1 | CV1 | Yes | 0.1 | 0.2 | 0.14 |
CV L∞ | CV∞ | Yes | 0.1 | 0.2 | 0.05 |
Length-weight scaling (kg cm−3) | α | No | 6.8e−6 | 1.0e−5 | 1.7e−5 |
Allometric factor | β | No | 3.1 | 3.0 | 2.9 |
Maturity slope (cm−1) | Ω1 | No | −0.27 | −0.42 | −0.90 |
Length at 50% maturity (cm) | Ω2 | No | 38.2 | 28.9 | 15.9 |
Recruitment | |||||
Log mean virgin recruitment | ln R0 | Yes | 18.7 | 10.5 | 16 |
Steepness | h | No | 0.65 | 0.76 | 0.59 |
Recruitment variability | σr | No | 0.4 | 0.7 | 0.73 |
Selectivity | |||||
Mean fishery length-at-50% selectivity (cm) | S1 | Yes | 38.2 | 28.9 | 15.9 |
Fishery length selectivity slope (cm) | S2 | Yes | 10.6 | 7 | 3.3 |
Survey length-at-50% selectivity (cm) | S3 | Yes | 30.5 | 23.1 | 12.7 |
Survey length selectivity slope (cm) | S4 | Yes | 10.6 | 7 | 3.3 |
Log-catchability | ln q | Yes | 0 | 0 | 0 |
Survey observation error s.d. | σS | No | 0.2 | 0.2 | 0.2 |
Time-varying parameters (final values) | |||||
K (increase) | – | No | 0.2731 | 0.4871 | 0.548 |
K (decrease) | – | No | 0.1578 | 0.2736 | 0.3149 |
S1 (increase) | – | No | 47.745 | 36.125 | 19.87 |
S1 (decrease) | – | No | 28.655 | 21.675 | 11.93 |
M (increase) | – | No | 0.255 | 0.3 | 0.57 |
M (decrease) | – | No | 0.165 | 0.14 | 0.29 |
Life history, fishery, and modelling parameters used for each life history type (cod-like, flatfish-like, and sardine-like).
Parameter (units) . | Symbol . | Estimated . | Cod . | Flatfish . | Sardine . |
---|---|---|---|---|---|
Base parameters | |||||
Natural mortality (year−1) | M | No | 0.2 | 0.2 | 0.4 |
Reference age (year) | a1 | No | 0.5 | 0.5 | 0.5 |
Maximum age (year) | Amax | No | 25 | 25 | 15 |
Biology | |||||
Length at a1 (cm) | L1 | Yes | 20 | 12.7 | 10 |
Length at Amax (cm) | L∞ | Yes | 132 | 47.4 | 25 |
Growth rate (year−1) | K | Yes | 0.2 | 0.35 | 0.4 |
CV L1 | CV1 | Yes | 0.1 | 0.2 | 0.14 |
CV L∞ | CV∞ | Yes | 0.1 | 0.2 | 0.05 |
Length-weight scaling (kg cm−3) | α | No | 6.8e−6 | 1.0e−5 | 1.7e−5 |
Allometric factor | β | No | 3.1 | 3.0 | 2.9 |
Maturity slope (cm−1) | Ω1 | No | −0.27 | −0.42 | −0.90 |
Length at 50% maturity (cm) | Ω2 | No | 38.2 | 28.9 | 15.9 |
Recruitment | |||||
Log mean virgin recruitment | ln R0 | Yes | 18.7 | 10.5 | 16 |
Steepness | h | No | 0.65 | 0.76 | 0.59 |
Recruitment variability | σr | No | 0.4 | 0.7 | 0.73 |
Selectivity | |||||
Mean fishery length-at-50% selectivity (cm) | S1 | Yes | 38.2 | 28.9 | 15.9 |
Fishery length selectivity slope (cm) | S2 | Yes | 10.6 | 7 | 3.3 |
Survey length-at-50% selectivity (cm) | S3 | Yes | 30.5 | 23.1 | 12.7 |
Survey length selectivity slope (cm) | S4 | Yes | 10.6 | 7 | 3.3 |
Log-catchability | ln q | Yes | 0 | 0 | 0 |
Survey observation error s.d. | σS | No | 0.2 | 0.2 | 0.2 |
Time-varying parameters (final values) | |||||
K (increase) | – | No | 0.2731 | 0.4871 | 0.548 |
K (decrease) | – | No | 0.1578 | 0.2736 | 0.3149 |
S1 (increase) | – | No | 47.745 | 36.125 | 19.87 |
S1 (decrease) | – | No | 28.655 | 21.675 | 11.93 |
M (increase) | – | No | 0.255 | 0.3 | 0.57 |
M (decrease) | – | No | 0.165 | 0.14 | 0.29 |
Parameter (units) . | Symbol . | Estimated . | Cod . | Flatfish . | Sardine . |
---|---|---|---|---|---|
Base parameters | |||||
Natural mortality (year−1) | M | No | 0.2 | 0.2 | 0.4 |
Reference age (year) | a1 | No | 0.5 | 0.5 | 0.5 |
Maximum age (year) | Amax | No | 25 | 25 | 15 |
Biology | |||||
Length at a1 (cm) | L1 | Yes | 20 | 12.7 | 10 |
Length at Amax (cm) | L∞ | Yes | 132 | 47.4 | 25 |
Growth rate (year−1) | K | Yes | 0.2 | 0.35 | 0.4 |
CV L1 | CV1 | Yes | 0.1 | 0.2 | 0.14 |
CV L∞ | CV∞ | Yes | 0.1 | 0.2 | 0.05 |
Length-weight scaling (kg cm−3) | α | No | 6.8e−6 | 1.0e−5 | 1.7e−5 |
Allometric factor | β | No | 3.1 | 3.0 | 2.9 |
Maturity slope (cm−1) | Ω1 | No | −0.27 | −0.42 | −0.90 |
Length at 50% maturity (cm) | Ω2 | No | 38.2 | 28.9 | 15.9 |
Recruitment | |||||
Log mean virgin recruitment | ln R0 | Yes | 18.7 | 10.5 | 16 |
Steepness | h | No | 0.65 | 0.76 | 0.59 |
Recruitment variability | σr | No | 0.4 | 0.7 | 0.73 |
Selectivity | |||||
Mean fishery length-at-50% selectivity (cm) | S1 | Yes | 38.2 | 28.9 | 15.9 |
Fishery length selectivity slope (cm) | S2 | Yes | 10.6 | 7 | 3.3 |
Survey length-at-50% selectivity (cm) | S3 | Yes | 30.5 | 23.1 | 12.7 |
Survey length selectivity slope (cm) | S4 | Yes | 10.6 | 7 | 3.3 |
Log-catchability | ln q | Yes | 0 | 0 | 0 |
Survey observation error s.d. | σS | No | 0.2 | 0.2 | 0.2 |
Time-varying parameters (final values) | |||||
K (increase) | – | No | 0.2731 | 0.4871 | 0.548 |
K (decrease) | – | No | 0.1578 | 0.2736 | 0.3149 |
S1 (increase) | – | No | 47.745 | 36.125 | 19.87 |
S1 (decrease) | – | No | 28.655 | 21.675 | 11.93 |
M (increase) | – | No | 0.255 | 0.3 | 0.57 |
M (decrease) | – | No | 0.165 | 0.14 | 0.29 |

Extent of data available to the EM. Both catches and the survey have associated length- and age-composition data. The survey index was generated as lognormal samples with a CV of 0.2. The survey occurs every 2 years for 20 years.
The EM estimates virgin recruitment (R0), deviations in recruitment about the stock–recruitment curve, fishery and survey selectivity parameters, survey catchability (q), and somatic growth parameters (L∞, K). Natural mortality (M), the steepness of the stock–recruitment relationship (h), and the extent of variation about the stock–recruitment relationship (σR) were assumed to be known without error.
Experimental design
Life history: Populations were simulated for three general life history patterns: North Sea cod (“Cod”; Gadus morhua; OM parameter values supplied by R. Methot, NMFS, NOAA, pers. comm.), yellowtail flounder (“Flatfish”; Limanda ferruginea; OM parameter values from Legault et al., 2012), and Pacific sardine (“Sardine”; Sardinops sagax caeruleus; OM parameter values taken from Hill et al., 2012). The Ricker stock–recruitment function used by Hill et al. (2012) for the sardine-like life history was replaced by a Beverton–Holt stock–recruitment function with a steepness specific to sardine (Myers et al., 1999) to facilitate comparisons among the three life history types. Parameters used for each life history type are shown in Table 1.
Time-varying process: Growth, fishery selectivity, and M were allowed to vary over time (individually) in the OM in an attempt to induce retrospective patterns. Growth was changed so that the age at which 95% of individuals reach L∞ increased or decreased by 25% from the initial values (Figure 4a). Selectivity was changed such that the length at which 50% of individuals were selected increased or decreased by 25% from the initial values (Figure 4b). M was changed such that the new maximum sustainable yield (MSY) was ±25% of the original MSY (Figure 4c). Table 1 shows base and modified values for these parameters.
Fishing mortality pattern: Three typical patterns in F were used to generate the catches: constant F, equal to the value that produced 0.95 MSY on the left limb of the yield vs. F curve (Figure 4d); a steadily increasing trend to the F corresponding to 0.95 MSY on the right limb of the yield vs. F curve (Figure 4e); and a “fish down and recovery”, i.e. a 60-year linear increase to the F corresponding to 0.95 MSY (right limb), followed by a 15-year linear decrease to the F corresponding to 0.95 MSY (left limb; Figure 4f). For all scenarios, years 1 through 25 had zero fishing, and acted as a burn-in period.
Patterns in time-varying processes: The onset of the change in a process occurred either 10 years (“Recent”) or 25 years (“Old”) before the last year of the simulation; the pattern of the change was either “sudden” or “gradual”; and the direction of the changes, “positive” and “negative”, were tested for the ability to induce retrospective patterns (Figure 4g). Table 2 details the eight time-varying scenarios.
Scenario . | Timing . | Direction . | Pattern . |
---|---|---|---|
Base | None | None | None |
1 | Old | Positive | Gradual |
2 | Recent | Positive | Gradual |
3 | Old | Negative | Gradual |
4 | Recent | Negative | Gradual |
5 | Old | Positive | Sudden |
6 | Recent | Positive | Sudden |
7 | Old | Negative | Sudden |
8 | Recent | Negative | Sudden |
Scenario . | Timing . | Direction . | Pattern . |
---|---|---|---|
Base | None | None | None |
1 | Old | Positive | Gradual |
2 | Recent | Positive | Gradual |
3 | Old | Negative | Gradual |
4 | Recent | Negative | Gradual |
5 | Old | Positive | Sudden |
6 | Recent | Positive | Sudden |
7 | Old | Negative | Sudden |
8 | Recent | Negative | Sudden |
Scenario . | Timing . | Direction . | Pattern . |
---|---|---|---|
Base | None | None | None |
1 | Old | Positive | Gradual |
2 | Recent | Positive | Gradual |
3 | Old | Negative | Gradual |
4 | Recent | Negative | Gradual |
5 | Old | Positive | Sudden |
6 | Recent | Positive | Sudden |
7 | Old | Negative | Sudden |
8 | Recent | Negative | Sudden |
Scenario . | Timing . | Direction . | Pattern . |
---|---|---|---|
Base | None | None | None |
1 | Old | Positive | Gradual |
2 | Recent | Positive | Gradual |
3 | Old | Negative | Gradual |
4 | Recent | Negative | Gradual |
5 | Old | Positive | Sudden |
6 | Recent | Positive | Sudden |
7 | Old | Negative | Sudden |
8 | Recent | Negative | Sudden |

Experimental design showing the time-varying processes (a–c), fishing patterns (d–f), and time-varying patterns for the processes (g). For the patterns of time variance, only some cases are shown to reduce clutter (solid and dotted lines show old and recent timing, respectively; black and grey lines show gradual and sudden patterns, respectively). See text for further explanation.
A full factorial design with the 216 possible combinations of all four factors was performed. Additionally, nine “base” cases (three life histories × three fishing patterns) with no time-varying processes were also evaluated for comparison. Fifty simulations were performed for each scenario. The EM was run six times for each of the 50 simulations: once with all available data and five times with one fewer year of data each time (model runs with fewer data are referred to as “peels”).
Convergence criteria
Convergence was determined using the maximum gradient from the minimization procedure. Only simulations with gradients <0.1 were accepted. To ensure that all scenarios had the same number of simulations, new iterations using new randomly generated numbers were run until the desired number of simulations was obtained. Given time constraints, inverting the Hessian matrix (a common test of convergence) was not an option.
Performance metrics and analyses
To compare different scenarios quantitatively, a fixed-effects analysis of variance (ANOVA) was used to evaluate the proportion of the variance of the values of Mohn’s ρ for spawning biomass explained by each factor. Separate analyses were conducted by life history type because differences among life histories are large and may dampen the effect of individual factors. Values for Mohn’s ρ were also regressed against the relative error of biomass in the terminal year from the last (most data-rich) assessment to determine whether Mohn’s ρ gives any information about the bias in model estimates. Note that these analyses were not conducted to evaluate “statistical significance”, but rather as a way to characterize the output of a complicated set of simulations.
Results

Sample results showing retrospective patterns for cod for fishing mortality “fish down and recovery”. Growth is time varying in the OM, with results for scenario 2 (a; recent, negative, gradual change) and scenario 3 (b; old, positive, gradual change).

Distribution of Mohn’s ρ (upper panels) and κ (lower panels) for spawning biomass, for cod (a and d), flatfish (b and e), and sardine (c and f) when growth is time varying. The bars on the lower x-axis show the percentage of times κ was <0. Colours are repeated three times for the three F patterns (constant, increasing, and up and down, respectively). See Table 2 for scenario IDs.

Distribution of Mohn’s ρ (upper panels) and κ (lower panels) for spawning biomass, for cod (a and d), flatfish (b and e), and sardine (c and f) when selectivity is time varying. The bars on the lower x-axis show the percentage of times κ was <0. Colours are repeated three times for the three F patterns (constant, increasing, and up and down, respectively). See Table 2 for scenario IDs.

Distribution of Mohn’s ρ (upper panels) and κ (lower panels) for spawning biomass, for cod (a and d), flatfish (b and e), and sardine (c and f) when M is time varying. The bars on the lower x-axis show the percentage of times κ was <0. Colours are repeated three times for the three F patterns (constant, increasing, and up and down, respectively). See Table 2 for scenario IDs.
Effect of time-varying factors on Mohn’s ρ statistic
For all life history types, a change to faster growth, selection at larger sizes, and lower M resulted in retrospective patterns with negative Mohn’s ρ for estimates of biomass, while a change to slower growth, selection at smaller sizes, and higher M resulted in positive Mohn’s ρ values (Figures 6–8). A positive value for the ρ statistic means that the quantity being evaluated is consistently being overestimated (when compared with the estimate from the full time-series) and is potentially most problematic in terms of sustainability. Fishing pattern showed a minor influence on the relationship between Mohn’s ρ on the how time-varying growth impacted estimation performance (Table 3). Timing and pattern (i.e. abrupt vs. gradual) of the changes did not have a noteworthy influence.
ANOVA results for Mohn’s ρ for biomass, showing the percentage of variance explained by each variable (values are sum of sq./total sum of sq.).
Variable . | Cod . | Flatfish . | Sardine . |
---|---|---|---|
Process | 2.799 | 2.648 | 1.879 |
F | 0.527 | 0.114 | 0.413 |
Timing | 0.014 | 0.000 | 0.030 |
Direction | 0.163 | 0.376 | 0.031 |
Pattern | 0.032 | 0.034 | 0.086 |
Process × timing | 0.001 | 0.002 | 0.027 |
Process × direction | 5.065 | 2.464 | 0.066 |
Process × pattern | 0.517 | 0.181 | 0.137 |
Residuals | 90.881 | 94.18 | 97.331 |
Variable . | Cod . | Flatfish . | Sardine . |
---|---|---|---|
Process | 2.799 | 2.648 | 1.879 |
F | 0.527 | 0.114 | 0.413 |
Timing | 0.014 | 0.000 | 0.030 |
Direction | 0.163 | 0.376 | 0.031 |
Pattern | 0.032 | 0.034 | 0.086 |
Process × timing | 0.001 | 0.002 | 0.027 |
Process × direction | 5.065 | 2.464 | 0.066 |
Process × pattern | 0.517 | 0.181 | 0.137 |
Residuals | 90.881 | 94.18 | 97.331 |
ANOVA results for Mohn’s ρ for biomass, showing the percentage of variance explained by each variable (values are sum of sq./total sum of sq.).
Variable . | Cod . | Flatfish . | Sardine . |
---|---|---|---|
Process | 2.799 | 2.648 | 1.879 |
F | 0.527 | 0.114 | 0.413 |
Timing | 0.014 | 0.000 | 0.030 |
Direction | 0.163 | 0.376 | 0.031 |
Pattern | 0.032 | 0.034 | 0.086 |
Process × timing | 0.001 | 0.002 | 0.027 |
Process × direction | 5.065 | 2.464 | 0.066 |
Process × pattern | 0.517 | 0.181 | 0.137 |
Residuals | 90.881 | 94.18 | 97.331 |
Variable . | Cod . | Flatfish . | Sardine . |
---|---|---|---|
Process | 2.799 | 2.648 | 1.879 |
F | 0.527 | 0.114 | 0.413 |
Timing | 0.014 | 0.000 | 0.030 |
Direction | 0.163 | 0.376 | 0.031 |
Pattern | 0.032 | 0.034 | 0.086 |
Process × timing | 0.001 | 0.002 | 0.027 |
Process × direction | 5.065 | 2.464 | 0.066 |
Process × pattern | 0.517 | 0.181 | 0.137 |
Residuals | 90.881 | 94.18 | 97.331 |
Other, more complex, patterns also emerged, and some factors may be more important for some life history types than others (Table 3). Comparisons of which variable (M, selectivity, growth) had the largest impact on Mohn’s ρ should be interpreted with caution because the changes over time in these quantities are not necessarily comparable. Nevertheless, for cod, the magnitude of Mohn’s ρ was more marked for changes in growth (Figure 6a), but smaller for changes in selectivity (Figure 7a). Flatfish showed the opposite pattern, larger impacts for changes in selectivity (Figure 7b), but smaller for changes in growth (Figure 6b). All three life histories showed largest (positive or negative) Mohn’s ρ values when natural mortality changed (Figure 8a–c); sardine showed values as large as those under changing natural mortality for all three factors studied (Figures 6c, 7c, and 8c).
Retrospective patterns for estimates of F showed a similar pattern as those for estimates of spawning biomass, but with different sign (Supplementary Figures S1–S3; Supplementary Table S1). However, the relationship between Mohn’s ρ for biomass and that for F was not linear (Supplementary Figure S4). The curvature in the relationship (Supplementary Figure S4) is explained by the lognormal distribution of biomass, and the exponential relationship between biomass and F.
Convergence/divergence and bias
A positive value of the convergence/divergence index, κ, means that a retrospective pattern is divergent, i.e. the average absolute bias in the last year of a peel is larger than the absolute bias in the previous to last year of the peel (Figure 1d). The κ statistic was mostly positive for growth (Figure 6d and e) and natural mortality (Figure 8d and e) for the cod and flatfish life histories, and always for sardine (Figures 6f, 7f, and 8f). However, cod and flatfish showed negative values for κ about half of the time for selectivity (Figure 7d and e). The index κ showed a similar behaviour for retrospective patterns in F as it did for spawning biomass.

Relationship between Mohn’s ρ and relative error in spawning biomass in the terminal year from the last (most data-rich) assessment.
Influence of specific time-varying processes in the OM on estimates of parameters

Ordination of simulation results via NMDS for cod and flatfish. Grey dots are individual simulations; arrows indicate gradients of significant (i.e. permutation p-values <0.01) response variables (relative error and measures of bias). The length of the arrow reflects the strength of the gradient. Grey text indicates the location of factors changed in scenarios that had permutation p-values <0.01. “FCons”, “FUp”, and “FTwoWay” represent constant F, increasing F, and fish down and recovery F patterns, respectively.
Correlations between scenarios and the ordination scores for the responses were significant, but effect sizes (i.e. goodness-of-fits) were very small, which suggests the observed correlations may be an artefact of the large sample size resulting from a repeated simulation study. Still, some generalizations might be drawn from the NMDS. Figure 10 shows the scenarios and responses for which the p-values for the goodness-of-fit statistics were <0.01. Larger Mohn’s ρ and κ for biomass were more associated with a steadily increasing F pattern and gradual changes in population processes. Larger Mohn’s ρ for F and error in biomass were associated with sudden, recent changes in population processes and were related to larger relative errors in selectivity parameters in both the fishery and the survey. Relative errors in growth parameters are not all shown, but have similar gradients and directions as K and were associated with a “fish down with recovery” F pattern with recent changes in population processes.
Discussion
We have shown that retrospective patterns can be induced in integrated, age-structured models, when time-varying processes are not accounted for in an assessment. Model misspecification had been identified in the past as a cause of retrospective patterns (Mohn, 1999; Cadigan and Farrell, 2005; NOAA, 2009), but current literature is still unclear about how large a retrospective pattern has to be for it to be of concern. It is generally assumed that retrospective patterns cannot be generated from purely random processes (NOAA, 2009). Although Mohn’s ρ for simulated data without underlying time-varying processes was roughly unbiased, convincing-looking retrospective patterns were generated in some cases from these data (Supplementary Figure S5). Such patterns tended to have smaller Mohn’s ρ statistics than those produced by model misspecification, so Mohn’s ρ should always be used (instead of visual identification) for identifying retrospective patterns that warrant corrective measures such as allowing for time-varying parameters in the stock assessment.
All the processes considered led to retrospective patterns, even if certain combinations of processes did not produce them. Because we explored processes related to biology, fishery, and population dynamics (growth rate, selectivity, and natural mortality, respectively), this suggests that retrospective patterns can arise from misspecification in any parameter of a stock assessment failing to include time variance, when that process is actually time-varying. Although the absence of a retrospective pattern may suggest that a stock assessment is correctly specified, this must not be taken as definitive confirmation (in a few cases, misspecified models did not show retrospective patterns).
Retrospective patterns and the magnitude of Mohn’s ρ are life history dependent, and could be case-specific (Mohn, 1999; NOAA, 2009). We found that species with higher variability in their dynamics showed higher variability and magnitudes in Mohn’s ρ, but the general behaviour of this statistic was robust to life history type. Retrospective patterns did not change considerably with fishing history or depletion, suggesting that these factors are not important for the magnitude or sign of the Mohn’s ρ statistic. However, we only explored catch series that were perfectly known; imperfectly known catch histories can also generate retrospective patterns (Parma, 1993; Mohn, 1999; NOAA, 2009), and should be considered as an option when attempting to identify responsible processes behind retrospective biases. Larger changes in time-varying factors may also result in larger values of Mohn’s ρ, as has been observed in some New England assessments (Chris Legault, NMFS, NOAA, pers. comm.), but this was not addressed directly in this study.
Given that the variability of Mohn’s ρ depends on life history, and that the statistic appears insensitive to F, we propose the following rule of thumb when determining whether a retrospective pattern should be addressed explicitly: values of Mohn’s ρ higher than 0.20 or lower than −0.15 for longer-lived species (upper and lower bounds of the 90% simulation intervals for the flatfish base case), or higher than 0.30 or lower than −0.22 for shorter-lived species (upper and lower bounds of the 90% simulation intervals for the sardine base case) should be cause for concern and taken as indicators of retrospective patterns. However, Mohn’s ρ values smaller than those proposed should not be taken as confirmation that a given assessment does not present a retrospective pattern, and the choice of 90% means that a “false positive” will arise 10% of the time. In both cases, model misspecification would be correctly detected more than half the time.
This rule of thumb is presented with several caveats. The survey CV used in these simulations is small (0.2), which may affect the magnitude of retrospective patterns. We ran additional simulations for a subset of scenarios (cod with time-varying growth and constant harvest), with increasing survey uncertainty (CV 0.2, 0.4, 0.6, and 0.8) and estimating the hessian. Increasing survey uncertainty increased the variance of Mohn’s ρ, but not its median (Supplementary Figure S6a). This increase in variance means that a “false positive” would arise 30% of the time when survey CV is 0.8. NOAA (2009) proposes a rule of thumb that considers estimation uncertainty to assess whether a retrospective pattern is large enough. Following this suggestion, we divided Mohn’s ρ by the σ of the terminal estimate of biomass. This stabilized the variance of the index, but it also resulted in a decreasing absolute value of its median as survey uncertainty increased (Supplementary Figure S6b). This is undesirable, thus we consider that using the actual value of Mohn’s ρ to be more appropriate for a rule of thumb. Also, we only used simulations from three representative life histories, and these intervals could vary for other life history types, such as large pelagic fish (e.g. tunas) or slow-growing, long-lived demersal species (e.g. rockfish).
The insensitivity of Mohn’s ρ to the process which is varying over time (as seen through the small effect sizes of the different scenarios in the NMDS and the ANOVA) makes it a reliable statistic to assess whether retrospective patterns arise from time-varying processes. Mohn’s ρ is useful in determining the direction of the change of the true time-varying process that may be causing the retrospective pattern. However, the magnitude of a Mohn’s ρ is not related to bias in biomass or F, and should not be used to assess how far an assessment is from the truth.
For management purposes, some retrospective patterns are more concerning than others. Patterns that show a positive Mohn’s ρ and positive κ for biomass (negative Mohn’s ρ for F) are the most concerning in terms of stock conservation, as they imply consistent overestimation of biomass and the highest risk for overfishing. Of the cases studied here, managers should be particularly alert to changes to slower growth or higher M, which resulted in the most positive Mohn’s ρ and consequently, overestimation of biomass.
We showed that retrospective patterns arising from time-varying processes can look very similar and that NMDS was unable to associate any population process with gradients in bias statistics or relative error in estimated parameters. However, retrospective patterns in F are associated with higher relative errors in selectivity parameters. Retrospective patterns in F may be cause to examine selectivity more closely, for models with similar formulations to the ones included in our analysis. When retrospective patterns are observed in a stock assessment, they are often corrected by introducing estimation of a time-varying parameter (usually selectivity, M or q; Fu et al., 2001; Legault et al., 2011, 2012; Martell and Stewart, 2013) or applying a retrospective bias adjustment (TRAC, 2012; Deroba, 2014). The underlying causes for such patterns are commonly unknown (Fu et al., 2001; ICES, 2008), but moving window analyses have proven promising in identifying the timing of a change which leads to the retrospective pattern (ICES, 2008). The risk of introducing further misspecification in an assessment can be high if these corrections are done based only on the presence of a retrospective pattern. It is uncertain whether a misspecification of time-varying parameters used to correct a retrospective pattern introduces more biases into the assessment. In addition, adding time-varying parameters could lead to an over-parameterized models and overall to poorer performance, so further research in this area is needed.
This study is a first attempt to systematically characterize retrospective patterns, and has caveats that should be considered. The κ-statistic is not available for real-life stock assessments, but we believe that it can be useful for further simulation studies. This statistic can be of particular interest when developing management strategy evaluations that consider retrospective patterns, as it informs whether these patterns are convergent or divergent. Whether a retrospective pattern is convergent or divergent has important implications for the conservation of the resource, as was discussed earlier. We studied a limited range of variables, and the behaviour of retrospective patterns to changes in other parameters (e.g. q, L∞) should be explored systematically. Also, we only changed one process at a time, and did not explore the interaction between multiple changes in different parameters, or more complex patterns of time variance such as the pulse changes in catchability explored by NOAA (2009). We did not explicitly search for methods to identify the source of a retrospective pattern. Last, we only explored the effects of retrospective patterns on stock assessments, but did not evaluate how much risk they would introduce when managing a stock. For example, risk would depend on the status of the stock. If stock biomass is very high, a retrospective pattern might not be as problematic as it would be if biomass is very low, where a retrospective pattern can be more risky. Alternatively, model misspecification could affect the setting of reference points used in management. Studies such as management strategy evaluations (Smith et al., 1999) are thus needed if these risks are to be characterized.
Supplementary data
Supplementary material is available at the ICESJMS online version of the manuscript.
Acknowledgements
The authors thank Chris Legault and one anonymous reviewer for their invaluable comments and suggestions. CSS was supported by a Washington SeaGrant fellowship. RRL was supported by Conicyt. MLM was funded by Exxon Valdez Oil Spill Trustee Council, grant 13120111-Q. AEP, KFJ, KO, CCM, and CRM were partially funded by the Joint Institute for the Study of the Atmosphere and Ocean (JISAO) under NOAA Cooperative Agreements NA10OAR4320148, Contribution No. 2194, respectively. KFJ was partially supported for this work under a World Conference on Stock Assessment Methods travel bursary. SCA was supported by Fulbright Canada and NSERC. Partial support for this research came from a Eunice Kennedy Shriver National Institute of Child Health and Human Development research infrastructure grant, R24 HD042828, to the Center for Studies in Demography and Ecology at the University of Washington. This research addresses the good practices in stock assessment modelling program of the Center for the Advancement of Population Assessment Methodology (CAPAM).
References
Author notes
Present address: National Marine Fisheries Service, Alaska Fisheries Science Center, National Oceanic and Atmospheric Administration, 7600 Sand Point Way NE, Seattle, WA 98115, USA.
Handling editor: Mark Maunder