Calibr ating ecosyst em models t o suppor t ecosyst em-based manag ement of marine syst ems

Ecosystem models, such as Ecopath with Ecosim (EwE), provide a platform to simulate intricate policy scenarios where multiple species, pressures


Introduction
There is a growing demand for ecosystem models that can support decision-makers in their commitments to implement ecosystem-based management of fisheries and other marine resources (Craig and Link 2023, Karp et al. 2023, Rodriguez Perez et al. 2023 ).Ecosystem models facilitate simulations of multiple species, interspecific interactions, top-down and bottom-up processes, and how they are impacted by the prevailing environmental and anthropogenic pressures.Although ecosystem models have been around for more than four decades (Andersen and Ursin 1978 ;Laevasto and Larkin 1981 ), only recently, and on rare occasions, has their management utility been realized in real-world settings (Howell et al. 2021 , Craig andLink 2023 ).
The complexity and associated uncertainty of ecosystem models have historically been seen to reduce their suitability for tactical advice (Hilborn and Walters 1992, Plagányi et al. 2014, Townsend et al. 2008 ), while the structural inertia of traditional management systems and regulatory frameworks has hampered the operational rollout of ecosystembased management (EBM) and ecosystem models (Christie 2005, Marshak et al. 2017, Craig and Link 2023, Karp et al. 2023 ).This has changed, however, as the importance of going beyond tactical management and towards a more strate-gic and trade-off-orientated approach has been made clear through initiatives such as the EU Landing Obligation (e.g.Celi ć et al. 2018 ) and the development of alternative energy with the associated consequences for spatial planning (e.g.Serpetti et al. 2021 ).
The development and application of ecosystem models are constrained by technical challenges such as data gaps, imprecise observations, and uncertainty in input parameters (Steenbeek et al. 2021, Karp et al. 2023 ).Parameter estimation in ecosystem models can be subject to bias and limited transparency due to the lack of representative data, large parameter space, long run-times, and the absence of a standardized calibration approach.Without appropriate guidance, model calibration, through which simulations are brought into alignment with observations, can introduce considerable uncertainty and bias into model predictions and stimulate the development of divergent calibration strategies that vary across model frameworks and even individual modelers (Pethybridge et al. 2019 ).
Decision-makers are increasingly exploring how ecosystem models can inform management and policy, particularly in the context of cumulative pressure assessment in a rapidly changing environment and multispecies trade-offs.Justifiable and reproducible approaches for calibrating ecosystem models are therefore needed to ensure models are capable of reproducing retrospective trends and have predictive power.The calibration step is essential for establishing the form predator impacts on prey take over a range of predator density.The form of these relationships sets the scope for trophic interactiondependent changes that have occurred and could potentially occur as a result of management decisions and environmental changes.Regardless of the modeling platform, careful consideration is needed at this step.
Calibration procedures for ecosystem models vary depending on model complexity and how they represent different components and processes of the ecosystem.Existing calibration procedures involve manual and iterative tuning approaches to fit to historical observations (Pethybridge et al. 2019 ) as well as the implementation of automatic calibration routines (e.g.Scott et al. 2016 ).As models increase in complexity, with the addition of more species or biophysical processes, the number of parameters that could be tuned generally grows, increasing the potential for overfitting with associated poor predictive power.
Ecopath with Ecosim (EwE; Christensen and Walters 2004 ) is the most widely used ecosystem modeling framework, used to explore fishery management options in the context of their ecosystem impacts (e.g.Mackinson et al. 2018 ), and increasingly to simulate the often cumulative impacts of climate change (e.g.Serpetti et al. 2017, Stock et al. 2023 ), Marine Protected Areas (MPAs; e.g.Püts et al. 2023 ), bioaccumulation of contaminants (e.g.Tierney et al. 2018 ), and infrastructure development such as offshore wind (e.g.Serpetti et al. 2021 ).EwE is being increasingly applied to inform intergovernmental advice, particularly within the International Council for the Exploration of the Sea (ICES).EwE has been used by ICES to enhance single species catch advice with ecosystem information (ICES 2020 , Bentley et al. 2021 ), add qualitative food web descriptions to ecosystem overviews (e.g.ICES 2022 ), and explore options to incorporate ecosystem information into Integrated Ecosystem Assessments (IEA; ICES 2018 ).Continuing progress is also being made through ICES using EwE to quantify the impacts of pressures on ecosystem services (ICES 2023 ), explore the flow and sequestration of carbon, and provide quantitative food web indicators for ICES Ecosystem Overviews (ICES 2019a ).
EwE includes a static mass-balance model of the marine system, Ecopath, that serves as the initial conditions for timedynamic simulations in Ecosim (Walters et al. 1997 ).Ecosim predictions are sensitive to the Ecopath input parameters (usually biomass, production and consumption rates, diet, and fishery removals) as well as the predator −prey "vulnerability multipliers," which can be manually set or estimated in Ecosim during model calibration to determine the functional form of predator consumption rates as predator biomass changes.The impacts of vulnerability multipliers are conditional on Ecopath inputs, which set the baseline predation mortalities.The concept underpinning vulnerability multipliers is derived from foraging arena theory (Ahrens et al. 2012 ).In Ecosim, vulnerability multipliers have implications for stock-recruit dynamics, density dependence and compensation, carrying capacity, stock resiliency, interspecific interactions, and ecosystem energy flow (Walters and Martell 2004 ).However, the effect of different calibration strategies on the estimation of vulnerability multipliers in Ecosim and emergent stock productivity estimates has not yet been demonstrated, nor do we under-stand in a comprehensive way how sensitive model outputs are to different approaches and how this may influence derived advice.
The goals of this paper are to shed additional light on the Ecosim vulnerability multipliers and the ecological theory behind them, describe different Ecosim calibration approaches, and demonstrate the sensitivity of model outputs and management advice to those approaches.We provide an updated overview of foraging arena theory and its implementation in Ecosim, followed by a discussion of the life history and ecological basis for initializing vulnerabilities and the existing approaches to estimation in Ecosim.Case studies are provided to demonstrate the effect of model fitting strategies on a conceptual model as well as two ecosystem models with drastically different structures that have each been used in realworld management applications.It is the overarching aim of this paper to increase awareness and transparency in the EwE calibration approach and provide recommendations for the estimation of vulnerability multipliers to support the growing application of EwE as a decision support tool.

Ecosim vulnerability multipliers and their estimation
Behavioral, morphological, and physical mechanisms limit the rates at which prey encounter predators (e.g.Walters andJuanes 1993 , Green andCôté 2014 ).Behaviors such as vertical migration, schooling, and hiding reduce the individual's probability of capture (Gilwicz 1986, Jolles et al. 2022 ), while physical processes such as mixing or sinking in the water column can limit the spatial overlap of predator and prey (Morison et al. 2019 ).Simple mass-action predictors (such as the Lotka-V olterra model; W angersky 1978 ) do not account for the effect of risk-sensitive foraging behavior on prey availability and assume that encounters between predators and prey are entirely random and linear.In such models, biomass and encounter rates are the only factors limiting consumption.
Ecosim predictions of consumption based on a simple massaction model have been modified to account for spatial and temporal constraints as a result of risk-sensitive foraging behavior (Walters et al. 1997, 2000, Ahrens et al. 2012 ).Prey biomass pools in Ecosim are dynamically divided into vulnerable and invulnerable components, which imply behavioral or physical mechanisms that limit the rate at which prey become vulnerable to predation ( Fig. 1 ; Walters et al. 1997 ).The transfer rates between these components determine the amount of prey available to a predator and thus the degree to which a change in predator biomass will impact predation mortality and prey biomass.
These transfer rates, or "vulnerabilities," ( v i j ) as they are more widely known, influence predator and prey biomasses by regulating the consumption rates ( Q i j ) of a predator j on its prey group i , as shown in Equation (1): Equation ( 1) has grown over time in response to user requests for new functionality being added to impact consumption.Here, a i j is the effective search rate of predator j feeding on prey i , which directly influences the time spent feeding and, thus, the risk of predation.The terms T i and T j are the relative feeding times of prey i and predator j, which can be adjusted via a feeding time adjustment rate: larger values represent a Downloaded from https://academic.oup.com/icesjms/article/81/2/260/7587678 by Centre Mediterrani d'Investigacions Marines i Ambientals -CSIC user on 13 May 2024 Figure 1 Simulation of flow between available ( V i) and unavailable ( B i − V i) prey biomass in Ecosim.aij is the search rate of prey i by predator j, and v is the e x change rate between the vulnerable and unvulnerable states.Fast equilibrium between the two prey states implies V i = v B i/(2v + a B j). Based on Walters et al. (1997) .more rapid adjustment of foraging time and reduced predation risk.S i j are the seasonal and long-term forcing effects, which can be defined by the user to modify consumption in direct relation to environmental conditions.M i j are the mediation forcing effects, which users can construct to account for changes in consumption that may have been indirectly caused by the behaviors of other groups or external factors.For example, small pelagic fish may become more vulnerable to predation by seabirds if they are driven to the surface by mediators such as large pelagic predators (Dill et al. 2003 ).Handling time, D j , limits the rate of consumption and represents the notion that predators have limited time available for foraging due to the time spent in pursuit of prey or manipulating/ingesting captured prey (Holling 1959, Schoener 1971 ).The size of the foraging arena is denoted as A .Finally, f ( En v t ) represents an environmental response function that restricts the size of the foraging arena to account for external drivers, which may change over time (Christensen et al. 2014 ).This function has principally been used to make consumption rates dynamic with time and space in response to temperature and habitat (e.g.Christensen et al. 2015, Bentley et al. 2017, Serpetti et al. 2017, Corrales et al. 2018 ).
Users cannot adjust vulnerability exchange rates ( v i j ) directly.Instead, users interact with these rates via vulnerability multipliers ( k i j ), which can be more easily interpreted as the maximum increase in predation mortality rate that a predator can exert on a prey if the predator were to grow to its carrying capacity.The increase is relative to baseline Ecopath predation mortality rates ( M 2 , where M 2 = Q ji / B i ) .The vulnerability exchange rates ( v i j ) are then set to the vulnerability multiplier ( k i j ) multiplied by the baseline predation mortality ( M 2 ), i.e. v i j = k i j M 2 .Multipliers can range from one to infinity with two as the default value.When v i j results in a low rate of exchange the relationship between predator density and mortality on prey saturates quickly.As k i j is increased, resulting in an increase in v i j , this relationship straightens.
The vulnerability multipliers within the Ecopath baseline establish the shape of the predator −prey relationship and this shape does not change when running Ecosim across years: Ecosim dynamically handles the consequences of changes in predator and prey abundance based on the baseline situation, including changes in carrying capacity over time.The default value of two assumes that the predation mortality rate can double at the highest, while a value of one means that the predator is at its carrying capacity, which by definition means that it fully utilizes its prey, so it cannot increase the predation mortality its causing on the prey.
High vulnerability multipliers imply strong top-down control and weak bottom-up control.With top-down control, a doubling of predator abundance may result in close to a doubling in the predation mortality they are causing, which can occur when a predator is far from its carrying capacity.Of course, predator carrying capacity is also dependent on prey biomass, but the key aspect of top-down control is that it is the predator abundance that determines the rate of prey consumption.With low vulnerability multipliers where a predator is at its carrying capacity, any increase in consumption has to be linked to changes in prey productivity-i.e. in bottom-up factors (Christensen and Walters 2023 ).
For exploited species, it is thus extremely important to recognize that vulnerability multipliers do not only reflect the ecological limits caused by prey and predator behavior, but also how depleted the exploited species is in the base Ecopath model relative to the natural level (i.e.carrying capacity) that might be achieved if fishing were stopped.As such, for overexploited species to recover following reduced fishing, vulnerability multipliers need to be set relatively high so that predators can consume far more prey than in the initial Ecopath snapshot.Higher vulnerability multipliers tend to make groups more sensitive and responsive to changes in fishing mortality.It is also important to consider not only the consequences of risk-sensitive foraging but also the ecological consequences of predator depletion.As a predator population increases it may expand its distribution and access additional foraging areas, albeit this may be restricted by species ecology and habitat availability.
Over time, EwE users have adopted and developed multiple approaches to parameterize the vulnerability multipliers ( Fig. 2 ).While some approaches derive estimates from a priori knowledge and ecological observations or hypotheses in data-poor situations (e.g.Rehren et al. 2022 ), users are more frequently turning to formal statistical estimation using calibration time series and a tuning process when time series are available (e.g.Scott et al. 2016 ).Statistical fitting routines estimate vulnerability multipliers that bring simulations closer in line with observations.However, users should be cautious, as statistically optimized multipliers may stray away from values that might be considered ecologically realistic.A thorough sense-check is always recommended.The following sections explore these different approaches in more detail.

Using ecology and history to estimate or initializ e vulner ability multipliers
Vulnerability multipliers are perhaps easier to comprehend when it is assumed that they reflect how far an exploited predator is from their carrying capacity (e.g.interpreted as unfished state), and that vulnerability multipliers should facilitate consumption rates that enable a species to recover from their Ecopath biomass to their unfished biomass if fishing ceases.EwE can use the ratio between a group's unfished biomass and its Ecopath biomass to estimate vulnerability multipliers for exploited groups (see Box 1 in Supplementary material ).This does, however, require information on the unfished biomasses of exploited species, which could either come from quantitative historical estimates (e.g.Rosenberg et al.  2005 ) or a proxy method (e.g. the CPUE approach used by Rehren et al. 2022 ).
An added corollary is that the unfished state may be associated with high abundance of top predators and low abundance of their prey due to high predation mortality.If such top predator populations are fished down, predator release may cause the prey to increase.For those prey, the vulnerability multipliers should thus be set to a high value, even though the baseline model represents the unfished state.Mid-trophiclevel organisms are likely to also be fished; therefore, their vulnerability multipliers could also be set to reflect related depletion.
It can indeed be difficult to specify reasonable vulnerability multipliers for non-exploited species.Here, vulnerability multipliers need to be considered in the context of the foraging arena: the fine-scale spatial structure of the trophic interactions and what proportion of prey may be vulnerable to predation at any moment ( Fig. 3 ).The activity, spatial restrictions, and distributions of species provide insight into the likely vulnerability of prey to predation.This in turn provides a starting point from where it is possible to assign vulnerability multipliers.The distribution of predators could be restricted by limited mobility, habitat requirements, or the predation risk they face themselves, whereas prey vulnerability may be influ-enced by the time they spend in and out of refuge.This can be related to the availability of shelter, such as macroalgae (e.g.Wilson et al. 1990 ), or specific ontogenetic life stages; e.g.juvenile fish may allocate less time to foraging, or be more restricted spatially (and thus unable to access vulnerable pools of prey) than their adult counterparts (Nunn et al. 2012 ).Different behaviors, such as dispersal behaviors (e.g.moving to spawning sites), aggressive behaviors, or evolutionary behaviors (e.g.changes in shoaling dynamics) may also influence vulnerability to predation (e.g.Ladich 2022, Sbragaglia et al. 2021 ).
Trophic levels have also been used to approximate vulnerability multipliers (e.g.Cheung et al. 2002, Kluger et al. 2016 ).This approach has been used in situations where time series data were unavailable and assumes that the vulnerability multipliers are proportional to the trophic level of the predator.If we refer to the relationship between vulnerability multipliers and unfished biomass, this approach assumes that higher trophic levels are further removed from their unfished biomass than lower trophic levels.This may seem like a reasonable assumption considering how global fisheries have historically targeted and depleted higher trophic level fish stocks (Shannon et al. 2014 ), but conflicts with the concept of using a priori knowledge to parameterize vulnerability multipliers  based on region-specific trends in historical exploitation or ecology.
Finally, an approach to setting vulnerability multipliers was recently applied by Chagaris et al. (2020) to constrain how much predation mortality by a given predator could increase relative to a prey's total natural mortality: where M 2 cap defines the percentage of the natural mortality of a prey that a predator can be responsible for (between 0 and 1), M i is the natural mortality of prey i , and M 2 base ,i j is the base predation mortality by predator j on prey i .There may be ecological reasons, or reasons derived from data, to prevent a single predator from being accountable for large proportions of a prey's natural mortality.Using these k i j instead of default or model estimated values (which are often higher) may also be driven by ambitions for model perfor-mance.Chagaris et al. (2020) found that extremely high k i j estimated by Ecosim led to dynamic instability at high fishing mortality rates when evaluating equilibrium yield curves, and using M 2 cap values between 0.75 and 1.0 led to more reasonable estimates for F MSY (the fishing mortality at maximum sustainable yield) while also constraining theoretical maximum predation mortality rates to values that were compatible with prey natural mortality rates.

Evaluating and weighting time series
It is becoming more common for Ecosim vulnerability multipliers to be estimated using statistical routines, with Heymans et al. (2016)   they provide an opportunity to explore important drivers of change and tend to have strong contrast in the data, which improves the model's ability to estimate parameters, leading to more confidence in our assessment of ecosystem dynamics.
The quality of time series data is important, especially if the fitted model is to be used for management purposes, as vulnerability multipliers, and thus predation rates, will be augmented for simulations to fit against the data.In Ecosim, users can weight time-series data to represent how reliable or variable time series are compared to the other reference time series.Low weights imply that the data either has high variance or is unreliable (e.g.underestimated or uncertain catches).Weightings impact the contribution of time series to the assessment of model performance, where a weight of 0 indicates that the time series will not be used in the calculation of "goodness of fit."Weightings can be assigned based on a qualitative assessment of data pedigree (e.g. based on data origin), or by using more quantitative information, such as confidence intervals from survey estimates, the retrospective analyses of stock assessment models, or signal-to-noise ratio assessments (Heymans et al. 2016 ).
The procedure of estimating vulnerability multipliers that improve the fit of model simulation to calibration data is based on the estimation of both the sum of squares ( SS ) and the Akaike Information Index ( AIC) (Akaike 1998 , Cavanaugh and Neath 2019 ).The SS is simulated using log likelihood criteria ( Fig. 4 ): this is a weighted SS of log simulations from log observations, scaled in the case of relative observations ( y ) by the maximum likelihood estimate of the relative simulation scaling factor ( q ) in the equation y = q × n , where n is the absolute observation.When generating a set of models under different fitting hypotheses, AIC is used to iden-tify the model of best fit.AIC is a tool for model selection that penalizes for fitting too many parameters relative to the time series available for estimating the SS and is calculated as: where n is the total number of observations, or time-series values, from the loaded calibration time series and K is the number of parameters estimated.When sample size is small, there is a large probability that AIC will select models with too many estimated parameters (i.e.overfit models).AICc is used to address this potential overfitting by including a correction for small sample sizes: As a rule of thumb, Burnham and Anderson (2004) state that AICc should be used unless n K > about 40.In other words, unless the number of estimated parameters equates to a minimum 2-3% of the amount of data, use AI Cc .AI Cc should therefore be used when assessing EwE model performance.
A word of caution, the AIC calculations assume that the observations are independent, whereas time-series data, such as typically used for ecosystem modeling have high autocorrelation.For this reason, it is advisable to test the impact on assumptions about n on model selection.
Multiple approaches have been developed to statistically estimate vulnerability multipliers (refer to Fig. 2 ).They can be estimated for predators, providing a single multiplier limit to all of a given predator's base predation rates, and they can be estimated for individual predator −prey relationships, which assumes that the multiplier limits are heterogeneous across Downloaded from https://academic.oup.com/icesjms/article/81/2/260/7587678 by Centre Mediterrani d'Investigacions Marines i Ambientals -CSIC user on 13 May 2024 prey.Vulnerability multipliers are estimated by predator or predator −prey, and not just prey, as the terms impact the consumption rates and carrying capacities of predators and therefore have more influence on predator biomass dynamics when estimated by predator.Estimating by prey may also increase the risk of overfitting, which is discussed in more detail in the case studies.This choice between a predator or predator −prey approach tends to be associated with user preference, ecological justification, or determined based on the approach that produces the best fit model.Whether estimating predator or predator −prey vulnerability multipliers, there are a few ways to select which vulnerability multipliers should be estimated: (i) manual selection based on a priori knowledge or species priority; (ii) select vulnerability multipliers for groups with calibration time series; or (iii) select the most sensitive vulnerability multipliers (i.e.those that when changed have the largest impact on SS ).Manually selecting vulnerability multipliers allows for an early integration of ecological information but may lead to a suboptimal model fit if the SS is not sensitive to the selected multipliers.Conversely, the sensitivity search may optimize model fit but it is purely statistical and fits will not necessarily make sense ecologically.Only estimating vulnerability multipliers for groups with time series acknowledges that, to some degree, the parameter should be constrained by the available time series.The level of group connectedness within the food web (e.g. group consumption and predation) may also constrain the parameter search if changes in vulnerability multipliers impact the SS contribution of other groups.Groups that do not have informative time series or, have low connectedness in the food web, have widely variable estimated vulnerability multipliers.
Choosing how many vulnerability multipliers to estimate, without overfitting is another point of confusion and discussion.The number of vulnerability multipliers that can be potentially estimated is often significantly more than the data available to constrain simulations.EwE best practices suggest that a conservative number of degrees of freedom and, therefore, parameters to estimate is one less than the number of calibration time series available ( K − 1 ; Heymans et al. 2016 ).This approach recognizes that values within time series are highly autocorrelated, viewing each time series as an "independent observation," but the K − 1 approach could be overly conservative, especially if long time series are available.
Both manual and automated statistical calibration routines are available in Ecosim to search for vulnerability multipliers.The manual approach can be arduous when testing multiple fitting hypotheses (e.g. with or without fishing effort or primary production anomalies), as the number of plausible fitting combinations can easily reach the hundreds, if not thousands, increasing the likelihood of user error.In the past, users have overcome this issue by only testing the n th fitting scenario (e.g. 5, 10, and 15 vulnerability multipliers, etc.; e.g.Alexander et al. 2015 ), however, this approach risks overlooking the vulnerability multiplier combination, which produces the best statistical fit.The stepwise fitting procedure developed by Scott et al. (2016) automates this process, allowing for a broad exploration of the parameter space, which accelerates the process and removes the problem of user error.Recent improvements to the automated approach have increased the computational speed by enabling multiple fitting scenarios to be tested simultaneously using computers multithreading capabilities (J.Steenbeek personal communication, September 20, 2023).
Novel approaches to estimate vulnerability multipliers using the manual and automated fitting routines have also been developed for two EwE models, which are being used operationally to inform fisheries catch advice (Chagaris et al. 2020, Bentley et al. 2021 ).Bentley et al. (2021) employed an approach that combined searches for predator vulnerability multipliers and predator −prey vulnerability multipliers, whereas the approach developed by Chagaris et al. (2020) uses the manual fitting tool in Ecosim to iteratively estimate the K − 1 most sensitive predator −prey vulnerability multipliers over multiple sequential (repeated) tuning iterations.Full methodologies for these approaches are provided in the Supplementary material .
It is worth reiterating that statistical estimation of vulnerability multipliers does not necessarily have any bearing on ecology.While it is possible to exclude vulnerability multipliers from the search routine, there is currently no mechanism to include prior information or ecologically sensible bounds to constrain the limits for vulnerability multipliers included in the search routine.A judgment evaluation following the formal estimation of vulnerability multipliers should be applied to: (i) reflect on the ecological assumptions attached to estimated vulnerability multipliers; (ii) assess how realistic functional group simulations are (in hindcast and future); and (iii) understand and fix issues with model structure and parameterization ( Fig. 4 ) (e.g.Corrales et al. 2017 ).It is possible to view the fit of each functional group to calibration time series and its contribution to the overall SS in Ecosim via the "SS group plots."This is often used to screen issues with model simulations, such as contradicting trends or misalignment in initial time steps, and direct fixes.
However, what is often not accounted for when estimating vulnerability multipliers is their impacts on the advice products, such as F MSY or food web indicators.The focus is often only on the goodness of fit of the model, but the impacts of estimated vulnerability multipliers on predictions and reference points should be evaluated (e.g.Rehren et al. 2022 ).We provide two case studies to explore how alternate fitting approaches impact the emergence of vulnerability multipliers and how vulnerability multipliers impact model outputs.

Case study 1: How fitting approaches and calibration data influence which vulnerability multipliers emerge
A hypothetical EwE model (Anchovy Bay: often used to test EwE scenarios; see Christensen and Walters 2023 ) was used to investigate how vulnerability multipliers emerge (and whether they re-emerge through fitting) and how this process is influenced by: (i) noise in the calibration data and (ii) the chosen approach for estimating vulnerability multipliers: "predator" vulnerability multipliers or "predator −prey" vulnerability multipliers.We investigated the impact of emerging vulnerability multipliers on biomass and catch simulations and estimates of F MSY .

Building a base Ecosim model
Fictitious Ecosim simulations for Anchovy Bay were created by adding temporal trends to fishing effort and adjusting vulnerability multipliers.Simulated fishing effort trends (Supplementary Fig. 1 ) reflected trends often see in reality: (i) sealers fishing effort followed an exponential decline as may be ex-Downloaded from https://academic.oup.com/icesjms/article/81/2/260/7587678 by Centre Mediterrani d'Investigacions Marines i Ambientals -CSIC user on 13 May 2024 pected in response to conservation efforts/policy; (ii) trawlers fishing effort followed an exponential decline under the assumption that whitefish (cod and whiting) stocks have been overexploited, leading to reductions in effort to encourage stock recovery; (iii) seiners and bait boat effort followed a slight linear increase in response to growing demand; and (iv) shrimpers effort increased assuming fishers shifted their target species to shrimp following reduced opportunities to catch white fish.Vulnerability multipliers ( k i j ) were adjusted following ecological assumptions and assumptions linked to the fishing effort trajectories (Supplementary Table 1 ).To distinguish between scenarios more easily, predator vulnerability multipliers will hereafter be denoted as k j , while predator −prey vulnerability will remain as k i j .For predator k j estimates, a mix of high, low, and default k j estimates were applied.For groups that were assumed to be overexploited, k j values were estimated using the "Estimate Vulnerabilities" interface (as described in the Supplementary material ).For the predator −prey k i j estimates, the Ecosim sensitivity search was used to identify the 10 most sensitive predator/prey k i j parameters, which were then adjusted to ensure a range of high and low k i j estimates were included (Supplementary Table 2 ).
For the purpose of this exercise, these two simulations (one with predator k j and one with predator −prey k i j ) were viewed as perfect representations of their ecosystems, i.e. the biomass and catch simulations were "real observations" driven by the "true" vulnerability multipliers.The aim of the following exercise was to test whether, when using these "real observations" as calibration time series, the "true" vulnerability multipliers would re-emerge, and whether the addition of noise to the "real observations" had any impact on the emerging vulnerability multipliers.Biomass and catch simulations were extracted from Ecosim and four scenarios for observation data quality were prepared: noise [random noise, normally distributed around the mean (true) biomass trend to represent observation error] was added to the calibration time series with coefficients of variation (CV) of 0 (no noise), 0.1, 0.3, and 0.5 (Supplementary Fig. 2 ).

Predator vulnerability multipliers
Vulnerability multipliers were reset to the default value of two; fishing dynamics were not changed from those used to produce the "real observations."The exported biomass and catch time series were used as calibration time series to estimate predator vulnerability multipliers for the functional groups seals, cod, whiting, shrimp, benthos, and zooplankton using the manual stepwise fitting interface.k j values for groups that had values of two in the initial model were not altered.
Fig. 5 shows how k j parameters emerged after model calibration, and how this altered functional group carrying capacities in the absence of fishing and F MSY estimates.k j values, which emerged when estimated using the calibration time series with no noise were similar to the "true" k j parameters ( Fig. 5 a).Adding noise to the calibration time series led to divergence between the estimated k j values and the "true" parameters, highlighting the impact data quality can have on the fitting procedure and thus stressing the importance of evaluating the suitability of time series before using them to drive model calibration.While not tested in the simulations presented, it is important to reiterate that time series can be weighted to represent the reliability of the data relative to established confidence intervals or signal-to-noise assessments (Heymans et al. 2016 ).The variability in k j re-emergence under the four data quality scenarios was also unique to specific functional groups; e.g.k j estimates for cod showed greater re-emergence accuracy (or consistency) when compared to other functional groups.Cod is highly connected within the food web (i.e.cod is an opportunistic predator i.e. also preyed upon by higher trophic levels); therefore, vulnerability multipliers that improve the model fit tend to be more constrained due to their potential to have large cascading impacts on the wider food web.In addition, cod also experienced a period of collapse followed by recovery, which provides much-needed contrast for the model to reliably estimate the vulnerability multipliers.
Functional group carrying capacities and estimates of F MSY were impacted by emerging k j values ( Fig. 5 b and c).Carrying capacities from scenarios with k j parameters calibrated against data with no noise were most similar to those achieved with the "true" k j parameters ( Fig. 5 b), with dissimilarity generally increasing with the addition of noise to the calibration data.The importance of acknowledging the impact of k j estimates beyond model fit is demonstrated with the resulting F MSY estimates: relative changes to F MSY estimates mirrored the deviations of estimated k j values relative to the "true" k j values ( Fig. 5 c).Increases in k j values led to decreases in F MSY , while decreases in k j values led to increases in F MSY .This is because higher k j values enable groups to recover faster with the cessation of fishing and reach a higher carrying capacity, but they also decrease stock resilience to increases in F (functional groups decline faster and more severely if you increase their k j ).It is worth noting that where differences between true and estimated F MSY occurred, they were not proportional to the difference in true and estimated vulnerability multipliers (i.e.large changes in k i j do not result in equally large changes to F MSY ), as F MSY is also constrained by other components, which limit production, including other mortality and the biomass of available prey as well as prey dynamics in the foraging arena.

Predator −prey vulnerability multipliers
Similar to the predator scenario, vulnerability multipliers were reset to the default of two, and the exported biomass and catch time series (generated with "true" predator −prey vulnerability multipliers) were used as calibration time series to estimate predator −prey vulnerability multipliers.Predator −prey k i j values for the 10 most sensitive predator/prey k i j parameters were estimated using the manual stepwise fitting interface.Fig. 6 shows how k i j parameters emerged and how this altered functional group carrying capacities and F MSY estimates.
In comparison to the emergence of predator vulnerabilities, the emergence of predator −prey vulnerabilities was less constrained with poor k i j re-emergence accuracy across all calibration data scenarios ( Fig. 6 a).Functional group carrying capacities showed higher dissimilarity from their baseline when compared to predator simulations and their baseline ( Fig. 6 b).Carrying capacity dissimilarity increased with the addition of noise to the calibration data; however, simulations with no/low noise were notably more dissimilar when estimating predator −prey vulnerabilities as opposed to predator vulnerabilities ( Fig. 6 b), which is due to the greater differences in k i j estimates.
Relative F MSY estimates, influenced by predator −prey k i j values, showed higher dissimilarity from their baseline ( Fig. 6 c) when compared to F MSY estimates influenced by predator k i j values ( Fig. 5 c).The links between predator −prey k i j values and F MSY are less obvious than the links between predator k j values and F MSY due to the more complex interaction-specific consumption limits.This is particularly true for groups with mixed diets where diet composition is relatively homogenous across multiple prey (e.g.cod, whiting, seals, and mackerel), while links between predator −prey k i j values and the F MSY estimates for groups that are heavily dependent on a single prey group were observed for anchovy (F MSY mirrors the anchovy/zooplankton k i j estimates) and shrimp (F MSY mirrors the shrimp/benthos k i j estimates).

Case study 2: How fitting approaches impact the evidence derived from operationally used models
The Irish Sea EwE model and Northwest Atlantic Continental Shelf EwE model (hereafter called NWACS-MICE) have both been used to inform fisheries advice for their respective regions using ecological/ecosystem reference points (Chagaris et al. 2020, Bentley et al. 2021, Howell et al. 2021 ).Both models were designed to focus on commercial fisheries; however, they have very different structures in terms of model complexity (Table 1 ).In this case study, the two models were used to demonstrate the outcomes and management implications of vulnerability multiplier ( k j or k i j ) estimation and compared estimates of F MSY and ecosystem indicators.Ecosystem indicators selected for this analysis included total system biomass, commercial biomass, total catch, system diversity (Kempton's Q; Kempton and Taylor 1976 ), the trophic level of the catch, and the trophic level of the community.Estimates of F MSY and ecosystem indicators were compared across the following nine fitting approaches: (1) Predator k j values, where the number of parameters estimated is one less than the number of available calibration time series ( K − 1 ).( 2) Predator k j values estimated for all functional groups with time series.(3) Predator k j values using the automated stepwise fitting approach, where the applied k j values are taken from the model with the lowest AICc , and the vulnerabilities are reset to the default Equation ( 2) at each fitting iteration.(4) Predator k j values using the automated stepwise fitting approach, where the applied k j values are taken from the model with the lowest AICc , and the vulnerabilities are retained from previous fitting iterations.(5) Predator −prey k i j values, where the number of parameters estimated is one less than the number of available calibration time series ( K − 1 ).(6) Predator −prey k i j values using the automated stepwise fitting approach, where the applied k i j values are taken from the model with the lowest AICc , and the vulnerabilities are reset to the default Equation ( 2) at each fitting iteration.
(7) Predator −prey k i j values using the automated stepwise fitting approach, where the applied k i j values are taken from the model with the lowest AICc , and the vulnerabilities are retained from previous fitting iterations.(8) Predator −prey k i j values using a repeated manual stepwise fitting approach, where the estimated k i j ( K − 1 ) are retained from one iteration to the next (with a total of 5 iterations) and the final configuration is that with the lowest AICc , as was done in Chagaris et al. ( 2020) .( 9) A combination of predator k j and predator −prey k i j values using the methods outlined by Bentley et al. (2020) .Predator k j values were estimated using the automated stepwise fitting approach in #3.Predator −prey k i j values were estimated using a manual stepwise fitting approach and the remaining degrees of freedom.
The number of additional predator −prey k i j 's was determined by their AICc score (note this approach was only carried out for the Irish Sea model in this study).
Alternate fitting approaches led to the emergence of different vulnerability multipliers in the corresponding models of best fit (as determined by sum of squared deviations and AICc ) for the Irish Sea ( Fig. 7 ) and NWA CS-MICE ( Fig. 8 ).Different fitting approaches impacted estimates of F MSY in both models due to changes in species sensitivity to F with alternate vulnerability multipliers.Despite the increased complexity of the Irish Sea model, the patterns in F MSY variability are comparable between models, with certain species having consistent F MSY estimates across the nine approaches for vulnerability multiplier estimation.This includes cod and whiting for the Irish Sea and striped bass for the NWACS.As demonstrated in Case Study 1, these species are opportunistic feeders, which are also predated on by higher trophic levels, giving them a relatively high degree of connectivity within the food web models, which may constrict the emergence of vulnerability multipliers.Additionally, these groups have experienced a period of collapse and, in some cases, recovery, which provides contrast for the model to estimate the vulnerability multipliers.Fitting approaches with similar properties resulted in more closely related F MSY estimates (Supplementary Fig. 3 ).For example, F MSY estimates generated by approaches that searched for vulnerability multipliers by "predator" tended to be more similar to each other when compared to those generated by approaches, which searched by "predator −prey," and vice versa.This emergent trend is perhaps most clearly seen in the F MSY estimates for menhaden and bluefish adults from the NWACS-MICE model ( Fig. 8 ).
We investigated how the approach used to estimate vulnerability multipliers impacted derived ecosystem indicators ( Figs 7 b and 8 b).These impacts were relatively small, with most deviations being within the range of 5-10% when compared to indicators from the published models.Trophic level indicators were particularly robust across estimation approaches, despite often larger differences being observed in diversity (Kempton's Q) catch and commercial biomass indicators.Balanced reconfiguration within the ecosystem models (i.e.increases in some species and decreases in others with similar trophic levels) enabled the trophic indicators to remain similar across approaches.However, the dissimilarity in trophic level of the catch in the NWACS-MICE model was generally higher across scenarios where vulnerability multipliers were searched by "predator."This reflects the higher F MSY reference points for adult weakfish and bluefish and lower F MSY reference points for menhaden and herring produced under the same fitting approaches.Overall, the Irish Sea EwE model showed greater dissimilarity in derived indicators than the NWACS-MICE model.This outcome is likely linked to the increased complexity of the Irish Sea model, and how a repeated search provides the opportunity to adjust more predator −prey vulnerability multipliers.This may be less of a concern for low-complexity models as the parameter space is smaller, increasing the likelihood that the same vulnerability multipliers will be adjusted.

Limitations and future development
By describing the estimation of vulnerability multipliers and visualizing their impact on model simulations, this paper attempts to demystify the use of vulnerability multipliers, advance EwE uncertainty analyses, and contribute to our understanding of the importance of such parameters for model simulations and projections (Payne et al. 2016 ).Understanding and acknowledging the underlying assumptions of vulnerability multipliers, and the risk of overfitting, is a crucial step in the pursuit of operational and defensible EwE models that can be used to inform marine management decisions and objectives.Currently, this is an aspect of EwE modeling which is not always appropriately scrutinized, risking overconfidence in simulations and missed opportunities to convey model uncertainty and direct future research.We expect that one reason for the limited assessment of vulnerability multipliers is the general complexity of the concept, and how this has impacted the ways in which their use, purpose, and meaning have been diversely interpreted by the global community.
Challenges associated with parameter estimation are not unique to EwE or other ecosystem models.Complex singlespecies stock assessment models that integrate multiple data types under a statistical framework also struggle to estimate key parameters that determine stock productivity, such as those describing the stock-recruit curve (Conn et al. 2010, Lee et al. 2012, Mangel et al. 2013 ).Yet these models are routinely used for managing fish stocks around the world.We have shown that, given the calibration approach used (i.e. by estimating predator vulnerability multipliers), EwE can return reasonable parameters.The complexity of ecosystem models should thus not limit their utility to management, considering that they are used appropriately in recognition of their tactical limitations, they can fit the data reasonably well, common sense checks are performed, and uncertainty is appropriately communicated.
It is not enough to estimate vulnerability multipliers and assume those that produce the best statistical hindcast fit are appropriate.Ecological reasoning and hypothesis testing must support statistical inference as it should when balancing Ecopath models (Link 2010 ), estimating primary production anomalies (e.g.Serpetti et al. 2017 ), and incorporating environmental drivers (e.g.Mackinson 2014 ).Part of this process should include a critical evaluation of calibration time series, as these, and their associated uncertainty, drive the statistical estimation of vulnerability multipliers.Using data with inherent inconsistencies will lead to variable and potentially biased estimates.Equally important is the lack of missing reference time series.Time series produce constraints, and when esti- mating vulnerability multipliers for groups without time series, the lack of constraints allows the fitting procedure to explore a broad parameter space to let such groups indirectly impact other groups with time series.All of the above can impact model-derived management advice.
As shown in our case studies, F MSY estimates can change in response to estimated vulnerability multipliers and their impacts on predator consumption rates, albeit with most changes being relatively conservative (Supplementary Fig. 3 ).With high vulnerability multiplier values, species are more sensitive to changes in F as deviations from the baseline Ecopath mortality rate have a greater impact on production rates.Biomass therefore declines at an increased rate as fishing mortality increases (and vice versa) when vulnerability multipliers are high, meaning maximum sustainable yields are achieved at lower fishing mortalities ( Fig. 9 a).As predator consumption rates increase with higher vulnerability multiplier values, prey experience higher predation rates, reducing the yield that can be obtained by fishing at F MSY ( Fig. 9 b).Unreliable vulnerability multipliers are not easily apparent when reviewing model hindcast simulations against calibration time series data.Comparing Ecosim F MSY to other estimates, or proxies (e.g.natural mortality), is one approach to assess vulnerability multipliers and has been demonstrated for the ICES keyrun models of the North Sea (ICES 2016 ) and Irish Sea (ICES 2019b ).Simulating models beyond observations, under alternate fishing or environmental scenarios, can also highlight issues with vulnerability multipliers by exploring group sensitivities and whether simulated responses to change fall outside of what might be considered ecologically reasonable.Future developments should also consider dependencies between vulnerability multipliers, whether correlation exists between vulnerability multipliers, and how this may impact the ability of a search routine to find stable solutions.
For EwE models to be of operational use, it should be possible to explain why estimated vulnerability multipliers are realistic.This could be based on knowledge of species ecology , carrying capacity , or natural mortality .We envisage that the future development of Ecosim will encourage users to think more critically when calibrating models by building options to restrict the statistical vulnerability optimization routine.The objective of this would be to enable users to con-strain vulnerability multiplier estimation using a priori knowledge where, importantly, data is available to justify doing so.Increased control over the search for vulnerability multipliers could be used to set upper and lower parameter limits, or limits determined by carrying capacity, and penalize parameter combinations that operate outside of predefined limits.Such constraints would also have important implications for Ecospace, the spatial-temporal component of EwE.High vulnerability multipliers, and the large increases in predation mortality that they enable, can have disproportionately large impacts in Ecospace when prey are restricted to small areas (as predators are able to deplete them rapidly).Vulnerability multipliers in Ecospace require further consideration given how spatial heterogeneity may impact species physiology, habitat carrying capacity, and predator −prey interaction rates.Spatial considerations are implicit within the vulnerability concept and enable spatial considerations to be integrated indirectly into Ecosim.The necessity for vulnerability multipliers, or at least those in Ecosim, which go some way to indirectly recognizing spatial heterogeneity, may be negated by the explicit consideration of spatial heterogeneity in Ecospace.Alternate vulnerability multiplier combinations may be needed depending on the priority use of Ecosim or Ecospace and the different mechanistic role vulnerability multipliers may play across the two components.

Recommendations
Calibration methods for EwE are not prescriptive.Any one of the methods included in Fig. 2 , or even new methods, may be suitable for use if they can be justified.That said, Case Study 1 showed that predator vulnerability multipliers are more likely to re-emerge than predator −prey vulnerability multipliers, and that re-emergence is impacted by data quality .Below , we have provided best practice recommendations to evaluate the appropriateness of vulnerability multipliers and their impact on model uncertainty: r Recommendation 1: Limit the number of vulnerability multipliers to be estimated.The most efficient way to limit the number of parameters is to estimate by predator, add individual predator −prey combinations, if you have specific arguments for why this is necessary.Avoid Provide justifications for setting initial vulnerability multipliers (or keeping the default).If estimating vulnerability multipliers using a statistical routine, check if they make sense relative to the exploitation and ecology of the predator and the ecology of the predator −prey interaction.
r Recommendation 3: Sense c hec k carrying capacities.Vulnerability multipliers augment the upper limit for predator consumption rates, which dictates how predators respond to changes in mortality rates (e.g. release from fishing pressure or predation) or in prey biomass.It is important to review how predator biomass responds to such changes and critically evaluate whether the changes are plausible and whether the limits of estimates should be constrained (i.e.setting upper and lower limits).

Summary and conclusions
Decision makers are being increasingly challenged to find solutions to complex problems, which require the assessment of risk, trade-offs, and cumulative pressures in the marine environment (Punt 2017, Piet et al. 2020, Pedreschi et al. 2021, Rodriguez Perez et al. 2023 ).EwE is actively being used to support both strategic and tactical management decisions, which address the cumulative impacts of multiple pressures and needs of various stakeholders (Craig andLink 2023 , Karp et al. 2023 ).Given the growing operational use of EwE and other process-based models, it is of mounting importance that best practices are regularly revisited and developed to build our understanding of bias within models and uncertainty in model outputs.This is key to the wider acceptance and pragmatic use of EwE and other ecosystem modeling approaches.Calibration procedures for complex process-based models require particular attention and scrutiny given the sensitivity of model outputs and advice to alternate sets of calibration parameters.The way in which calibration procedures are communicated and understood can act as a barrier to their effective implementation and dissemination.It was our intention here to highlight the existing routes for parameter estimation and frame them in the context of the technical and ecological mechanisms and theories that can guide their justified use.Through the case studies presented in this paper, we have demonstrated that assessing model calibration based on similarity between model simulations and calibration data alone is not adequate, and that statistical optimization routines should be complemented with a judgmental evaluation grounded in ecological theory and lessons, which can be learnt from available data and knowledge of the study region.

Figure 2
Figure 2 Pathw a y s f or estimating vulnerability multipliers (k_ij) in Ecosim.All pathw a y s end with a reference to peer-re vie w ed e xamples.

Figure 3
Figure 3 Using history and ecology to estimate vulnerability multipliers (k_ij) in Ecosim.The illustration at the top of the figure provides examples where k_ij estimates can be inferred from functional group ecology and life history.Model simulations demonstrate how (a) prey vulnerability influences predator −prey biomass trajectories when predator biomass increases and (b) how k_ij estimates impact the rate of functional group recovery following reduction in fishing (inset figure).

Figure 4
Figure 4Ov ervie w of the Ecopath and Ecosim modeling processes.Using log likelihood criteria, the input parameters or anomalies (e.g.climate) may be estimated based on a non-linear search routine and vulnerability multiplier (vulmult) estimation.Prediction (fitting) failures after each estimation trial then inform judgmental changes in model str uct ure and parameters.B is biomass, Z is total mortality, C is catch, and W is a v erage w eight.Subscript 0 refers to the Ecopath model base year, and CC to carrying capacity.Adapted fromChristensen and Walters (2011) .

Figure 5
Figure 5 Estimation and impact of predator vulnerability multipliers (k_j).The Anchovy Bay ecosystem model was calibrated against generated time series with incremental CV to identify the impact of time series quality on (a) k_j re-emergence and how k_j estimates impacted; (b) functional group carrying capacities in the absence of fishing; and (c) estimates of relative fishing mort alit y consistent with achieving maximum sustainable yield (F MSY ).

Figure 6
Figure 6 Estimation and impact of predator −prey vulnerability multipliers (k_ij).The Anchovy Bay ecosystem model was calibrated against generated time series with incremental CV to identify the impact of time-series quality on (a) k_ij re-emergence and how k_ij estimates impacted; (b) functional group carrying capacities in the absence of fishing; and (c) estimates of relative F MSY .

Figure 7
Figure 7 Irish Sea EwE outputs under alternate fitting approaches.Vulnerability multipliers for the Irish Sea Ecosim model were estimated following se v en alternate fitting approaches.The impacts of alternate fitting approaches and vulnerability multiplier estimates are shown for (a) estimates of F MSY and (b) indicators of ecosystem str uct ure and function.The impacts of vulnerability multiplier estimates on indicator simulations are illustrated by comparing new simulations against the simulations from the published model.The published Irish Sea model has vulnerability multiplier values estimated using the predator and predator −prey approaches (Bentley et al. 2020 ).

Figure 8
Figure 8 NWACS-MICE EwE outputs under alternate fitting approaches.Vulnerability multipliers for the NWACS-MICE Ecosim model were estimated f ollo wing se v en alternate fit ting approac hes.The impacts of alternate fit ting approac hes and vulnerability multiplier estimates are sho wn f or (a) estimates of F MSY and (b) indicators of ecosystem str uct ure and function.The impacts of vulnerability multiplier estimates on indicator simulations are illustrated by comparing new simulations against the simulations from the published model.The published NWACS-MICE model has vulnerability multiplier values estimated using the manual repeated predator −prey vulnerability multiplier search approach (Chagaris et al. 2020 ).

Figure 9
Figure 9 Effects of vulnerability multipliers on derived sustainable fishing advice.Estimations of F MSY (Walters et al. 2005 ) are influenced by the value of vulnerability multipliers (k_ij).Changing predator k_ij values influences the F MSY reference points for (a) the predator and (b) its prey.Dashed lines indicated the estimated F MSY reference points for each k_ij scenario.

r 2 :
Downloaded from https://academic.oup.com/icesjms/article/81/2/260/7587678 by Centre Mediterrani d'Investigacions Marines i Ambientals -CSIC user on 13 May 2024 estimating vulnerabilities for groups without time series as the lack of constraints can lead to unrealistic estimates.Recommendation Explain vulnerability multipliers.

r
Recommendation 4: Look beyond goodness of fit when evaluating model performance.Combinations of vulnerability multipliers that achieve the best statistical fit (i.e.SS and AICc) do not necessarily produce the "best" model, if other model outputs, such as indicators, F MSY reference points, and forward projections, are unlikely.Assessment of wider model performance should be undertaken to review vulnerability multipliers.r Recommendation 5: Perform vulnerability multiplier sensiti vity anal yses.It is best practice to acknowledge and communicate model uncertainty.Calibrating Ecosim models, and thereby choosing one of multiple approaches to estimate vulnerability multipliers, introduces additional uncertainty into the process.Exploring model performance under alternate calibration approaches tests the sensitivity of model outputs to changes in vulnerability multipliers and identifies which vulnerability multipliers consistently emerge.

Table 1 .
Comparison of k e y characteristics between static (Ecopath) and time-dynamic (Ecosim) models of the Irish Sea and Northwest Atlantic Continental Shelf (NWACS-MICE).