Original Drivers of the summer-distribution of Northeast Atlantic mackerel ( Scomber scombrus ) in the Nordic Seas from 2011 to 2017; a Bayesian hierarchical modelling approach

and K. Drivers of the summer-distribution of Northeast Atlantic mackerel ( Scomber scombrus ) in the Nordic Seas from 2011 to 2017; a Bayesian hierarchical modelling approach. – ICES Journal of Marine Science, 76: 530–548. Identifying factors that are statistically correlated with the geographical distribution dynamics of a species can facilitate our understanding of causal physi-ological and ecological relationships. Northeast Atlantic (NEA) mackerel is a species of great economic and ecological importance, whose habitat expansion in the last decade has altered the biomass dynamics in the pelagic realm of the Nordic Seas. We highlight drivers that may have regulated the geographical distribution of NEA mackerel during summers, from 2011 to 2017, by ﬁtting Bayesian hierarchical spatiotemporal models on data obtained during the International Ecosystem Summer Survey in the Nordic Seas. Temperature in the upper 50m of the water column, food availability (approxi-mated by mesozooplankton biomass), a proxy of herring abundance and longitude were the main factors inﬂuencing both the catch rates (proxy for ﬁsh density) and the occurrence of NEA mackerel. Stock size was not found to directly inﬂuence the distribution of the species; however, catch rates in higher latitudes during years of increased stock size were lower. Additionally, we highlight the improved performance of models with spatiotemporal covariance structures, thus providing a useful tool towards elucidating the complex ecological interactions of the pelagic ecosystem of the Nordic Seas.


Introduction
(hereafter referred to as "mackerel") utilized the Norwegian Sea as its main summer feeding ground, but since mid-2000s its distribution range expanded towards Iceland and further west to Greenland (Jansen et al., 2016;Nøttestad et al., 2016c), opening new fishing possibilities in these areas (Astthorsson et al., 2012;Jansen et al., 2016). Apart from the economic aspect, the pressure exerted by mackerel on its prey (Bachiller et al., 2018), including opportunistic feeding on early life history stages of Norwegian spring-spawning (NSS) herring (Clupea harengus; Berge et al., 2015;Skaret et al., 2015), raises important questions about species' ecological relationships in the epipelagic realm of Nordic Seas.
Mackerel, herring and blue whiting (Micromesistius poutassou) are the main pelagic fish stocks in the Nordic Seas, with a combined spawning stock biomass (SSB) at $13 million metric tonnes in 2017 (ICES, 2017). During summer in the Nordic Seas, mackerel exhibits a thermal preference range from 7 C to 15 C Jansen et al., 2016, Olafsdottir et al., in press), and is well within the thermal tolerance of the species [2-28.5 C, as derived from laboratory experiments, Studholme et al., 1999]. It is distributed from the surface down to $40 m depth (Nøttestad et al., 2016a), in contrast to herring and blue whiting that are found in deeper waters (herring: 0-400 m, Misund et al., 1997;Huse et al., 2012;blue whiting: 200-800 m, Johnsen and Godo, 2007) and in water masses between 2 and 8 C .
While blue whiting resides in deeper waters, the potential spatial and temporal overlap, between mackerel and herring, combined with overlapping feeding preferences allows ecological interactions between them. All three species are planktonic feeders, with a higher degree of diet overlap between mackerel and herring, that consume mainly calanoid copepods, than with blue whiting that feeds on larger prey (mainly euphausiids and amphipods; Prokopchuk and Sentyabov, 2006;Langøy et al., 2012;Bachiller et al., 2016;Ó skarsson et al., 2016). Olafsdottir et al. (in press) recently suggested that temperature and differences in prey availability might act as a cue for mackerel to expand its distribution outside its traditional feeding area in the Norwegian Sea (Skjoldal, 2004) to accumulate energy; a mechanism that has also been suggested for herring's feeding migrations .
Although prey fields and competition for prey are important in regulating the spatiotemporal distribution of pelagic fish species, their population size and abiotic conditions also play a major role (Secor, 2015). Temperature is a well-known factor influencing the changes in geographical distribution of fish stocks (Drinkwater et al., 2014;Nye et al., 2014) and this is observed globally, with a general agreement that warmer regimes lead to poleward distributions changes (Cheung et al., 2009;Poloczanska et al., 2013). Additionally, habitat expansion and contraction phenomena in pelagic fish populations are known to be impacted by stock size (Lluch-Belda et al., 1989;Barange et al., 2009), among other parameters. Historically, spatial dynamics of mackerel in the North Sea have been influenced by the combined effect of multiple drivers, such as reduced spawning in the area due to decreasing temperatures, supplemented by food availability and wind-induced turbulence (Jansen, 2014). Since 2007, mackerel expanded its geographic distribution range, during the summer feeding season in Nordic Seas, both northward and westward compared with its traditional distribution in the Norwegian Sea (Astthorsson et al., 2012;Nøttestad et al., 2016c). However, whether the abundance of herring and/or blue whiting, in combination with factors such as temperature and prey availability, affects the occurrence and/or the density of mackerel remains largely unknown.
The International Ecosystem Summer Survey in the Nordic Seas (IESSNS) provides spatiotemporal information regarding mackerel, herring and blue whiting, as well as prey abundances and abiotic conditions in the area in July and early August (Nøttestad et al., 2016c;ICES, 2017). Because of the patchy aggregations of pelagic fish species within suitable habitats (Fréon and Misund, 1999), density data derived from either catch or acoustic backscatter data are usually characterized by high proportions of zero values and spatial dependence (Martin et al., 2005). This has been shown for juvenile mackerel (Jansen et al., 2015).
To develop an appropriate statistical model to identify factors affecting the distribution of adult mackerel, zero-inflation and dependency structures in the data need to be considered. To accomplish this, we fitted Bayesian hierarchical spatiotemporal models, using the Integrated Nested Laplace Approximation (INLA) methodology (Rue et al., 2009). This methodology provides accurate approximations to the posterior marginal distributions of latent Gaussian Markov Random Field (GMRF) models (Rue and Held, 2005) and is computationally fast, allowing the implementation and testing of complex spatiotemporal covariance structures. This is achieved by utilizing the stochastic partial differential equations (SPDE) approach (Lindgren et al., 2011). With the SPDE a continuously indexed Gaussian Field (GF) with Matérn covariance function is approximated by a GMRF (Lindgren et al., 2011) that represents an unobserved stochastic process (random effect) that addresses the spatial autocorrelation in the data. The INLA methodology therefore provides an efficient way of testing large complex models, and has been successfully implemented on fisheries-related data (e.g. Muñoz et al., 2013;Grazia Pennino et al., 2014;Cosandey-Godin et al., 2015;Paradinas et al., 2015;Quiroz et al., 2015;Boudreau et al., 2017;Carson et al., 2017).
Elucidating the drivers that influence the distribution of fish is one of the prerequisites for the successful implementation of the ecosystem approach to fisheries (Garcia et al., 2003), as understanding the relationships between species and their biotic and abiotic environment can help us designate areas that are important for the life cycle of a species (Planque et al., 2011). For highly migratory species, such as mackerel, under the warming global oceans (e.g. Levitus et al., 2000;Lyman et al., 2010), the response to the changing environmental conditions can have important implications for the species itself (e.g. reduced condition indices, Olafsdottir et al., 2016), for other components of the ecosystem (e.g. opportunistic predation on herring larvae, Skaret et al., 2015) and for fisheries (e.g. disagreement on quotas between the involved nations in a given area, Spijkers and Boonstra, 2017). In the present study, we aim to identify which combination of predictors gives the best explanation of the observed distribution patterns in mackerel density in its summer feeding area, while accounting for zerocatch registrations. Using standardized catch rates from the IESSNS as proxy for density, we thus test if abiotic variables (e.g. bottom depth, temperature, salinity), proxies of food availability and co-occurrence of herring, are important predictors for the occurrence and/or the density of mackerel, to elucidate possible underlying mechanisms that drive the biomass distribution.

Material and methods
Mackerel survey catches (in kg), herring (acoustic) density measured as herring-assigned nautical area scattering coefficient (NASC) values, and total dry mesozooplankton biomass (mg m À2 , used as an index of mesozooplankton abundance) for the period 2011-2017 in the IEESNS survey were retrieved from the Planning Group on Northeast Atlantic Pelagic Ecosystem Surveys (PGNAPES) database hosted at the Faroes Marine Research Institute, Torshavn, Faroe Islands (accessed November 2017). The IESSNS is an internationally coordinated survey that has taken place in 2007, 2010, 2011-2017, with four to five vessels participating from research institutes in Norway, Iceland, the Faroe Islands, and recently Greenland, covering an area of $1.6 (in 2007) to 4 million square kilometres in the recent years ( Figure 1; ICES, 2017). Briefly, the survey is held from the beginning of July until early August, with pelagic surface trawls (vertical span from surface to $35 m depth) taken at fixed locations on a grid of mostly east-to-west transects or diagonal transects across the shelf edge, to quantify the abundance of mackerel. Intertransect distance varies from 40 nmi to 60 nmi while trawl stations positions are fixed at every 30-80 nmi along transects. Apart from pelagic trawling, echosounding is also used to record herring abundance along the transects from 15 m (depth at which the echosounder lies) to 500 m depth. Mackerel is not detected sufficiently by echosounders due to the weak acoustic backscatter signal the absence of a swim bladder, as well as its distribution very close to the surface, which causes parts of the schools to lie in the "acoustic blind zone" (for details see Nøttestad et al., 2016c). At each station, a CTD (deployment depth 500 m) and a WP-2 plankton net (deployment depth 200 mm, 180-200 lm mesh, 0.5 m 2 opening, tow speed 0.5 m s À1 ) are deployed for registration of hydrological parameters and collection of mesozooplankton. More details regarding the IESSNS sampling protocol can be found in ICES (2015) and Nøttestad et al.(2016c).
NASC values assigned to herring, integrated over 1 nmi, were used to calculate mean NASC for each pelagic trawl station, and considered a proxy of herring abundance (for details see Supplementary Figure S1). The allocation of NASC-values to herring was based on the composition of the trawl catches, the characteristics of the recordings, and frequency between integration on 38 kHz and on other frequencies by a scientist experienced in viewing echograms, onboard each of the participating vessels. Mackerel catches were converted to catch rates (numbers of fish per square nautical mile, N nmi À2 ) using the open-source software StoX (StoX, 2015) to serve as a proxy for density of the species at each pelagic trawl station (hereafter catch rates will be referred to as densities). StoX is the ICES-approved software for abundance index estimations of mackerel and herring (ICES, 2017). Swept area mackerel density estimates y k;i by length category (k) were estimated for each pelagic trawl haul in location (i) using the formula: where y k;i is the number of fish (N) of length k per nmi 2 observed in trawl haul location i; x k, i is the estimated frequency of length k per nmi 2 observed on trawl haul location i and n i is the swept area calculated as: where td i /1852 is the towed distance in nmi and wd i is the fishing width of the trawl, as recorded by sensors placed at the trawl doors, in nmi units (Mehl et al., 2016). Following, densities per length category were summed to obtain total mackerel density (y) in N nmi À2 per pelagic trawl station. Vertical profiles of temperature and salinity for each pelagic trawl station carried out during the IESSNS were obtained from the participating nations for the same period for the calculation of mean and integrated hydrological parameters. The parameters considered were (a) temperature (T), salinity (S), and sigma-theta density (D) for specific depths (10 m, 20 m, 50 m, 100m), (b) mean T, S, and D for specific depth layers (0-25 m, 0-50 m, 0-100 m, 25-50 m, and 50-100 m, the Potential Energy Deficit (PED, the amount of energy required to vertically mix the water column so that the density is even from top to bottom, Planque et al., 2006), (c) the bottom depth ("depth," from survey vessels soundings and EMODnet, EMODnet Bathymetry Consortium, 2016) and the distance from the shore (http://www.naturalearthdata.com), and (d) the mackerel SSB as derived from the latest stock assessment (ICES, 2017). Satellite-derived mean monthly chlorophyll a data were downloaded from NASA (https://oceandata.sci.gsfc.nasa.gov/ MODIS-Aqua/, NASA Goddard Space Flight Center, Ocean Ecology Laboratory, 2014) and values were extracted for the geographical coordinates and month each pelagic trawl station was carried out.
In 2007 and 2010, mesozooplankton was not collected by all participating nations, and thus these years were excluded from the analysis. Stations that were missing mesozooplankton and/or CTD information for the 2011-2017 period were also excluded from the analysis. In total, 1731 trawling stations were included in the analysis, with the percentage of hauls with mackerel catches ranging from $79 to 87% (Table 1). Maps of the study area with locations of trawl stations and positive densities by year are shown in Figure 1.

Data exploration
Initial data exploration was carried out in order to identify (a) errors in the database that were subsequently removed or corrected, (b) collinearity (using variance inflation factors) and pairwise scatterplots (Zuur et al., 2010), (c) any clear relationships between mackerel density and each of the covariates both for the pooled across years dataset and for each year separately (Supplementary Figures S3-S10). The final dataset used for modelling comprised mackerel density and occurrence as the response variables and the following covariates: "depth," mean temperature (T50), and salinity (S50) at the upper 50 m of the water column (considering them representative of the layer where mackerel is mainly found, Nøttestad et al., 2016a), logtransformed mean monthly chlorophyll a ("chl a"), mean (at the station-level) log-transformed NASC for herring (HER_NASC), log-transformed mesozooplankton density ("plankton"), longitude and latitude [converted to UTM Zone 29 coordinates (units in km)], and the SSB of mackerel (ICES, 2017). Additionally, the interactions between each of longitude and latitude with SSB and the interaction between T50 and plankton were included, to explore whether these interactions affected the occurrence and/or density of mackerel. Finally, T50 entered the model as a quadratic term, to allow for a temperature optimum. This was based on Olafsdottir et al. (in press) and on the data exploration carried out, revealing a dome-shaped pattern between density and T50 (Supplementary Figures S3-S10). Remaining covariates (e.g. distance from shore, hydrological parameters at different depths, PED) where not considered in any of the models due to collinearity with the covariates included. All abovementioned covariates were standardized prior to inclusion in the models to facilitate numerical computations and entered as fixed effects. Standardization consisted of centring and scaling of the covariates, i.e. subtraction of the mean and division by the standard deviation.

Models tested
Prior to fitting the Bayesian hierarchical models, a generalized linear model (GLM) was used to find which distribution best fits the positive mackerel density data. The following model was therefore fitted to the data after excluding all zero-catches data where g is an appropriate link function, m is a distribution-specific mean parameter, Z j is a vector comprising an intercept and all the covariates considered for each year j, and b is the corresponding coefficient vector (i.e. regression parameters). Two options were tested, namely the gamma and the log-normal distributions, commonly used in fisheries catch data (Maunder and Punt, 2004). The deviance information criterion (DIC, Spiegelhalter et al., 2002) was used as a metric of goodness of fit, and it suggested that for all years but 2011 the gamma distribution better described the observed data distribution, so this was selected for further modelling of mackerel positive density values (Supplementary Table S1). However, the relatively high percentage of zero-catch mackerel hauls during the IESSNS ($13-20%, Table 1), dictates for an integrated approach for handling both occurrence and positive density data, e.g. a hurdle (or delta) model. A hurdle model for continuous data is a two-part model that specifies one process for zero observations and another process for positive values, where covariates can be added on both parts, i.e. on both occurrence and positive density data (Maunder and Punt, 2013). We used a Bayesian hierarchical modelling framework for mackerel densities, y i,j , at location i in year j. Following Quiroz et al. (2015) we have the formulation: where p(y i, j jl i;j ; /) is a zero-inflated gamma density, with p i,j being the probability of mackerel absence and l i,j is the conditional mean, given that y i;j > 0. We use the same parameterization of the gamma distribution, with precision parameter /, as Quiroz et al. (2015). log it p ð Þ ¼ log p=ð1 À pÞ f is the link function connecting the linear predictor g i,j (1) to the probability of mackerel absence (p i,j ); log is the link function connecting the linear predictor g i,j (2) to the mean m i,j; b (1) and b (2) are coefficient vectors (or regression parameters) of the Z where the coefficient q is the autocorrelation parameter (scalar quantity), and the innovation term u i;j is a spatial GMRF with a Matérn covariance function (Lindgren et al., 2011). The Matérn covariance function has two hyperparameters: the marginal standard deviation r and j, which controls the wiggliness of the field. Table 1. Number (N) of pelagic trawls considered in the present study, respective number (and percentage of hauls) without mackerel caught per year and range of the mackerel positive density and the covariates included in the models (excluding SSB). The latter parameter determines the correlation range r, i.e. the distance at which the correlation drops to 0.1, through the approximate formula r ¼ ffiffi ffi 8 p =j. For q ¼ 0, Equation (7) results in uncorrelated annual GMRFs, i.e. GMRFs are based on the same hyperparameters, while the model takes into account the annual aspect of the data).
The hurdle models tested were the following: M1 where no spatiotemporal GMRF is included (essentially reducing to a gamma hurdle GLM). M2 where two independent purely spatial GMRFs (f i 1 ð Þ and f i 2 ð Þ ) are included (one for each linear predictor). In this case, data are pooled (i.e. no annual structure is considered) to test whether a single GMRF captures the residual variance in the data for all years. M3 where a shared, purely spatial GMRF, instead of two independent, is included, i.e. f i In this case also, no annual structure is considered, and all the data are pooled across years [Equation (5)]. The model also involves an additional hyperparameter called w, which is unknown scale parameter that controls the magnitude of the spatial terms in g i (1) in comparison with the spatial term in g i . The second linear predictor results into g i M5 where a spatial GMRF is included in only the linear predictor for occurrence data f i 2 ð Þ ¼ 0: M6 where two independent spatiotemporal GMRFs are included (one for each linear predictor) in each year of observations, with no temporal correlation (q ¼ 0), i.e. f i;j ¼ u i;j . The resulting annual GMRFs are independent both between the linear predictors (not shared GMRF) as well as between years.
M7 where a shared spatiotemporal GMRF is included in each year of observations, with no temporal correlation (q ¼ 0), i.e. f i;j ¼ u i;j . The resulting annual GMRFs are independent only between years, not between linear predictors (shared GMRF). M8 where two independent AR1 spatiotemporal GMRFs are included (one for each linear predictor). M9 where a shared AR1 spatiotemporal GMRF is included (the AR1 equivalent of M6).
The complete formulation of the models tested can be found in Table 2. Furthermore, in all cases, models were also run separately, i.e. not jointly as a hurdle model, but as separate models for (a) occurrence and (b) for positive mackerel densities, respectively, to assess whether the fit was better, using DIC as the goodness-of-fit metric.

Inference and goodness of fit
INLA calculates marginal posterior distributions of all fixed effects, random effects and hyperparameters included in a model. From the available options in R-INLA for approximation of the posterior marginal distributions, we used the "Laplace," which is considered the most accurate one (Martins et al., 2013). We also used the recommended "PC-priors" (Simpson et al., 2015) to construct a Matérn SPDE model, characterized by spatial correlation range r and standard deviation parameter r, with probability P(r < 100 km) ¼ 0.05 and P(r > 10) ¼ 0.05. Finally, an integrate-tozero constraint was applied on the SPDE model (Lindgren and Rue, 2015). The mesh required to apply the SPDE approach is shown in Figure 2. It defines the spatial domain of interest and is used to build the GMRF. To achieve this, R-INLA functions were utilized to create a constrained refined Delaunay triangulation, using as spatial domain of interest the polygon defined between land masses in the wider area of the Nordic Seas (white area in Figure 2). Table 2. Formulations of the models tested.

Model
Linear predictors Hyperparameters GMRFs between linear predictors Temporal structure between GMRFs Yes [conditional autoregressive of order 1, (AR1)], annual data  To facilitate the mesh construction, the 110 m resolution land polygon obtained from a free online source (http://www.natural earthdata.com/http//www.naturalearthdata.com/download/ 110m/physical/ne_110m_land.zip) was smoothed, avoiding the excessive details of the coastline and excluding the Baltic Sea (by masking it as land). This polygon was used as a boundary in the model. Finally, an outer extension was included to avoid the "boundary effect" (increased variance at borders, Lindgren and Rue, 2015). Different mesh designs (not shown here) were evaluated to investigate their effects during model selection.
The goodness of fit of the tested models was assessed with the DIC, the accuracy rate and the root mean squared estimation error (RMSEE). The accuracy rate was defined as the % sum of observations estimated as present, when they are actually present, and of observations estimated as absent, when they are actually absent. Predicted presence or absence were defined as probabilities > or <0.5. RMSEE was calculated as the square root of the sum of squared differences between positive ðy) and fitted (ŷ ) mackerel density values divided by the number of observations (n obs ).
Generalized additive models (GAMs, Hastie and Tibshirani, 1990) is a common way of analysing ecological data, where a smoothing function on the geographical coordinates can be included. We applied a binomial and a gamma GAM to the occurrence and the positive density observations, respectively, to explore whether they fitted the data better than the Bayesian models. In both GAMs the same set of covariates as in M1 were included, plus a thin plate regression spline smoother on the geographic coordinates, also considering the annual structure of the data, by including the SSB (collinear to YEAR) in the "by" argument of the "gam" function. In this way, the data were treated in a way similar to M6, i.e. as annual observations without any temporal autocorrelation.

Results
Mackerel positive density, as inferred from the IESSNS catches during the study period, ranged from 11 to 654 702 N nmi À2 (Table 1), with median values, however, ranging from 5087 to 23 074 N nmi À2 (Supplementary Figure S2). The catches were taken in areas with bottom depth ranging from $60 m to >3500 m. The lowest T50 was -0.5 C and the highest 12.7 C with relatively few stations with T50 < 5.0 C. S50 was 35.0 6 0.5 with only few stations having S50 < 34.0. Plankton density ranged from 0.01 to 102.40 g m À2 and chlorophyll a mean concentration from 0.16 to 7.74 mg m À3 throughout the study period. The calculated HER_NASC per pelagic trawl station was usually low, although high values were occasionally calculated reaching a maximum of 3489 (in 2015, Table 1, Supplementary Figure S2). Finally, the SSB, as derived from the latest stock assessment for the species, ranged from 3.5 to 4.4 million metric tonnes (Table 1, ICES, 2017).
The best among the tested models was formulation M6, for which occurrence and positive density data where considered as two separate processes (M6S), and the spatial distribution was considered uncorrelated between years (Table 3). The hurdle version of this model (M6H), had poorer performance (dDIC ¼ 12.298) but was better than other hurdle configurations (i.e. M1H-M9H). M6S also had borderline higher accuracy rate than M6H, but somewhat lower RMSEE (Table 4). For the purposes of the present study we focus on the goodness-of-fit of the selected model rather than its predictive power, hence our results are based on the models of M6S formulation. From our results, it was also obvious that models including spatial GMRF(s) (M2-M9) always outperformed the model without one (M1), as well as  that to obtain a better fit the annual character of observations needs to be taken into account (higher dDIC of M2 compared with M6-M9; Table 3). Interestingly, the positive density part of the M7H model produced a lower DIC than the "M6S-Positive density" (dDIC ¼ -29.775), however the overall DIC of this hurdle model formulation was higher than the M6H, largely due to the worse fit of the occurrence part of the model (dDIC ¼ 95.857 compared with "M6S-Occurrence," Table 3). Finally, the tested GAMs always had poorer fit compared with the Bayesian models fitted on non-pooled data, i.e. M6-M9 model formulations (Table 4). Fitted vs. observed positive densities of mackerel are shown in Figure 3. The positive density model (M6S-Positive density) performed well for high mackerel densities (density $>10 000 N nmi À2 ), but overestimated densities below this level. The mean and standard deviation of the annual spatial GMRFs for the occurrence and the positive density models are shown in Figure 4. The best models included a different realization of the GMRF for every year of observations, resulting to different spatial effects from year to year (Figure 4). The spatial residuals for the positive density data model can be found in Supplementary Figure S11.
The occurrence of mackerel was better explained when data where considered as uncorrelated annual observations and a spatial GMRF was included in the model, i.e. M6S formulation (Figure 4). The intensity of the GMRF of the "M6S-Occurrence" was able to adjust the model fit to accurately reflect the observed presence or absence of mackerel in the area (Figure 4), resulting also in a very high accuracy rate (Table 4). The uncertainty around the mean of this GMRF was high, mainly in marginal areas, where the observations throughout the years were scarce. For example, observations in the northern part of North Sea were only recorded in 2013. The model accordingly provides results for the GMRF in this area for all years, but accompanied with high uncertainty for the years without actual observations in that area (Figure 4). Finally, the occurrence model fitted values with their associated uncertainty in comparison with the observed presence/absence of the species are shown in Figure 5. The model accurately captured the observed presence/absence registrations of mackerel, with generally low uncertainty, that was more pronounced in contrasting areas, i.e. areas where zero-catches were consistently observed (e.g. south of Iceland in 2017 or at the periphery of the surveyed area; Figure 5).
The intensity of the GMRF of the "M6S-Positive density" contributed substantially in explaining the observed catch rate patterns, capturing the areas where high and low catch rates of mackerel were recorded (Figure 4). The uncertainty around the mean of the positive density GMRF was also relatively high, highlighting the need for the GMRF to adjust the linear predictor so as to provide a better fit to the data. The fitted values along with their associated uncertainty estimates for the positive densities model are shown in Figure 6. Overall the model captured well the registered catch rates for values $>10 000 N nmi À2 but Table 4. Accuracy rate (occurrence data) of the models tested (see Table 2   overestimated those below this point. The associated uncertainty for the fitted positive densities of mackerel was generally high, and especially for higher catch rates (e.g. >100 000 N nmi À2 ; Figure 6). For the occurrence model, the set of covariates contributing positively and significantly [i.e. their 95% credible intervals (CI) did not include zero], comprised longitude, T50, HER_NASC, S50 and plankton ( Table 5). Given that the covariates enter the model standardized, the magnitude of the estimated coefficients is direct measures of the effect on the prediction. Temperature response for the occurrence model, when the rest of the model covariates were kept fixed at 0 (as they entered the model standardized), followed a logistic curve that initiated at $2.5 C and reached a plateau at 10 C, after which the probability of occurrence slightly decreased (Figure 7).
For the positive density model, the positively contributing covariates, in order of direct effect on the prediction, were T50, longitude, depth, chl a, HER_NASC and plankton. Latitude and its interaction with SSB were the covariates contributing negatively to the linear predictor, suggesting that for increasing SSB the density of fish was lower with increasing latitude, i.e. fish densities in northern marginal areas remain low even when SSB increases. Temperature response for the positive density model, when the rest of the model covariates were kept fixed at 0 (as they entered the model standardized), also followed an increasing pattern, initiating at $2.5 C, reaching a maximum almost at 10.0 C and afterwards decreasing rapidly up to the point where positive densities were recorded (12.3 C, Figure 7).
The range at which the correlation of the spatial GMRF dropped to $0.1 for the occurrence data was estimated at r occurrence ¼ 1265 6 314 km, whereas for the positive density data at r positive density ¼ 530 6 112 km (Table 5). The range for the GMRF of the positive mackerel densities is $500 km, which implies that mackerel density (when mackerel is present) depends on its neighbour observations up to this distance.
The inclusion of spatial GMRFs in the occurrence and positive density models (with or without temporal correlation) led to increased CI of the fixed effects, as can be seen by the comparison of the fixed effects' means and 95% CI between models M1S, M2S, M6S, and M8S ( Supplementary Figures S12 and S13). It can also be seen that absence of the spatial covariance structures from a model, can lead to erroneous inclusion or omission of covariates in the models.

Discussion
Fish distribution is influenced by a number of biotic and abiotic factors, the latter including intra-and interspecific mechanisms (e.g. Planque et al., 2011). In our study, mackerel occurrence in its summer feeding areas in the Nordic Seas was found to be correlated with temperature (T50). Additionally, salinity (S50), plankton, the proxy of herring abundance (HER_NASC), and the longitude were also significantly correlated with the occurrence of the species. These covariates, excluding salinity, were also correlated with the density of mackerel, but were complemented by depth, chl a, the latitude and the interaction between the SSB and latitude.
For both the occurrence and positive density models, the significant covariates were largely consistent irrespectively of the  addition of a temporal autocorrelation structure or not, i.e. M6 vs. M8 model configurations. In particular, for the occurrence model, S50 was not found to be significant for the model M8 compared with M6, whereas the interaction of T50 with plankton was significant, but with a very low overall contribution. Similarly, for the positive densities model S50 entered the M8 configuration significantly, whereas HER_NASC and the geographic coordinates did not. Again, the changes observed resulted in very low contribution of S50 that entered the model (low estimated coefficient). Regarding the covariates in M8 that did not enter the models, HER_NASC had low contribution in M6, so it's inclusion in M8 did not result into big changes in the overall performance of this model and the non-significance of the geographic coordinates was ameliorated by the GMRF that had its intensity adjusted accordingly. Naturally, in models that the data were pooled across years, larger differences were observed regarding the contributing covariates. Temperature (T50) had a strong effect in both the occurrence and the density of mackerel. On the basis of the occurrence model results given a quadratic response of T50, a 50% probability of presence for mackerel is predicted at $5.0 C, when all other main effects are fixed to their mean. In the same way, our results from the density model predict a peak for optimal temperature for the species at almost 10.0 C. The highest T50 where mackerel was captured (Bachiller et al., 2018) was at 12.3 C, with T50 > 12.3 C recorded only at four stations (zero mackerel catch hauls). This lack of sufficient number of observations for T50 > 12.3 C, does not allow the models to predict values beyond that point. Hence, the estimated upper thermal limit where mackerel density is non-zero in the study area and period is at 12.3 C, with 97% of positive observations resting between 5.0 C and 12.0 C, whereas 73% resting between 8 and 12.0 C (Figure 7). However, the shaping of the curve if forward projection was to be done, would result to an upper thermal limit of approximately at 15-16 C. For the occurrence model, although the same inability of prediction beyond the highest recorded temperature (12.7 C) holds, there seems to be an initiation of dropping for the probability of occurrence at very high temperatures (Figure 7).
The known thermal preference, i.e. increased density, of adult mackerel in this area, ranges from 9 to 13 C Jansen et al., 2016;Olafsdottir et al., in press). It is, however, clear that records (trawl hauls) at higher temperatures are needed to obtain a full picture of the thermal preference range of mackerel in the Nordic Seas, as it is obvious from our results and from other studies that the tolerable thermal range for the species in the area is wider (7-15 C, Olafsdottir et al., in press, 50% probability of occurrence at >5 C, this study). The small differences in thermal preference range observed in the present study compared with that of Olafsdottir et al. (in press) can be attributed to the different temperature used as input to the models. Olafsdottir et al. (in press) used temperature at 10 m depth to represent the ambient temperature in the surface mixed layer, considering the latter as the layer occupied by mackerel. In the present study, a wider layer was considered, based on the species' vertical distribution, as suggested by Nøttestad et al. (2016a) (surface down to $40 m depth).
Temperature is also known to influence mackerel large scale distribution in areas outside the Nordic Seas as well (Radlinski et al., 2013;Giannoulaki et al., 2017). Unfortunately, the results from these studies are not directly comparable to ours, as the authors use satellite-derived Sea Surface Temperature measurements, even though the vertical distribution of mackerel in these studies is not near the surface [Radlinski et al., 2013: bottom trawl data, Giannoulaki et al., 2017: acoustic data and pelagic hauls for mackerel juveniles, with schools distributed in depts. >50 m (pers. com.)].
Salinity (S50) was only found to be important in the occurrence model, having a positive linear effect to the presence of the species. This is an expected result to some extent as the bulk of mackerel is mainly found in the wider area of the Norwegian Sea, Table 5. Parameter estimates [mean, standard deviation (r) and 95% credible interval (CI)] in log-domain for the fixed effects, the precision parameter (/) of the gamma distribution, the range (r) and r of each GMRF, included in the final models (M6S); Covariates entering the model, i.e. zero is not included in the 95% credible intervals, are shown in bold.

NEA mackerel distribution
where the warm Atlantic water that flows into it, is of high salinity compared with colder polar waters (Skjoldal, 2004) with low occurrence of mackerel. As stated earlier, S50 range was relatively narrow (35.0 6 0.5) with only few stations having S50 < 34.0, thus revealing mackerel's avoidance of areas of low salinity. Prey fields are also an important factor determining the distribution of pelagic fish (Secor, 2015). Despite the high diet overlap with herring, and the lower overall with blue whiting (Bachiller et al., 2016), prey availability is not a factor hindering the coexistence of these three stocks. Recently, a bioenergetics modelling study (Bachiller et al., 2018) showed that at least in the Norwegian Sea, which is considered the traditional core distribution area of mackerel in the summer Nøttestad et al., 2016c), the planktonic prey biomass is able to sustain the observed biomass of all three main pelagic species.
Our modelling results showed that, within the range of examined plankton biomass values, in locations where plankton values were higher, both occurrence probability and density of the species were increased. There are some fundamental issues to consider, however, when using sampled plankton biomass as an index of food availability. One of these issues is that the sampled quantity does not necessarily reflect the biomass in the area prior to the fish arrival (when considering pelagic migratory fish) and thus, it cannot be stated that increased plankton biomass acts as a cue that attracts fish. Because of feedback mechanisms between predator and prey, the sampled plankton might well be what is available in the water after fish have already preyed upon it. For this reason, in our models we complemented the plankton dry biomass with mean chl a concentrations, over the month that sampling took place in order to obtain a prolonged view of the food availability conditions, considering chl a as a good proxy of mesozooplankton biomass. This revealed that chl a was important for the density part of the model, suggesting that it is only density that is positively influenced by elevated chl a concentrations and not the occurrence of the species.
Another problem related to the information provided by the mesozooplankton samples, is that of catchability. Given that mackerel preys mainly upon large calanoids and euphausiids (Bachiller et al., 2016), the use of the WP-2 sampler can underestimate the abundance of these larger-sized mesozooplankton, as they are avoiding the net, especially during vertical tows (Gjosater et al., 2000). Other samplers (e.g. Bongo-net, MOCKNESS) with oblique tows could alleviate that problem (Gjosater et al., 2000), but again such samplers are not always easy to deploy from rented commercial vessels that lack the appropriate necessary equipment (and are routinely used in research surveys, also the current one). A solution to this problem could be the correlation of model-derived planktonic fields that are based on realistic estimates of production rates, with observed fish distributions, as in this way the predation effect would not be masking the observed patterns. Alternatively, given the advancements in the scientific echosounders, information on planktonic biomass in high spatial resolution can be obtained and subsequently used in spatial models. In this way, prey fields will be more accurately described and correlated with fish distribution. Unfortunately, the snapshot view obtained by sampling plankton cannot reveal whether planktonic biomass is actually a driver of the distribution of its pelagic fish predator, but only act as a proxy.
Another interesting result was the lack of significant effect of the interaction of temperature and planktonic biomass on either the occurrence of mackerel or its density. In fish, the relationship between food intake and temperature increases until it reaches a peak, after which decreases more or less rapidly, when supraoptimal temperatures are met (Jobling, 1998). The rapid decrease in food intake in higher temperatures could be related to limitations in oxygen delivery to tissues, due to lower oxygen concentration (and thus availability) in the water, under conditions of very high oxygen demand (Jobling, 1997); conditions that are usually met during searching, capturing and handling of prey. Mackerel's feeding activity is more pronounced in colder waters, increasing both its feeding incidence and stomach fullness (Bachiller et al., 2016), but in our case we didn't find any results supporting that temperature in the upper water column and plankton interact in such a way that influences the species distribution pattern. Again, the uncertainty involved in plankton biomass sampling could also mask this interaction between temperature and prey biomass. It could well be that planktonic biomass is enough to cover mackerel's dietary needs in the study area, so such an interaction would not be directly evident. Information on the diet of the species (prey composition and/or stomach content weight) can facilitate addressing such questions, as it can disentangle the effect of each covariate (temperature and plankton). Unfortunately, such data were not available for the whole area and time period of our study, thus could not be included in our models.
Given the known high degree of spatial overlap between herring and mackerel in the Norwegian Sea during the summer feeding season , which can be expected to increase the potential for resource competition, we chose to include a proxy of herring abundance in our models in order to explore its impact on mackerel's distribution. However, differences in their vertical positioning, as herring is being generally found in greater depths than mackerel , allows the exploitation of additional prey resources by herring (Bachiller et al., 2016). Additionally, earlier modelling work suggests that not only mackerel and herring, but also blue whiting, can all coexist regardless of their high abundance, zooplankton consumption rates and overlapping diet in the Norwegian Sea (Bachiller et al., 2018). Our statistical modelling results indicate that there is a positive effect of the proxy of herring abundance on both the occurrence (more pronounced) and the density (less pronounced) of mackerel, i.e. the two species co-occur spatially. The signal of this effect might actually be even stronger than our model states, as there are occasions where herring is distributed in the "acoustic blind zone" of the echosounders of the vessels (i.e. in the upper 10 m or so, above the depth of the hull-mounted transduces), resulting into zero acoustic registrations in the area, although they can be present in large quantities in the pelagic trawl catches (Nøttestad et al., 2016b).
Using information on mackerel's stock size from the latest assessment (ICES, 2017) we found no direct effect of SSB, suggesting that the fluctuations of the stock size in the last seven years were not directly correlated with the species' occurrence or density in the area. This result is contradictory to the findings of Olafsdottir et al. (in press) that identified stock size as one of the factors that explained the species expansion in the area, outside its core feeding area (the Norwegian Sea) in recent years, however not unexpected given the different time series used in the two studies. In the present study, years from 2011 to 2017 were used, a period when mackerel had already expanded its distribution in the Nordic Seas compared with the past (for details see Utne et al., 2012), whereas Olafsdottir et al. (in press) utilized a different and slightly more extended period [eight years (2007, 2010-2016)]. Especially in 2007, mackerel was still mainly located in the Norwegian Sea, whereas from 2010 onwards increased catch rates during IESSNS were obtained outside of this core area (Nøttestad et al., 2016c). Indeed, the different time series reflect a strong contrast in mackerel's SSB variability between these two different data series, with the prolonged one having 85% variability, whereas the series used in the present study only 19% (ICES, 2017), thus a possible explanation for the lack of effect of SSB on mackerel's occurrence and/or density.
Additionally, our results showed that the inclusion of a spatial random effect (GMRF) increased the credible intervals of the covariates included compared with the models where a GMRF was not included (Supplementary Figures S12 and S13). This difference in the modelling approach might have caused SSB to be identified as a significant factor in the study of Olafsdottir et al. (in press), although it might actually not be one, as the absence of spatial covariance structures from a modelling approach, can lead to erroneous inclusion (or omission) of covariates. On the other hand, the significant negative interaction of SSB and latitude found in our study with the density of the species suggests that SSB might have a more complex role concerning the distribution of the species than simply a linear effect. During years of increased SSB, the expectation would be that mackerel would increase its density in marginal areas (including those in the north of the study area) and the opposite during years of low SSB, if the conditions (e.g. food availability) were not adequate for optimal growth. We clearly did not observe anything like this in our results. Latitude as a main effect had a negative response with density, whereas longitude had a positive response for both occurrence and density. During 2011, the survey coverage in the northern part of the Norwegian Sea as only one vessel participated from the Norwegian side (Nøttestad et al., 2011). Moreover, from 2015 to 2017 both the northward and westward area occupied by the stock during IESSNS was found reduced compared with its peak in 2014 (Nøttestad et al., 2016c;ICES, 2017;Olafsdottir et al., in press).
The above suggest that the core of the occurrence and density of mackerel in the area remained in the eastern part of the study area, i.e. mainly in the Norwegian Sea. Also, the densities observed in the northern part of the study area were smaller than in the southern part; something that could however also be related to the stock size (SSB) of mackerel (see further up for discussion regarding the interaction between latitude and SSB). These locations, i.e. roughly between 62 -72 N and 15 W-10 E, designate mainly deep areas with higher densities of mackerel, something also captured by our density model (significant credible intervals for the depth) and also observed in fishery catch data from which it was found that catches have moved further offshore, to deeperwater areas (Hughes et al., 2015), although the latter study extends only up to 2013. However, the obvious (from the registered observations during sampling) westward and northward expansion of the species, can only be captured efficiently by the models when the GMRFs are included. There is a large number of variables that can influence the distribution and density of a species (Planque et al., 2011), especially a highly migratory one such as mackerel. In most of the cases, the routinely recorded variables during surveys (e.g. hydrography, planktonic prey concentrations, etc.) have proved to be poor predictors of the observed variability in density distributions. Hence, we need to re-evaluate what the driving factors on a species distribution could be and adjust our sampling schemes accordingly.
Our analysis showed that the occurrence and the density of mackerel in the study area and period were better described if two separate models were fitted to the data instead of a joint one (i.e. hurdle model). This is also supported by the different covariates that were identified as important during model fitting process. To our knowledge, comparing the results of a joint (hurdle) model and separate models for occurrence and density (continuous response) in fisheries utilizing the R-INLA framework, has only been done for Peruvian anchovy acoustics data where a joint model was found to better fit the data (Quiroz et al., 2015). In other fisheries-related studies, occurrence only (i.e. binary response, Muñoz et al., 2013;Pennino et al., 2013;Roos et al., 2015), abundance/biomass data only (Grazia Pennino et al., 2014;Cosandey-Godin et al., 2015;Paradinas et al., 2016;Boudreau et al., 2017;Carson et al., 2017;Fonseca et al., 2017;Rufener et al., 2017) or both occurrence and abundance but treated as separate processes (Paradinas et al., 2015) have been used, but a hurdle model was not tested.
Our results also indicated that a shared GMRF between each set of data, i.e. occurrence and positive density, was not enough to describe the latent process and two separate GMRFs performed better in fitting the data. Additionally, the magnitude of the posterior means of the GMRFs, suggested that the latent processes operating on the occurrence and the density of mackerel in the area, are important in describing the observed patterns as their inclusion improved significantly the models tested. Given the presence of some extreme values in the dataset (e.g. densities > 400 000 N nmi À2 ), combined with their sporadic spatial distribution, it would be unrealistic to expect the models to be able to capture these values with high accuracy, even with the inclusion of a GMRF. The adjustments on the response variable by the inclusion of the GMRF in the model, are relative to the neighbouring observations and if such extreme values were to be captured by the model, it would inevitably result to overfitting. The same sporadic distribution of the lower end density values, is probably accountable for the similar inability of the GMRF to adjust the model's predictions without ending up in overfitting.
The observed annual differences in both occurrence and density were better explained if the included GMRF was considered independent between years, i.e. when the temporal autocorrelation process (AR1) was zero. This is an indication that both the occurrence and the biomass distribution of the species varies from year to year without any "population memory" effects from previous years (ICES, 2007;Planque et al., 2011). Given, however, the good performance of also the models that included a temporal autocorrelation process in the GMRF (e.g. M8S), the possibility for population memory effects in mackerel cannot be excluded.
Persistency in the spatial distribution of fish populations in feeding or spawning areas is not uncommon (e.g. Paradinas et al., 2015;Boudreau et al., 2017;Carson et al., 2017) and our results suggest that mackerel could also be exhibiting such behaviour, especially when it comes to the occurrence of the species, when visiting its summer feeding areas in the Nordic Seas. When considering the AR1 models (e.g. M8S), the autocorrelation parameter q was found to be 0.76 and 0.82 for the occurrence and the density data, respectively. This could suggest persistency in the locations where mackerel is present or absent, as well as persistency of the locations where low and high densities are identified. The small interannual variability in the distribution of the species density observed during the study period, suggests that the underlying mechanisms that influence this distribution, remained relatively constant, something also obvious from the individual relationships between the covariates used (mainly those entering the models in a significant way) and the density of mackerel ( Supplementary Figures S4-S10). However, the short timeseries used in the present study (7 years), might be one reason why the AR1 models performed worse than the models that consider no temporal autocorrelation in the underlying stochastic process, so this is something that would need further investigation when more data are accumulated in the future.
The range and the variance of the spatial field are two useful characteristics that allow for interpretation of how the latent process operates in space. The range r (also referred to as connectivity) for the occurrence data was estimated to be at 1265 km. This is interpreted as the covariance decaying more slowly in space for the occurrence than the covariance of the density GMRF whose estimated mean was at 530 km, i.e. the pattern is similar over a larger distance for occurrence than it is for density. Moreover, with a spatial variance (r of the GMRF) of 5.7 compared with 1.02 of the density GMRF, the amplitude of the spatial pattern changes more drastically for the occurrence than for the density GMRF. The estimated ranges are large for both the occurrence and the density of the species but it is crucial to recall the extensive area the species occupies when present at its summer feeding areas.
Conclusively, our models revealed that the geographical distribution of mackerel in its summer feeding grounds in the Nordic Seas was better predicted when additional variables to the traditionally measured ones (e.g. temperature and prey biomass) were used. Inclusion of proxies of competition with other co-existing species, i.e. herring, and long-term food availability indices (chl a), improved the models substantially, signifying that interspecific ecological interactions and non-snapshot views of predator-prey dynamics are necessary to better understand the spatiotemporal distribution of species. Moreover, the improvement of the models when a the latent process was included, suggests that the ecological interactions regulating mackerel's distribution are such, that our current sampling methods and recorded variables are lacking the explanatory power needed to provide robust predictions. In this view, statistical models, as the ones presented in this study, can help highlight such inefficiencies, thus stimulating research towards mechanistic understanding of processes in the marine environment and improved sampling efforts. Albeit limitations, the advantage of the sampling methodology (i.e. pelagic trawling), is that it allows more accurate recordings of the environmental conditions met by mackerel compared with other approaches that utilize bottom trawl data (Radlinski et al., 2013). In this sense, continuation (to obtain long time-series), improvement (by measuring additional covariates of interest, e.g. light and oxygen levels) and further analysis of already collected data (e.g. taxonomic analysis of mesozooplankton samples) of largescale surveys such as the IESSNS, could significantly facilitate our understanding of the mechanisms driving important pelagic fish species distribution patterns.

Supplementary data
Supplementary material is available at the ICESJMS online version of the article.