Evidence for a decline in the population density of Antarctic krill Euphausia superba Dana, 1850 still stands. A comment on Cox et al .

Antarctic krill ( Euphausia superba Dana, 1850) exemplifies the key role of marine crustaceans in fisheries, foodwebs, and biogeochemical cycles. Ecological understanding and policy decisions require information on population trends. We have therefore worked with international colleagues to publish KRILLBASE, a database of fishery-independent krill population information for every decade since the 1970s. These data were used by Cox et al. (2018) who dispute the evidence for a late twentieth-century decline in krill density (number per unit area) in the Southwest Atlantic sector of the Southern Ocean and claim to overturn “much of recent thinking about climate-driven change in krill populations.” They support this claim with an analysis which reaffirms one non-significant result from an earlier paper but does not challenge the five significant results from that paper or those of other studies which support a decline. In this comment we examine the methods which led Cox and coauthors to conclude that krill density has been stable over the last 40 years. Although these authors provide a potentially useful approach, we show that their analysis was biased by the exclusion of usable net types, the inclusion of negatively biased data and down-weighting of high densities in the early part of the analysis period, the absence of recent data from the north of the sector, and a lack of statistical hypothesis testing. These factors maximise the chances of failure to detect a real decline. To aid future analyses we provide recommendations to supplement those which accompany KRILLBASE. We also suggest the need for consensus scientific advice on krill population dynamics based on agreed standards of evidence, evaluation of uncertainty, and a thorough understanding of the data. This will be more useful to policy makers and other stakeholders than polarised opinions. Meanwhile, the evidence for a decline in krill density still stands.


INTRODUCTION
Marine crustaceans provide a variety of important ecosystem services, several of which are exemplified by Antarctic krill (Euphausia superba Dana, 1850) (Grant et al., 2013). In its main population center in the Southwest Atlantic sector of the Southern Ocean, Antarctic krill is the main prey of whales, penguins and seals, and of commercially fished species such as mackerel icefish (Champsocephalus gunnari Lönnberg, 1905). Antarctic krill is also a fishery target species, accounting for 85% of the total fishery catch by weight in the Southern Ocean, and it plays important roles in carbon and iron cycling (Gleiber et al., 2012;Schmidt et al., 2016). The ability of Antarctic krill and other marine crustaceans to provide such ecosystem services may be affected by policy decisions concerning, for example, fishery catch limits. Scientists can influence policy decisions by supplying advice and information, including about the status of relevant crustacean populations and how they change over time.
Approximately 3,500,000 km 2 of the Southwest Atlantic Sector is open to krill fishing. Annual mesoscale (≤ 125,000 km 2 ) acoustic surveys conducted since the 1990s monitor krill biomass in about 5% of this area  and two large scale (471,000 km 2 and 2,065,000 km 2 ) surveys were conducted in 1981 and 2000 . There is also an ecosystem monitoring program, which was established in 1987 and aims to detect changes in "critical components of the ecosystem," namely penguins, seals, and albatrosses that feed on Antarctic krill (Agnew, 1997). This monitoring provides partial information on krill population status over the last two to three decades, but the importance of Antarctic krill suggests that additional information over longer timescales and larger spatial scales is also relevant. We have worked with international colleagues to compile and publish KRILLBASE, a repository of data on numerical density (the number of krill per unit area of sea surface, hereafter density) resulting from scientific net surveys conducted in the 1920s and 1930s, and from the 1970s onwards (Atkinson et al., 2017). By making these data publically available, and providing detailed information about their origin, use, and limitations, we aim to facilitate the provision of information to scientists, policy makers, and other stakeholders.
A recent paper published in the Journal of Crustacean Biology by Cox et al. (2018) dismisses previous evidence of a late twentieth century decline in krill density in the Southwest Atlantic sector (e.g. Atkinson et al., 2004Atkinson et al., , 2014Forcada & Hoffman, 2014;Loeb et al., 1997;Watters et al. 2013), and "paradigms that underlie much of the recent thinking about climate-driven change in krill populations," arguing instead that krill density was stable between 1976 and 2016. Cox et al. (2018) use KRILLBASE data to support their arguments but we show here that their approach contains multiple errors. It also relies on subjective interpretation rather than statistical hypothesis testing. This combination of factors led Cox et al. (2018) to a conclusion which is likely to be erroneous. We also show that Cox et al. (2018) made significant errors in their representation of a key analysis of the KRILLBASE dataset (Atkinson et al., 2004). Propagation of such errors in the literature will reduce clarity about the status of the krill population and the value of the data. By responding to Cox et al. (2018) we aim to identify these errors and provide recommendations which will enable readers to avoid repeating them.
Although Cox et al. (2018) directly dispute the findings of Atkinson et al. (2004), they did not analyse the same dataset or the same timescale. Atkinson et al. (2004) analysed the period 1976-2003and Cox et al. (2018) analysed 1976. Atkinson et al. (2004 analysed two independent sets of post-1976 krill density data (large nets with nominal mouth area ≥ 3 m 2 and all smaller nets) and applied three separate analyses to each (Table 1). Of these six regression analyses, five supported a widespread decline in density with P values < 0.05. The remaining analysis, a linear mixed model analysis of large nets, also had a negative slope but a non-significant P value (0.12). Cox et al. (2018) used a two-stage mixed model and analysed only data from large nets. The analysis of Cox et al. (2018) is therefore similar, but not equivalent, to the one analysis in Atkinson et al. (2004) which gave a non-significant result. We contend that reaffirming this non-significant result does not challenge any existing paradigm.
Since the Southwest Atlantic sector has warmed rapidly over the last century (Whitehouse et al., 2008) and Antarctic krill is an important species, clear information on its population status and trends is a major requirement for policy makers and scientists alike. Our overall aim in this comment is to suggest a collaborative approach which will allow the scientific community to provide information based on agreed standards of evidence and a thorough understanding of the data.

DATA
The current version of KRILLBASE (Atkinson et al., 2017) is considerably expanded to include previously unavailable records for the period analysed by Atkinson et al. (2004) and new data covering the period 2004-2016. The diverse range of sampling methods used to collect these data, their patchy distribution in space and time, and the high level of temporal and spatial variability in krill density pose challenges for their analysis. Subsequent to the Atkinson et al. (2004) study, a statistical standardisation process has been developed to account for methodological differences between records (including net size, sampling depth, time of day, and time of year) . The functions used in this standardisation, accompanying sensitivity analyses, and guidelines identifying key bias issues are published in the literature (Atkinson et al., , 2017, while the database itself includes warnings about problematic data (Atkinson et al., 2017). Cox et al. (2018) analysed data from the current version of KRILLBASE but rejected 19% of the 7,075 unique krill records available for the Southwest Atlantic sector during their analysis period (1976 to 2016). We followed the criteria in Table S1 of Cox et al. (2018) to reproduce the records which they retained and which we refer to as the Cox dataset (Table 2). We validated this dataset by comparison with the number of records stated in Table  S1 of Cox et al. (2018) and the percentage of data and sign of slope by grid cell in their Figure 1.

Bias
The Cox dataset includes records from just three of the 28 net types included in KRILLBASE. Cox et al. (2018) argue that their approach allowed them to model the effect of net type. An alternative approach of modelling the effect of net mouth area, as in the KRILLBASE standardisation, would have allowed Cox et al. (2018) to include more net types. Instead, their approach removes potentially useful data based on consistent basic net designs (e.g. bongo or ring nets) for which the detailed specification and therefore net type change over time. Cox et al. (2018) removed net types using a variety of criteria including the exclusion of nets "with fewer than 30 presence records" (where presence means non-zero krill density). This criterion clearly selects against zero-density records. It excludes only data collected during the later part of the analysis period (1996 onwards) and will therefore reduce the slope of any decline (issue A in our Figure 1). These data exclusions also exaggerate the spatial heterogeneity in the dataset, which we return to below.  Atkinson et al. (2004) which support their conclusion that krill density in the Southwest Atlantic sector of the Southern Ocean declined in the late twentieth century. P values indicate the statistical support for a decline (i.e. the probability of an erroneous result given the assumptions of the analysis). Each analysis was applied to two sets of net types: large nets (i.e. those with a nominal mouth area ≥ 3 m 2 ) and all smaller nets. Cox et al. (2018) excluded this latter category of nets and presented a mixed model analysis which was similar but not equivalent to analysis 2a.

Analysis
Net type P Unlike the dataset analysed by Atkinson et al. (2004) the current version of KRILLBASE includes data collected in the austral winter. The density of Antarctic krill in nets varies with time of year, and recorded densities are lowest in winter (Cleary et al., 2016). All of the 81 records for 1986 in the Cox dataset were winter records. The retention of this single year of winter data contrasts with the exclusion of net types that were used for less than five years, highlighting inconsistencies in the data selection approach used by Cox et al. (2018). They included days since 1 October as an explanatory variable in their models, but this approach would not appropriately compensate for winter sampling inefficiencies as there were no comparable summer records for 1986. The mean density for 1986 was the lowest of any year in the Cox dataset (1.44 krill.m -2 ). Because this year with erroneously low density is Table 2. Summary of the Cox dataset by 9° longitude x 3° latitude grid cell (Cox et al. 2018: fig. 1). The Cox dataset excludes most of the KRILLBASE net types but includes negatively biased data from winter and deep strata. Columns show net hauls in each grid cell as a percentage of the total; the temporal coverage within each grid cell (start year, end year and total years); the signs of time trends in krill density resulting from simple linear regression of (a) untransformed individual net haul data, (b) log-10 transformed annual averages, and (c) log-10 transformed individual net haul data; the estimated mean krill density resulting form (d) averaging all untransformed individual net haul data, (e) back-transforming the average of log-10 transformed annual means, and (f) back-transforming the average of log-10 transformed individual net haul data; and the percentage of net hauls in which krill were present. Our column c matches the signs of the regression slopes fitted to "transect means" in Figure 1 of Cox et al. (2018). The existence of these slopes does not imply a statistically significant trend. in the early part of the analysis period (pre-1996), its inclusion will reduce the slope of any decline (issue B in our Figure 1). Krill density varies with net sampling depth, with the lowest densities occurring at depths greater than 200 m. There is also variation within the upper water column and the highest densities generally result from sampling that includes the topmost 50 m (Atkinson et al., 2017). Thus, controlling for the effects of sampling depth variation is a key consideration in KRILLBASE analyses and is a part of the standardisation process. Cox et al. (2018) applied very limited filtering according to net sampling depth, excluding only those nets with a sampling depth range of less than 10 m, but they did not include sampling depth as an explanatory variable in their models. Records with a deeper top sampling depth will generally underestimate density compared to those with a more appropriate sampling depth range, for example 0 to 200 m. The Cox dataset included 40 records based on sampling only at depths below 200 m. Thirty-two of these occurred in 1982 and the rest in 1976, 1978, and 1985. A further 19 records were based on sampling only at depths below 50 m, 18 of which occurred between 1976 and 1990. Because these erroneously low densities mainly occur in the early part of the analysis period, their inclusion will reduce the slope of any decline (issue C in our Figure 1).
There have been spatial shifts in sampling effort over time, including a contraction of effort into three main study areas where krill are most abundant (Atkinson et al., 2004(Atkinson et al., , 2017. The data exclusions applied by Cox et al. (2018) exaggerate these shifts. In particular the Cox dataset shifts into shallower water and southwards over time (issues D and E in our Figure 1). It contains no post-2003 data for the most northerly grid cells (405, 406, 505, and 506 in Table 2). The data for the last five years of the analysis period are exclusively from cells 101 and 102 in the extreme Southwest of the study region. In these cells, the percentage of nets containing krill was relatively high in most years (e.g. 89% in 1976 to 1995 compared to 81% across all cells in the same years). A recent paper supporting an overall decline in krill density in the Southwest Atlantic sector reports sharp declines in the north of the sector but stable or increased krill density in the extreme Southwest (Atkinson et al., 2019). Cox et al. (2018) note that "fewer krill are found in areas with deeper seabed" and acknowledge that the increase in the probability of a net containing krill, shown in their Figure 2, may be due to the contraction of sampling effort to shelf areas. The shift in the Cox dataset toward areas where krill are abundant and density has been relatively stable (Steinberg et al., 2015) will reduce the slope of any decline.
A simple way to test the robustness of results to spatial shifts in sampling is to consider whether the result (or lack of result) could be an artefact of the shift. For example, Atkinson et al. (2004;supplementary information) reasoned that the observed decline could not be an artefact of the sampling shift towards areas where krill are most abundant as this would tend to counteract the observed trend. Conversely, Cox et al. (2018) do not provide any evidence to suggest that their conclusion is robust to the effect of spatial shifts in sampling.

Units of analysis
The spatial distribution of krill is highly heterogeneous, with up to 99% of individuals occurring in high-density swarms (Tarling et al., 2009). This creates challenges in the analysis of krill density data, whether they are derived from nets or acoustics. Log transformation can help to achieve a more normal data distribution and aid plotting of density data that can span several orders of magnitude but it also reduces the influence of very high values on any derived statistic. This is illustrated in our Table 2 and in Figure 3 of Fielding et al. (2014) where the means from untransformed net or acoustic data are typically one to three orders of magnitude greater than the back-transformed means of logged data. Most studies of inter-annual patterns in krill density, based on either net or acoustic data, use annual averages (which may be spatially resolved) of density estimates as their basic unit of analysis (e.g. Atkinson et al. 2004;Brierley et al., 1999;Fielding et al., 2014;Loeb et al., 1997;Murphy et al. 2007;Quetin et al., 2007;Steinberg et al., 2015). In contrast, Cox et al. (2018) used individual net hauls as their unit of analysis. Log transformation of individual net hauls down-weights very high values to a much greater extent than log transformation of annual averages (columns d to f in our Table 2). Of the pre-1996 density values in the Cox dataset, 4.6% were higher than 100 krill.m -2 compared to 3% of post-1995 densities. Because high densities were more common in the early part of the analysis period, the approach of Cox et al. (2018) will reduce the slope of any decline (issue F in our Fig. 1).
This decision to log transform individual net hauls impacts the results shown in Figure 1 of Cox et al. (2018) where log transformation used in conjunction with simple linear regression identifies negative trends in only four of 13 cells. The alternative approach of log transforming annual means more than doubles the number of negative trends (contrast columns b and c in our Table 2). Cox et al. (2018) used a variable, which they call "survey," as a random effect in their models. KRILLBASE does not identify the survey in which data were collected, partly because some data were supplied to us without voyage information and partly because one voyage can include multiple surveys. The method used by Cox et al. (2018) to identify surveys is not reliable. It is, however, a proxy for the year of data collection, which was used as a random effect variable by Atkinson et al. (2004).

Erroneous variables
The term "transect" used in Cox et al. (2018) is likewise unreliable. The caption to their Figure 1 suggests that linear regressions were fitted to transect means. KRILLBASE does not identify transects and Cox et al. (2018) do not explain how they did so. Not all surveys use transects and, when they are used, the spatial extent of transects can vary by an order of magnitude (e.g. from < 100 km to > 1000 km in the CCAMLR synoptic survey; Hewitt et al., 2004). It is therefore unlikely that transects represent a consistent sampling unit, or one that readily maps on to the grid cells in Figure 1 of Cox et al. (2018). Figure 3 in Cox et al. (2018) shows their results from a model with six separate explanatory variables. Cox et al. (2018) provide little information on the functional form or reliability of the modelled effects, other than a description of the effect of seabed depth. This hinders both reproducibility and validation of their results. Their Figure 3 shows predicted density rising slightly from 1976 before falling to about 84% of its 1976 level in the early 2000s and then recovering slightly to about 88% of its 1976 level by 2016. The minimum and maximum of parametric bootstrap confidence intervals were about 1.7 and 3.3 krill.m -2 respectively. Cox et al. (2018) state that these confidence intervals are "large" and that their analysis reveals "considerable inter-annual variability." These model-based estimates of variability, which include the effects of down-weighting high values, are much lower than the orders-of-magnitude variability reported in all previous studies cited in reviews by  and Hill et al. (2016). The mean predicted krill densities in Figure 3 of Cox et al. (2018) are also at least an order of magnitude lower than those observed in these previous studies and in the Cox dataset (Table 2). This figure is not, therefore, a reliable representation of krill density dynamics. Cox et al. (2018) argue that the decreasing trend in their Figure  3 is not consistent with a "massive" decline. The evidence presented is a comparison of predicted densities for 1976 and 2016 from 1,000 bootstrap samples, 431 of which were higher in 2016 and therefore indicated an increase in krill density. This comparison of two years is an insensitive method for detecting a trend over four decades for a number of reasons. Firstly, the precision of GAMs declines towards the extreme values of the independent variable (i.e. the first and last years), as indicated by the widening confidence intervals in Figure 3 of Cox et al. (2018). Secondly, 2016 is not representative of the region as a whole, since the last 5 years of data in the Cox dataset come exclusively from the two cells in the extreme Southwest of the study region, and there are no post-2003 data from anywhere north of 58°S. Thirdly, this approach is not suitable for datasets with high inter-annual variability, which may influence between-year comparisons more than any underlying trend.

Results and hypothesis testing
These issues aside, the approach of Cox et al. (2018) is notable for the absence of any statistical hypothesis testing or any estimate of the risk that their conclusion is erroneous (i.e. Type II error). The examination of a binary outcome (increase versus no increase) cannot provide any information on the magnitude of a decline, massive, or otherwise. The issue is therefore simply whether there is any statistical evidence of a decline, as the title of Cox et al. (2018) suggests. The examination of a binary outcome suggests a testable hypothesis: that the number of samples indicating a decline is higher than chance (i.e. 50% of samples). The null hypothesis is that the number of samples indicating a decline is no higher than chance and therefore that the model is not consistent with a decline. Cox et al. (2018) do not report the frequency of samples indicating a decline or the third potential outcome, which is no change. We therefore evaluate the null hypothesis using the assumption that 569 samples indicated a decline, and then we test whether our conclusion is robust to fewer samples indicating a decline. The null hypothesis is rejected by the binomial test with 569 samples indicating a decline (null probability = 0.5, P < 0.0001) and with as few as 526 samples indicating a decline (P < 0.05). In simple terms, if a coin was tossed 1,000 times and it came up heads at least 526 times, then one could reasonably conclude that the coin favour heads (a decline). The statement in Cox et al. (2018) that the "results suggest no detectable trend" therefore appears to be false. We make this point not to endorse the approach of comparing two years, but to demonstrate that the interpretation of their own results in Cox et al. (2018) is not robust to statistical hypothesis testing. Cox et al. (2018) suggest that the conclusions of Atkinson et al. (2004) are "a consequence of their not considering interactions between krill density and unbalanced sampling in the data, and not accounting for different net types used." In fact, Atkinson et al. (2004) considered each of these issues. They, like Cox et al. (2018), used mixed models to deal with unbalanced (i.e. spatio-temporally heterogeneous) sampling. Atkinson et al. (2004) accounted for different net types by performing separate corroborative analyses using data from different types of net (Table 1). Atkinson et al. (2004) also performed a range of supplementary analyses to ensure that their conclusions were robust to spatial, temporal, and methodological shifts in the data. The assertion of Cox et al. (2018) quoted above is therefore incorrect. Cox et al. (2018) twice extrapolate the 1976-2003 rate of decline found by Atkinson et al. (2004) to the present day. However, Atkinson et al. (2004) did not provide any projections and recommended that "future predictions must be cautious." The linear extrapolations of Cox et al. (2018) contravene recommended best practice (e.g. Hill et al., 2007) and do not represent the results of Atkinson et al. (2004). Cox et al. (2018) restate the argument of Nicol et al. (2012) that changes in krill density at the regional scale since the 1970s should be reflected in the results of more recent monitoring at smaller spatial scales (standardised acoustic surveys from the early 1990s) and putative indirect indicators of krill availability (predator indices, post 1987). Some of these datasets now have around three decades worth of data, a timescale over which it may be possible to distinguish climate-driven change from variability (Henson et al., 2010). We therefore recommend integrated analyses of these datasets alongside KRILLBASE to provide a thorough synthesis of variability and change at the regional scale. Nonetheless we caution that faith in the ability of fishery catch rate data to indicate population declines in aggregating species, as promoted by Cox et al. (2018), has been implicated in the catastrophic collapse of fished stocks around the world (e.g. Erisman et al. 2011;Rose & Kulka 1999) and of baleen whale populations (Heazle, 2012).

RECOMMENDATIONS
We intend KRILLBASE to be a useful resource for investigating Southern Ocean ecology. The issues raised in this comment suggest the following recommendations to supplement those in Atkinson et al. (2017) and support future use: • Composite datasets such as KRILLBASE may need some correction for differences in sampling methods. The KRILLBASE standardisation has the advantage that it is described in detail, with appropriate sensitivity analyses. Users might choose other standardisations or to correct for sampling issues within a model. Whichever method is used, it should be based on a thorough understanding of the data as described in Atkinson et al. (2017). • Net sampling depth is an important influence on sampling efficiency in addition to those considered by Cox et al. (2018) (i.e. net type, time of year, and time of day). All of these variables are included in the KRILLBASE standardisation, and should be taken into account in analyses. • Winter data or data from deeper strata (> 200 m) are not reliable indicators of summer density in the upper strata (0 to 200 m). • Do not assume that the KRILLBASE data fields contain any information that is not stated in Table 2 of Atkinson et al. (2017). • There is a trade-off between consistency of sampling method and data coverage. Ensure that data coverage is appropriate for the intended analysis. • Log transformation down-weights the influence of very high densities and could bias analyses. • Check that conclusions are robust to the effects of data transformation and unavoidable biases such as shifts in sampling method and location. • Report the probability of Type I error (P value) for positive results and of Type II error for negative results. • Report model results in meaningful detail to facilitate reproducibility and validation. • Avoid linear extrapolation of population trajectories. • Be aware of inter-annual variability, which can be greater than any underlying trend. • Avoid diagnosing or rejecting a multi-year trend based on a comparison of two years.

CONCLUSION
We have identified several sources of bias in the approach of Cox et al. (2018) resulting from (i) exclusion of low density data from the later part of the analysis period, (ii) inclusion of negativelybiased winter and deep stratum data in the early part of the analysis period, (iii) sampling shifts over time to areas of high krill density, and (iv) down-weighting high densities which were more common in the early part of the analysis period (Fig. 1). Each of these sources will reduce the slope of any decline and therefore increase the risk of failure to detect a real decline. This risk is increased by the use of subjective interpretation rather than statistical hypothesis testing. The opinion of Cox et al. (2018), that there has been no decline in krill density, is clear. The evidence to support this opinion is unclear. On the one hand their study reaffirms the one non-significant result in Atkinson et al. (2004) without challenging the five significant results in that paper which support a late twentieth century decline in krill density. On the other hand, it is unlikely that the approach of Cox et al. (2018) would detect a real decline. Consequently, existing evidence for a late twentieth century decline in krill density still stands (e.g. Atkinson et al., 2004;, 2019Forcada & Hoffman, 2014;Loeb et al., 1997;Watters et al. 2013).
A polarised debate about whether or not a decline in krill density has occurred impedes understanding of the past and present status of the Antarctic krill stock in the Southwest Atlantic. Such a debate provides no guidance to policy makers and other stakeholders and increases the risk of inappropriate policy decisions. The existence of KRILLBASE, which compiles data from ten nations, shows the potential for scientific collaboration. The onus is now on the scientific community to provide useful advice on how the krill stock has changed over time. There will be uncertainties associated with any assessment, especially because there is no large-scale, long-term direct monitoring of the krill stock. Advice should be clear about the level of confidence and agreement behind any statement, and the implications of any uncertainties. We suggest that a collaborative effort is needed to identify appropriate standards of evidence and to ensure that such advice is based on informed use of the available data.