Quantifying acoustic survey uncertainty using Bayesian hierarchical modeling with an application to assessing Mysis relicta population densities in Lake Ontario

A Bayesian hierarchical model was applied to acoustic backscattering data collected on Mysis relicta (opossum shrimp) populations in Lake Ontario in 2005 to estimate the combined uncertainty in mean density estimates as well as the individual contributions to that uncertainty from the various information sources involved in the calculation including calibration, target strength determination, threshold specification and survey sampling design. Traditional estimation approaches often only take into account the variability associated with the survey design, while assuming that all other intermediate parameter estimates used in the calculations are fixed and known. Unfortunately, unaccounted for variation in the steps leading up to the global density estimate may make significant contributions to the uncertainty of density estimates. While other studies have used sensitivity analyses to demonstrate the degree to which uncertainty in the various input parameters can influence estimates, including the uncertainty directly as demonstrated here using a Bayesian hierarchical approach allows for a more transparent representation of the true uncertainty and the mechanisms needed for its reduction. A Bayesian analysis of the mysid data examined here indicates that increasing the sample size of biological collections used in the target strength regression prove to be a more direct and practical way of reducing the overall variation in mean density estimates than similar steps employed to increase the number of transects surveyed. A doubling of target strength net tow samples resulted in a 23% reduction in variance relative to an 11% reduction that resulted from doubling the number of survey transects. This is an important difference as doubling the number of survey transects would add 5 days to the survey whereas doubling the number of net tows would add only one day. Although these results are specific to this particular data set, the method described is general.


Introduction
Assessment scientists engaged in conducting acoustic surveys of marine and freshwater populations have long recognized that there are several sources of variation contributing to the overall uncertainty in population estimates (Simmonds et al., 1992;Simmonds and MacLennan, 2005).Most of the major contributors to variation and bias are well understood by acoustic scientists, but can be difficult to include in variance calculations.Monte Carlo simulation techniques (including some methods that are spatially explicit), bootstrap and parameter sensitivity analyses have all been successfully used to examine how error can propagate through the estimation process into an assessment (Rose et al., 2000;Demer, 2004;O'Driscoll, 2004;Simmonds and MacLennan, 2005;Coetzee et al., 2008;Demer and Renfree, 2008;Simmonds et al., 2009;Woillez et al., 2009;Zwolinski et al., 2009) while Bayesian methods have been applied to different individual components of the process (Heywood et al., 2006;F€ assler et al., 2009;Juntunen et al. 2012;Boyd et al. 2015).In the acoustic setting, there are several preliminary steps that are needed to convert what the survey is observing in terms of backscatter to what one wishes to predict, namely density or biomass per unit area or volume.In addition to backscattering strength, data on target strength, threshold values, equipment calibration, species composition, size composition and animal behavior characteristics have all been explored for their influence on how sound is translated into population density (Demer, 2004;Simmonds and MacLennan, 2005).Coincidentally, there has been a concurrent development in statistics for incorporating into estimates the information that comes from supplementary sources (Carlin and Louis, 2000;Link and Barker, 2010;Gelman et al., 2014).Once such approach is Bayesian hierarchical modelling (Gelman et al., 2014).This statistical method often uses Markov Chain Monte Carlo (MCMC) simulation to create a posterior distribution of the probability that the population density is at a particular level and does so by combining all the different contributing sources of information (e.g.data, expert information, or sampling design) into a single hierarchical framework.The architecture of the MCMC algorithm typically allows the complex contribution of all of these pieces of information to come together into a single cohesive probability distribution even when no theoretical distribution can be analytically derived.Although such estimates could also be derived using other statistical frameworks, such as penalized likelihood methods, the Bayesian approach results in estimates that can be more simply interpreted as probabilities of a population being at a certain level, which assists in determining risk in a straightforward manner.
Mysid shrimps in the Mysis relicta species complex are an important component of the open water community in many large inland lakes and have been called the krill of freshwater lakes (Rudstam, 2009).Mysids are omnivores that can significantly affect their zooplankton prey at the same time as they are important prey for planktivorous fish and young piscivores (Nelser andBergersen, 1991, Rudstam andJohannsson, 2009;Walsh et al., 2012).In the Laurentian Great Lakes, mysids can make up a third of the total crustacean biomass (Watkins et al., 2015).Given their importance both as predators and prey, and the concerns about ecological changes in these lakes, there is a need to develop better methods for estimating mysid abundance.Rudstam et al. (2008) presented a method for estimating mysid abundance with the standard frequency used for acoustic fisheries surveys around the Great Lakes (120 kHz).
The types of information used in deriving estimates of mysid density include: (i) calibration factors associated with the acoustic equipment; (ii) thresholding coefficients for eliminating acoustic backscattering data associated with organisms that are larger than expected from mysids in these systems; (iii) target strength regression parameters for converting acoustic backscatter to counts of individuals; and (iv) sample size and design specifications of the acoustic survey used to assess abundance.The steps through which this information is applied can be visualized schematically as a linear progression of gathered and then implemented information (Figure 1).Depending on the species and the system, other important information sources can be identified as potentially contributing to this process.With acoustic estimation of fish densities, for example, additional types of information might include species composition, size composition, fish orientation, information gathered from multiple frequencies, and estimation of target strength from trawl catches (Simmonds and MacLennan, 2005;Rudstam et al., 2009).Here we explore the usefulness of applying such a Bayesian hierarchical framework to the somewhat simpler problem of acoustic estimation of mysids abundance in the Great Lakes.

Data
Acoustic data were collected as part of a survey for forage fish in Lake Ontario 25-31 July 2005 (Rudstam et al. 2008).We used data from five cross-lake transects in the main lake that were surveyed at night.Data were collected with a Biosonics Dt-X, 120-kHz split beam, 7.2 beam width, 0.4-ms pulse length, 1 pings s À1 deployed on a tow body with the transducer at 1.8-m depth.Survey speed varied from 5.5 to 7 knots, depending on conditions.This unit was calibrated with a standard 33-mm diameter À40.7 dB tungsten steel sphere during the survey.All data were collected with a lower threshold of À100 dB.
Data on mysid target strength were collected during a dedicated sampling cruise using a similar acoustic unit (Biosonics Dt-X, 120-kHz split beam, 7.8 beam width , 0.6 ms pulse length).The unit was calibrated by the manufacturer in May 2005, and confirmed to be within 0.1 dB of manufacturer's specification in October 2005 using a À40.4 dB 23mm diameter standard copper sphere.Data were collected with a lower threshold of À130 dB.Mysid were sampled with a 1 m diameter opening-closing net (mesh size 1 mm) using vertical net tows.Details on sample collection and processing are in Rudstam et al. (2008).

Statistical methods
We develop the statistical model first as a classical design-based cluster sampling approach and then subsequently as a Bayesian hierarchical modeling approach.The acoustic back scattering data were collected along continuous north-south transects using a systematic sampling design (Figure 2).As a consequence, the global mean mysid density estimate D (here in number/m 2 ) can be calculated using a classic cluster sampling formulation (Scheaffer et al., 2012): M where D k is the average density of mysids across transect k and M k is the total number of sample units observed within a transect k, and M ¼ X M k is the total number of observations across all K transects.One sample unit is a 200 m section of the transect.This formulation is effectively a weighted average where each transect mean is weighted according to the length of each transect.
For reference, the associated classical frequentist estimate of the SE of D is given by Scheaffer et al. (2012) as To convert acoustic backscattering into mysid density (D, in number/m 2 ) we used the standard equation for relating this to the area backscattering coefficient ABC i (or s a using MacLennan et al. 2002 symbols) recorded at each sampling location i (Simmonds and McLennan 2005): where a is the reciprocal of the average backscattering crosssection (r bs or target strength when expressed in dB) of a single mysid.Note that we use the reciprocal of r bs (a) in the analysis to avoid confusion with the use of r for standard deviation.In order to develop the model in a Bayesian and hierarchical fashion, we recognize that a is in fact estimated and that there may be additional variation coming from other factors such as the processes of calibration and threshold setting.To begin, we set up the regression model for the biological net tow samples as follows: Regression including now both the regression slope term a, to be estimated, and a calibration coefficient 1 1 centered on 1 depicting the variation associated with a typical calibration procedure.Note that the data here, accented by a *, represent those collected in the biological surveys that are run in parallel to the acoustic transect survey.
Here and elsewhere the distributions used either follow from theory or, as was the case with this regression, can be examined post hoc to evaluate the adequacy of the distribution model employed.
Relatively uninformative priors are used for the slope term a and the regression variance r 2 Regression following the recommendations of Gelman (2006) under a Gaussian distribution as follows: 1 for prior parameter values), while using an informative prior on the calibration variation 1 1 : The inverse gamma prior for the calibration variance was chosen over the half Cauchy used for the standard deviation in this instance because we wish to provide for an informative prior on the variance that is well away from 0 and representing the variation actually seen in practice.The parameters c 11 ¼ 151.5 and c 12 ¼ 1 used in the calculation were derived from actual calibration experiments and reflect an observed intercalibration variance of r 2 11 ¼ 0:0066 so that c 11 ¼ 1=r 2 11 .Because the data are collected via a cluster transect design, and not through simple random sampling, the design must be structurally accounted for in the Bayesian development (see e.g.Link and Barker, 2010;Gelman et al., 2014).We accomplish this by first modeling the threshold specific mean backscattering by transect as: where ABC :jk is the observed area backscattering coefficient averaged across all observations i for each specified threshold level j for each transect k, and the coefficient b j , the estimated mean acoustic backscatter per observation at each threshold level j, is weighted by the number of observations in each transect M k .Priors for b j , r 2 b , and r 2 h j are similarly specified as above using an uninformative Gaussian distribution for the mean and an Half-Cauchy distribution for the variance.
The relationship between backscatter and density, through the estimated coefficient a, and the Bayesian estimates of average  ABC can now be used to predict average mysid density by threshold and transect: as well as the average density at each specified threshold: where 1 2 is potentially a second calibration variable an inverse Gamma distribution on the variance with parameters c 21 and c 22 .
(In this work we use the same distribution for both calibration components, but allowed them to vary independently.)We again define M ¼ X M k , that is the total number of observations across all transects.Now the D j represent the average mysid density corresponding to a particular backscatter threshold level j.Each average density, then, corresponds to a chosen backscatter threshold, above which acoustic data are excluded as they are not believed to represent mysids.There is a range of possible exclusion thresholds u j : fj ¼ 1; J g that corresponds to each of the average densities estimated above.An experienced acoustician might rank the different thresholds as more or less probable: Here we specify that the acoustic threshold u applied at each step in the MCMC simulation will follow an informative categorical probability distribution with probabilities here characterized to follow a Binomial distribution with n ¼ 10 and p ¼ 0.5 to create probabilities over 11 possible thresholds ranging from À55 to À65 dB (in the TS domain, see above data discussion and Rudstam et al., 2008): The threshold distribution is also, therefore, informative and is used to randomly select with probability pðu ¼ u j Þ a particular average density at each iteration of the calculation, thus adding an additional probabilistic component to the estimates.
To summarize, the posterior distribution on the global mean density D for this process is a function of elementary backscatter conversion parameters a and r 2 Regression that are determined from biological net tow data, a vector of threshold specific mean backscattering coefficients b determined from the acoustic survey data, two elementary calibration parameters 1 1 and 1 2 whose probability distributions are garnered from the literature or in situ calibration experiments, and a vector of elementary threshold parameters u whose probabilities were specified based on the analysis in Rudstam et al. (2008) of the ambient back scattering strength with different thresholds when mysids were not present.The resulting posterior distribution of the predicted mean density of mysids is as follows: where ABC, is the matrix of average area backscatter coefficients for all thresholds and transects, ABC Ã and D Ã , are the observed vectors of area backscatter coefficients and mysid densities taken during the complementary survey to estimate mysid target strength, and the parameter set {a; r 2 Regression ; b; 1 1 ; 1 2 ; u} represent the key estimation parameters with the remaining metaparameters conditioning the various priors.The posterior distribution was estimated using MCMC simulation with JAGS (Plummer, 2003) within the R statistical computing environment (R Core Team, 2015).Finally, a post hoc correction was applied to the resulting MCMC realizations of density to account for bias in the variance known to occur with small sample sizes, in this case a sample size of K transects.The correction is as follows:

Results
Four chains of the JAGS model were run, each with 42 000 iterations.The first 2000 iterations of each chain were discarded as burn-in and the remaining iterations were thinned to one out of every four, resulting in 10000 iterations per chain for a total of 40 000 iterations used for the analysis that follows.Table 2 and Figure 3 provide summaries of the estimates which demonstrate the variation inherent in each of the components, as indicated by the standard error and overall distribution of the predictions, each contributing to the global variation in the final mysid density estimates.The fit of the regression component can be seen by plotting the fitted regression line to the net sample data (Figure 4).And the posterior density estimates of mysid abundance from the fully parameterized Bayesian model can be compared to a t distribution parameterized using the mean and variance derived from the classical cluster sampling analysis, where the only variation explicitly included is that due to the sampling design (Figure 5).Including all the other sources of variation roughly doubles the spread of the distribution, in other words, increasing the variance by a factor of four.
It is worth exploring how each component contributes to the overall variation.We do this by sequentially removing one component of variation at a time from the full Bayesian model starting with the calibration component, then the threshold component, and finally the density to ABC regression estimation component.This is done by setting the parameters for each component to their best estimate, and thus minimizing the full variation allowed for in association with each estimate.Figure 5 also indicates the decreases that occur in the total variation of the density estimates as a result of removing the variation due to each of the subcomponents.We further note that the Bayesian model that only includes variation due to the acoustic survey sampling component is quite similar to that derived from the classical approach.
Clearly, the variation in the target strength regression estimate has the greatest effect on the global estimate and its variation.Although the posterior distribution of mysid density does narrow when the effects of calibration and threshold are removed, these effects are minor although calibration seems to contribute more to the overall variation than does threshold selection.
To further explore the influence that estimation variation from the sampling design has relative to the target strength regression sampling and to examine the utility of using the Bayesian approach to help make decisions on the allocation of sampling effort, we conduct two simple experiments with the model.We compare a base case, that is the full Bayesian model with all components estimated and with the data as originally given, to two alternatives: (i) an application of the full model to the situation where we have twice as many transects sampled; and (ii) an application of the full model to the situation where we have twice as many regression point pairs in the target strength calculation.In these two applications we simply included the reference data sets twice without randomization.For a truer measure of sample size determination this approach can be done in a more sophisticated fashion using Monte Carlo simulation or bootstrap resampling to create the additional pseudo-data for power and sample size determination, but here we will simply double each dataset and consequently add twice as much pseudo-data into the procedures for each application.The results indicate that a greater gain is realized (23% reduction in variance from the base case) by doubling the sample size associated with the target strength estimation than would be realized by doubling the survey information (11% reduction from the base case).

Discussion
A Bayesian hierarchical modeling framework was created to fully characterize the uncertainty in the acoustic estimation of mysid densities in Lake Ontario.While incorporating the estimation uncertainty of some components of the process into the estimation algorithm (i.e.calibration, thresholding) resulted in only minor changes to the global estimates and their variation, other components, specifically the regression analysis relating density to acoustic backscattering and the survey sampling design, had a greater influence.Other techniques have been used to examine which components in the process are the most likely to influence estimates (e.g.sensitivity analyses or Monte Carlo simulation on the parameter inputs e.g.Demer, 2004;O'Driscoll, 2004;Simmonds and MacLennan, 2005), but the Bayesian hierarchical framework provides a relatively straightforward mechanism for directly including auxiliary information and appropriately characterizing the uncertainty in the estimation process.Further, the Bayesian framework proved useful in exploring sample size determination and addressing design questions in the context of the wider sources of uncertainty.
To our knowledge, previous applications of the Bayesian framework in hydroacoustics estimation were typically limited to a single component of the estimation process (e.g.TS estimation F€ assler et al. 2009, allocation of backscattering to different species, Juntunen et al., 2012, prediction of spatial distribution, Boyd et al., 2015).And while this use of Bayes method is certainly appropriate in these applications, one the of the greatest advantages to using a Bayesian approach is the facility with which many, and in fact all of the different intermediate estimates and estimation methods can be included hierarchically and statistically appropriately into the overall estimation.To the extent that the various intermediate data and estimation methods are available, this hierarchical statistical approach can be used to extend the work of Rose et al. (2000), O'Driscoll (2004), andSimmonds et al. (2009) in a unified and systematic fashion (as opposed to a combination of independently generated distributions using ad hoc approaches such as Monte Carlo and bootstrap) to not only explore the sensitivity of estimates to variation in inputs but to also come up with the actual estimates themselves and their associated variation; in fact to come up with the full multivariate statistical distribution of each estimate.
More sophisticated approaches could certainly have been implemented to determine gains associated with different sample sizes or alternative designs.For example, by simulating the collection of increasingly greater sample sizes by either bootstrapping the original regression data or doing Monte Carlo simulations based on means and variances from the samples conducted across a range of sample sizes could be quite productive for determining Shown are the mean, standard deviation and quantiles from the MCMC simulation for selected parameters.Rhat is the Brooks-Gelman-Rubin scale reduction factor, which should be close to 1 on convergence, and n.eff is an estimate of the effective sample size after adjusting for autocorrelation in the MCMC (Gelman and Rubin, 1992).
the cost effectiveness of reductions in variation in the estimates as the sample sizes of various components increases.Such an exercise would be relatively straightforward to run.Computing time may be an issue, but the number of chains for each step in the MCMC might be reduced from what we used here to shorten run times.
An additional advantage to using the Bayesian methodology is that it lends itself well to generating probabilistic outcomes, such as the probability that the population has dropped below a certain threshold level, or the probability that the density of forage organisms is above some critical level needed to sustain the growth of higher trophic levels.A classical frequentist might argue that this could be done without using Bayesian methods, but the Bayesian approach provides a natural framework for including information in a probabilistic way and subsequently obtaining outputs that straightforwardly reflect both this information and the associated assumptions.
The methods discussed above can be straightforwardly extended to include more sophisticated approaches such geostatistics (Petitgas, 1993;Rivoirard et al., 2000;Woilez et al., 2009).The hierarchical statistical framework can appropriately include the inputs discussed here in addition to including other auxiliary information as might occur with the incorporation of depth, location or time as covariates as well as simultaneously estimating other intermediate parameters such as those used in characterizing covariance structure or variograms.These more sophisticated approaches will obviously be more computationally intensive and we chose not to explore them in this work in order to highlight the inclusion of the more basic inputs of calibration, thresholding, target strength, and survey design.Furthermore, once calculated the resulting MCMC chains of jointly simulated parameter estimates could easily be input into additional calculations, such as an area-based estimate of abundance, or a geostatistical interpolation, or a time series projection of density resulting in 40 000 realizations, say, of such estimates and consequently a full distribution characterizing any subsequent calculation based on these estimates thus extending the degree to which probabilistic or risk based assessments might be derived.This all suggests that a single unified approach to estimation that fully encapsulates the uncertainty is needed and that the Bayesian hierarchical approach is useful in that regard.
Finally, the chains of jointly estimated parameters provide a natural way of capturing the correlation between parameter estimates, thus avoiding the common mistake of inadvertently but erroneously implying independence in the estimates by just abstracting means and variances from summary tables.This is especially true for parameter estimates based on nonlinear equations, where a high degree of collinearity may be present.
In this work, the Bayesian hierarchical framework was applied to survey data collected on M. relicta.The breadth and number of intermediate estimation components contributing to the acoustic assessment of this species might be narrower than one might expect for a multispecies fish stock, but the application can be easily generalized to include other ecological sources of information.Although sensitivity analyses can point to parameters that should receive greater research or sampling focus, inclusion in a Bayesian method can quantify the exact trade-offs.Bayesian methods, such as this one, can take several minutes to run on a PC and when doing development or even in practice under standard assessment conditions, it looks as though some input parameters, e.g.thresholding, could remain fixed with little or no serious loss in characterizing the uncertainty.Research and management scientists will have to decide what components are the most important to include and what to leave out for the sake of practicality.
Figure 5. Posterior distribution of the probability density of mysid density (number per square meter) under the full hierarchical Bayesian model, the Bayesian model with variation in the calibration estimate removed, the Bayesian model with variation in calibration and thresholding removed, the Bayesian model with variation in calibration, thresholding and target strength regression removed in comparison with results from the classical cluster design method which does not explicitly account for variation in calibration, thresholding or target strength.Note that the distribution of mysid densities under the Bayesian model with all but the survey variation removed is, perhaps not surprisingly, very similar to the results from the classical cluster design.

Figure 1 .
Figure 1.Flow diagram of acoustic algorithm for computing mysid abundance.

Figure 3 .
Figure 3. Posterior probability densities of key model outputs.

Figure 4 .
Figure 4. Fit of regression of mysid density to ABC.Ninety-five percent Bayesian credible intervals are also provided.

Table 1 .
Prior parameter values

Table 2 .
Summary output from JAGS run of the full Bayesian model