Investigating hydrological contributions to volcano monitoring signals: a time-lapse gravity example

for changes in water-mass storage in the unsaturated we combine 2-D axis-symmetrical numerical ﬂuid-ﬂow models with an axis-symmetric, distributed-mass, gravity calculation to model gravity changes in response to ﬂuctuating hydrological recharge. simulations are on tropical volcanic settings where high surface permeabilities promote thick unsaturated zones. Our study highlights that mass storage (saturation) changes within the unsaturated zone beneath a survey point can generate recordable gravity changes. We show that for a tropical climate, recharge variations can generate gravity variations of over 150 µ Gal; although, we demonstrate that for the scenarios investigated here, the probability of recording such large signals is low. Our modelling results indicate that microgravity survey corrections based on water table elevation may result in errors of up to 100 µ Gal. The effect of inter-annual recharge ﬂuctuations dominate over seasonal cycles which makes prediction and correction of the hydrological contribution more difﬁcult. Spatial hydrogeological heterogeneity can also impact on the accuracy of relative gravity surveys, and can even result in the introduction of additional survey errors. The loading ﬂuctuations associated with saturation variations in the unsaturated zone may also have implications for other geophysical monitoring techniques, such as geodetic monitoring of ground deformation.


I N T RO D U C T I O N
Volcanic hazard monitoring utilizes a number of geophysical techniques. These monitoring tools are sensitive to physical changes within the subsurface such as density, electrical properties and elastic deformation. In volcanic settings geophysical signals are regularly interpreted to reflect changes within the magmatic system and to track the movement of magma (e.g. Chouet 2003;Dzurisin 2003). As technology, data collection and processing strategies have improved, volcanic monitoring techniques are able to resolve increasingly small signals (e.g. McNutt 2005; Battaglia et al. 2008;Parker et al. 2014;Muller et al. 2015). This ability to resolve finer signals increases the challenge to decipher underlying physical sources that may or may not be related to the magmatic system and volcanic unrest. One non-magmatic source of geophysical signals are temporal and spatial hydrological variations. These hydrologically induced signals may reflect a perturbation of the volcanic hydrothermal system and therefore can indicate changes in volcanic behaviour (e.g. Bianco et al. 2004;Todesco 2009). However, non-volcanic hydrological fluctuations, driven by weather and climate, also generate recordable geophysical signals (e.g. GPS, Argus et al. 2014, Fu et al. 2015seismicity, Jiménez & Garca-Fernández 2000;gravity, Jacob et al. 2010). This can present a risk of misinterpretation where the purpose of the geophysical campaigns is to monitor pre-eruptive or eruptive activity.
This study focusses on geophysical signals resulting from climatically driven subsurface water mass variations. Mass variations are typically recorded by gravimetric monitoring campaigns. Therefore, we target this investigation on exploring gravimetric changes associated with variations in sub-surface water storage due to 260 B. Hemmings et al.  Saibi et al. (2010) variations in hydrological recharge and hydrogeological properties. We assess the potential for contamination of volcanigenic gravimetric signals with signals resulting from seasonal and inter-annual recharge variations that are typical of tropical climates.

Gravimetry used in volcanic monitoring
Gravity measurements are often an important component of the geophysical monitoring suite at active volcanoes. Static gravity data can be used to image and identify structural and compositional heterogeneities in the subsurface (Hautmann et al. 2013). Re-occupying a survey network over periods of months to decades can provide a record of mass or density changes over space and time. Such dynamic, or time-lapse, gravity monitoring has been used to track the complex evolution of volcanic unrest (Gottsmann et al. 2006a,b) and observe potential eruption precursors (Rymer 1994;Carbone et al. 2003a). Table 1 provides examples of the amplitude and timescales of observed gravity changes at volcanoes around the world. The nature of postulated magma-related gravity sources varies. Microgravity signals associated with eruptive phases at Mt Etna (Italy) have been attributed to basic magma injection both within and below the edifice (as summarized by Budetta et al. 2004). In more silicic systems, a more complex relationship between magma injection and recorded gravity signals has been suggested. Eggers (1983) proposed that magma injection at Pacaya Volcano, Guatemala generates a gravity decrease as high density, degassed magma is displaced by vesiculated, low density magma. Subsequent degassing and crystallization of intruded magma would result in a density increase and a positive residual gravity anomaly, as is suggested by Jousset et al. (2000) to explain the residual gravity signal observed at Merapi in 1993Merapi in -1994 Gravity surveys are typically combined with geodetic data to account for gravity changes related to ground deformation. A number of studies have used these data sets to distinguish magmatic signals from hydrothermal sources of deformation, often in large caldera forming systems (Berrino et al. 1984;Battaglia 1999;Battaglia et al. 2008). Studies that reveal a residual gravity anomaly after correction for elevation changes associated with ground deformation often implicate a magmatic source for the signal. For example, Battaglia (1999) suggest that intrusion of silicate magma at Long Valley caldera can explain both uplift and a residual gravity change of up to 64 µGal between 1982 and 1998. A magmatic source for deformation was initially favoured to explain deformation and gravity relationships observed at Campi Flegrei in the early 1980s (Berrino et al. 1984). Later studies, however, have suggested that the deformation between 1983 and 1984 was due to migration of hydrothermal fluids (Battaglia et al. 2006).
The simple assumption that residual gravity anomalies-those that remain after correction for deformation-are due to mag-matic processes can be questioned, as potential non-magmatic mass change sources have been identified. A number of authors have proposed that induced gravity changes may be due to dynamic behaviour in the hydrothermal system itself (Gottsmann et al. 2005;Todesco & Berrino 2005;Todesco et al. 2010). To explain shortperiod (less than 1 hr) gravity variations of up to 20 µGal at Nisyros caldera, Gottsmann et al. (2005) suggest movement and accumulation of steam pockets. With addition of other geophysical data records, Gottsmann et al. (2007) propose that these gravity cycles are related to deformation associated with pressure changes within the hydrothermal system on Nisyros. Using TOUGH2 fluid-flow models, Todesco & Berrino (2005) demonstrate that fluid density changes associated with longer period perturbation of the hydrothermal system beneath Solfatara, Campi Flegrei may be responsible for the observed gravity residual fluctuations in the years after the early 1980s bradyseism.
The importance of understanding the distribution of fluids in the subsurface and the effect that it has on gravity measurements is now well acknowledged. Variations in water table elevation are an important correction during the reduction of gravity data (Jachens & Roberts 1985;Battaglia et al. 2003). However, direct measurements of water table and aquifer storage potential (porosity), concurrent with gravity surveys, are rarely available. Assumptions and workarounds used to mitigate this data deficiency include spatially and temporally interpolating the water table elevation, where some measurements do exist (Battaglia 1999), and assuming that the changes in water level are seasonal, secular and consistent (Carbone et al. 2003a;Budetta et al. 2004). Some studies simply demonstrate that the observed gravity signals are too large to be solely due to water table elevation variations, estimated from rainfall records and recharge assumptions (Jousset et al. 2000;Hautmann et al. 2009). Budetta et al. (1999) recorded seasonal gravity variations of 20 µGal (peak to peak) on Mt Etna. Although this hydrological gravity signal obscured volcanically induced gravity signals at distal stations, where the volcanic signals are smaller, they were still able to distinguish magma-induced gravity changes in more proximal localities. Their study demonstrates the importance of understanding hydrological component of recorded gravity changes, particularly where the expected signals of volcanic unrest are small.

Known hydrological influence
Currently, compensation for hydrological influences on gravity measurements at active volcanoes consists of accounting for variations in the elevation of the water table. This is normally achieved by representing the water table aquifer as an infinite slab. Under the infinite slab assumption, the gravity correction due to a change in water table elevation of δz, is given by where g wt is the water table gravity correction, G is the universal gravitational constant (6.67 × 10 −11 m 3 kg −1 s −2 ), ρ w is the density of water and φ is the effective porosity of the aquifer. However, even where accurate g wt estimates are possible, through concurrent water table and gravity measurements, this hydrological correction fails to account for water-mass storage above the water table, in perched aquifers and within the unsaturated (vadose) zone. An increasing number of studies are using gravity measurements to assess and track water storage changes in the subsurface (e.g. Jacob et al. 2010;Naujoks et al. 2010;Creutzfeldt et al. 2012;Pfeffer et al. 2013;Hector et al. 2015). In a survey designed to assess water storage changes in karst environments in the south of France, Jacob et al. (2010) highlight the importance of mass storage in the vadose zone on gravity measurements. They suggest that the potential for mass storage in the epikarst, above the water table, is significant and in some sites is responsible for the entire seasonal gravity signal (Jacob et al. 2009). High precision, superconducting gravimeter experiments in Moxa, Germany have revealed seasonal, hydrologically induced, gravity signals ∼3 to 4 µGal (Naujoks et al. 2010). However, these signals are often masked in such sensitive gravimetric data by local hydrological effects, such as water content variations in the soil and disaggregated bedrock immediately surrounding the gravimeter (Krause et al. 2009). In many volcanic settings high permeabilities, combined with high relief, provide a potential for a deep water table and a relatively thick vadose zone, on the order of several tens to hundreds of meters (e.g. Yucca Mountain, Nevada, Klavetter & Peters 1986;Hawaii, Ingebritsen & Scholl 1993;Oregon Cascades, Hurwitz et al. 2003;Masaya Volcano, Nicaragua, Pearson et al. 2012). Capillary pressures in unsaturated material can promote water retention, especially in more fine grained material. Therefore, a thick vadose zone can have significant mass storage potential. Such hydrogeology, coupled with temporally and spatially variable recharge, has the potential to generate recordable and significant gravity signals independent of any change in water table elevation. For example, a change in liquid saturation fraction of 0.2 in the vadose zone with porosity of 0.25 results in a density change ( ρ) of ∼50 kg m −3 . Using a Bouguer slab approximation (eq. 2) and an estimated vadose zone thickness of h = 50 m, this change in water saturation would produce a gravity signal of over 100 µGal.
Even where saturation changes throughout the subsurface are integrated to provide effective water table changes, the Bouguer approximation has been shown to be inaccurate, especially where surface topography is significant (e.g. Creutzfeldt et al. 2008;Leirião et al. 2009;Creutzfeldt et al. 2012). By deriving time constants relating rainfall to gravity change, Crossley et al. (1998) developed an empirical method to correct for saturation and water table changes. Estimation of the time constants that relate rainfall to gravity changes requires regularly sampled (at least daily) and concurrent, gravity and rainfall time-series for a number of years (e.g. Mouyen et al. 2013). These time-constants are not necessarily uniform across a gravity survey network; the collection of many years of continuous gravity data across a 4D time-lapse microgravity network, with many tens of benchmark survey locations is, unfortunately, not feasible. Here, we couple hydrological simulations with gravity calculations to model potential gravity signals associated with temporal and spatial variations in groundwater recharge that are typical of a number of volcanic environments. We detail a method for converting a finite element distribution of saturation changes into modelled gravity signals. This method can provide a means to identify and correct hydrological contributions to recorded gravity signals.

Hydrological model
Hydrological simulations are performed using the TOUGH2 code (Pruess 2004)    . Panel (a) demonstrates the increasing cell volume with distance from the axial centre (at x = 0). Cell volume is less below the survey locations (black inverted triangles) as the cell width is reduced to 1.25 m. Cell thickness is reduced from 10 to 2 m in the regions where 250 m < z < 400 m and 0 m < z < 100 m. Permeability in panel (b) reduces exponentially with depth (d), according to eq. (3). The plot here includes a low permeability region (low-k core) at 0 m < x < 1400 m and −50 m < z < 250 m. atmospheric pressure). Recharge is modelled by generating water in the surface cells of the domain.
The domain is discretized into 51 457 cells for flat models and 41 609 for topo models. In the axis-symmetric models cell volumes generally increase with radial distance from the axis at x = 0 (Figs 2a and 3a). We refine the column widths below the modelled survey benchmark locations from 10 m down to 1.25 m (beneath black inverted triangles in Figs 1-3). Cell thicknesses range from 2 to 10 m. Absolute permeability is described as an exponential function with depth, d, after Saar & Manga (2004): where k is permeability in m 2 , at depth d in metres, relative to the ground surface. k 0 is the surface permeability (5 × 10 −13 m 2 ) and λ = 0.004. Key hydrological properties are described in Table 2. In some simulations, this basic permeability distribution is modified Table 2. Permeability, porosity and van Genuchten (capillary pressure and relative permeability) parameters (α, m, S lr and S gr ) used in simulations. α is related to air-entry pressure, and m to pore size distribution. S lr and S gr are residual liquid and gas saturations, respectively. The models implement the modified van Genuchten curves of Luckner et al. (1989) and incorporate the method of Webb (2000) which uses a logarithmic extension of the capillary pressure curve for saturations below S lr + ε. to include a low permeability core in the region 0m ≤ x ≤ 1400 m and −50 m ≤ z ≤ 250 m (Figs 2b and 3b), reflecting the conceptual model Type 1, described in Hemmings et al. (2015). This conceptual model cites the presence of a low permeability core of intrusive and altered volcanic material to explain high discharge springs at high elevations around the extinct volcanic complex of Centre Hills, Montserrat. The hydrological properties of this region are defined as 'low-k core' in Table 2.

Recharge
Recharge, reflecting the proportion of precipitation that enters the groundwater system, is specified as water generation in the uppermost cells (below the atmosphere). Volcanic settings are commonly associated with high relief topography. There are often strong relationships between elevation, precipitation, and associated recharge. Spatial recharge variation can also be strongly controlled by evapotranspiration which reflect natural and anthropogenic changes in land-use and vegetation distribution. Recharge models can be used to estimate spatially distributed recharge rates, accounting for variable precipitation, as well as temperature and land-use controlled evapotranspiration (e.g. Hughes et al. 2008;Hemmings et al. 2015).
Here, we use recharge model results for Montserrat, presented in Hemmings et al. (2015) , to define an elevation dependant recharge rate: where r z is recharge in mm d −1 at elevation z in metres above sea level. Each model scenario is subject to an initial simulation phase which is run to steady state. In this first simulation phase, the initially saturated models drain under constant recharge to establish a steady-state water table elevation and saturation condition. The models are then subject to 100 yr of six-monthly and annually varied recharge. The varied recharge input is pseudo-synthetic, derived from observations from Montserrat, where annual rainfall varies spatially and temporally from 1000 to 3000 mm yr −1 , and recharge from <50 to >2000 mm yr −1 . A series of 100 annual factors are generated from random gaussian distribution, based on the modelled annual variation in the recharge for Montserrat (Hemmings et al. 2015), with standard deviation of 0.5 about a mean of 1. The recharge rate in each cell, as defined by eq. (4), is multiplied by the same series of annual factors, producing a 100 yr varied recharge input for each surface cell. We also explore the effect of seasonality of recharge on subsurface saturation by superimposing a regular seasonal variation, whereby 15 per cent of the annual recharge is assigned evenly across the first six months of each year and 85 per cent evenly across the second six months. This is achieved by adjusting the annual recharge rate to a six-monthly step function; the total volume of recharge for each year is preserved.
Eight contrasting model scenarios are simulated, exploring the effects of topography and spatially varied recharge on the gravity signal at the surface. The differences between these model scenarios Table 3. Summary of differences between model simulations. flat-type models have uniform ground surface elevation at 400 m (see Figs 1a and 2) and are denoted by horizontal line in topography column. The top surface of topo-type models is based on a topographic profile through a typical volcanic arc island (see Figs 1b and 3) as denoted by the profile line in the topography column. Horizontal lines and profile lines in the Recharge column reflect spatially homogeneous and elevation dependant recharge, respectively. A checkmark in 'low-k core' column indicates that the simulation includes a low permeability unit in the region 0 m ≤ x ≤ 1400 m and −50 m ≤ z ≤ 250 m. are summarized in Table 3. For model flat1 and flat2, the recharge rate is consistent across all surface cells. The initial steady-state recharge rate is 1.44 mm d −1 (525 mm yr −1 , 1.66× 10 −5 kg s −1 m −2 ). The 100 yr variable recharge time-series is displayed in Fig. 4. Models topo1, topo2 use the same spatially uniform recharge rate, neglecting the elevation control on recharge. In models topo3 and topo4 , the recharge rate in each surface cell varies with elevation, according to eq. (4). The same spatial recharge variation is forced on the 'flat' models flat3, flat4. In the models with spatially varied recharge, the phase of the 100 yr recharge time-series is consistent across the surface and only the amplitudes of the recharge rate differs between recharge cells.

Simulation time-steps
The time steps during the flow simulations are automatically adjusted, depending on the convergence rate of the Newton-Raphson iterations used to solve the nonlinear, coupled flow equations. The during our simulations the time-step ranges from 1 s to an enforced maximum of 25 d. Time step lengths are modified to honour the time of generation rate changes. The mean and modal time-step length for each simulation range from 4.7 to 18.9 d, and 1.5 to 23.5 d, respectively. The mean time-step length of all simulations is 8.9 d. To prevent aliasing of saturation changes when formulating the gravity signal, the modelled saturation data is outputted at least once every 3 months.   . Geometrical illustration of the gravity due to an elemental ring, centred at α = 0 and elevation z i , with radius α i , elevation thickness dz and radial thickness dα, recorded at point P. S(θ ) is the distance from P to any point on the ring and is defined in eq. (6). Leirião et al. (2009) present a detailed method for converting 3-D finite-difference hydrological model results into simulated gravity changes. Their method, also adopted by Piccolroaz et al. (2015), calculates the contribution of each element to the simulated gravity change at any given point by using either in the a prismatic volume-mass approach (after, Nagy 1966), the MacMillan equation (Creutzfeldt et al. 2008) or a point-mass solution. Both the prism and MacMillan equations incorporate information about the dimension or shape of the elements. Which method is used for each element is related to the shape of the element and the distance to the simulated gravity survey point. The relatively computationally intensive prism equation is used for the most proximal elements, the point-mass equation for the most distal, and the MacMillan equation is used for intermediate elements.

Gravity formulation
Similar to the approaches outlined by Leirião et al. (2009), we develop a method for calculating the gravity signal at a given surface location (P) that results from the distribution of mass throughout the entire model domain. Our approach is an extension of the point-mass method. Apart from computational effi- ciency, the advantage of the point-mass method for our purposes is geometrical flexibility; it allows the calculation of simulated gravity changes from linear and axisymmetric 2-D hydrological models.
The contribution of each element (i), with density ρ i , centred at cylindrical coordinates (α i , z i ), to the gravity measurement recorded at P(α p , z p ) is calculated by considering the element as a ring centred on α = 0 (Fig. 5). The vertical component of gravitational acceleration (in m s −2 ), recorded at point P, associated with a ring of radius α i , radial thickness dα and elevation thickness dz, is defined by integrating around the ring for angles 0 ≤ θ < 2π radians: where s(θ) is a function defining the distance between benchmark survey location and each elemental component on the ring:   The total gravity change recorded at point P due to a change in mass or density within the model is calculated by summing the contribution of all the elemental rings: For the purpose of modelling gravity changes due to water-mass storage, the change in density of an element is defined by: where S li is the change in liquid saturation, ρ w is the density of water and φ i is the porosity of the cell i. Eqs (7) and (8) are used to compute the gravity signal created by temporal variations in saturation at six survey locations on the surface of the model domain (black inverted triangles in Fig. 1). For comparison, we also compute the gravity correction ( g wt , 266 B. Hemmings et al. Step in water table at x ∼ 1400 m in panel (a) is due to low permeability unit. Solid line in panel (b) is the gravity signal calculated based on the distributed changes in saturation (eqs 7 and 8). The dashed line for P1 to P3 represents the commonly applied correction, based on water table elevation (eq. 1). Poor resolution in the model below P4 to P6 at the elevation of the water table causes large step artefacts in the water table correction; therefore, it is not displayed. eq. 1) associated with the modelled change in the water table elevation below each survey location. Modelled gravity signals are presented in µGal (1 µGal = 1 × 10 −8 m s −2 ).

R E S U LT S
Each model scenario simulation (Table 3) results in a 100 yr g time-series at every survey location (P1-P6). All model scenarios produce a variation in g of over 150 µGal during the 100 yr of varied recharge for at least one survey location. The simulations produce a maximum g between 55-60 yr and 65-70 yr and a minimum g between 85 and 95 yr, with respect to a value of g = 0 at time zero. The minimum and maximum of each simulated time-series is displayed in Fig. 6 and in tabular form in Table A1.
The gravity signals in simulations flat1 and flat2, without spatially variable recharge and with flat topography, are similar for each simulated survey location. (e.g. flat2, Fig. 7). Consequently, the minimum and maximum gravity changes in the 100 yr of simulation are relatively consistent for survey locations P1-P6 (Fig. 6). The model scenarios that incorporate topography and/or the spatially varied recharge input demonstrate more gravity signal variability between different survey locations (e.g. topo4, Fig. 8). Signal amplitudes are greater for the stations closer to the axis.
Compared to the spatially uniform recharge scenarios (flat1, flat2, topo1 and topo2), recharge is higher towards the axial boundary of the spatially varied scenarios (flat3, flat4, topo3 and topo4). As a result, the magnitude of the temporal variation is also higher. Similarly, recharge, and therefore the magnitude of storage variations, is lower in surface cells towards the distal boundary in these models (where elevation is lower). This contributes to higher g amplitudes for the more axial stations in models flat3, flat4, topo3, and topo4. The presence or absence of a core with low permeability and porosity appears to have a very limited effect on the simulated gravity signals in these models, although it does raise the water table by up to 135 m in the interior of the model. Modelled g time-series for all scenarios are presented in Appendix A (Figs A1-A6, Supporting Information).

D I S C U S S I O N
The flow simulations with 100 yr of variable recharge presented above display appreciable and recordable changes in gravity, on the order of 10s to 100s of µGal. The results indicate that topography, even in the absence of any associated spatial recharge variation, can produce significant spatial variations in gravity changes (compare P1 and P6 in models topo1 and topo2, Figs A2 and A3). This is related to differences in vadose zone thickness, induced by high topographic relief, and relatively high and homogeneous permeabilities. The vadose zone thickness at P6 in the topographic models is just 20 m, compared to between 300 and 400 m at P1 and P2 (with and without the lower permeability core, respectively). With the relatively thin vadose zone beneath P6, recharge reaches the water table relatively quickly; from here it is transported laterally rather than stored, and the recharged mass has less effect on the gravity recorded at a fixed location on the surface. The mass storage potential beneath the survey location is relatively low and therefore the amplitude of the gravity signal is low, compared to P1. The tendency for vadose zone thickness to correlate with elevation means that gravity changes associated with saturation variations are likely to be stronger towards volcano summits, potentially mimicking the spatial patterns commonly associated with magmatic signals. Spatial variations in recharge and storage also have a strong control on spatial differences in modelled gravity. The amplitude of the gravity signal is largely dependent on the magnitude of the temporal recharge variation. Compare, for example, the results from the flat model with spatially varied recharge (flat3 and flat4, Figs A4 and A5) to those with spatially uniform recharge (flat1 and flat2, Figs A1 and 7).
The modelled gravity changes are dominated by inter-annual recharge variations. The seasonal gravity fluctuations are generally low amplitude (<10 µGal). Small seasonal gravity variations are comparable with early estimates from Mount Etna (Sanderson 1982), however, more recents studies suggest that the seasonal hydrological component of gravity measurements from Etna may be on the order of 20 µGal (Budetta et al. 1999;Carbone et al. 2003b). None of the model scenarios presented here produce seasonal changes of that magnitude. However, it may be possible to generate such signals under alternative hydrological regimes that are not investigated in this model suite.

Comparison to common water table correction
Correction of gravity signals for hydrology, using eq. (1), most commonly involves some estimate of water table elevation changes (δz). More rarely, direct measurements of δz are used. In our simulation we calculate this correction, under the idealized scenario of knowing δz beneath each survey location, through time. We can compare this water table correction with the modelled distributed gravity signal that also incorporates vadose zone mass changes.
Unfortunately, evaluation of δz in the models, and therefore the estimation of the associated gravity correction, is limited by the vertical resolution of the models cells at the level of the water table. Where cell thickness at the water table is coarse (10 m), the apparent water table elevation in the model can jump as cells cross a threshold between saturated and unsaturated conditions. The associated gravity correction (eq. 1) is sensitive to this artefact and direct comparison with the distributed saturation calculation is not appropriate. The result of this calculation is only displayed where the modelled water table coincides with finer cell thickness (2 m) . In these cases, it is possible to compare the axis-symmetric distributed gravity calculation with the water table correction (eq. 1). For the flat topography models (flat-type), the distributed gravity calculation matches the water table correction well (e.g P1 and P2 in Fig. 7b). However, for models with topography (topotype) the water table exhibits a phase-lag of ∼5 yr compared to the distributed gravity calculation (e.g. P1-P3 in Fig. 8b). With this phase-lag the water table correction miss-corrects by up to 100 µGal.

Relative gravity changes
In order to correct for instrument drift and large spatial scale gravity variations, it is common to record gravity changes relative to a reference site or base station. In order to mimic this technique, we calculate relative gravity signals using the most distal station (P6) as the reference site. The reference time-series is subtracted from the signal from the other stations to provide time-series of relative gravity changes. The minimum and maximum of each relative 268 B. Hemmings et al. Figure 10. flat2 relative gravity change time-series (thick black lines). Relative gravity change at Pn (n = 1, 2, . . . , 6) are calculated with P6 as the reference station by g Pn rel = g Pn − g P6 . g Pn is shown as the thin solid line on each plot; g P6 , the reference signal, is shown by the thin dashed line. As P6 is the reference station, g Pn rel = 0 for that station.
gravity time-series is displayed in Fig. 9 and in tabular form in Table B1. The relative gravity time-series for flat2 and topo4 are presented in Figs 10 and 11; relative gravity time-series plots for the remaining simulations are provided in Appendix B (Figs B1-B6, Supporting Information). Calculating relative gravity, with P6 as the reference station reduces the signal amplitude produced by the recharge variation for all stations in all the simulations. The effect is most significant for the flat models, with spatially uniform recharge (flat1 and flat2, e.g. Fig. 10). For topo3 and topo4 the g signal is low amplitude at P6 and the relative gravity reduction has limited effect on the gravity change signal for each station (e.g. Fig. 11). For these scenarios, with topography and spatially varied recharge, relative gravity changes still reach 100-150 µGal during the 100 yr simulation.
The suite of scenarios investigated here demonstrates that when there are spatial variations in recharge and storage potential, especially between survey sites and the reference location, calculating relative gravity does not effectively remove the signal associated with temporally varied recharge. Importantly, in the models presented here, the phase of the input recharge signal is consistent across the surface. As a result, the modelled gravity changes over time at each station are also in-phase. Where spatial variations in recharge and storage potential are more complex, and the temporal variations across the survey sites are not necessarily in-phase, the calculation of relative gravity changes could introduce an erroneous signal. Spatially complex recharge and storage potential is not uncommon in high relief volcanic regions. Additionally, reference stations are often necessarily distal, or, as can be the case on small volcanic islands, they may be located on a different island with a very different hydrogeological regimes.

Implications for gravity surveys
The simulations presented here demonstrate that temporal and spatial hydrological recharge variations can generate recordable gravity changes. However, in order to assess the implications of rechargeinduced signals on gravity monitoring surveys, it is important to consider the timescales of these signals relative to the timescales of the surveys.
Continuous gravity records of over long time periods are rare, but some do exist with lengths on the order of years (e.g. Jousset Figure 11. topo4 relative gravity change time-series (thick black lines). Relative gravity change at Pn (n = 1, 2, . . . , 6) are calculated with P6 as the reference station by g Pn rel = g Pn − g P6 . g Pn is shown as the thin solid line on each plot; g P6 , the reference signal, is shown by the thin dashed line. As P6 is the reference station g Pn rel = 0 for that station. Budetta et al. 2004). More common continuous gravity surveys in volcanic environments are at short timescales of hours to days (e.g. Gottsmann et al. 2007). The timescales of the flow investigations here do not provide insights on such short timescales. The most widely used gravity survey technique is discrete reoccupation of a survey network at intervals on the order of weeks, months and years (e.g. Carbone et al. 2003a;Hautmann et al. 2010). Such studies attempt to interpret the changes in the gravity signal that occur on the timescales of the reoccupations. Table 1 provides examples of the gravity change amplitudes recorded at volcanoes experiencing unrest (60-150 µGal). Also provided are the timescales that the changes are recorded over by volcano monitoring campaigns (1-16 yr). Gottsmann et al. (2005) highlight the issue of the Nyquist frequency for dynamic gravity surveys; only signals with a period greater than two times the occupation interval can be unambiguously resolved. The propagated uncertainty of measurements from individual stations within a typical dynamic gravity volcano monitoring network can be up to 10 µGal (Battaglia et al. 2008;Hautmann et al. 2010).
To illuminate the amplitude of our modelled gravity changes, over typical field campaign intervals, we calculate changes over 1, 2, 5 and 10 yr windows , rolling over every data point in the modelled gravity time-series. Performing this operation, over these four campaign intervals ( t) on the gravity change times series (f(t) = g(t)) results in four discrete-time-type derivative time-series ( f (t) t ) which can be described by: Using the discrete-time derivatives we can assess the statistical probabilities associated with the magnitude of gravity changes recorded over campaign intervals of 1, 2, 5 and 10 yr. We compute distribution and exceedance probabilities for | g| over these fixed time intervals. Fig. 12 shows the derivative time-series and exceedance probability plots for model topo4. If two gravity readings, 1 yr apart ( t = 1 yr), were made at site P1 during our 100 yr simulation, the probability that | g| would exceed 25 µGal is 17 per cent. If the interval between campaigns is 10 yr, the 25 µGal exceedance probability is 60 per cent. This analysis for model topo4 suggests that | g| recorded at typical campaign intervals can exceed 50 µGal for 1 yr, 70 µGal for 2 yr, 100 µGal for 5 yr and 160 µGal for 10 yr. However, the majority of the changes recorded at station P1 over 270 B. Hemmings et al. these windows would be 6-20, 10-30, 10-45 and 15-60 µGal, respectively (Fig. 13). Table 4 illustrates that hydrologically induced gravity changes modelled at P1 of model topo4 are unlikely to produce the amplitude of gravity changes on the timescales provided as examples in Table 1. However, where the time intervals between surveys are large and the gravity changes are low there is increased potential of recording hydrologically induced gravity changes and misinterpreting them as volcanic signals. The equivalent figures for models flat1 to flat4 and models topo1 to topo3 are provided in Appendix C (Figs C1-C14, Supporting Information).
The geometry, hydrology and recharge variations in the simulations presented here are based on data and observations from the sub-tropical volcanic island of Montserrat. The results clearly have implications for gravity surveys of a number of volcanic systems with similar recharge regimes to Montserrat. Our results suggest that time-lapse gravity surveys are still a relevant and valuable volcano monitoring tool. However, we demonstrate the importance of understanding the hydrological dynamics when interpreting gravity signals, especially where large temporal and spatial recharge variations exist. The magnitude of inter-annual recharge variations have a critical control on the amplitude of hydrologically induced g signals. An additional suite of models has shown that hydraulic properties also exert a control on the amplitude of water-mass storage variations resulting from fluctuating recharge (Hemmings 2014).
For example, lower permeability slows the transfer of recharging groundwater, producing low frequency but large amplitude g signals. Conversely, porosity has the opposite effect; lower porosity reduces storage capacity but also promotes rapid saturation variations, and therefore more rapid water migration through the vadose zone, producing lower amplitude but higher frequency g signals.

Dynamic gravity survey recommendations
The insights gained from the numerical simulations presented here allow us to provide some recommendations for performing and interpreting dynamic gravity surveys in areas where spatial and temporal hydrological variations are likely to be present.
(i) Collect data on water table elevation and rainfall variability. (ii) Incorporate continuous gravimeter data, where possible, to test reference station stability and response to rainfall fluctuations.
(iii) Perform frequent reoccupations and attempt to illuminate relationships between g and seasonal/inter-annual rainfall variation.
(iv) If possible, perform numerical fluid-flow to g simulations, such as those presented here, to assess the potential magnitude of hydrologically induced g signals, and ideally correct for distributed water-mass storage changes above and below the water table. Figure 13. Boxplot of gravity changes over 1, 2, 5 and 10 yr campaign intervals for model topo4. The boxes cover the second and third quartile, with the median gravity change is marked by a red line. Whiskers covers the range between the 5th and 95th percentile. Outliers, marked by crosses are outside this range. A priori information on the distribution of hydrogeological heterogeneity can and should be incorporated into the flow models.

Broader relevance for geophysical monitoring
The combined fluid-flow and gravity simulation results presented here also have implications for other geophysical volcano monitoring techniques that are sensitive to hydrological flow and fluctuations. Saturation has a strong control on electrical properties of the subsurface. A common assumption is that electrical resistivity is proportional to S −n l , following Archie's law (Archie 1942). Although a number of studies have demonstrated that the relationship between electrical resistivity and saturation is more complicated, order of magnitude scale resistivity changes are regularly observed over the saturation ranges produced by our models (e.g. Frohlich & Parke 1989;Knight 1991;Khalil & Monterio Santos 2009).
Mass storage variations, such as those presented in these models, also produce geomechanical loading effects that can generate geodetic deformation signals. Hydraulic (un)loading effects related to seasonal and longer term variations in rainfall, snowfall, snowpack thickness, and groundwater abstraction have been implicated in fluctuating GPS signals recorded in California (Argus et al. 2014;Amos et al. 2014) and the Cascades (Fu et al. 2015). Equivalent water table changes on the order of 0.5 m, similar to our simulations, have been linked to vertical ground surface oscillations on the order of 10 mm in the Californian mountains (Argus et al. 2014). Whether related to stress field changes associated with seasonal loading or lubrication of faults by groundwater, seasonal seismicity has also been reported in the literature (e.g. Jiménez & Garca-Fernández 2000). Regular seasonal geophysical signals are relatively easy to identify and can be filtered out of monitoring data. However, the results we present here suggest that inter-annual recharge variations can generate higher amplitude saturation fluctuations. The geophysical signals associated with these inter-annual fluctuations are not necessarily predictably cyclical. Such signals are therefore less easily identified in monitoring data and can be difficult to remove using standard signal processing techniques alone.
When interpreting geophysical monitoring signals, care should be taken to ensure that non-cyclical hydrological fluctuations are not misinterpreted as geological or volcanic signals. This study indicates that the magnitude of a hydrological contribution to a geophysical monitoring signal will be site-specific; dependent on local climate and weather variations, hydrogeology, and the amplitude of the geophysical signal of interest. Incorporating numerical flow modelling into geophysical monitoring campaigns may be useful for assessing the likelihood of hydrological contamination of a particular geophysical monitoring record.

C O N C L U S I O N S
The flow simulation scenarios investigated here demonstrate that variations in groundwater recharge can generate changes in subsurface mass distributions that are large enough to produce recordable gravity signals. Inter-annual recharge variations dominate modelled gravity changes, with signals on the order of 10s-100s of µGal. Gravity changes result from mass (saturation) changes within the vadose zone as well as water table elevation changes. Spatial variations in recharge can generate significant spatial differences in recorded gravity changes, which are sufficient to produce artefacts in relative gravity measurements.
For the 100 yr varied recharge simulations explored here, the probability of recording exceptionally large gravity changes between typical monitoring campaigns is relatively low. However, we have demonstrated that recordable gravity signals, on the order of 10s of µGal can be generated by simple and not atypical variations in recharge. This demonstrates the importance of understanding spatial and temporal groundwater behaviour when interpreting gravity signals in monitoring locations where the hydrology is dynamic.
Numerical fluid-flow models present a valuable tool for investigating and estimating the effect of hydrological processes on gravity measurements. As such, the simulations and work-flow presented here can provide a method for assessing the potential for hydrologically generated gravity signals and correcting for them, where necessary. The approach presented here can also be adapted to explore potential hydrological contributions to other geophysical monitoring signals.

A C K N O W L E D G E M E N T S
This work is funded by the NERC BUFI programme (studentship no. S186), The Royal Society and EC-FP7 (#282759) 'VUELCO'. Thanks are due to Adrian Croucher (University of Auckland) for developing pyTOUGH and promptly responding to questions and edit requests; to Stefan Finsterle (Lawrence Berkeley National Laboratory) for his patient support of TOUGH2; Stefanie Hautmann