## Summary

Here, we evaluate the improvement in noise correlation functions (NCFs) gained by dividing ambient seismic records into shorter, overlapping time windows before correlation and stacking (Welch′s method). We compare waveform convergence of short duration NCF stacks (e.g. 2, 5, 15 and 50 d stacks) towards the long-term (365 d) NCF stack. We observe short duration NCF improvement when applying Welch′s method for non-pre-processed and running normalized time-series and short duration NCF degradation when applied to a ‘one-bit’ normalized time-series. Surprisingly, non-pre-processed time-series provides the quickest convergence to a robust (year-long) NCF. Because of the simplicity of Welch′s method, the improved NCF convergence and a minimal increase in computation, we recommend applying Welch′s method for future ambient seismic field analyses. Using this approach will likely improve future NCF analyses, particularly for studies with limited duration recordings, high levels of intermittent local or site noise and studies attempting to evaluate temporal variations in subsurface structure.

## Introduction

The concept of using the Earth′s ambient seismic field to study its structure has made significant advances over the past half century. Aki (1957) first suggested that the phase velocity of the propagating wavefield beneath a seismic array could be estimated by spatially correlating ground motion, which was shown to yield a Bessel function. This idea, which became known as the spatial autocorrelation method, was the first step. Claerbout (1968) proposed that the temporal average of this spatial correlation could retrieve the very impulse response itself.

By cross-correlating the ambient noise recorded at two stations over a period of time, the Green′s function or impulse response between the two stations can be extracted, assuming that the noise source distribution is spatially homogeneous around the stations (e.g. Lobkis & Weaver 2001; Derode *et al.* 2003; Shapiro & Campillo 2004; Snieder 2004; Wapenaar 2004). The result of days, months or years of stacked correlation is often called a noise correlation function (NCF; e.g. Roux *et al.* 2005; Sabra *et al.* 2005). From these NCFs, estimates of the surface wave dispersion can be made and then inverted to obtain 1-D velocity estimates between the stations. When applying such analyses to an entire array, implementing every combination of stations, 2-D and 3-D velocity structures are imaged (e.g. Shapiro *et al.* 2005; Lin *et al.* 2007; Yang *et al.* 2007; Bensen *et al.* 2009). More recently, NCFs have been used for monitoring very small changes in a medium′s velocity structure (Wegler *et al.* 2006; Brenguier *et al.* 2008a,b).

These innovations led to the usage of the ambient seismic field to perform surface wave tomographic studies on a variety of scales (e.g. Shapiro *et al.* 2005; Brenguier *et al.* 2007; Lin *et al.* 2007; Yang *et al.* 2007; de Ridder & Dellinger 2011). This technique has the ability to work for nearly any seismic array, as it does not require earthquake sources to occur near the array, allowing the seismic structure beneath aseismic regions to be estimated (e.g. Liang & Langston 2008). Ambient seismic field analysis also has the capability of providing surface wave dispersion measurements at periods shorter than 20 s (e.g. Shapiro *et al.* 2005), which are often lacking in earthquake studies because of attenuation over long earthquake-to-receiver paths.

Multisensor ambient noise analyses have been applied to a variety of subjects including ultrasonics (Weaver & Lobkis 2001; Larose 2006), helioseismology (Duvall *et al.* 1993; Rickett & Claerbout 1999, 2000), ocean acoustics (Roux & Kupperman 2005), engineering (Kohler *et al.* 2005; Snieder & Safak 2006; Sabra *et al.* 2007; Prieto *et al.* 2010), crustal seismology (Shapiro *et al.* 2005; Yao *et al.* 2006), exploration seismology (Schuster *et al.* 2004; Draganov *et al.* 2007), seismic monitoring (Sens-Schönfelder & Wegler 2006; Brenguier *et al.* 2008a,b) and structural monitoring (Sabra *et al.* 2007; Larose & Hall 2009). Clearly, the extraction of robust NCFs plays an extremely important role in the process of converting ambient noise to models of physical properties for a wide range of groups.

In this study, we examine several variant methods of NCF generation in an attempt to quantify which method consistently generates the most stable NCF from ambient seismic records. Towards this aim, we seek to answer certain fundamental questions, such as ‘How few data are needed to produce a “reliable” NCF?’ and ‘How does the stability of the NCF change with the increased duration of data?’ We also investigate how the details of common pre-processing steps, such as performing a running normalization (Bensen *et al.* 2007) or a ‘one-bit’ normalization (e.g. Shapiro & Campillo 2004; Shapiro *et al.* 2005) of the recorded time-series and how it affects the NCF. In the following sections, we present the methods used, describe the results obtained and discuss why the most optimal method-using short non-pre-processed overlapping windows-is more successful than the others.

## Methods

Here, the Green′s function or NCF is approximated using the coherency as presented in Prieto *et al.* (2009). The coherency, γ_{AB}(ω), is calculated in the frequency (ω) domain as

*u*(ω), recorded at stations A and B

*,** represents complex conjugate, ‖ indicates the real absolute value of the spectra and {} indicates the 20-point frequency running average used for whitening the signals. Note that the division by the spectra whitens the coherency and thereby normalizes the amplitudes.

Initially, this method was applied by the authors to preserve amplitudes for basin amplification (Prieto & Beroza 2008) and attenuation estimations (Prieto *et al.* 2009; Lawrence & Prieto 2011). Prieto *et al.* (2011) provides a review of this technique and its merit. No ‘one-bit’ or running normalization is applied when computing the coherency as given by eq. (1). We pose that changing the time domain amplitudes with a non-uniform amplitude gain correction, augments the amplitudes and phase in the frequency domain, thereby reducing the stability of any measured coherency (e.g. Lynn 1989). The coherency studies of Prieto *et al.* (2009) and Lawrence & Prieto (2011) noted improved results from ensembles of more short time windows (1-2 hr) as compared to fewer longer time windows (24 hr).

Here, we more robustly determine how and why NCFs are optimized by using shorter time windows following the technique of Prieto *et al.* (2009). We note that, for a non-stationary seismic noise signal, there is no scientific justification for stacking correlations of 24-hr (86 400 s) time-series, other than the ease of obtaining data at that length. In this study, we alter the duration of time windows in which the NCFs are calculated, from 15 min to 24 hr, to see if any resulting NCFs are more reliable than any others.

In addition, we examine the effect of computing coherency with overlapping time windows, based on the method presented by Welch (1967), as used in single-station site noise studies (e.g. Sleeman *et al.* 2006; Evans *et al.* 2010). An example of overlapping is shown in Fig. 1. Using overlapping windows has several consequences. First, Welch′s method removes any dependencies the NCF may have on an arbitrarily chosen beginning and ending of the time-series. As a result, any single high amplitude transient signal has less effect on the ensemble average (e.g. Prieto *et al.* 2011; Lawrence & Prieto 2011). In this study, we vary the amount of overlap of the time windows from 0 to 75 per cent, with no tapering being applied at the beginning and ending points of the time window. We also compare this coherency method against two other common methods of pre-processing: ‘one-bit’ (e.g. Campillo & Paul 2003; Sabra *et al.* 2005; Shapiro *et al.* 2005) or running normalization (e.g. Bensen *et al.* 2007) and then pre-whitening. We note for clarity that pre-whitening a time-series that is not normalized in the time domain is mathematically equivalent to the coherency,

*et al.*(2009) and the methods of others is the time-domain normalization step.

We evaluate if the process of rejecting time windows containing transient signals above a specified threshold (e.g. Sabra *et al.* 2005) positively or negatively impacts the reliability of the NCF. In theory, large-amplitude transient signals (like earthquakes) ‘contaminate’ NCFs. Here, we systematically examine how various thresholds for data rejection between 5 and 11 times the standard deviation of the specific time window affect the resulting NCF for a network of data.

In this study, we assume that a better time averaging leads to better spatial averaging, which is one primary challenge in making NCFs approximate the interstation Green′s function. By stacking (or averaging) many uniformly distributed normalized cross-correlations into the same time delay space, only stationary phases should emerge (e.g. Sabra *et al.* 2005; Gerstoft *et al.* 2006). Stacking more subsets of data from unique sources should lead to a closer approximation of the ideal uniform spatial averaging. Therefore, stacking 365 d of correlations should yield a better estimate of the interstation Green′s function than a 5- or 10-d stack (e.g. Bensen *et al.* 2007). Such improvement in signal-to-noise ratio with increased data is clearly visible in Fig. 2.

We assert that a better technique for calculating NCFs should exhibit signs of stacked NCFs converging towards the long-term solution faster. For a given quantity of data, the better technique should yield short-term NCFs that more closely resemble the long-term NCFs. Alternatively, a better technique should produce equivalent NCFs given less data. We therefore stack NCFs for a range of data durations (1, 2, 3, 4, 5, 6, 10, 15, 20, 25, 30 40, 50, 100 and 365 d). We then calculate the zero-time lag correlation between the short-term and 365-d NCF stack calculated using the same technique, which we call a convergence correlation (*R*_{C}).

We compare the convergence correlation obtained by stacking correlations of time-series with various window lengths from 15 min to 24 hr (900 s, 1800 s, 3600 s, 7200 s, 14 400 s, 43 200 s and 86 400 s). For each time window, we evaluate how specifying a range of rejection thresholds can change the convergence correlation. We test four threshold values: 5, 7, 9 and 11 times the standard deviation data about the mean. We then compare each time window length and rejection threshold with four values of overlap (0 per cent, 50 per cent, 67 per cent and 75 per cent), with the expectation that more windowing of the data will constrain the results better. For a limited range of time windows, rejection thresholds and window overlaps, we computed the convergence correlation for all durations of stacked NCFs.

## Data

The data used for this study are a subset of the instruments from USArray′s broad-band vertical component (BHZ) seismometers near Yellowstone National Park, as shown in Fig. 3. The cross-shaped array geometry was chosen to obtain a wide range of interstation distances and azimuths. With *N*= 20 stations, there are *N*(*N*- 1)/2 = 190 station pairs. All 20 station vaults were occupied by USArray in this area for 18 months, including the entire year of 2009. To allow for faster computation of NCFs, the data were bandpass filtered between 5 and 50 s (0.02 and 0.2 Hz) and downsampled from 40 to 5 Hz.

## Results

Before comparing the entire network, we first compare convergence correlations for a single station pair (TA-E18A and TA-D24A; Fig. 4) given all tested combinations. As expected, the convergence correlation is strongly dependant upon the quantity of data stacked. At first, the NCFs converge rapidly (~0.1 change in *R*_{C} per day) with the first few days of stacked data, but then converge more slowly for longer stacks (<0.02 change in *R*_{C} per 50 d). Fig. 5 illustrates the dependence of the convergence correlation on the time window length of the stacked correlations (after 15 d), as compared to the rejection threshold and overlap. Of these three parameters, the NCFs improve most with decreased time window, then as a function of increased overlap and only marginally as a function of rejection threshold.

For the station pair (TA-E18A and TA-D24A), the NCFs stacked from correlations of short (1800 to 3600 s) time windowed data improve more rapidly with more days stacked into the data. This trend is observed for nearly all of the station pairs, as well as for both non-preprocessed and running normalized time-series. Typically, the NCFs stacked from ‘one-bit’ normalized data provide the lowest convergence correlation by 10-30 per cent, although non-preprocessed NCFs are typically only marginally higher (<10 per cent) than running normalized NCFs. In many instances, the ‘one-bit’ normalized NCFs provide spurious results, which yields lower convergence correlations than the running normalized or non-preprocessed data. As shown in Fig. 4(c), the shorter time windows actually reduce the convergence correlation of the ‘one-bit’ normalized data. The NCFs stacked from correlations of short, overlapping time windows of non-pre-processed data provide better convergence correlations in most tested cases than NCFs of normalized, long time windowed, non-overlapping data.

For comparison of many station pairs, it is easier to visualize the results for a single duration of stacking. We choose 15-d stacks (Fig. 5) because the 15-d NCF stacks have transitional values between the fast improving convergence correlation for minimal day stacks (e.g. 5-d) and the marginal improvement of longer stacks (e.g. 50-d). With *R*_{C} being ~0.85, the 15-d non-pre-processed NCF stacks also represent a threshold where we interpret the correlation with the 365-d NCF to be ‘strong’ .

Fig. 6 depicts histograms of the most frequent optimized parameter for (1) all of the 190 station pairs after 15 d, (2) pairs with short station separation (<300 km; blue) and (3) pairs with long station separation (>300 km; green). The optimized time window required to produce the most robust NCF for each station pair in the array after 15 d consistently (~70 per cent) falls within the range of shorter duration windows, being anywhere from 900 to 3600 s. This percentage may decrease for close stations (~65 per cent) and increase (~80 per cent) for stations farther apart. The most optimal data overlap is 75 per cent (the highest tested here) and was very robust (~80 per cent) regardless of station separation. The most optimal threshold at which to reject data is low (five to seven times the standard deviation), but only marginally more common (~55-60 per cent) than higher thresholds (9-11).

The convergence correlations decrease with station separation for a given time window length, rejection threshold, overlap and stack duration, as seen in Fig. 7.

To estimate the benefit of using short-time windows in NCF generation, we plot the ratio of the 1800 s convergence correlations to the 86 400 s convergence correlations, as shown in Fig. 7(b). We fit a logarithmic function to the distance dependant convergence correlations to illustrate the improvement gained by applying short time windows. We do not focus on the exact functional form, because it bares little physical meaning, but merely point to the general distribution. After 15 d at short station separations (<100 km), the 1800 s NCFs have converged 3-20 per cent more than the 86 400 s NCFs. At greater distances (>100 km), the average 1800 s NCFs have converged 40-50 per cent more than the 86 400 s equivalent NCFs.

Fig. 8 depicts the mean number of days each station pair requires to have a correlation coefficient of *R*= 0.9. Looking at the trend lines applied in the figure, it can be seen that the short-time windowed time-series with overlap (blue) provides a ~50 per cent quicker convergence to *R*= 0.9 than the long-time windowed time-series without overlap (red). The percentage improvement appears nearly constant for all distances, but the difference in absolute data quantities grows with station separation. For example, stations separated by 500 km (such as TA-E17A and TA-E24A) require only ~17 d worth of data to reach a convergence correlation of *R*= 0.9 when cross-correlating 1800 s time-series using Welch′s method; the same two stations require ~35 d worth of data when correlating 86 400 s time windows.

## Discussion

Any well-behaved analysis should yield a near-perfect Green′s function for the autocorrelation (where distance is zero), implying a convergence correlation of 1. Farther apart, NCF convergence correlations should decay, as seen in Fig. 7. A linear or exponential decay trend can fit the short-time window NCF convergence correlations with a zero-offset intercept statistically indistinguishable from 1. Although a logarithmic fit provides a better empirical representation of the mean decay, no unconstrained logarithmic solutions were found that yield a zero-distance offset intercept of 1. With a more optimal technique, NCFs will converge faster at non-zero distances.

There is a significant (2-50 per cent) improvement in convergence correlation between the NCFs calculated with overlapping, short-time (1800 s) windows as compared to non-overlapping longer (86 400 s) windows, for the same data. These differences are inherently observed in both the overall convergence correlation over a given amount of time, as well as a slower decreasing in the convergence correlation with station separation. This ‘optimum’ time window, being ~1800-3600 s, seems to imply that there are aspects affecting both longer (>3600 s) and shorter (<1800 s) time windows, which cause them to be less robust. We argue that each has a different limiting factor.

Before comparing data at two stations, it is important to think about the data recorded at each station independently. Here, we refer to the large amount of literature on single-station power-spectral densities used to evaluate site noise levels at all frequencies (e.g. Peterson 1993; McNamara & Buland 2004; Evans *et al.* 2010; Ringler *et al.* 2010, 2011). These studies indicate that averaging more short-time windows provide a more reliable estimate of the single-station noise than fewer long-time windows (for the same duration). Currently, the predominant site-noise level technique employs Welch′s method (Welch 1967) to improve the estimates of power-spectral density (e.g. Sleeman *et al.* 2006; Evans *et al.* 2010). Welch′s method calls for averaging many partially overlapping short time-series for a given duration of data. Evans *et al.* (2010) find that with Welch′s method, many short (1638.4-2084 s) Hanning filtered time-series provide the best estimates for power-spectral density for data sampled between 40 and 1 Hz.

It is logical that, if the ambient field is best characterized and observed with Welch′s method, then the two station ambient field NCFs should benefit from similar treatment. Given that two-station NCF studies whiten each time-series before or during the correlation/coherency step, it is important that we fully understand the effect of this whitening process. The whitening is accomplished by taking the square root of the single-station power spectral density. As discussed above, the power spectral density is best characterized with short time windows using Welch′s method. Therefore, Welch′s method should improve the whitening processes that are applied by most two-station NCF analyses.

It is our assertion that the controlling factor in generating high-quality NCFs is the number of consistent spectra used in the stacking process. NCF improvement has been observed before (e.g. Bensen *et al.* 2007) and is clearly illustrated in plots of convergence correlation versus the number of windows stacked (e.g. Figs 4 and 5). Given the above assertion, by decreasing the time window length and increasing the number of consistent spectra in a stack, we should improve the NCFs. This seems to be true until the time window length decreases below ~1800 s. For very short-duration time windows (~900 s) we assert that the source spectra contain too little power relative to transient signals, resulting in marginally degraded NCF stacks.

For time windows shorter than 1800 s, the spectra can become less consistent. Fig. 9 illustrates a set of spectra corresponding to subsets of the same time signal. The spectra of the shorter window length series (900 s) sometimes have lower spectral amplitudes and deviate more from the spectra of the longer time-series (e.g. >3600 s). Most notably in Fig. 9, the spectral amplitudes at ~0.07 Hz (14 s) microseism are reduced for the 900 s time windowed data. This figure illustrates that in some cases, the 900 s windows do not characterize the microseisms as well as long-time windowed data. We speculate that with insufficient signal length, the weak but consistent microseismic noise signal does not contain sufficient power to overcome transient source signals. This points to a general requirement for the application of Welch′s method to two-station ambient noise: at least one common source should exist in each time window. With increased time, the power of random white noise decreases relative to the consistent repeated microseismic noise. For two-station noise correlation, this results in marginally lower convergence correlation. Note that the convergence correlation for the 900 s time windowed NCFs are typically 2-35 per cent higher than the convergence correlations for the 86 400 s time windowed NCFs.

Another advantage of the short time-series is that signal rejection actually rejects less data. If the data are divided into single daylong time windows and one large event causes signal rejection, then the whole day is rejected. If the data are divided into 47 or more overlapping hour-long time windows, only one or two hour-long windows would be rejected. This preserves more data.

As a validation of the convergence rates shown in Fig. 4, data from 2008 was also processed and compared to the year-long NCF from 2009. As shown in Fig. 10, the off-year data of the short (1800 s) and long (86 400 s) time windows exhibit the same observed behaviour, converging at a similar rate to the 2009 data (shown as the dashed lines in Fig. 10). From this, it can be seen that these convergence rates are not artefacts associating bias because of correlating portions of data to itself, that is, a 15-d stacked to a year-long NCF from 2009.

It seems that the prevalent practice of using a ‘one-bit’ or running normalization is not a necessary step in NCF generation. NCFs from the non-pre-processed coherency method using a short-time window Welch′s method converge towards the long-term NCF faster than NCFs obtained using either ‘one-bit’ or running normalization (with or without short duration Welch′s method). This suggests that prior coherency work using shorter time windows without such normalization (e.g. Prieto *et al.* 2009; Lawrence & Prieto 2011) preserved more signal than other methods. One benefit of not normalizing in the time domain is that one can preserve the relative amplitudes with the NCF, which can help constrain lateral variations in amplification (Prieto & Beroza 2008) and attenuation (Lawrence & Prieto 2011).

With regards to time-domain normalization, it should be noted that amplitude and phase are intrinsically linked. Applying a ‘one-bit’ or running normalization changes the phase, not just the amplitude (Fig. 11). We maintain that it is important to preserve the phase before correlation, regardless of whether correlation is performed in the time or frequency domain. Phase is quintessential to obtaining correct traveltimes and any augmentation of phase may result in biased NCFs (e.g. Prieto *et al.* 2011; Lawrence & Prieto 2011). With sufficient temporal stacking, such bias is likely minimized, but the filtering is not a necessary step (as illustrated above). Using a single (uniform) normalization per window (as accomplished by whitening) does not distort the phase, so the method proposed here is likely more appropriate for NCF generation.

The chosen 20-point smoothing filter applied to the power spectral density for whitening, is somewhat arbitrary, but effective. Choosing a different smoothing width has little effect (Fig. 12) on the convergence correlation of the NCFs. The frequency width of a frequency domain smoothing filter corresponding to a 20-point average for a 3600 s time window is ~4.4 and ~0.36 mHz for the 86 400 s time window. Choosing a 480-point smoothing filter for the daylong time window (~4.4 mHz) does not significantly effect the convergence correlation for the Welch′s method and running normalization NCFs. Similarly, a 10-point smoothing filter (~17 mHz) hardly augments the convergence correlation for 1800 s time window NCFs. The ‘one-bit’ normalization exhibits more dependence on filter width for the 86 400 s time window than do the other two methods, likely because the other filters aren't as harsh (or arbitrary).

One unexpected and interesting observation is that the optimally time-windowed data may yield better azimuthally distributed signals than longer (e.g. 24-hr) time-windowed data. In theory, approximately uniform (or uniformly random) source distribution is required to produce stable NCFs. However, if one whitens (a form of normalization) a longer time-series, any dominant signal(s) may reduce the amplitudes of other lower amplitude signals corresponding to other sources within that time-series. This means that one dominant source may reduce the contribution of signals from other sources.

Fig. 13 illustrates the potential improvement in the azimuthally distributed signal by plotting the envelopes of all stacked NCFs as a function of interstation azimuth. Ideally, with evenly distributed noise sources, we would observe equally high amplitudes at the Rayleigh wave speed (~3 km s^{-1}) for all azimuths. The observed azimuthal amplitude distribution for the 365-d NCF stacks are very similar for short (3600 s) or long (86 400 s) time windowing, with some low-amplitude gaps for data headed to the southwest.

The energy contained within the 15-d NCF stacks of the short (1800 s) time windowed data are distributed more evenly with respect to azimuth than the daylong (86 400 s) time windowed NCF data. For daylong time windowed azimuthal distribution, there may be an additional factor contributing to the azimuthal gaps. Only high signal-to-noise ratio NCFs were used in generating Fig. 13 (where SNR =[max signal - min signal]/[max noise - min noise] > 3). This means that because more data were rejected for daylong time windows, fewer high signal-to-noise NCFs were generated for the 15-d stacks (this is true even for a threshold of data rejection at 11 times the standard deviation). Note that the NCFs improve with shorter windows regardless of ‘one-bit’ or running normalization, so a similar improvement in observed azimuthal energy distribution is expected regardless of how a signal is normalized. Regardless of whether the more evenly azimuthally distributed energy results from higher signal-to-noise or better normalization of the sources, it is clear that better azimuthal contributions can be constrained in a shorter duration with more overlapping short time windows.

We pose that the convergence correlations decay with increasing distance as a result of the ambient source distribution. Low magnitude ambient sources (or scatterers) near both stations of a station pair may be recorded coherently if the station separation is small. However, as the station separation increases, low-amplitude signal recorded at one site might be fully attenuated, scattered and/or geometrically spread such that the signal is below the noise-floor of the more distant sensor(s). Thus, stations separated by greater distance should converge more slowly towards a solution than stations that are closer together. With no station separation (autocorrelation), the NCFs should converge within the duration it takes to measure the desired period (~1800 s; as shown by Evans *et al.* 2010). Given the patterns of the convergence correlation versus distance and duration (Figs 3 and 7), any level of correlation should be achievable at any finite distance given sufficient time. The greater the distance, the greater the quantity of data this process will require to generate a similar level of signal reliability. At infinite station separation in the presence of attenuation, no coherent signals could be measured at both stations, providing no convergence of the NCF irrespective of the stacking duration.

The process of correlating shorter time windows helps increase convergence at greater distance by removing the dependence on the largest sources within the time window. On average, shorter time windows contain records from fewer sources. Whitening a short-time window is less likely to normalize any given source by amplitudes corresponding to other sources. Thus the correlation increases with shorter time windows. At greater station separations, the amplitudes bias towards sources closer to each station, which may result in greater bias for long-time windows than observed at short station separations.

The higher NCF convergence correlation observed using short overlapping windows, suggests that, at any stack duration or distance, this method should provide superior NCFs to long non-overlapping window NCF stacks. Dividing the data into many short windows corresponds to only a minor reduction in computational efficiency (~20 per cent), which comes at the benefit of requiring less data. To achieve a convergence correlation of *R*~ 0.9, Welch′s method applied here only requires approximately half the data processing, leading to a ~40 per cent improvement in total computational efficiency for the equivalent NCF. With higher desired convergence correlations, this efficiency is even greater.

Perhaps the most exciting outcome of generating better NCFs using less data is that we may be able to examine time varying processes with greater accuracy (e.g. Brenguier *et al.* 2008a,b; Baptie 2010; Hadziioannou *et al.* 2011). If shorter durations of stacked data are required, we could potentially improve our temporal resolution of time varying processes. It has been shown that time varying structures can be observed using NCFs calculated from distinct durations (Sens-Schönfelder & Wegler 2006; Brenguier *et al.* 2008a,b; Baptie 2010). With better NCFs using shorter durations, applications of this method are bound to improve.

## Future work

Building on this research, there are several studies that could improve and test the quality of NCFs for ambient noise studies. For example, applying a triangular shaped Hanning filter to the ambient field time-series before calculating the coherency would better isolate different signals in each overlapping window. This method is routinely applied in single-station site noise studies (e.g. Evans *et al.* 2010). We chose not to apply this method here because of concerns that the triangular filter could bias the correlation towards zero-lag. This bias is likely to be small, but the convergence correlation improvement could be significant. Note that the single-station site noise studies use different signal durations for different sampling rates and different types of instruments (e.g. Evans *et al.* 2010), which may indicate that two-station NCF studies might need to as well.

It would be illuminating to determine if a similar level of correlation improvement is observed for the whole period range used here (5-50 s), only some subset of periods (e.g. 7-14 s or >14 s) or other noise bands (e.g. >1 Hz). It is possible that the use of short-time windows could reduce robustness at long periods because fewer cycles are observed. Along this line of thinking, the coherent frequency content may change as a function of distance because of attenuation (Lawrence & Prieto 2011), so much smaller (e.g. de Ridder & Dellinger 2011) or much larger arrays may require another evaluation of appropriate time windows. It may need to be shown that the optimal time window varies depending on the noise distribution, noise frequency content, array density, array dimension, sensor gain or sensor bandwidth. Another extension of this research is to determine how stable group or phase velocity estimates for various duration lengths using different window and duration lengths. It could also prove useful to investigate how this methodology compares to that of the phase cross-correlation (e.g. Schimmel *et al.* 2010) of the ambient noise field or if improvements can be seen in it by applying Welch′s method.

Different noise sources may also require variant window lengths or data removal methods, depending on array geometry, frequency content and duration of signal recording.

## Conclusions

We systematically search through a variety of parameters, such as the time window length used for cross-correlation to obtain the NCFs, per cent of window overlap and data rejection thresholds, in an attempt to quantitatively assess the quality of the NCF, as compared to other techniques, such as a ‘one-bit’ or running normalization of the data. From this, we can suggest that correlating 30-60 min windows of data and overlapping these windows, provides a more robust, quicker converging NCF than using daylong time windows or other pre-processing techniques.

We have demonstrated that, at least for the study region, NCFs converge faster using short-duration overlapping time windows (Welch, 1969) than with long, non-overlapping time windows. In comparing data from 190 broad-band station pairs over 2009 (and part of 2008), we show that the observed optimal time window length (1800-3600 s) is consistent with optimal lengths from single-station seismic site noise characterization studies, which use similar length windows to better constrain the time varying power spectral densities. Because this method converges faster not only towards the one-year, but also (in theory) the infinite stack duration NCF, we recommend employing this method. Although the exact time window length may vary for different arrays recording differing noise sources, applying Welch′s method will most likely help stabilize NCFs in most studies.

## Acknowledgments

We thank the reviewers of this manuscript for their invaluable input. This research was supported under NSF grant EAR-1050669.