-
PDF
- Split View
-
Views
-
Cite
Cite
T S Hudson, S Klaasen, O Fontaine, C A Bacon, K Jónsdóttir, A Fichtner, Towards a widely applicable earthquake detection algorithm for fibreoptic and hybrid fibreoptic-seismometer networks, Geophysical Journal International, Volume 240, Issue 3, March 2025, Pages 1965–1985, https://doi.org/10.1093/gji/ggae459
- Share Icon Share
SUMMARY
Distributed acoustic sensing (DAS) is a promising technology for providing dense (metre-scale) sampling of the seismic wavefield. However, harnessing this potential for earthquake detection with accurate phase picking and associated localization remains challenging. Single-channel algorithms are limited by individual channel noise, while machine learning and semblance methods are typically imited to specific geological settings, have no physically constrained phase association and/or require specific fibre geometries. Here, we present a method that seeks to detect seismicity for any geological setting, applicable for any fibre geometry, and combining both fibreoptic and conventional seismometer data to maximize the information used for detection and source localization. This method adapts a proven back-migration detection method to also include DAS observations, migrating energy from many receivers back in time to search for localized peaks in energy, corresponding to seismic sources. The strengths of this method are capitalizing on coherency over many channels to enhance detection sensitivity even in high-noise environments compared to single-channel algorithms, applicability to arbitrary fibre geometries, as well as built-in, physics-informed phase association and source localization. We explore the performance of the method using three geologically and geometrically diverse settings: a glacier, a volcanic eruption and a geothermal borehole. Our results evidence the effect of spatial-sampling extent and non-optimal fibreoptic geometries, accounting for P- and S-wave sensitivity, coupling effects and how the sensitivity of native fibreoptic strain measurements to shallow subsurface heterogeneities can affect detection. Finally, we attempt to also present a method-ambivalent overview of key challenges facing fibreoptic earthquake detection and possible avenues of future work to address them.
1 INTRODUCTION
Earthquakes are essential for monitoring natural hazards, imaging subsurface structure and interrogating Earth system processes. In order to harness the potential of earthquakes for either monitoring or insight into fundamental processes, one first has to detect and locate any seismicity. Typically, earthquake detection has been performed using conventional seismometers sensitive to the seismic velocity wavefield, but recent new optical instrumentation now allows one to use fibreoptic cables to measure the seismic strain wavefield with far denser spatial sampling over a wide bandwidth (Hartog et al. 2018; Lindsey et al. 2020a; Lindsey & Martin 2021; Paitz et al. 2021). This technology is often referred to as distributed acoustic sensing (DAS). However, currently these fibreoptic technologies only provide single-component measurements, with deployed fibre geometries often not favourable for earthquake detection and/or location. Therefore, there is a need for earthquake detection algorithms that can, first, be applied for arbitrary fibre geometries; secondly, maximize the spatial sampling extent of the seismic wavefield by combining fibreoptic and conventional seismometer data; and thirdly, capitalize on the spatio-temporal coherency of the earthquake wavefield.
Earthquake detection methods can be broadly separated into two categories: (1) receiver-by-receiver detection, with earthquake arrivals triggered on each seismogram in isolation and then combined afterwards using some form of phase association; or (2) multireceiver detection, where earthquake arrivals at multiple receivers are combined together in a physics-constrained framework. Common receiver-by-receiver algorithms are applied in the time domain using short-term-average to long-term-average (STA/LTA) methods, for example, (Allen 1978; Withers et al. 1998) or in the frequency domain by looking for energy peaks within a certain frequency band (O’Neel et al. 2007; Helmstetter et al. 2015). Machine-learning techniques, such as convolutional neural networks, have also been applied to receiver-by-receiver seismic phase detection (Zhu & Beroza 2019; Mousavi et al. 2020; Hernandez et al. 2022; Zhu et al. 2023). Recently, novel hybrid algorithms combining multiple STA/LTA functions with machine learning have also been developed (Latto et al. 2024). Perpetual limitations of any receiver-by-receiver method are not using spatio-temporal coherency information and the challenge of associating phase arrivals from each receiver with one another, especially for multiple wave types (Ross et al. 2019). Multireceiver detection methods overcome these limitations by identifying coherent signals arriving at multiple receivers. Methods that do not explicitly require knowledge of the medium’s velocity structure, designed specifically for DAS measurements include: semblance-based coherence (Porras et al. 2024) and machine-learning based image recognition methods (Stork et al. 2020; Huot et al. 2022). More general approaches include array-processing techniques, such as beamforming (Hudson et al. 2023) or covariance matrix analysis (Seydoux et al. 2016). However, although these methods do not explicitly require subsurface velocity structure information, they either implicitly assume, learn or are sensitive to the local velocity structure. A final multireceiver method of note is back-migration, which explicitly requires an estimate of velocity structure in the region of interest, using this information to effectively perform physics-informed stacking of energy arriving at multiple receivers (Drew et al. 2013; Wagner et al. 2017; Hudson et al. 2019, 2021; Guidarelli et al. 2020; Smith et al. 2020; Winder et al. 2021.
When discussing earthquake detection, we deem it helpful to consider the following key ingredients for optimal earthquake detection algorithm performance:
Maximize spatial coverage and sampling density.
Exploit signal coherency.
Maximize sensitivity to multiple seismic phases.
Quantify event origin-time and phase arrival-time uncertainty.
Optimize computational efficiency.
(Bonus: towards universal applicability, while considering the trade-off with computational efficiency).
In this work, we quantify how important these ingredients are for earthquake detection and evidence the reasons why, relating them to the modifications required to apply an existing back-migration earthquake detection method to fibreoptic and hybrid fibreoptic-seismometer data sets. This builds on previous work, where back-migration was applied to a DAS data set without any DAS-specific adaptations (Hudson et al. 2021). While we focus here on local to regional microseismicity applications, these ingredients should generally be transferable to global earthquake detection (Selby 2011) and noise localization applications (Igel et al. 2023). We favour a back-migration method because it includes all the key ingredients specified above in the recipe. In particular, the adapted back-migration method presented here allows one to use arbitrary fibreoptic deployment geometries, maximize spatial coverage and seismic phase sensitivity by also including conventional seismometer data, and requires no further modification or retraining when applied to new data sets. It also provides earthquake location estimates without additional cost. Using fibreoptic (DAS) data sets for earthquake detection requires various specific considerations, including: in-axis fibre directional sensitivity, the use of native strain/strain-rate measurements, coupling of the fibre to the medium; and preserving the contribution of 1000s of DAS channels in a computationally efficient way with far fewer conventional seismometer measurements. As we detail the method and its performance, we also identify and discuss remaining challenges of using fibreoptic data sets for earthquake detection.
1.1 Data
Three real-world data sets are used in this work. These data sets approximately represent end members of current fibreoptic deployments: (1) a dense 2-D fibreoptic grid deployed coincident with nodes on a glacier; (2) a dark-fibre located within the vicinity of a volcanic eruption; and (3) a downhole fibre at a geothermal field. The specific detection algorithm settings used in each case are given in Table 1.
Data set: . | Glacier . | Volcanic eruption . | Geothermal borehole . |
---|---|---|---|
Phases used | P, surface | P, S | P, S |
Sampling rate | 1000 Hz | 100 Hz | 1000 Hz |
Frequency filter, P | 10–250 Hz | 1.2–20 Hz | 2–300 Hz |
Frequency filter, S | n/a | 1.2–20 Hz | 2–300 Hz |
Frequency filter, surface | 5–150 Hz | n/a | n/a |
Grid resolution, x | 8 m | 150 m | 400 m |
Grid resolution, y | 8 m | 150 m | 400 m |
Grid resolution, z | 10 m | 300 m | 100 m |
STA/LTA P | 0.01/0.2 | 0.2/1 | 0.01/0.5 |
STA/LTA S | 0.02/0.2 | 0.2/1 | 0.01/0.5 |
Coalescence detection threshold | 1.15 | 1.125 | 1.7 |
Marginal window | 0.25 s | 2 s | 0.25 s |
DAS specific settings | – | – | – |
Spatial downsamp. factor | 1 | 5 | 10 |
Channel spacing | 1.6 m | 16 m | 1 m |
Gauge length | 6.38 m | 10 m | 10 m |
Semblance stacking | no | yes | yes |
Data set: . | Glacier . | Volcanic eruption . | Geothermal borehole . |
---|---|---|---|
Phases used | P, surface | P, S | P, S |
Sampling rate | 1000 Hz | 100 Hz | 1000 Hz |
Frequency filter, P | 10–250 Hz | 1.2–20 Hz | 2–300 Hz |
Frequency filter, S | n/a | 1.2–20 Hz | 2–300 Hz |
Frequency filter, surface | 5–150 Hz | n/a | n/a |
Grid resolution, x | 8 m | 150 m | 400 m |
Grid resolution, y | 8 m | 150 m | 400 m |
Grid resolution, z | 10 m | 300 m | 100 m |
STA/LTA P | 0.01/0.2 | 0.2/1 | 0.01/0.5 |
STA/LTA S | 0.02/0.2 | 0.2/1 | 0.01/0.5 |
Coalescence detection threshold | 1.15 | 1.125 | 1.7 |
Marginal window | 0.25 s | 2 s | 0.25 s |
DAS specific settings | – | – | – |
Spatial downsamp. factor | 1 | 5 | 10 |
Channel spacing | 1.6 m | 16 m | 1 m |
Gauge length | 6.38 m | 10 m | 10 m |
Semblance stacking | no | yes | yes |
Data set: . | Glacier . | Volcanic eruption . | Geothermal borehole . |
---|---|---|---|
Phases used | P, surface | P, S | P, S |
Sampling rate | 1000 Hz | 100 Hz | 1000 Hz |
Frequency filter, P | 10–250 Hz | 1.2–20 Hz | 2–300 Hz |
Frequency filter, S | n/a | 1.2–20 Hz | 2–300 Hz |
Frequency filter, surface | 5–150 Hz | n/a | n/a |
Grid resolution, x | 8 m | 150 m | 400 m |
Grid resolution, y | 8 m | 150 m | 400 m |
Grid resolution, z | 10 m | 300 m | 100 m |
STA/LTA P | 0.01/0.2 | 0.2/1 | 0.01/0.5 |
STA/LTA S | 0.02/0.2 | 0.2/1 | 0.01/0.5 |
Coalescence detection threshold | 1.15 | 1.125 | 1.7 |
Marginal window | 0.25 s | 2 s | 0.25 s |
DAS specific settings | – | – | – |
Spatial downsamp. factor | 1 | 5 | 10 |
Channel spacing | 1.6 m | 16 m | 1 m |
Gauge length | 6.38 m | 10 m | 10 m |
Semblance stacking | no | yes | yes |
Data set: . | Glacier . | Volcanic eruption . | Geothermal borehole . |
---|---|---|---|
Phases used | P, surface | P, S | P, S |
Sampling rate | 1000 Hz | 100 Hz | 1000 Hz |
Frequency filter, P | 10–250 Hz | 1.2–20 Hz | 2–300 Hz |
Frequency filter, S | n/a | 1.2–20 Hz | 2–300 Hz |
Frequency filter, surface | 5–150 Hz | n/a | n/a |
Grid resolution, x | 8 m | 150 m | 400 m |
Grid resolution, y | 8 m | 150 m | 400 m |
Grid resolution, z | 10 m | 300 m | 100 m |
STA/LTA P | 0.01/0.2 | 0.2/1 | 0.01/0.5 |
STA/LTA S | 0.02/0.2 | 0.2/1 | 0.01/0.5 |
Coalescence detection threshold | 1.15 | 1.125 | 1.7 |
Marginal window | 0.25 s | 2 s | 0.25 s |
DAS specific settings | – | – | – |
Spatial downsamp. factor | 1 | 5 | 10 |
Channel spacing | 1.6 m | 16 m | 1 m |
Gauge length | 6.38 m | 10 m | 10 m |
Semblance stacking | no | yes | yes |
The glacier data set comprises of 1.2 km fibre deployed at Gornergletscher in the Swiss Alps, in 2023 October. The network has an |$\sim 100$| m aperture, with 29 vertical single-component Sercel WiNG nodes deployed in the same area. The interrogator used is a Sintela Onyx, measuring strain, with a gauge length of 6 m and a channel spacing of 1.6 m. All data were acquired with a sampling rate of 1000 Hz. The majority of microseismicity at the study site is thought to be caused by near-surface crevassing (Walter et al. 2009; Hudson et al. 2020).
The volcanic eruption data set comprises of an 8 km dark fibre, interrogated during the first Svartsengi volcanic eruption on the Reykjanes Peninsula, Iceland, in 2023 December. We also include data from a broad-band seismometer operated by the Icelandic Meteorological Office. The fibreoptic interrogator used is a Silixa iDAS, measuring strain rate, with a gauge length of 10 m and a channel spacing of 16 m. All data are sampled at 100 Hz. Seismicity detected here is attributed to one intrusion episode on the 2023 December 18.
The downhole geothermal data set is from the Utah Frontier Observatory for Research in Geothermal Energy (FORGE) 2019 experiment, consisting of 1.2 km of fibre cemented into a vertical monitoring borehole (Lellouch et al. 2020). A network of seismometers was deployed at the surface. The fibre is interrogated using a Silixa iDAS interrogator with a gauge length of 10 m, a channel spacing of 1 m and a sampling rate of 100 Hz. Here, we focus on one particularly active hour of seismicity during a well stimulation at 17:00 to 18:00 on 2019 April 27.
While these example data sets are not comprehensive, the majority of fibreoptic deployments for studying seismicity are likely similar to at least one of these examples, with the exception of subsea and urban deployments.
As well as real-world data sets, we also use a synthetic data set that replicates the glacier data set to investigate detection algorithm performance. This synthetic data set uses the same source–receiver geometry as the glacier data set and the same QuakeMigrate settings, with the exception that channel spacing and simulated gauge length are both 1 m. For the source, we use a Ricker wavelet with a centre frequency of 100 Hz and an explosive moment tensor (|$M_{xx,yy,zz}=1,$|, |$M_{xy,xz,yz}=0$|) at the same location as an icequake from the real-world glacier data set. Synthetic waveforms are modelled using a spectral-element method, Salvus (Afanasiev et al. 2019).
2 THE EARTHQUAKE DETECTION RECIPE
2.1 Back-migration at a glance
A back-migration earthquake detection method converts continuous seismograms at each receiver into onset functions that represent the energy from a particular seismic phase arriving at each receiver through time (Drew et al. 2013). These characteristic onset functions from all receivers are then back-migrated in space through time, effectively stacking the data with physically meaningful time-shifts. Potential events are detected by identifying peaks in the back-migrated energy through time (see Fig. 1). The key strength of this method lies in events only being triggered by coherent source singularities rather than incoherent noise.

Schematic of back-migration for a single seismic source in space and time. Top shows the 3-D volume at different points in time, with darker shading corresponding to the wavefield amplitude at particular grid cells. Triangles denote receivers and black star denotes location of peak back-migrated energy corresponding to a hypothetical seismic source. Note: No noise is present in this illustration. Lower plot shows back-migrated energy corresponding to the amplitude of the grid cell with the maximum amplitude at each point in time. As energy is back-migrated through the grid in time, the energy coalesces towards a singularity in space (a seismic source).
2.2 The QuakeMigrate algorithm
The specific back-migration algorithm used in this work is a modified version of the open source software QuakeMigrate (Hudson et al. 2019; Smith et al. 2020; Winder et al. 2021). The specific steps of the modified QuakeMigrate algorithm are summarized schematically in Fig. 2 and for an example seismic source in Fig. 3. The specific steps are as follows:
First, 3-D traveltime and fibre-sensitivity lookup tables are generated for each receiver and each seismic phase (e.g. P, S), corresponding to the time-shifts required to back-migrate the characteristic onset function to a particular point in space (see Section 2.3.4 for details on how sensitivity lookup tables are obtained). This computationally expensive step is only performed once for a given network and velocity model. One should note that this requires a velocity model, which theoretically limits the universal applicability of the method. However, typically one can make an approximate yet sufficient guess at an initial model.
The second step is performing the back-migration. Continuous seismograms are read in for every receiver. Characteristic onset functions representing the energy arriving at each receiver are calculated. These onset functions are calculated for every receiver and every phase separately, for example, by using an STA/LTA ratio. These onset functions are what is back-migrated, rather than seismograms, to minimize destructive interference effects and reduce the effect of noise. For conventional three-component seismic data, vertical components are used for P wave arrivals and horizontal components for S wave arrivals, although this differs for DAS data (see Section 2.3 for details). These characteristic onset functions are shifted and stacked in time for each point in the 3-D search space, for all seismic phases (e.g. P and S) combined via a linear sum. For each point in time, the value of the grid cell with the maximum coalescence of energy from all receivers and all phases combined is recorded, from which a maximum energy coalescence time-series is found. From hereon in, we refer to coalescence of energy, a measure of the spatio-temporal concentration of the stacked onset functions (a proxy for seismic energy arriving at receivers) simply as coalescence. Wherever possible, we maximize the information used for detection and location, so if we have three-component seismometer data available then unless otherwise stated, we will use these receivers for both P-wave and S-wave detection.
The third step is to trigger possible event detections based on finding peaks in the coalescence time-series. Typically, we find the best trigger threshold is dynamic, using a multiple of the median absolute deviation value from a moving window of several hours in duration.
Finally, one refines the event location by repeating the back-migration for each triggered candidate event individually. Event location can be refined by using data at a higher sampling rate, a more spatially dense lookup table, or different frequency filters, for example. The outputs are an event catalogue, including arrival time picks for each seismic phase at each receiver, the earthquake origin time, and an estimate of earthquake location. Uncertainties are quantified for all parameters (see Fig. 3, part 4). Arrival times for each seismic phase are derived from peaks in the onset function for the corresponding receiver and seismic phase. This is important, as individual arrival times are therefore independent of the model space, with back-migration information used only to effectively constrain phase association of the arrival time picks. Arrival time uncertainties are defined as the standard deviation of a Gaussian fit to the characteristic onset function for a given receiver and phase. The earthquake origin time and uncertainty correspond to the peak in the coalescence function and the standard deviation in time of the peak in this coalescence function, respectively. The earthquake location and uncertainty are estimated from the peak in the marginalized coalescence space at the origin time and the standard deviation of a Gaussian fit to this peak in space, with the marginalized coalescence space assumed be a proxy for the probability density function in space (see Drew et al. 2013 for more details). If desired, one can also output additional information such as plots of the coalescence in space and time or plots of arrival time pick labelled waveform data. These are useful for initial refinement of the detection parameters for a particular data set, especially regarding STA/LTA values, bandpass filters and lookup table grid resolution.
![Schematic overview of the modified back-migration detection method. (1) Traveltime and DAS sensitivity lookup tables calculated for each channel. (2a) Data is pre-processed, including conversion and stacking of DAS data. (2b) Onset functions are calculated and back-migrated through time. (3) Potential events are triggered from peaks in the time-shifted, stacked onset functions [corresponding to coalescence (or a measure of energy)]. (4) Back-migration is performed again, just for candidate events. Uncertainties are estimated in this step and a final earthquake catalogue is generated.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/gji/240/3/10.1093_gji_ggae459/2/m_ggae459fig2.jpeg?Expires=1747890986&Signature=pQltZE4GIgJ7hh6BCnSltJWXzXP6ySkR8fyShsID7-RN8CBNJbJjAUx8WLcHFM~jaqG~EeniTyTFjacFo4OJ~GYx0HE1gxbTF1JhtDig4S04PMWGWS8ymI9IsLtW5ti8QIijGuoLwBxVTrbxGqP8yBsbOzqrpD-FsiNmycj7GD7ChFprocbu7USvCTLpm9vQputFprPoZAYLY2jDWnhtzmZrwqprckn0VZg3zV9AF6dXG~erxZBIpFBlCW79d3cLO-Yyf7bIQUingS8PelcPPTDfXmhqUiex7i4Tj~8g70UcQEHaYeFC5tsfAsGw3Dfkiw3~Wdwn7CE5-1cYYID0tw__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Schematic overview of the modified back-migration detection method. (1) Traveltime and DAS sensitivity lookup tables calculated for each channel. (2a) Data is pre-processed, including conversion and stacking of DAS data. (2b) Onset functions are calculated and back-migrated through time. (3) Potential events are triggered from peaks in the time-shifted, stacked onset functions [corresponding to coalescence (or a measure of energy)]. (4) Back-migration is performed again, just for candidate events. Uncertainties are estimated in this step and a final earthquake catalogue is generated.

The modified back-migration detection method as in Fig. 2 but with data from a real seismic source (a Gornergletscher icequake). (1) Slices of lookup tables for one receiver. (2a) Example of pre-processing from strain (rate) to decimated velocity. (2b) Examples of onset functions calculated from seismograms for two receivers. (2c) Example of slices of the 4-D coalescence space at three times, as well as the coalescence time-series for the event. Star shows seismic source location. (3) Example of arbitrary trigger threshold applied to the overall coalescence time-series. (4) Examples of how arrival times, origin time and event location (with their corresponding uncertainties) are derived from the result.
2.3 Fibreoptic-specific modifications
Although back-migration earthquake detection methods already exist, a number of modifications are required to include fibreoptic data sets and optimize the algorithm. Modifications include: combining data sets with different units (velocity, strain/strain rate), including incoherent and coherent noise reduction during conversion; lossless spatial decimation of DAS channels; using 1-D DAS measurements for multiple seismic phases; and accounting for in-axis fibre sensitivity. These modifications are all implemented into the QuakeMigrate algorithm. Below, we describe why these modifications are needed and how they are implemented.
2.3.1 Combining fibreoptic and conventional data types
Combining data from different instrument types is non-trivial. Fibreoptic interrogators measure strain rate (or strain) whereas seismometers typically measure velocity. Strain-rate, |$\dot{\varepsilon }$|, is the spatial derivative of velocity, v. One might assume that we are ambivalent to receiver units, since we back-migrate an approximation of the normalized energy arriving at each receiver through time. However, spatially or temporally differentiating or integrating a periodic time-series leads to a systematic change in frequency–amplitude content. For example, let us assume that an earthquake has a simple sinusoidal signal,
where |${\bf \mathit{ k}}$| is the wave number, |${\bf \mathit{ x}}$| is the direction vector, |$\omega$| is angular frequency and t is time. Then conversion to |$\dot{\varepsilon }$| gives
Since |$k = 1 / \lambda$|, where |$\lambda$| is the wavelength, and the wavelength is proportional to the frequency (since |$c = \lambda f$|), the amplitude of |$\dot{\varepsilon }$| is dependent on frequency. Such a relative change in frequency content in seismograms from different receivers could cause issues when pre-processing data inputs (e.g. bandpass filtering), and affect onset function amplitudes and hence overall coalescence amplitudes.
We therefore opt for converting all data into the same units before calculating and back-migrating onset functions if possible. Note that this is only possible over approximately linear sections of fibre. An example of conversion from strain rate to velocity is shown in Fig. 4. This is only possible over approximately linear sections of fibre. If the fibre does not have significant sections that fulfil this criteria, then we leave the data in native strain rate. We choose to convert DAS strain rate to velocity rather than converting seismometer velocity to strain rate for several reasons. First, converting seismometer velocities into strain rate is highly challenging as one would have to combine all seismometer data, reconstruct the wavefield, and then take the spatial derivative of that wavefield at seismometer locations (Muir & Zhan 2022). This is impossible unless one has many conventional receivers. Secondly, integrating DAS strain rate to velocity is not only more practical, but the resultant integration noise has infinite apparent velocity that can be removed using an |$fk$|-filter, increasing signal-to-noise (SNR) ratio. Thirdly, the integration also acts as a spatial low-pass filter, removing some incoherent noise. Fourthly, strain rate is inherently more sensitive to local velocity structure than velocity (Capdeville & Sladen 2024), with conversion to velocity removing this sensitivity. If an earthquake is far from the receiver and one does not know the local velocity structure adequately, then again one gains an improvement in back-migration performance. As we have already hinted, we convert from strain rate to velocity via direct integration followed by an |$fk$|-filter to remove near-infinite apparent velocities. Other options for converting between strain rate and velocity exist (Lindsey et al. 2020a) but involve independent coincident seismometers to use as a reference. We opt not to implement these methods here, since it is more important from our perspective that we are independent of the requirement for coincident receivers than velocity conversion precision. Given the open-source nature of the algorithm, users have the freedom to implement more involved velocity conversion methods if desired.

Example of strain rate to velocity conversion. (a) An earthquake distance along fibre versus time plot for an earthquake from the Reykjanes Peninsula, Iceland in units of strain rate. (b) The same earthquake as in panel (a), but converted to velocity without removing infinite apparent velocity integration noise, both in fk-space and distance–time space. (c) Same as (b) but with integration noise removed.
2.3.2 Preserving fibreoptic data contributions while increasing computational efficiency, via semilossless decimation
A primary challenge associated with including fibreoptic data is the large number of channels compared to conventional receivers. The challenge lies in processing the vast amount of information contained in DAS records as computationally efficiently as possible. We opt for maximizing the contribution of fibreoptic receivers while reducing the number of back-migrated channels via semilossless decimation. This semi-lossless decimation refers to performing semblance-based stacking on every n fibreoptic receivers, similar to Porras et al. (2024). Specifically, semblance stacking comprises time-shifting every channel relative to every other channel in order to maximize the stacked amplitude. Time-shifts are limited by a maximum permissible apparent velocity. Theoretically, this preserves both amplitude and directional information. However, we refer to the method as semi-lossless rather than lossless as we discard the directional information but preserve coherency post decimation. This semblance-based decimation improves decimated fibreoptic receiver signal quality, approximately retaining the information that would otherwise be obtained via the formal space–time back-migration stacking. In practice, one is limited by the fibre geometry, since semblance-based stacking does not work on arbitrary orientations of fibre (e.g. it would fail if applied to two channels orthogonal to one another). Therefore, we currently employ the philosophy of decimating as much as possible while preserving both semblance-stack performance (e.g. only over linear segment scales or the gauge length) and spatial coverage.
2.3.3 Combining multiple seismic phases
A strength of back-migration detection methods is that one can use multiple seismic phases to constrain earthquake location better, enhancing coalescence and therefore improving detection performance. QuakeMigrate traditionally does this for P and S waves by back-migrating vertical and horizontal receiver component onset functions through P-wave and S-wave velocity models, respectively. Fibreoptic channels only measure signals in the fibre axis, making it non-trivial to isolate different seismic phases. Instead, we modify the algorithm to allow one to use fibreoptic channels for P and/or S phases (as well as surface waves for approximately non-dispersive settings). This can be specified wholesale or individually for every channel, depending on whether a channel is horizontally deployed on the surface or vertically in a borehole, for example. Currently, the surface wave implementation involves specifying a single group velocity for the medium and assuming the energy migrates within the near surface. This is valid for the Gornergletscher field site since we deployed directly on the ice surface with no firn layer present, unlike previous surface DAS glacier studies (Walter et al. 2020; Hudson et al. 2021). However, more rigorous inclusion of surface waves could be achieved by modifying the method to back-migrate energy through different phase-velocity models for different frequency bands, approximately simulating surface wave dispersion and Rayleigh versus Love waves.
Above, we hint that fibre channel orientation and surface versus subsurface deployment play a role in the sensitivity of a particular channel to a particular seismic phase. While this topic could be the subject of numerous studies, one can typically assume that subsurface channels deployed vertically are sensitive to both P and S waves, whereas horizontal surface channels are dominantly sensitive to S-waves because of steep near-surface velocity gradients resulting in near-vertical ray incident angles (Hudson et al. 2021). An exception to this is if the medium has an approximately homogeneous velocity structure, for example if deployed on ice (Walter et al. 2020). However, for the surface DAS examples in this work, one has a homogeneous velocity structure and so has similar sensitivity for both P and S waves, and although the other is dominantly sensitive to S-waves, we still observe some P-wave energy in that case too.
2.3.4 Fibreoptic wavefield sensitivity
In reality, the sensitivity of fibreoptic cables to different seismic waves is not binary. A more sophisticated approach that we implement here is to calculate fibreoptic channel sensitivity based on ray takeoff angle derived from the same velocity model used to calculate traveltime lookup tables. We calculate the takeoff angle for a ray propagating to each receiver from every grid cell in the search volume, for both P and S waves. Fibreoptic sensitivity to strain rate and velocity differ, so for flexibility we implement both. For velocity as measured by fibreoptic channels, the sensitivity of a fibreoptic channel to P, SV and SH phases is given by (Martin 2018)
where |$\theta$| is the angle of the fibre on a plane relative to a reference direction (e.g. north), |$\phi _1$| is the in-plane angle of a plane wave propagation direction relative to the same reference direction, and |$\phi _2$| is the angle of the plane wave relative to the plane-perpendicular angle (e.g. angle from vertical down for a horizontal fibre channel). Alternatively, for strain rate the sensitivity is,
Note that these are for point strain and we drop any wave amplitude factors and frequency or phase dependence since we do not know the amplitude, frequency content or phase of any prospective arrivals prior to detecting them. SV and SH sensitivities then have to be somehow combined. Since the proportion of SV to SH energy incident at a receiver prior to detection is unknown, maximum sensitivity to any given S-wave polarization is assumed. We therefore define the S-wave sensitivity as
These equations are now implemented in QuakeMigrate. We then define a sensitivity threshold below which we deem that a particular fibre channel is insensitive to that location within the search volume. Such a threshold could be selected, for example, by determining at what sensitivity value the amplitude of a fibre channel would fall below the noise level of another channel with perfect sensitivity. We then mask these regions for the associated seismic phase at that particular channel. We do this for all fibreoptic channels. This provides greater constraint over where potential events coalesce. This approach is dependent on knowledge of the approximate velocity structure, especially the shallow velocity gradient. It should therefore be used with caution, with the user able to choose whether to implement it or not. While it is computationally intensive, it is only performed once when the traveltime lookup tables are generated, so subsequent runtime on continuous seismic data is unaffected.
While we only address body-wave sensitivity here, theoretically, one can also surface wave sensitivity in a similar way.
3 RESULTS
3.1 Algorithm performance
We first investigate various specific aspects of the detection algorithm performance in more detail. We first investigate the effect of signal quality and velocity model errors using a synthetic data set based on the same configuration as the Gornergletscher data set. We then compare the effect of network geometry for the synthetic data set compared to the real Gornergletscher data set, before focusing on the importance of stacking fibreoptic channels, fibreoptic coupling and fibreoptic sensitivity in a real-world setting.
3.1.1 Signal-to-noise ratio
We test the performance of the detection algorithm for different SNR signals using a synthetic data set replicating the Gornergletscher experiment source–receiver geometry (see Section 1.1 for details). We generate a synthetic source with a centre frequency of 100 Hz, at a depth of 10 m below surface with the epicentre shown by the green star in Fig. 5(d). We add Gaussian white noise to the signals with varying SNRs from 1.1 to 10 at all receivers equally, representing typical worst-case and best-case noise levels from experience of real data sets (see Fig. 5g for example signals for one receiver). Here, and for all the synthetic example cases, we only use P waves for detection and location. All other processing steps are identical to the real data examples.
![Synthetic model results for varying SNR. (a–c) Instantaneous normalized coalescence results corresponding to the time of maximum coalescence for increasing SNR signals. (d–f) Same as (a)–(c) except the overall summed normalized coalescence space, providing an estimate of overall spatial uncertainty. (d) Example waveforms used from one channel [see channel location in panel (a)]. (e) Coalescence time-series for varying SNR.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/gji/240/3/10.1093_gji_ggae459/2/m_ggae459fig5.jpeg?Expires=1747890986&Signature=TLKVP9K1aYYPcdEz6vv0qlMzmjdEr7m6f72msThGn9vwQuE2-fgkZhAlLxrwrvqzpO58sob5RcqZajxYNYQ5AvoL-BJsJmj82jRPeQZ4AG5hVnqQczoU5P~cHgbAwCg-Kg2pwUbvqLcyHpO7oQuMwO6peZQy3Q2RF4Vtah-6DtLhGpDpr2yKSWVGZ6-bTeNTwT1YAALSdEKnEqJMOT88h1pVxFEbf1q6SbZEjP0S5EK8WFdwu4huUgsh0L7NK70ibc-MFqzvym6G6Ok3wM6a6WuJvP17kIYLTc3f2VVUSa93alArjW8M508L6N-5AJX2T5DsThvdvxdkK7P~jDeOZg__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Synthetic model results for varying SNR. (a–c) Instantaneous normalized coalescence results corresponding to the time of maximum coalescence for increasing SNR signals. (d–f) Same as (a)–(c) except the overall summed normalized coalescence space, providing an estimate of overall spatial uncertainty. (d) Example waveforms used from one channel [see channel location in panel (a)]. (e) Coalescence time-series for varying SNR.
Figs 5(a)–(f) show the instantaneous and total overall coalescence maps at 10 m depth. The instantaneous maps show how as SNR increases, the coherency of the instantaneous back-migrated wavefield at maximum coalescence becomes more condensed. In this particular case, all three events with differing SNRs would be detected by the algorithm. The coalescence time-series (Fig. 5h) show that SNR has minimal effect on origin time uncertainty (the width of the peak), but decreases the peak coalescence by orders of magnitude. However, the hypocentre location for the lowest SNR event is resolved poorly by the algorithm (darker-shaded (black) star, Fig. 5d) relative to the true location (lighter-shaded (green) star, Fig. 5d). For higher SNRs, event hypocentres are well resolved, agreeing within estimated uncertainty. Overall, Fig. 5 shows that the back-migration algorithm is sensitive to even extreme low-SNR levels (1.1) but location performance suffers as SNR decreases. While this performance is encouraging, we expect results to be poorer for real-world data sets since noise may be spatially coherent, something we have not simulated here. Furthermore, the noise field may be non-uniform due to other factors, such as velocity structure complexity.
3.1.2 Velocity model error
Although Fig. 5 shows how powerful back-migration methods can be for seismic detection, such methods are dependent on knowledge of the velocity structure. Here, we briefly use the synthetic data set to assess the effects of velocity model uncertainty on detection algorithm performance. To do this, we generate synthetics for three velocity models: (1) a reference model comprising a best guess at a homogeneous velocity model for ice; (2) a homogeneous model, perturbed by increasing velocities uniformly by |$+10$| per cent throughout; and (3) a velocity model perturbed by |$\pm 20$| per cent from the reference model cyclically (see Fig. 6e). The latter models represent typical extremes of variations in velocity structure in the real world (Gauntlett et al. 2023). In each case, the velocity model used for back-migration is the homogeneous ice velocity model (model 1).

Synthetic model results for perturbing velocity model errors. (a–c) Results for varying velocity model error, for zero error homogeneous velocity model, uniform |$+10$| per cent error homogeneous velocity model and cyclic |$\pm 20$| per cent heterogeneous velocity model, respectively. (d) Coalescence time-series for panels (a)–(c). (e) Slice through the cyclic |$\pm 20$| per cent heterogeneous velocity model of panel (c).
Velocity model uncertainty perturbation results are shown in Fig. 6. Increasing velocity model uncertainty decreases the maximum coalescence (see Fig. 6d) and increases the spatial extent of the back-migrated peak coalescence in space (see Figs 6a–c). As to be expected, increasing the velocity brings the QuakeMigrate-derived hypocentre (black star, Fig. 6b) artificially closer to the network centroid. Similarly, the estimated spatial uncertainty also reduces as increasing the velocity is equivalent to reducing the spatial extent of the back-migration space. However, even with a |$+10$| per cent homogeneous velocity increase, the true source location is in agreement with the QuakeMigrate-derived location, within uncertainty. This performance breaks down for the |$\pm 20$| per cent heterogeneous velocity model, with the true hypocentre falling outside the QuakeMigrate-derived error estimate (see Fig. 6c). In this case, these velocity error perturbations are extreme and we anticipate that results would be highly dependent on the exact distribution and size of heterogeneities. Overall, the synthetic results of Fig. 6 suggest that increasing velocity model uncertainty results in artificial perturbations to the estimated hypocentre and reduces coalescence and hence detectability, but unless velocity errors are extreme, QuakeMigrate-derived spatial uncertainties appear to sufficiently account for the observed deviations from the true location.
3.1.3 Spatial coverage and fibreoptics-only versus combined fibreoptics and seismometers
Adequate source localization is essential for any back-migration detection method. Theoretically, the better the source localization, the higher the peak energy observed in the space–time search space and hence the more likely an event is to be detected above the ambient noise level. Fig. 7 investigates the importance of network geometry and the spatial extent of seismic wavefield sampling for source localization and hence detection. A synthetic set-up is compared to an approximately identical real-world set-up from Gornergletscher, including source frequency content. To study variations in fibre geometry, we use the same icequake and simply add and remove receivers to simulate differing geometries. Three geometries are studied: two linear segments of fibre (Figs 7a and d), a 2-D fibre grid (Figs 7b and e), and a 2-D fibre grid and 29 seismic nodes placed at fibre intersections and surrounding the fibre deployment (Fig. 7). The linear geometry is typical of many fibre deployments, for example that of the Iceland volcanic eruption data set in this study. The 2-D grid represents a best-case surface deployment fibre geometry.

Importance of spatial coverage for back-migration earthquake detection. Panels (a)–(c) show normalized coalescence maps with increasing network spatial extent for a synthetic event. Lighter-shaded (green) and darker-shaded (black) stars in panels (a)–(c) indicate the true source location and peak coalescence corresponding to the inferred source hypocentre from QuakeMigrate. Panels (d)–(f) show normalized coalescence maps with increasing network spatial extent for an icequake at Gornergletscher, Switzerland. Star in panels (d)–(f) indicates the peak coalescence corresponding to the inferred icequake hypocentre from QuakeMigrate. Black dashed line indicates the corresponding 95 per cent contour. (g, h) Comparison of maximum single-pixel coalescence values through time for the synthetic source results (a–c) and real icequake (d–f).
The synthetic and real-world data are in approximate agreement for the full 2-D DAS grid and when also including nodes. In each case, the source hypocentres are in agreement within uncertainty, with the synthetics resolving the true location in each case. However, while the linear fibre geometry also resolves the hypocentre for the synthetic case (see Fig. 7a), when applied to the real icequake data set the icequake hypocentre is poorly resolved (see Fig. 7d). Instead of resolving the icequake at its likely true location as found from the 2D DAS grid, the icequake hypocentre is instead located at the end of the fibre. We suggest that this is a result of noise impacting the coalescence performance, with artificial focusing of noise at the end of the fibre due to fibre sensitivity effects. Such effects are theoretically described from an integrated sensing perspective in Fichtner et al. (2022). In all cases, the real-world data exhibit smaller coalescence peaks that have a greater spatio-temporal extent, representing greater spatio-temporal uncertainty. This is likely due to a combination of SNR, velocity model error and poorly coupled channel contributions. One further interesting result is that spatial uncertainty increases when seismic nodes are included (Fig. 7e versus Fig. 7f). One might expect the additional information gained by including more receivers to decrease uncertainty. However, we interpret this increase in uncertainty, which is not observed in the synthetic data set (Figs 7b and c), to be caused by additional noise introduced to the system by inclusion of the nodes. The nodes have a different sensitivity to DAS channels and natively measure acceleration, so it is plausible that noise levels might not change linearly with signal levels when including this data.
The results of Fig. 7 show how important maximizing spatial sampling coverage is for source localization and hence detection. For glacier deployments, it is conceivable to deploy dense 2-D geometries, but in other situations it is likely that maximizing spatial coverage will often require the combination of conventional seismometers and/or nodes in addition to fibre.
3.1.4 Stacking and coupling
Two immediate challenges of processing fibreoptic data sets are processing large data volumes resulting from inherently dense spatial sampling and fibre-medium coupling (Hudson et al. 2025). Fig. 8 summarizes the effects of stacking fibre channels to reduce data volumes and accounting for coupling effects by removing poorly coupled channels, for the glacier data set.

Effect of stacking and coupling on detection algorithm performance. (a) Reference coalescence map, using all available fibre channels for detection of an example icequake. (b) 10-fold (10 channel) semblance-stacking result for the same icequake as in panel (a). (c) Detection result for same icequake as in panel (a) except with poorly coupled channels removed. (d) Comparison of maximum single-pixel coalescence values through time for each setup in panels (a) to (c).
The effect of stacking multiple fibre channels is shown in Fig. 8(b). The motivation for stacking is to reduce data volumes and hence increase computational efficiency. The result in Fig. 8(b) applies semblance stacking to every 10 channels (16 m), aiming to preserve spatio-temporal coherency information while spatially downsampling the data. The results show that downsampling the fibre via stacking does not have a significant effect on the hypocentral location but does result in a spatially more constrained peak in the coalescence. This result is interesting since it shows that the semblance stacking not only preserves the coherency information but acts to reduce noise effects between individual channels, enhancing the overall coherency of the solution in space and hence improving detection performance. Semblance stacking therefore not only gives a computational performance gain (both in terms of efficiency and memory usage), but also enhances detection algorithm performance, at least in this instance.
The effect of removing poorly coupled channels on detection performance is shown in Fig. 8(c). Here, we remove channels that traverse crevasses and are hence poorly coupled to the ice and comparatively well coupled to the atmosphere. Not only are these channels therefore approximately insensitive to subsurface seismic energy, but actually have higher noise amplitudes due to atmospheric effects such as wind, for example. The results are remarkably similar to those of Fig. 8(b), with no distinguishable difference in the location of the coalescence peak, but the 95 per cent contour becoming better constrained spatially. One might expect removing poorly coupled channels to have a greater effect. However, we attribute the relatively insignificant change in performance to be a result of a key strength of back-migration, in that poorly coupled channels represent incoherent noise that theoretically should not contribute significantly to event detection. At least in this example, accounting for coupling appears to not be of first-order importance.
3.1.5 Including fibreoptic sensitivity
The influence of accounting for the effects of fibreoptic measurement sensitivity are shown in Fig. 9 for both a glacier icequake (Figs 9a–d) and a volcano-tectonic earthquake (Figs 9e–i). For the glacier icequake, accounting for sensitivity affects the coalescence, including both the peak and extent of the 95 per cent contour, moving the peak and tightening the spatial constraint (see Fig. 9b versus Fig. 9a). The effect is more extreme than the effects of stacking or removing poorly coupled channels (Fig. 8). Accounting for sensitivity for the volcanic earthquake example has a smaller effect. This is despite the geometry being far more linear than in the icequake example. While the 95 per cent contour encloses a smaller spatial extent, the earthquake hypocentre moves only |$\sim 100$| m, relative to the 8 km fibre. We attribute this behaviour to the fact that the somewhat linear fibre geometry here has a highly non-uniform sensitivity to both P and S waves, and so the network is only sensitive to seismic energy from these regions already, so the additional sensitivity constraint we impose has little effect. This is contrary to the icequake example, where the overall network has an approximately homogeneous sensitivity to incoming energy from any source location. We argue that it is therefore debatable whether one should impose any sensitivity constraint in practice, since it is already incorporated into the analysis and if the velocity structure is poorly constrained then the sensitivity maps will also be poorly constrained. However, in any case, plotting up the sensitivity maps for various seismic waves (Figs 9c, g and h) is insightful and should always be compared to the distribution of earthquake hypocentres output from a detection and location algorithm.

Effect of fibreoptic sensitivity on detection performance. (a, b) Coalescence maps for a glacier icequake example, without and with sensitivity compensation, respectively. (c) P-wave sensitivity map for the Gornergletscher network corresponding to the results of (b). (d) Comparison of maximum single-pixel coalescence values through time with and without sensitivity compensation. (e, f) Coalescence maps for a volcano-tectonic example, without and with sensitivity compensation, respectively. (g, h) P- and S-wave sensitivity maps, respectively, for the Reykjanes Peninsula network corresponding to the results of panel (e). For each channel and seismic phase, regions with sensitivities |$\lt 0.1$| are masked for individual receivers. (i) Same as (d) but for the volcano-tectonic example.
3.1.6 A note on the spatial coalescence prominence trade-off
Figs 5 to 8 show the effect of various data properties on the prominence (relative amplitude of maxima relative to background noise levels) of both the temporal and spatial coalescence values. While the coalescence time-series typically always have high prominence, the spatial coalescence prominence varies considerably, with maxima often only 5 to 10 per cent greater than the background values for the marginalized coalescence space (summed over the entire time window, see Fig. 5a versus Fig. 5d, for example). The prominence of the marginalized coalescence space is proportional to the window length. For example, at the extremes, a window length of one sample would give the highest prominence with a marginalized spatial coalescence equal to the instantaneous value (see Fig. 5a) whereas an infinitely long window would give an approximately homogeneous marginalized coalescence space, assuming a homogeneous coherent noise wavefield. There is therefore a trade-off between window length and prominence: too short a time window to sum over and the spatial coherence peak and hence optimal location may only capture a local minimum and underestimate the spatial uncertainty, while too long a time window would reduce the prominence to a level where the event location becomes indeterminable spatially above the noise. While the prominence is low in some of our examples (for example, Fig. 7), the events remain detectable. Therefore, we find that the time window duration does not have a significant effect on detection but can significantly affect hypocentral location. We prefer the philosophy of overestimating the uncertainty and so use longer rather than shorter time windows to perform the marginalization over. However, we recommend that users investigate this trade-off themselves for each data set, since it is highly dependent on seismic source parameters and local noise conditions.
3.2 Generating earthquake catalogues
3.2.1 Glacier
Fig. 10 shows the icequake catalogue from the fibreoptic deployment at Gornergletscher in the Swiss Alps. The majority of these icequakes are likely generated by near-surface crevassing (Walter et al. 2009; Hudson et al. 2020). First, the 2-D fibre geometry results in no apparent bias in the spatial distribution of seismicity. In this example, we use both P and surface waves to contribute to event detection. The surface fibre deployment is sensitive to P waves due to the approximately homogeneous velocity structure of ice, with no shallow slow velocity firn layer present. Such slow velocity layers at glaciers have proven problematic for P-wave detection previously (Hudson et al. 2021). S-wave energy generated by crevassing is expected to be minimal (Hudson et al. 2020), so is not used here. Since the ice column is assumed to be of |$\lt 100$| m thickness and the dominant icequake generation mechanism is expected to be near-surface crevassing, we also use surface waves for detection. The site has several sources of noise, including wind and subsurface fluids that flow through some of the fractures. Coupling directly to glacier ice in such conditions can be challenging, but in this experiment the fibre is generally well-coupled to the ice since the fibre froze in within the first 12 hr of the deployment (Hudson et al. 2025). The quality of coupling can be seen in Fig. 10(b), with most channels showing clear P- and surface-wave arrivals for an icequake, but with a number of channels showing only noise where they are suspended above a crevasse (at |$\sim 470$|, |$\sim 750$| and |$\sim 970$| m, for example). Here, all channels are included for detection. This is based on the finding that the algorithm is unaffected, at least to first order, by poorly coupled channels with spatially uncorrelated noise (see Fig. 8c).

Glacier example from a crevasse field at Gornergletscher, Switzerland. (a) Gornergletscher icequake catalogue (2023-10-22T04:00:00 to 2023-10-22T10:00:00). Red stars are icequakes, black triangles are DAS channels and gold triangles are seismic nodes. Background images are from an unmanned aerial vehicle during field deployment and from Swiss Topo (accessed: 2024 June 10). (b) Example icequake signal measured on the fibre. Black-white colour scale is normalized strain. Red and yellow points and error bars show P-wave and surface-wave arrival time picks and associated arrival time uncertainty estimates, respectively, by QuakeMigrate. (c) Phase arrival picks for P waves (red) and surface waves (yellow) for three channels in more detail. (d) Example of P-wave arrival detected on a seismic vertical single-component node for comparison.
173 icequakes are detected in six hours. While there are likely more icequakes in the data set, we opt to lower the detection threshold only to a point where we are confident that we minimize false detections. Figs 10(b)–(d) show results for one icequake. The detection algorithm generally picks both P and surface wave first breaks where one would manually identify them, after accounting for uncertainty, although uncertainties are large (of the order of the dominant surface-wave period). Seismic energy arriving at one of the nodes is also shown in Fig. 10(d), evidencing that the detection algorithm performs adequately on both conventional and fibreoptic data in combination, one of the key aims of this work.
3.2.2 Volcanic eruption
Fig. 11 shows an earthquake catalogue during one episode of the ongoing Sundhnúkagígar eruption, Iceland. The dark fibre interrogated during this experiment has a somewhat linear geometry, typical of many dark fibre geometries, following a road from a geothermal power plant to the coast. Both the geothermal plant (at the fibre origin) and the coast (at the far end of the fibre) generate coherent noise. Coherent noise sources can affect the performance of the detection algorithm if they contain energy within the bandwidth of interest. These noise sources detrimentally affect phase arrival identification from 0 to 300 m and beyond 7900 m along the fibre. The presence of both natural and anthropogenic noise, along with the numerous earthquakes that occur over the time period make this data set an ideal case study.

Volcanic eruption example from Sundhnúkagígar on the Reykjanes Peninsula, Iceland. (a) Earthquake catalogue from one eruptive period (2023-12-18T00:00:00.0 to 2023-12-20T00:00:00.0). Fibre receivers are shown by gold line, seismometer receiver is shown by gold triangle, earthquakes detected and located in this study are red stars and matched Icelandic Met Office events are shown by blue stars. Digital elevation model is from the ArcticDEM (Porter 2023). (b) Example earthquake signal measured on the fibre. Black–white colour scale is normalized strain rate. Red and blue points and uncertainty bars show P-wave and S-wave arrival time picks and their associated estimated uncertainty, respectively, from QuakeMigrate. (c) Phase arrival picks for P-waves (red) and S-waves (blue) for three channels in more detail. (d) Example of P-wave and S-wave arrivals detected on a conventional seismometer operated by IMO, for comparison.
The fibreoptic data combined with a single seismometer detects 886 earthquakes within the region shown in Fig. 11 a on the 2023 December 18, compared to 826 earthquakes detected within the same region by the permanent regional monitoring network (operated by the Icelandic Met Office). However, although the energy from the earthquakes coalesces sufficiently to make detections, the locations typically remain poor. We expect the majority of seismicity to align with the opening rift (Sigmundsson et al. 2024), but instead find that the seismicity clusters near one end of the fibre. This is likely for several reasons. First, many of the earthquakes detected are close to the noise level, affecting the accuracy of individual channel arrival time picks. Secondly, and likely more significantly, the geometry of the fibre provides poor location constraint (see Figs 9c and d). This is partly due to poor azimuthal coverage but also likely the result of low sensitivity to P waves from certain regions of the seismically active rift. Poor azimuthal constraint likely results in poor locations, primarily because the regional velocity model used is likely faster than the true shallow velocity structure.
Even though locations are poorly constrained, phase arrival times for some events are promising. For example, the earthquake shown in Figs 11(b)–(d) shows P and S arrival time picks close to the first break, with realistic uncertainties, even though the P-wave amplitude is close to the noise level. This is particularly clear in Fig. 11(c). It is surprising that one can even observe P-waves on the fibre at all, given the crustal setting. However, we attribute this to be due to the fibre being deployed within metres of the bedrock, removing much of the effect of a steeply varying near surface velocity gradient that would otherwise refract P-waves towards vertical incidence. The earthquake arriving at a conventional seismometer also included for detection is shown in Fig. 11(d), again confirming the promising performance when processing hybrid fibreoptic-seismometer data sets.
While it is encouraging that the detection algorithm works well even when locations are poorly constrained, our findings from this data set illustrate the challenges associated with using fibreoptic data from dark fibres for source localization.
3.2.3 Geothermal borehole
Fig. 12 shows an earthquake catalogue for one hour during a stimulation test at the Utah FORGE experiment. The deployment is typical of many borehole DAS deployments, with only a vertical fibre cemented into a well, with seismometers deployed at the surface. For the hour of data we analyse, the closest surface seismometers to the injection well have too high noise to observe any subsurface seismicity. We therefore use only one seismometer, FORU, in combination with the downhole fibreoptic data.

Geothermal borehole example from the Utah FORGE experiment, US. (a) Earthquake catalogue from a one hour period during stimulation (17:00 to 18:00 on 2019 April 27). Receivers are shown by gold triangles and earthquakes detected and located in this study are red stars. Digital elevation model is from the NASA Shuttle Radar Topography Mission. (b) Example earthquake signal measured on the fibre. Black–white colour scale is normalized strain rate. Red and blue points and uncertainty bars show P-wave and S-wave arrival time picks and their associated estimated uncertainty, respectively, from QuakeMigrate. (c) Phase arrival picks for P waves (red) and S waves (blue) for three channels in more detail.
We detect 135 earthquakes, compared to 125 earthquakes detected using a combination of standard methods and matched filter processing with a string of 12 borehole geophones (Dzubay et al. 2022). Most earthquakes are present in both catalogues (96 out of 125). One should note that comparing numbers of earthquake detections can be misleading, since one can lower detection thresholds and detect many more earthquakes or vice versa. Obviously one can check candidate events individually, but this is not feasible for data sets with thousands of earthquakes. Here, we try to avoid this by setting the detection threshold to a level where we minimize false detections while still detecting as many real events as possible. However, it is still worth emphasizing that the detection algorithm of this study detects a comparable number of earthquakes to the most sensitive detection method possible, a matched filter method. Although there is no ground truth, we deem earthquake locations to be poor, with a portion of the earthquakes detected locating far from the stimulation well located at approximately |$38.5^o\mathrm{ N}$|, |$112.9^o\mathrm{ W}$|. Similar to the volcanic eruption example, we attribute poor location constraint to be dominantly caused by network geometry. If more surface instruments were usable, locations would likely be constrained better.
Although locations are interpreted to be poor, this does not translate to poor earthquake detections, consistent with our findings from the volcanic eruption results. This is demonstrated for the example earthquake shown in Fig. 12(b). Both P- and S- wave phases are identified with small uncertainties compared to the other data sets presented. This is likely due to a combination of good coupling of the cemented fibre, as well as perhaps lower noise levels in the subsurface and a relatively homogeneous velocity structure in the vicinity of the monitoring well. S-wave arrivals may be incorrectly identified in some cases (see Fig. 12c), but this is to be expected given the noise levels and/or P-wave coda.
4 DISCUSSION
4.1 Benefits of combining fibreoptic sensing and conventional instrumentation for hybrid seismic detection
4.1.1 Maximizing spatial sampling extent and density
Including fibreoptic data for earthquake detection is an obvious way to enhance spatial sampling density. However, as we show in the results, enhancing spatial sampling density alone does not necessarily equate to enhanced earthquake detection performance. Where receivers are placed geographically may be of similar importance as number of receivers deployed (Toledo et al. 2020; Strutz & Curtis 2024). Fibreoptic deployments fall into two categories: fibreoptic cables deployed specifically for a seismological application versus interrogation of existing dark-fibre telecommunication networks. While the first category allows one to design an optimal network geometry, one has no influence of the geometry for the second category. Fibreoptic geometries can severely limit back-migration-based detection methods (e.g. it is impossible to uniquely back-migrate energy from a linear fibre geometry). Overcoming such issues is only possible by including other data, for example, conventional seismometers, which will enhance spatial coverage in almost any scenario.
In summary, maximizing spatial coverage requires using data from as many receivers as possible. Back-migration algorithm performance is then improved by converting all data into the same units (velocity) that both homogenizes frequency content as well as reduces noise and heterogeneous local velocity structure effects.
4.1.2 Exploit signal coherency
The back-migration method inherently exploits coherency, which introduces both benefits and challenges when including fibreoptic data sets. One benefit of exploiting coherency is associated with coupling of fibre to the medium. Ideally, one would quantify coupling and remove poorly coupled channels from any analysis. However, quantifying coupling in fibreoptic deployments remains challenging. Instead, here we simply capitalize on the assumption that poor coupling results in incoherent noise that cannot be back-migrated. Poorly coupled channels will reduce the overall maximum theoretical coalescence value, since the channels may not contribute but crucially poor coupling will not contribute to false triggers. However, the greatest benefit of exploiting signal coherency is the detection of signals with an SNR as low as 1.1 (see Fig. 5). In this instance, it is challenging to see any signal in the synthetic data yet the event can clearly be detected, even if location constraint is imprecise. The vast increase in number of receivers provided by DAS compared to conventional instrumentation increases this benefit. Given that individual DAS channels are typically noisier than individual conventional receivers, back-migration methods could play an important role for earthquake detection using DAS.
4.1.3 Maximizing sensitivity to multiple seismic phases
Using more than one seismic phase reduces the non-uniqueness of the nonlinear earthquake location problem, an essential component of any back-migration method. The single-axis measurement of DAS limits its sensitivity to multiple seismic phases. In certain data sets with a strong shallow velocity variations (Hudson et al. 2021), the fibre is only sensitive to a single seismic phase. However, in all the data sets presented here we can identify at least two seismic phases associated with the seismic sources. In the surface deployments, individual channel sensitivity to particular phases varies considerably, but overall combined sensitivity maps (see Fig. 9) suggest that sensitivity is not as big an issue as previously suggested. Although we provide ways to deal with variations in channel sensitivity here, for example, masking regions of low sensitivity, we find that sensitivity is not as important as other considerations.
4.1.4 Maximize computational efficiency
Back-migration detection methods are inherently computationally expensive compared to simpler, receiver-by-receiver detection methods. However, QuakeMigrate runs in approximately real-time for the experiments presented here (using 8 processors on an Apple M3 Pro CPU). These efficiencies are primarily driven by three contributions: computing lookup tables only once for entire data sets; reading in continuous seismic data in blocks rather than entire files, minimizing read–write operations; and implementing the core back-migration step in the pre-compiled C language. To minimize the additional computational expense of including fibreoptic data, we only compute sensitivity lookup tables once, and perform lookup table masking without the need to subsequently store sensitivity information independently in memory. Secondly, we support reading of a number of native DAS data formats (hdf5, segy, etc.) directly, which are typically split into small, one minute duration files. One can then simply run QuakeMigrate over minute long time windows, optimizing costly read–write processes and memory usage.
4.2 Practical considerations influencing earthquake detection with fibreoptic sensing
The findings of this work emphasize a number of practical considerations affecting earthquake detection using fibreoptic sensing. While we focus on local microseismicity detection using a back-migration method, most of the points below also hold for earthquake detection using fibreoptic sensing more generally.
The spatial sampling extent of the seismic wavefield plays the most important role in earthquake localization, yet back-migration based detection still performs successfully in practice, even if the coherent source of energy is poorly constrained spatially. Specifically, the geometry of the fibreoptic cable plays an important role, evidenced by the glacier 2-D grid deployment of fibre detecting and locating events better than highly linear fibre geometries such as the volcanic eruption data set, for example. Extrapolating from this, we expect 3-D fibre geometries to further improve performance. In practice, many fibreoptic cable geometries are limited by practical constraints, and so combining fibreoptic data with conventional receivers provides an optimal alternative.
Fibreoptic sensing inherently provides dense sampling of the seismic wavefield. Decimation of this data is important for increasing the computational efficiency and minimizing memory usage in practice. We find that using semblance stacking (Porras et al. 2024) to decimate the data preserves wavefield coherency information. Such stacking is applicable where fibre geometries have linear sections at least of the order of the spatial decimation/stacking length.
Conversion from native strain or strain rate to velocity enhances performance, for fibre geometries with substantial linear sections or low curvature. For example, we find a phase pick accuracy gain for the volcanic eruption example but not the glacier example. This gain is primarily attributed to dampening sensitivity to subsurface heterogeneities (Capdeville & Sladen 2024) but also the isolation of integration noise.
Compensating for fibreoptic sensitivity can provide additional spatial constraint of coalescence peaks, but the benefit is limited. This finding is important because it implies that sensitivity may not be a major practical concern for earthquake detection for most deployments.
There exists an optimal spatial resolution of the 3-D search grid. We introduce this specific back-migration concept here, since it can play an important role in our detection algorithm performance. Specifically, too high a resolution grid results distributes coherent energy between more grid cells, due to velocity structure uncertainty, resulting in lower peaks in coalescence. Conversely, too low a resolution grid results in high coalescence values of a single grid, but risks not isolating coherent earthquake signals from coherent or even incoherent noise. Optimal grid cell resolutions for each example in this work are given in Table 1.
There also exists an optimal moving time window with which to process the data. Again, this point is a specific back-migration concept, but important for users of the detection algorithm presented here. Specifically, the time window over which one processes phase arrivals must be sufficiently long that the uncertainty in phase arrivals can be adequately estimated, while being sufficiently short to eliminate phase mis-association. The length of this time window also affects the prominence of the marginalized spatial coalescence, and hence the estimate of the optimal hypocentre and associated spatial uncertainty. The time window is highly dependent on the seismic source frequency content, distance to receivers, and local noise levels, so we recommend users always tune the time window for each unique data set.
Back-migration is sensitive to velocity model error. Although our synthetic results (see Fig. 6) suggest that velocity model errors do not majorly effect event detection, we cannot quantify the effects on real-world data. The dense spatial sampling provided by fibreoptics and the measurement of strain/strain rate rather than displacement/velocity means that such data is more sensitive to local velocity heterogeneities than conventional networks (Capdeville & Sladen 2024). Typically, earthquake detection might be one of the first analyses of a particular region, with these earthquakes needed for body-wave tomography approaches to image subsurface velocity structure more precisely. Best practice might be to first use ambient noise tomography to estimate velocity structure before performing earthquake detection. However, in the absence of any prior velocity model knowledge, we would recommend using as simple a model as possible so that at least any velocity model error effects can be identified as easily as possible.
4.3 Comparison to alternative approaches
The greatest strength of fibreoptic sensing is the orders-of-magnitude denser spatial sampling over conventional instrumentation. Coherency-based back-migration methods such as that described in this study by definition capitalize on coherency between these densely sampled channels in a way that receiver-by-receiver detection methods cannot. Receiver-by-receiver STA/LTA algorithms are susceptible to triggering off incoherent noise. Receiver-by-receiver machine-learning-based phase arrival detection methods are theoretically less sensitive to incoherent noise since they approximately learn to identify the noise field, but this requires the availability of a training data set of existing detected earthquakes. Furthermore, all receiver-by-receiver methods struggle with phase association. Although machine-learning derived phase associators exist (Ross et al. 2019), we find that they do not always perform adequately for local seismicity (within 5 km of a network). The only benefit of receiver-by-receiver methods over back-migration is computational efficiency. However, all the examples presented in this work run faster-than-real-time on a standard computer (8 processors on a Apple M3 Pro CPU). Surpassing the faster-than-real-time benchmark is key for any seismic monitoring application, with any additional gains only a bonus.
Some promising advances in detecting earthquakes using fibreoptics have been made by assessing coherent energy arriving at many fibre channels simultaneously. These include explicit methods such as assessing the curvature of arrivals on linear fibre using semblance methods (Porras et al. 2024), and implicit methods that use machine-learning image recognition algorithms (Stork et al. 2020; Huot et al. 2022) to identify similar features. Although these methods are promising, typically detecting events close to the noise level, there are a number of limitations. First, explicit semblance-based methods require specific, linear fibre geometries, or at least substantial lengths of linear fibre. It is unclear how sensitive machine-learning image recognition algorithms are to fibre geometry, but current implementations would have to be retrained for every new deployment or new fibre geometry. A further limitation is that it is challenging to conceive of ways to include both fibreoptic and conventional receivers in such detection frameworks. Huot et al. (2024) lay foundations for combining fibreoptic and conventional receivers in a machine learning algorithm by running two binary logistic regression models in parallel and identifying events detected by both. They find that this significantly reduces false triggers compared to only using the fibreoptic data, emphasizing the importance of combining data from all available receivers where possible. However, unlike the back-migration method we present here, such methods do not yet fully couple the information provided by multiple receiver types, and therefore do not yet fully realize the associated performance gain.
4.4 Perpetual challenges
A number of challenges remain for using fibreoptics for earthquake detection. Particular challenges that are difficult or even impossible to overcome are as follows: First, fibreoptic cables are sensitive only to in-axis strain. This limits sensitivity to multiple seismic phases, especially in the presence steeply varying shallow subsurface velocity gradients (Hudson et al. 2021). Helically wound fibre has the potential to overcome this issue (Baird 2020), but still represents a pseudo-1-D measurement, is costly, and is never deployed in telecommunications networks. Secondly, fibre-medium coupling remains generally both poorly constrained and for some experiments uncontrollable (Paitz et al. 2021). Recent work allows one to quantify expected coupling response (Celli et al. 2024) and if a fibre is buried, frozen-in or cemented in situ, then coupling is approximately perfect. However, for fibres deployed on the surface or deployed in conduits, for example, dark fibres, coupling is challenging to quantify. Thirdly, our results emphasize the importance of fibre geometries and spatial extent, yet at least for dark fibre deployments one has little or no control over the deployed geometry. Overcoming this issue may be possible by interrogating many fibres in dense, urban environments, but will remain a challenge in rural or subsea regions. Fourthly, data volumes remain challenging. Experiments can generate 100s GB to TBs of data per day. Downsampling data will therefore become essential with increasing deployment duration and/or increased deployment spatial scales. The semblance stacking component described in this study may allow on-the-fly decimation while preserving some directivity information. A final challenge of note is near-field coherent noise, for example, roads or train lines (Dou et al. 2017; Lindsey et al. 2020b; Bozzi et al. 2024). The back-migration method we present would be sensitive to coherent vehicle noise that would likely prevent simultaneous earthquake detection. A key benefit of fibreoptic sensing is measuring the seismic wavefield in urban environments, so going forward this challenge would have to be addressed for urban earthquake detection.
4.5 Future directions
The aforementioned challenges inspire directions for future work. Recording and processing large data volumes (TBs) is non-trivial and limits real-time earthquake detection using back-migration style algorithms. A possible avenue for reducing data volumes could be applying compressive-sensing based approaches (Muir & Zhan 2021) to effectively reduce the number of fibre channels used while retaining sufficient information to detect seismicity. In a similar vain, using non-uniform or cascading coalescence search grids, inspired by those used for earthquake location and tomography (Lomax & Curtis 2001; Thrastarson et al. 2024), could optimize memory usage and compute expense while refining peak coalescence values and hence detection. Here, we only use body waves, except for the glacier data set where surface waves are assumed to be non-dispersive. In most settings, surface waves are dispersive, making back-migration more complex. One method might be to back-migrate surface wave components with different frequencies through different velocity models. However, we leave such an adaption as an area for future work. Another avenue of future work is reducing or removing noise. Recent advances applying machine learning to minimize instrument and/or coherent noise (Martin et al. 2018; Batson & Royer 2019; Lapins et al. 2023; Ma et al. 2023; van den Ende et al. 2023) could readily be applied to our detection workflow, improving detection performance. A remaining endeavour is to perform masking of coherent noise sources. For example, certain regions of the coalescence search grid may correspond to roads that act as temporally varying noise sources. One could envisage adapting our detection method to remove parts of the wavefield corresponding to a coalescence of energy from these locations, in a similar approach to that taken to mitigate fibreoptic sensitivity in this work. Simultaneously arriving lower SNR earthquakes may then be detectable even in the presence of higher local coherent noise sources.
ACKNOWLEDGMENTS
We thank the editor, C. Tape, the reviewer A. Lellouch and an anonymous reviewer for their constructive comments that no doubt improved the manuscript. For the Gornergletscher data set, the Sintella interrogator used was kindly borrowed from M. Kendall at the University of Oxford. We also want to thank T. Kettlety for helping set up the interrogator. F. Walter, E. Wolf, R. Cross and E. Julen provided support in the field, without which the experiment would not have been possible. The fieldwork was funded by a Leverhulme Early Career Fellowship (ECF-2022-499) and Oxford University’s John Fell Fund (0013666). A philanthropist kindly provided logistical support through use of a vehicle for the Gornergletscher fieldwork. For the Iceland DAS experiment, we thank HS Orka, operator of the Svartsengi Geothermal Power Station, for access to the fibreoptic cable. OF was supported by FNRS (Fonds de la Recherche Scientifique, contract number: FC 49429). For the Utah FORGE experiment, we thank the project partners for making the data openly accessible.
DATA AVAILABILITY
Earthquake catalogues for each data set are provided, along with a full working example of how to run the modified QuakeMigrate package, via an archived online repository (Hudson 2024a) (https://doi.org/10.5281/zenodo.13355664). The exact DAS modified version of the QuakeMigrate software is also archived online (Hudson 2024b) (https://doi.org/10.5281/zenodo.13355775), which will be merged with the main QuakeMigrate repository. Data for the example earthquakes from the Gornergletscher glacier and the Iceland volcanic data set are also provided in the same repository. Also provided in this repository are jupyter-notebook examples of running the modified version of QuakeMigrate. FORGE DAS data are available from US DOE Geothermal Data Repository (https://doi.org/10.15121/1603679) and seismometer data from the University of Utah (https://doi.org/10.7914/SN/UU), accessed from Incorporated Research Institutions for Seismology (https://ds.iris.edu/mda/UU/).