Including stress relaxation in point-process model for seismic occurrence

Physics-based and statistic-based models for describing seismic occurrence are two sides of the same coin. In this article we compare the temporal organization of events obtained in a spring-block model for the seismic fault with the one predicted by probabilistic models for seismic occurrence. Thanks to the optimization of the parameters, by means of a Maximum Likelihood Estimation, it is possible to identify the statistical model which fits better the physical one. The results show that the best statistical model must take into account the non trivial interplay between temporal clustering, related to aftershock occurrence, and the stress discharge following the occurrence of high magnitude mainshocks. The two mechanisms contribute in different ways according to the minimum magnitude considered in the data fitting catalog.


INTRODUCTION
The description and understanding of seismic phenomena is fascinating not only to add a piece to basic physics knowledge, but also for the social impact it can have.Seismic prediction is a longstanding and debated topic in the geophysical community, with different scenarios, ranging from the extreme position of a completely random unpredictable (Poisson) process, up to deterministic predictions [1].The difficulty in answering this debate can be attributed to the complexity of the earthquake process, presenting different mechanisms acting on very different spatial and temporal scales [2].Currently, epidemic-like probabilistic models [3][4][5] have given and continue to provide a fundamental contribution to seismic forecasting.These class of models are able to capture the spatio-temporal clustering of earthquakes, with the majority of earthquakes, usually termed aftershocks, occurring soon after large earthquakes and close in space to their epicenter.Within these models seismic hazard increases after the occurrence of large earthquakes.Nevertheless it is not possible to achieve information on the timing of the next large earthquake in a given region because magnitudes are randomly assigned to each earthquake from the empirical magnitude distribution and does not depend on the previous seismic history.On the other hand, physical models originally implement the idea that an earthquake is the sudden relaxation of the stress accumulated over years or decades in such a region.Accordingly one could expect that the region experiencing the largest slip also relaxes most of the accumulated stress and therefore is less prone to host a new earthquake, as in the original hypothesis of alternation proposed by Gilbert in 1894 [6].Subsequent models, developed along this direction, assume that consecutive earthquakes substantially re-rupture the same fault segment, nucleating characteristic earthquakes which are roughly equal in size and roughly periodic is time.These models usually termed seismic gap or seismic cycle models [7,8], propose the possibility of quasi deterministic prediction of the timing of the next large earthquake.Testing of the seismic gap hypothesis have shown its inefficiency [9,10] even if it is still sometimes used in earthquake forecasting [11].Furthermore the seismic gap model does not provide any justification for the occurrence of aftershocks which conversely, as anticipated, is a very distinctive feature of seismic catalogs.Nevertheless, aftershock occurrence, and as a consequence also short-term clustering, is not incompatible within the seismic gap hypothesis if aftershocks are located outside the region involved during the mainshock slip or, at most, in regions with low levels of the mainshock slip.This feature has been documented for 101 large subduction zone plate boundary mainshocks with well determined coseismic slip distributions [12].This study has indeed revealed a deficit of aftershocks inside the mainshock slip area, consistently with the hypothesis of large slip areas re-locking.As a further indication of compatibility between the seismic gap hypothesis and aftershock occurrence is the observation that larger aftershocks occur farther away than smaller aftershocks [13].Interestingly, the interplay between the stress accumulation/relaxation mechanism underlying the seismic gap hypothesis and the occurrence of aftershocks described by epidemic models, has been enlightened in a springblock model for the seismic fault.The model, is a generalization of the OFC model [14], which represents a cellular automaton version of spring-block model for seismic fault [15].In particular, the considered model, defined afterslip relaxation OFC (AROFC) model, belongs to the class of models [16][17][18][19][20][21][22][23][24] which add a relaxation mechanism introducing aftershock occurrence within the OFC model [14].In particular the AROFC model describes the seismogenic fault as an elastic layer, like in the OFC description, and it takes into account its coupling with a more ductile layer where stress is gradually relaxed by a post-seismic deformation, usually termed afterslip [25,26].The model is very efficient in quantitatively reproducing the main statistical features of seismic catalogs, as well as to explain the connection be-tween aftershocks and afterslip [22,24] or the effects of Moho depth on magnitude distribution [27].Interestingly, in the AROFC model [28], even if the higher seismic rate is observed after the occurrence of large mainshock, some features of the seismic gap hypothesis are recovered, with large earthquakes usually occurring in gap regions.Nevertheless, the timing of mainshocks is irregular and weakly depends on the time delay of that region from the previous slip.Recent results therefore show that epidemic description and gap hypothesis, in a weaker form, can coexist.The main purpose of this study is to combine the two mechanisms in a statistical model for seismic occurrence which, on one side, preserves the epidemic description and, on the other side, keeps into account the stress accumulation/relaxation mechanism beyond the seismic gap hypothesis.More precisely we show how the two scenarios can be combined in various statistical models [29][30][31][32].The problem is that an accurate statistical validation of the different models is not possible in instrumental seismic catalogs since the occurrence of large earthquakes, with overlapping slip regions, is a rare event.For this reason we search for the statistical model which better reproduce the temporal organization of earthquakes in the AORFC model, which conversely allows us to generate arbitrarily long simulated catalogs for a full statistical validation.More precisely we identify the best parameters for each model via a Maximum Likelihood Estimation (MLE) finding a non-trivial dependence of the Log-Likelihood function on the minimum magnitude of the catalog.The paper is organized as follows.Section II is an overview of the AROFC model that is used for the generation of the synthetic catalog.The section III is dedicated to the introduction of the various statistical models that are used subsequently.The optimization of the parameters by means of MLE and the comparison of the efficiency of the different models in fitting the data, are carried out in the following sections.The last section is devoted to conclusions.

II. THE AROFC MODEL
The model is composed by two layers: a layer H and a layer U , which mimics the brittle and the ductile part of the fault, respectively.The layer U is driven by the tectonic dynamics at the velocity V D and is elastically coupled to a layer H.Each layer is an object that can be considered as a set of blocks organized on a lattice of size L x × L y .For simplicity, we assume V D directed along the x-direction and, therefore, the displacement is confined along x, defined h i (t) in the layer H and u i (t) in the layer U .We assume a short-range interaction where each block is elastically connected only with its nearest neighbor blocks.Within this approximation, the total stress on block H at position i can be divided into two contributions: the intra-layer stress f i = k h ∆h i and the inter-layer stress g i = k(u i − h i ).In the last expressions, ∆ is the discretized Laplacian, whereas k h and k represents the elastic constants of the inter-layer and the intra-layer interactions, respectively.Regarding the friction force, we assume two different forms.For the brittle fault H we adopt a Coulomb failure criterion: as soon as the total force f i + g i overcomes a local frictional threshold f th i , the position h i becomes unstable and the i−th block performs a slip of fix quantity ∆h.The stress evolves as follow: where The first quantity takes into account the coupling between the two layers, while the second one is the dissipation.The quantity k h ∆h = ∆f represents the stress drop and it is extracted from a Gaussian distribution with average ∆f and standard deviation σ.For the ductile layer, we assume a velocity-strengthening friction, taking the stationary form of the rate-and-state friction law , where σ N is the normal stress, µ c is the friction coefficient when the block U slides at the steady velocity V D and A > 0 for a velocitystrengthening material.The slip ∆h of h i induces a coseismic slip q 0 ∆h of u i and q 1 ∆h of the blocks u j (where j is a nearest neighbor of u i ).When the total stress acting on i-th block of layer H exceeds the local frictional stress threshold f th i , i.e., when f i + g i ≥ f th i .The in-depth explanation of the dynamics of the model was studied in [22][23][24].In this article we simulate a fault with system size L y = 200 and L x = 400 with absorbing boundary conditions.An earthquake is defined as a series of slips starting from an initial instability of the block i, whose position defines the epicentral coordinates of the earthquake and terminating when f i + g i ≤ f th i .Taking into account that the i-th block can slip more than one time during an earthquake, we introduce the quantity n k (i) for the number of slips performed by the i-th block during the k-th earthquake.The seismic moment M k released during the k-th earthquake can be defined as , where the sum extends over all lattices sites.We finally define the moment magnitude m k = (2/3) log 10 M k , where we have set to zero the arbitrary additive constant.Therefore, with the model described, it is possible to obtain as output a collection of the magnitude of events as a function of time, called the seismic catalogue.We want to emphasize that this model reproduces all the statistical laws of earthquakes correctly, such as Gutenberg-Richter law, Omori-Utsu law, m vs Log(A)scaling, etc. [23].
We simulated two synthetic catalogues: the first, CAT 1 , consists of N = 1E6 events with no constraint on the seismic moment; the second, CAT 2 , is a subset of CAT 1 obtained by considering only earthquakes with a seismic moment M less than 1000.In particular in Fig. (1) the seismic moment versus the occurrence time is shown.Here, each symbol corresponds to an event with occurrence time t and moment magnitude M k .It is immediate to observe the presence of correlated sequences, called clusters, in the catalog.Each cluster is composed by a mainshock (defined as the biggest event of the sequence) and some aftershocks/foreshocks (all the events occurred after/before the mainshock).It is common to observe a cluster with the first event bigger than all the others, this case is not a pathology of the model, but represents a single seismic sequence without the occurrence of foreshocks, which is often observed in instrumental catalogs.

III. SELF TRIGGERING AND STRAIN-RELEASE MODELS
A Point Process (PP) is a very popular mathematical model used in seismology, especially for earthquake forecasting, implementing the assumption that an earthquake is an isolated point in time.PP can be defined by means of the number of events N occurred in a certain time interval ∆t.If this time span is small, N can assume a null value (event not happened) or a unit value (event happened).Therefore the process is univocally determined by the probability of observing an event in a certain interval ∆t conditioned to the previous history H t .The limit then represents the probability density (usually called "intensity function") of observing an earthquake at a certain time t, given its history up to time t.Forecasting models also consider a point-process description for the spatial occurrence of earthquakes.In this study, for simplicity, we only consider temporal models.

A. Self-Correcting Models
In the classical rebound model, one assumes that stress slowly builds up to the breaking point when an earthquake occurs.The rupture of the fault following this event, on the other hand, resulted in a reduction in stress.In general, the conditional intensity rate of this model can be written as: where G[0, t) is an history-related function and F (t) is the stress loading which, in general, is considered a linear increasing function of the time.The exponential form is chosen to ensure positivity of the rate λ SR .The Stress Release Model (SR) belongs to the selfcorrecting models introduced by [30].In practice, the elastic strain is accumulated due to the long-term tectonic loading, and is released when it exceeds a certain threshold level.After the release of the stress, a certain period is needed to the re-accumulation of the stress and the genesis of a subsequent event.Therefore Eq.( 3) for a SR model, as proposed by [30,[33][34][35], can be written as where F (t) = α + βt is a linear function of the time, representing the tectonic loading, whilst ξS[0, t) is related to the reduction of stress due to the past history.
Indicating by M i the seismic moment of the i-th event, S[0, t) = i:ti<t S i , where S i is the stress released by the i-th earthquake, and the sum extends to all events occurred at t i < t.A reasonable hypothesis [36,37] is that S i ,is proportional to square root of the energy released Therefore, the fitting parameter set for SR model in Eq.( 4) is Θ SR = (α, β, ξ).Sometimes a simplified version of the SR model is used in the description of seismicity.In particular, in refs.[32,38] is has assumed that, on average, the total stress released is proportional to the number of occurred earthquakes, leading to S[0, t) = N [0, t), where N [0, t) denotes the number of earthquakes in the interval (0, t].With this position, we obtain The fitting parameter set is the same of the SR model and, from now on, we call this model simply Self-Correcting (SC) model since it does not take into account the energy released by previous earthquakes.

B. Triggering models
Self-exiting point process (SEPP) have been also used to model the seismic occurrence and for seismic forecasting.A typical Hawkes process [39] can be written as where µ(t) represent the background seismicity and ν(t) the triggering intensity function, in practice, ν controls the clustering density.For the functional form of ν(t), many proposals have been made [3,31,[40][41][42][43].
Eq.( 6) describes seismicity as a branching process where each earthquake can have aftershocks, and these can have their own aftershocks, leading to a cascade seismic process.
In this paper we focus on two types of triggering form.
The first one, suggested by the elastic aftereffect theory, gives for the probability density λ T R1 (t|H t ) where we have placed µ(t) = e α and ν(t) = φe −θt .The second one is the ETAS model which implements the Omori-Utsu law The parameters to fit in the triggering models depend on the functional choice of ν.

C. The ETASLC model
We propose to combine triggering with self-correcting models to obtain the so-called Epidemic Type Aftershock Sequence Long-term Correcting (ETAFLC) model.Analytically, two terms of the conditional intensity rate must be modified.The first step is to set the background seismicity µ(t) of the triggering models with the temporal stress loading proposed in Eq. (7), that is while, the stress discharge mechanism after a large earthquake can be reasonably implemented in the functional form of ν(t) simply subtracting a constant quantity ξ, which represents the new parameters to be fitted as in Eqs.(4,5).
Overall, putting all this information together, employing the exponential triggering form, the conditional rate of the ETAFLC model can be written as: while, implementing the Omori-Utsu law for the triggering part (11) It is evident that the constant ξ, present inside the sum, is responsible for a correcting term proportional to the number of earthquakes occurred up to time t, playing a similar role of N [0, t) in Eq.( 5).

IV. LOG-LIKELIHOOD (LL) MAXIMIZATION
For asynchronous data, the likelihood for a nonstationary Poisson process, characterized by an intensity function λ(t), is (12) where Y = (y 1 , ..., y N ) are the N observed data, t i is the occurrence time of the i-th event in the time interval [t in , t f in ] and Θ is the parameter set.The optimization of the parameters, for both models, is carried out by a Maximum Likelihood Estimation (MLE) [4,[44][45][46].The parameter optimisation procedure is carried out via a component-wise Markov-Chain-Monte-Carlo (MCMC) procedure.The MLE algorithm used is presented in App.A.The model that best reproduces the data of the simulated catalog is the one with the minimum AIC value.The AIC value of a certain model is AIC = 2k − 2LL, where k is the number of the parameters in the model.In practice, this criterion avoids under-fitting by choosing the model with the highest LL, but assigns a penalty for each additional parameter.

V. COMPARING MODEL AND SIMULATIONS
For the AROFC catalogue, the parameter estimates for the different fitting models are reported in Tabs.(I,II), the set up values for the MCMC algorithm are shown in Tab.(III) in the App.B.In Figs.(2,3,4,5,6,7) we plot the evolution of parameters during the optimization parameter, for all the models considered.The algorithm is carried out with 10 6 Monte Carlo steps and it is also very clear that the algorithm converges to a stationary value for all parameters and all models.We first analyse the complete catalogue and results of Tab.(I) clearly show that the triggering model is better than both the self-correcting model and the stress release model, but combining the two previous models results in an AIC ET ASLC2 much smaller than all the others, moreover, both ETAS and ETASLC2 have an AIC value smaller than corresponding model defined implementing an exponential triggering function.This results give us two important information.The first, as already well known, is that the Omori-Utsu form is the most plausible triggering function ν(t) for the description of the short term triggering effects.This also confirms the correct implementation in the physical model of the relaxation function after each event.The second, that both ETASLC1 to ETASLC2 models over-perform their respective triggering models.Interestingly, if we carry out the same study considering only the events of the numerical catalog below a certain seismic moment M max = 1000, we observe that the AIC values for both ETASLC models dramatically increase becoming comparable to the triggering model ETAS.This is strong evidence that events of larger magnitude contribute to a much greater decrease in stress than do small events alone, in fact, the AIC values of the triggering models remain almost unchanged (see Tab.(II)) restricting the catalogue magnitude.Again, a comparison of the triggering models TR1 and ETAS shows how the Omori-Utsu functional form is a more appropriate choice for describing short-term seismicity.The results also confirm the recent findings of [28] which show a slight dependence of the recurrence times two subsequent mainshocks.The magnitude dependence is also observed by looking at the simple Self-Correcting and Stress Release models.In fact, only in the case in which large events are considered, we note an improvement of the AIC for the SR model, which contains precisely the information on the magnitude of the earthquake.In the case of minor events, however, the AIC remains almost unchanged between the two models.

VI. CONCLUSIONS
Two different types of models have been presented in this article, a physical spring-block model and various point process models.They are employed a lot in seismology both for the intrinsic description of the physical phenomenon and for the practical use of statistical forecasting.Trying to bridge the two types of models could strengthen both sides.In particular, in this study, a numerical catalog was produced by means of a cellular automata algorithm from a spring-block model based on [23].This catalog has been fitted with different types of point process models.On a side, the self correcting model and the stress release model, which take into account that when the fault rupture the amount of strain present around the earthquake location decreases.On the other side, the self-exiting model, in which is implemented the temporal clustering property of the earthquakes.In the middle, there exist a mixing of the two previous type of PP models, called Epidemic Type Aftershock Longterm Correcting models.They take into account both the short term triggering effect and the release of strain by earthquake occurrence.In the triggering model, the background seismicity µ is assumed constant, whereas for the other models µ is a function of time.By means of the LogLikelihood maximization procedure, it is possible to find the best set of parameters of each model which makes it better fit the synthetic catalog.The results of this study show that the choice of the best statistical model is not trivial and could depend on the minimum seismic moment at which the seismic catalog is cut.In fact, large events are present in the catalog, it is necessary to introduce the stress release ingredient in order to better fit the experimental catalog.It is important to underline that a numerical catalog is characterized by the absence of incompleteness.Thanks to this, it is possible to attribute this not trivial result certainly not to the incompleteness of the experimental data, but to clustering phenomena and/or preponderant background effects.We hope that our results will contribute to develop a new generation PP models.For the LL function maximization, we employ a MCMC algorithm.Calling Θ k = {θ k i } the parameter set for model k, an initial value of θ k 0 is chosen.Then, the algorithm is carried out in the subsequent steps.
• A parameter θ k i * is randomly chosen and updated following the rule θ k i * → θ k i * + i * , where i * is tuned by hand in order to obtain both a convergence of the procedure and avoiding the Markov chain moving very slow.The procedure stops when it is no longer possible to find a likelihood value greater than the previous one.To ensure the convergence of the algorithm, we produce a Markov chain composed by 100000 steps.For the SC model, the intensity of the non-stationary Poisson point process is Eq.( 5).Then, the Likelihood function for the Self-Correcting model can be written as To simplify calculation, the Likelihood function is converted to the Log-Likelihood (LL) function: In a very similar way, we can obtain the LL(Y |α, β, ξ) related to the SR model considering N [0, t i ) = S[0, t i ) in the definition of λ.The optimized parameters are listed in Tab.(I) whilst the MCMC dynamics for each parameter of the model is plotted in Fig. (2,3).

Triggering model
As for the SC and SR models, the Log-Likelihood function for the triggering model (TR1) 7, can be written as

FIG. 1 :
FIG. 1: The numerical earthquake catalogues obtained with the OFCR model.(Top panel) Black circles indicate the seismic moment M i of all the event i occurred at time t i .(Bottom panel) Red squares indicate the seismic moment M i of all the event i occurred at time t i with the extra constraint M i ≤ 1000. λ

Appendix B: LogLikelihood estimation 1 .
SC and SR model Initial values and standard deviations for the STLC2 model parameters TABLE III: Initial values employed for the MCMC algorithm in all the different models.LL(Y |α, β, φ, θ) = <t φ (t − t j + c) θ   dt (B4)The best series (α, φ, θ), and (α, φ, c, θ) compatible with the recorded data Y are the ones that maximize the likelihood.The optimized parameters are listed in Tab.(I).

FIG. 2 :
FIG. 2: Parameters optimization for the Self-Correcting model.Red curves (a), (b) and (c) represents the evolution of the parameters α, β, ξ, respectively, during the MCMC algorithm for the catalogue with only M ≤ 1000.Black curves (d), (e) and (f) represents the evolution of the parameter during the MCMC algorithm for the whole numerical catalogue.

FIG. 3 :FIG. 5 :FIG. 7 :
FIG. 3: Parameters optimization for the Stress Release model.Red curves (a), (b) and (c) represents the evolution of the parameter during the MCMC algorithm for the catalogue with only M ≤ 1000.Black curves (d), (e) and (f) represents the evolution of the parameter during the MCMC algorithm for the whole numerical catalogue.

TABLE I :
The parameters of the models used in this article obtained by means of Maximum Likelihood Estimation procedure.

TABLE II :
The parameters of the models used in this article obtained by means of Maximum Likelihood Estimation procedure for the numerical catalog with M ≤ 1000.
Initial values and standard deviations for the SC model parameters Parameter Initial Value Std.Dev.