The strongest aftershock in seismic models of epidemic type

We consider an epidemic-type aftershock model (ETAS($F$)) for a large class of distributions $F$ determining the number of direct aftershocks. This class includes Poisson, Geometric, Negative Binomial distributions and many other. Assuming an exponential form of the productivity and magnitude laws, we find a limiting distribution of the strongest aftershock magnitude $m_a$ when the initial cluster event $m_0$ is large. The regime can be either subcritical or critical; the initial event can be dominant in size or not. In the subcritical regime, the mode of the limiting distribution is determined by the parameters of productivity and the magnitude laws; the shape of this distribution is not universal and is effectively determined by $F$. For example, the Geometric $F$-distribution generates the logistic law, and the Poisson distribution (studied earlier) generates the Gumbel type 1 law. The accuracy of these laws for moderate initial magnitudes is tested numerically. The limit distribution of the Bath's difference $m_0$-$m_a$ is independent of the initial event size only if the regime is critical, and the ratio of exponents in the laws of magnitude and productivity is contained in the interval $(1,2)$. Previous studies of the $m_a$-distribution have dealt with the traditional Poisson $F$ model and with arbitrary (not necessarily dominant) initial magnitude $m_0$.

1 Introduction 1.1 ETAS model.The epidemic-type aftershock sequence model (ETAS) is a standard tool for assessing and analyzing seismicity [e.g., Ogata, 1988;Ogata and Zhuang, 2006;Nandan et al, 2021].Traditionally, the model is defined by means of the conditional intensity of an event at a point of the phase space S= (time, location, magnitude), given the past of the seismic process up to the current time.The model in this form is convenient for modeling and forecasting, but is not flexible enough to describe direct aftershocks [Shebalin et al, 2020].This shortcoming can be easily overcome by describing a cluster of seismic events in S as a Galton-Watson tree [Molchan et al, 2022].In the future, the coordinates (time, place) are irrelevant.Therefore, we define the model in projection to the magnitude.
The initial event of magnitude m generates a random number ) (m


of direct aftershocks according to the off-spring law F: Each of these events is assigned an independent magnitude in accordance with the law Each event of the first generation independently generates new ones, following the described rule for its parent, etc.The process continues indefinitely.It stops with probability 1 if the criticality index [Harris, 1963], where is the productivity of the m-event.The values 1  n and 1  n correspond to the subcritical and critical regimes, respectively.
We get the standard ETAS model if F has a Poisson distribution (P), [e.g., Baró, 2020], that is Shebalin et al [2020] found that the Geometric distribution, F=G: ) agrees better with the empirical data.
Accepting this fact, the more general ETAS(F) models deserve attention.The choice of model F is not straightforward because of the difficulty of identifying direct aftershocks.The practical choice of model can be influenced by both the goals and the ease of modeling.A natural step in this situation is to expand the verification of the ability of models to reproduce various properties of real seismicity.For this purpose, we chose Båth's law, the analysis of which within the framework of ETAS(F) models is of independent interest.
Zaliapin and Ben-Zion [2013] showed the spatial variability cluster characteristics on the scale of tens of kilometers with respect to heat flow and other properties determining the effective viscosity of the region.This may explain the bimodality of the m  -distribution for circum-Pacific earthquakes (M7-8.4) with modes 1.2 and 1.8, [Tsapanos, 1990] , and for Taiwan earthquakes (Ms > 5) with modes 0.3 and 0.9 [Chen, Wang, 2012].An addition difficulty for Båth's law is the fact that substantial part (~20%) of global main shocks or main shocks in Japan turn out to be a part of doublets of earthquakes with difference not more than 0.4 units of magnitude and anomalously close in space-time (Kagan and Jackson, 1999;Grimm et al., 2023).
The theoretical analysis of the a m -distribution for clusters in the traditional ETAS(P) model was undertaken in [Saichev, Sornette ,2005;Zhuang, Ogata, 2006;Vere-Jones, 2008; Vere-Jones, Zhuang,2008;Luo & Zhuang,2016].The analysis refers to the case when the initial event 0 m is arbitrary (not necessarily dominant) and large, 1 0  m .In this case the limit a m -distribution is doubly exponential, which is typical for the theory of extreme values.
1.3 RT property.Below we will significantly expand the theoretical analysis by considering a special class of distributions F having the so-called Random Thinning (RT) property [Renyi, 1956;Molchan et al, 2022].Roughly speaking, the RT property assumes that the off-spring distribution after a random loss of a given proportion p of direct aftershocks changes only its average value  to  ) 1 ( p  , but does not change its type, say P or G defined above.
The choice of F distributions with RT property is quite natural in the face of unavoidable errors in identifying direct aftershocks.The Generating function of such distribution has the following representation The case 1   corresponds to the Geometric distribution, while the limiting case , corresponds to the Poisson distribution.The class of distributions with RT property is quite wide.For example, the following operations on RT- , where allow to obtain new types of distributions with RT property [Molchan et al, 2022].

1.4
The problem.Båth's law can be interpreted in different ways, depending on the definition of the main event.It can be the first and strongest (in the future dominant) event or only strong enough relative to subsequent events.Therefore, we will consider clusters with the initial magnitude 0 m of both types when 0 m is dominant or not.Our task is to find in these cases the distribution of the maximum aftershock magnitude under the condition 1 0  m .The case with the dominant initial magnitude refers to the original interpretation of Båth's law.
The main result and its discussion can be found in Sections 2.2 and 2.3.The actual accuracy of the limit laws and the difficulty of choosing a model F are briefly discussed in Sections 2.4, 2.5.
All auxiliary mathematical statements and proofs are placed in the Appendix.

The main equation.
We will consider clusters in which the number of all events, including the initial one, is at least two, i.e. the initial event 0 m and the strongest aftershock a m are different as events.
Statement 1. a) Consider AM-cluster in the ETAS(F) model with the generating function


is the inverse of the following function: is given by the equation : Remark.In the case F=      0 ), ( NB function ( 9) is explicitly reversible as follows where The ) ( 0 m DM cluster in the ETAS(NB( ))-model can be considered as an AM-cluster in the same type of model but with new where Let a m be a maximal magnitude in the AM/DM cluster and n is the criticality index.
A. Under 1 0  m condition, the following regression relationships are valid: The random component in (17-19) has the limit distribution regardless of the cluster type DM or AM.
For x>0 , relation (20) has the following meaning : , defined by the following equation B1.For AM-cluster, where the random component has the limit distribution (20) , in this case we will have B2.For DM cluster, the limit a m distribution is where

The critical points
2) Non-random of a m -regression component.
In the subcritical regime, 1  n , the non-random component in ( 17) can be represented as follows is the average size of the cluster with the initial 0 m -event (not necessarily dominant).As (Helmstetter, Sornette, 2003) noted, where That is the average number of cluster events well predicts the mode in the a m distribution.However, in the critical regime (n=1) or    this property disappears.
The universality (relative to F) of the a m component in sub-critical regime is due to the fact that the average cluster size with initial magnitude 0 m is determined by the laws of magnitude and productivity common to all F-models considered.
3) The random component of the a m -regression.
The distribution of the random component of the a m regression is not universal and is determined by the off-spring F law.In the case F=NB ) ( , the limit distribution ( 20) is known as the generalized Logistic distribution [Gupta , Kundu,2010] : with the mode at x=0.The standard Logistic distribution corresponds to the case 1   , i.e. to the Geometric distribution, F=G.In the Poisson case (  


) we have the Gumbel type- Johnson et al, 1995).Its truncated analogue was used in (Grimm et al, 2023), although the authors proceeded from the hypothesis that the off- .This is true for the case related to the Negative-Binomial distribution of the direct aftershocks.
Therefore, the non-random component of a m in the regression relations (17)(18)(19)23) for ETAS(F=NB( )) models can be interpreted as the mode of the a m distribution.
Despite the diversity of the random components, in the subcritical regime, the distributions of   / have a universal exponential behavior at large values: However, the left tail of this distribution loses versatility.In the model with the Negative Binomial distribution, F=NB ) ( , the left  -tail can be both stronger This fact can be useful to test F.

4) DM and AM clusters.
The probabilistic structure of a cluster with a dominant initial event is also described by the DM and AM clusters become close.Therefore, it is quite expected that the limit distributions of the maximum aftershock in both cases will be the same.These intuitive considerations proved themselves in the subcritical regime and even in the critical one, if . Some simplification is possible for small , the a m distribution looks as follows:

5) The Båth's law
The mode of the Båth's gap and logarithmically, Felzer et al.(2004) considered the    condition as necessary for the independence of the Båth's gap from the magnitude of the main event.If we do not neglect the logarithmic dependence of m  ˆ on 0 m , then condition    is not sufficient.Moreover, this condition is not necessary for all our models.The limit distribution of the Båth's difference 0 ma m is independent of the initial event size only if the regime is critical, 1  n , and . This fact is unexpected.The necessity of condition n=1 was previously noted in [Saichev, Sornette, 2005].

Numerical testing
The theoretical analysis of the a m -distribution is obtained for .The method for calculating the exact a m -distributions is described in Statement 1.
Numerical parameters.The initial magnitude 0 m varied in the range 2-6, assuming that the lower magnitude threshold is taken as 0. The parameters of the productivity and magnitude laws are similar to those used by Helmstetter and Sornette [2003] for California: Figure 1 shows the


-distribution densities for two F-models (P/G) and two cluster types (AM/DM).In the limiting case, the m  -distribution is symmetric for the case F=G and asymmetric for F=P, and the left tail of the distribution is exponential.These features can also be clearly seen in the prelimit distributions, especially in the case of AM clusters.
The limit a m -distribution, , is determined by the noise limit distribution in the regression (17): we get the linear relation for all options under consideration are shown in Figure 2. In doing so, we used the ) (M   distribution variants (see (A20,A34)), which takes into account that the cluster size is at least 2. In this case the support of In Figure 2, the differences between models F= G and F= P disappear in the region of central values of distributions and the expected linearity of regressions (32) are well reproduced.The regression parameters themselves are close for the AM and DМ clusters, but may differ from the limit values (see Table 1).All this suggests that the result about the noise component in the limit a m -distribution is quite acceptable for the real application., AM/DM cluster type, and off-spring distribution F: Poisson(P) and Geometric (G).In the limit case, . The prelimit (A,C) coefficients are shown in the figures.To give a definition, let us consider a chain of events connected by causal relationships.Such a chain begins with the cluster initial event, each subsequent event in the chain is a direct aftershock of the previous one, and the last event has no aftershocks.The number of the chain events determines its 'depth', d, and their empirical average value over all chains in the cluster is the statistics <d>.Next, we will assume that the depth refers to the conditional situation when the cluster contains at least two events.

AM clusters DM clusters
Obviously, <d> is closely related to the 0-probability controls the chain condition.For ETAS(F) models (see Appendix 5) where n is the criticality index.For all the difficulties of empirically estimating <d>, analysis of the California data [Zaliapin,Ben-Zion,2013] showed that the range of <d> is much wider in real catalogue than in the synthetic one, based on ETAS(P) model; namely, (1,25) vs. (1,5).The authors concluded that ETAS models do not fully describe the mechanism of cluster formation.At the same time, the results of [Zaliapin, Ben-Zion,2013]  if we change its 0-probability as follows: This operation will preserve the shape of F for k>0 and increase the average cluster depth and the new criticality index does not exceed 1: , and n<1.
The most of the conclusions about the limit distribution of a m in the ETAS( .The proof literally repeats the arguments for the case F.

Conclusion
Within the ETAS(F) model with regular the magnitude and productivity characteristics, we obtained limit distributions for the strongest aftershock a m of the initial event 1 0  m .The problem was solved depending on the type of criticality of the regime, the dominance of the initial event size, and the off-spring distribution F.
In the subcritical regime, the mode of the limiting distribution is determined by the parameters of productivity and the magnitude law; the shape of this distribution is not universal and is effectively determined by F.
The limit distribution of the Båth's gap is independent of the initial event only if the regime is critical, and the ratio of exponents in the laws of productivity and magnitude,   / , is contained in the interval (1/2, 1).We have identified a number of universal properties of the limit m  -distribution; in particular, it's the mode and the exponential tail of Staying within the ETAS(F) model, the choice between Poisson and Geometric F-distributions remains controversial.This problem requires an analysis of the zero probability Assume that for any is an increasing function of Similarly, , since the product under the expectation sign is non-negative at integer values of ) (


. As above, we can conclude, that 0 ) ( .The existence of derivatives 2 , 1 ), 0 ( ensures their continuity at point 0. We have 0 Let us estimate the function Next, and .The lemma is proven.

Appendix 2.
Statement 4. Consider ETAS(F) model with magnitude and productivity characteristics be the generating function for direct aftershocks of an m-event.
Then ) ( 0 m DM cluster in the ETAS(F) model can be viewed as ) ( 0 m AM cluster in the ETAS( F ˆ) model such that the magnitude law has the form and the off-spring law F ˆ is contiguous with F: The generating function looks as follows In the case F= )) ( NB , any ) ( 0 m DM cluster in the ETAS(F) model can be viewed as an cluster of the same model, but with the new magnitude law (A1) and the productivity law of any initial m -event realized under the condition Therefore the conditional probability density of the direct aftershocks is are given by ( A1) and (A2) .Hence In the case F= NB( )  this relation looks as follows Hence, the F law all ( direct and indirect) off-springs of the initial event  have a magnitude <M .
We will find the equation for the distribution of the strongest aftershock a m in terms of and As a result, we will have , where 0 m is the initial cluster event.
events of 1-st generation.Then probability density where by denoting } {... m we fix the magnitude m of the initial event. Since By definition, the number of events in a cluster is greater than 1.Therefore, considering the event for the initial event 0 m , we must take into account that 0 m has a positive number of events of first generation, that is Integrating (A5) by m with weight ) ( 1 m f , we get the second equation: , (A6) is converted into an equation with respect to the function ,where ) ( M is the  - quantile of the a m -distribution.

AM(
, we can simplify some elements of the ETAS(F) model: -the criticality index Consider the main equation , the equation can be represented as follows


is defined by the equation According to Statement 3, the ) (w F  function increases on the semi-axis w<0, and therefore the inverse function Substituting (A13) into (A11), we have where In the case    , direct integration yields The left-hand part (l.p.) in (A14) can be found explicitly: Using the resulting estimates (A14-A18), we can find an approximate solution to equation (A11, A12).

4.1a The case
in (A17, A18), where x is fixed, we have In other words From (A20) follows a more accurate approximation The limitation M from below here is not surprising, since the approximation (A21) is obtained for the central (peak) values of a m .

4.1b The case
In this case , and therefore By setting and using ( A17), we have , we get again but with a new normalization of a m , namely, where the random component  has the ) ( x F e    distribution. To obtain the analogue of (A20) we have to replace n in this formula with Let's introduce the notation , x<0 and changing the variable under the integral: where we replaced , because the integral summand in (A25) is of order 2 V . Since , we have by inversion where the random component  has the limit distribution ) ( x F e    .
As above, we can get a more accurate approximation )) ( / ln( 0 In the case under consideration, ) /( 1 . Therefore the left-hand part of ( A14) is The right-hand part of (A14) is The o(1) estimate follows from (A29).The continuity of F    at zero implies that ) From equality (A28) and (A30) it follows that After integration we will have the following: As above, we can conclude that ) where the random component  has the ) ( x F e    distribution.
According to Statement1 the DM( ) , 0 F m cluster can be considered as an ETAS( ) F model, the main characteristics of which are related to the original ones as follows , both characteristics converge asymptotically to the characteristics of ETAS( ) F .Therefore, it is natural to expect that the limit a m -distributions for AM and DM clusters will be the same if the peak value of a m in the AM cluster is much smaller than 0 m .The method of proof remains the same, though more cumbersome.Therefore, only its main steps are given below.
The main equation.Using (A32), (A33) and repeating the previous way, we can find the main equation in a form similar to (A14,A15):

4.2a,b The cases:
where c is specified in (A19) and (A22) .Since 0 m M  , the asymptotics of (A14) and (A34), are the same, i.e. (l.p.A34)= ) Let's estimate (A35).Under conditions (A.37) and 0 . Here we used the relation . Hence, the integrand term in square brackets in (A35) admits the estimate Consequently, the limit distributions of a m for AM and DM clusters coincide.In particular, in the subcritical regime Using the notation (A38, A39), we represent (A35) as follows: Finally, combining (A41) with (A42-A44), we get In the notation (A25), we have: .For small y , y y   ) (  and therefore for small The proof is complete.
This case both by proof and by result does not differ from the analogous case for the AM cluster To confirm this, it is enough to compare (A34, A35) with (A28, A30) given Dividing by 0 m and going to the limit, we obtain the same relation as for AM cluster: Statement 3 in Appendix 1.An important example of a distribution with the RT property is the Negative Binomial distribution F= ) ( NB .Its type depends on the  -parameter since in which the normalization of a m and mode of the limit a m -distribution are continuous across the parameters in the problem.The critical values of the initial event  is random and has a non-truncated exponential distribution: as  /increases, the first and second moments of this statistic become infinite, starting at 1 . Therefore, for x<0, the scatter of the noise component for F=G and F=P is visually different, because it is roughly described by tails of the form )

Figure 1 .
Figure 1.Distribution density of the Båth's magnitude difference

Figure 2 .
Figure 2. Noise component test, depending of the initial magnitude 6 2 0   m, AM/DM cluster type, and off-spring distribution F: Poisson(P) and Geometric (G).In the limit case, ) 1 /( ln ) ( The universal nature of a number of properties of the limiting distribution of the strongest aftershock makes it difficult to choose the most preferable ETAS(F) model in relation to Båth's law.Above we noted a practically useful but not strong enough feature to distinguish F=P model from F=G: this is the symmetry and asymmetry of the a m distribution for F=G and F=P, respectively.Therefore, let us turn to an important characteristic, <d>, of an epidemic-type seismic model introduced by Zaliapin and Ben-Zion [2013], which we will call average cluster depth (instead of average leaf depth).
do not exclude the admissibility of the ETAS(P) model to describe seismicity in regions with low heat flow.Thus, the statistic <d> testifies in favor of choosing the traditional Poisson model from the family with F= NB( )) while the data, related of the off-spring distribution F, testifies to the opposite.This means that the )

m
distribution to the left of the m E value.For the practical choice of model F, the contrast of the scatter of the data in the area of large and small values of m  is important.For example, in the Poisson model, the probability of large m  values sharply decreases and the m  -distribution is asymmetric, whereas for the geometric model we have symmetry of the m  -distribution and the exponential tails.Numerical testing of the noise component of the a m distribution has shown that it is possible to transfer the conclusions for large initial events 0

)
, A15) in the new notation has the form:


is non-negative (seeStatement 1 )  and therefore) (V  is an increasing function of V .It is clear, that ) (V increases from 0 to infinity, which guarantees the solvability and uniqueness of equation (A25) for any 0 0   .Moreover, the non-negativity of the integral term in (A25) entails magnitudes, we obtain the distribution of the chain depth : the empirical average of d over all cluster chains, then Ed d inequality, we can refine the estimate (A52) for the traditional ETAS model

Table 1
. (A,C) coefficients from Fig.2and their limit values The upper bound in (33) is sharp estimate of Ed for the whole