Rescaled earthquake recurrence time statistics: application to microrepeaters

Christian Goltz,1,2 Donald L. Turcotte,2 Sergey G. Abaimov,2 Robert M. Nadeau,3 Naoki Uchida4 and Toru Matsuzawa4 1Kiel University, Germany. E-mail: sgabaimov@gmail.com 2Department of Geology, UC Davis, Davis, CA, USA 3Berkeley Seismology Laboratory, UC Berkeley, Berkeley, CA, USA 4Research Center for Prediction of Earthquakes and Volcanic Eruptions, Graduate School of Science, Tohoku University, Sendai, Japan


I N T RO D U C T I O N
Because of their complexity, earthquakes satisfy several scaling laws with considerable robustness. The most famous of these is the Gutenberg-Richter frequency-magnitude scaling. It can be shown that this is equivalent to a fractal (power law) scaling between the number of earthquakes and the linear dimension of rupture. The cumulative number N c with rupture areas greater than A r scales as N c ∝ A −D/2 r , where the fractal dimension D ≈ 2. Another property of seismicity that has been shown to satisfy a universal scaling law is the statistical distribution of interoccurrence times. All earthquakes in a specified region and specified time window with magnitudes greater than m are considered to be point events. Based on studies of properties of California seismicity, a unified scaling law for the temporal distribution of recurrence times between successive earthquakes was proposed by Bak et al. (2002). Two distinct scaling regimes were found. For short times, aftershocks dominate the scaling properties of the distribution, decaying according to the modified Omori's law. For long times, an exponential scaling regime was found that can be associated with the occurrence of main shocks. To take into account the spatial heterogeneity of seismic activity, it has been argued that the second regime is not an exponential but another power law (Corral 2003). Shcherbakov et al. (2005) showed that the observed distribution of interoccurrence times can be well approximated by a distribution derived from assuming that aftershocks follow non-homogeneous Poisson statistics with the rate given by Omori's law.
The main focus of this paper is the scaling of recurrence times. It is important to distinguish between the statistical distribution of recurrence times between earthquakes on a specific fault or fault section, and the interoccurrence times between earthquakes on all of the faults in a region. This difference is clearly illustrated by the behaviour of the San Andreas fault in California. Quasi-periodic sequences of great or moderate earthquakes include great earthquakes on the northern section of the San Andreas fault (the 1906 San Francisco earthquake), the southern section of the San Andreas fault (the great 1857 earthquake) and moderate earthquakes (m ≈ 6) on the Parkfield section of the San Andreas fault in central California. Offsets on these faults are dominated by the largest earthquakes. These earthquakes are generally known as characteristic earthquakes. Detailed discussions of the relationship between recurrence times and characteristic earthquakes have been given by Turcotte et al. (2007) and Abaimov et al. (2007aAbaimov et al. ( ,b, 2008. Characteristic earthquakes are associated with quasi-periodicity, but can also have considerable variability. A measure of the variability of recurrence times on a fault or fault segment is the coefficient of variation C V (the ratio of the standard deviation σ to the mean μ). For strictly periodic earthquakes on a fault or fault segment, we would have σ = C V = 0. For a random (i.e. exponential with no memory) distribution of recurrence times, we would have C V = 1, (i.e. σ = μ). Ellsworth et al. (1999) analysed 37 series of recurrent earthquakes and suggested a provisional generic value of the coefficient of variation C V ≈ 0.5. A number of alternative distributions have been proposed for the purpose of specifying the statistics of recurrence times. These include the exponential (Poisson), the log-normal, Brownian passage-time (inverse Gaussian) and Weibull (stretched exponential) distributions (Davis et al. 1989;Sornette & Knopoff 1997;Ogata 1999;Matthews et al. 2002).
The statistical distribution of recurrence times of characteristic earthquakes is important in defining the seismic hazard (Working Group on California Earthquake Probabilities 2003). For example, the last characteristic earthquake on the northern section of the San Andreas fault in 1906 was responsible for the destruction of much of San Francisco. It has been almost 100 yr since that characteristic earthquake and the mean recurrence interval of these characteristic earthquakes is estimated to be about 200 yr. If these characteristic earthquakes are periodic, then the next earthquake can be expected in about 100 yr. If these characteristic earthquakes are Poissonian, that is, have an exponential distribution, then the future probability distribution would be exponential with a mean of about 200 yr independent of when the last earthquake occurred.
Ideally, observed sequences of earthquakes on a fault should be used to establish the applicable statistical distribution for recurrence times. However, the number of events in observed earthquake recurrence sequences is not sufficient to establish definitively the validity of a particular distribution (Savage 1994). In this paper, we introduce a rescaling technique in which sequences of interval times can be superimposed. The result is that larger numbers of data points can be used to test alternative statistical distributions. Our approach can be used for the Weibull and log-normal distributions but not for the Brownian passage-time distribution.
To test our approach we consider the recurrence time statistics of microrepeaters, small earthquakes (m ≈ 1-2) which recur at the same locations on a fault. We consider sequences that are not associated with aftershocks and find that the recurrence time statistics are stationary and have a similar behaviour to the recurrence time statistics of characteristic earthquakes on major faults. Schaff et al. (1998), Peng et al. (2005 and Peng & Ben-Zion (2006) have carried out such studies during aftershock sequences and find that the interval statistics are consistent with Omori's law.

A P P L I C A B L E D I S T R I B U T I O N S
One objective of this paper is to determine whether there is a preferred distribution for recurrence time intervals of microrepeaters. Two widely used statistical distributions for recurrence time intervals are the log-normal and Weibull. We will compare each of them with our data.
A primary objective of this paper is to superimpose recurrence times for large numbers of microrepeaters to test alternative statistical distributions. We will illustrate how this will be done for the Weibull and log-normal distributions.
The probability distribution function (pdf) for a Weibull distribution is given by (Patel et al. 1976) where β and τ are fitting parameters. The mean μ and the aperiodicity (coefficient of variation) C V of the Weibull distribution are given by where (x) is the gamma function of x. The corresponding cumulative distribution function (cdf) for the Weibull distribution is given by If β = 1 the Weibull distribution becomes the exponential (Poisson) distribution with σ = μ and C V = 1. In the limit β → +∞, the Weibull distribution becomes a δ-function with σ = C V = 0. In the range 0 < β < 1, the Weibull distribution is often referred to as the stretched exponential distribution. An important property of the Weibull distribution is the powerlaw behaviour of the hazard function The hazard function h(t 0 ) is the probability that an event will occur during an interval δt 0 at a time t 0 after the occurrence of the last event. For the Poisson distribution, β = 1, the hazard function is constant h(t 0 ) = τ −1 and there is no memory of the last event. For β >1, the hazard rate increases as a power of the time t 0 since the last event. We argue that the hazard rate must increase after a characteristic earthquake. A characteristic earthquake reduces the regional stress. Tectonic loading slowly increases the regional stress until the next characteristic earthquake occurs. Complexity introduces fluctuations in stress but the risk of the next earthquake must increase with time. As a specific example consider 1906 San Francisco earthquake. The plate motion is loading the San Andreas fault resulting in an increased probability of the next great San Francisco earthquake.
Our rescaled combination approach for the Weibull correlations is carried out as follows. We consider m sets of recurrence times for m sets of microrepeater earthquakes. For the jth set, we have n j recurrence times t ij with i = 1, 2, . . . , n j . We order these times from the shortest to the longest and construct the cumulative distribution of the t ij . We then obtain the best fit of the cumulative Weibull distribution from eq. (4) to these values using a maximum likelihood test and obtain τ j and β j . Using these values we have a renormalized set of recurrence times given by (t i j /τ j ) β j . This is repeated for the m sets of microrepeaters.
The values of the rescaled times are next ordered from smallest to largest for all sets of microrepeaters. The cumulative distribution of these values is then obtained. If the distributions are well approximated by the Weibull then there will be a good fit to the exponential relation given in eq. (4).
The pdf for a log-normal distribution of recurrence times t is given by (Patel et al. 1976) The log-normal distribution can be transformed into a normal distribution by making the substitution y = ln t; μ y and σ y are the mean and standard deviation of this equivalent normal distribution. The mean μ, standard deviation σ and aperiodicity (coefficient of variation) C V for the log-normal distribution are given by The corresponding cdf P(t) (fraction of recurrence times that are shorter than t) for the log-normal distribution is where erf (x) = 2 √ π x 0 e −y 2 dy is the error function. Our rescaled combination approach for the log-normal correlations is very similar to the approach used for the Weibull correlations. Again we consider the m sets of recurrence times for m sets microrepeater earthquakes. The recurrence times t ij are ordered for each j set and the cumulative distribution is plotted. We then obtain the best fit of the cumulative log-normal distribution from eq. (8) to the distribution by using a maximum likelihood test and we find μ yj and σ yj . Using these values we have a rescaled set of recurrence times given by (ln t − μ yj )/2 1/2 σ yj . This is repeated for each of the m sets of microrepeater earthquakes.
The values of the renormalized times are next ordered from smallest to largest for all sets of microrepeaters. The cumulative distribution of these values is then obtained. If the distribution is well represented by the log-normal distribution then there will be a good fit to the error function relation given in eq. (8).
Although the Weibull and log-normal distributions are the most widely used distributions for recurrence time statistics, the Brownian passage-time distribution has recently been used (Matthews et al. 2002). We do not consider this distribution in this paper because it is not possible to rescale it as we have done for the Weibull and log-normal distributions. The reason for this is that the mean μ enters the Brownian passage-time distribution both as an additive and multiplicative factor. In the Weibull and lognormal distributions, the fitting parameters are either multiplicative or additive, they are not both. Thus it is not possible to rescale the Brownian passage-time distribution as we have done for the Weibull and log-normal distributions.

C A L I F O R N I A M I C RO R E P E AT E R E A RT H Q UA K E S
Repeating sequences of small earthquakes in central California have been widely studied. Some 300 clusters of these repeating earthquakes on the San Andreas fault have been recognized (Nadeau et al. 1995;Schaff et al. 1998;Nadeau & McEvilly 1999, 2004. Some clusters contain as many as 20 similar regularly occurring 'characteristic' micro-earthquakes. Each sequence has a common hypocentre and very similar waveforms. They appear to be the result of periodic ruptures of an asperity on a creeping section of a fault. Time intervals between characteristic micro-earthquakes range from months to years with the intervals scaling with earthquake magnitude (Nadeau & Johnson 1998;Chen et al. 2007). For similar repeaters on the Calaveras fault, Vidale et al. (1994) and Marone et al. (1995) correlated time intervals with rupture duration.
In some cases, the microrepeaters occur during aftershock sequences and rates of occurrence scale with Omori's law (Schaff et al. 1998;Peng et al. 2005;Peng & Ben-Zion 2006). In this paper, we will concentrate our attention on sequences that exhibit stationarity. We eliminate sequences that exhibit systematic trends in the recurrence times. We will also only consider sequences that include at least 10 intervals to obtain significant statistics for the distribution of recurrence times.
We will consider sequences on the creeping section of the San Andreas fault in central California. These sequences have been previously studied by Nadeau et al. (1995), Nadeau & Johnson (1998) and Nadeau & McEvilly (2004). We have studied in detail 28 sequences. The locations of these sequences are given in Fig. 1. Each sequence contained at least 11 events and covered about 20 yr, the earliest event was in 1984 and the most recent in 2005.
The majority of locations are on the creeping section north of the Parkfield segment with only three locations on the locked Parkfield segment. The numbering scheme seen in Fig. 1 will be used to discuss individual sequences below. Some sequences, for example, 21 and 24, show a shortening of waiting times towards the end, that is, acceleration in the failure process. There is also evidence of deceleration and missing data in sequence 12.
We estimate the number of missing characteristic events to be on the order of 5 per cent in general. The reason is that both catalogue entries and waveforms from the Northern California Seismic Network (NCSN) are required to identify members of microrepeater sequences (characteristic sequences, CSs), however, the procedure is not yet optimized for the smaller microrepeating events. Catalogue entries are generally available before their corresponding waveforms. The waveforms are available only after an analyst reviews the data and finalizes the event, and in some cases, waveforms are not made available even after review due to the small magnitudes of the events considered. However, as with any seismological network, when large event aftershock sequences occur, small events are missed due to network saturation and in addition the review and finalization process falls behind. For the data sets considered here, we therefore expect about 15-20 per cent missing events during the first month after the 2003 December 22 San Simeon M = 6.5 earthquake for example.
Only six sequences could be considered to be stationary without any data manipulation. These are marked in red in Fig. 1. No satisfactory fits of any distribution can be obtained for recurrence time data of non-stationary sequences. In the following, we will concern ourselves with stationary data only.
Properties of the six sequences we consider are given in Table 1. Included in this table are the set number, the number of events n, mean magnitudem, mean recurrence time μ, coefficient of variation C V and the Weibull parameters τ and β for each sequence. The number of events range from n = 12-14, the magnitudes range  Note: included are the set number, number of events n, mean earthquake magnitudem, mean recurrence time μ, coefficient of variation C V and the Weibull parameters τ and β. fromm = 1.36 − 2.22, the mean recurrence times range from μ = 500-655 days, and the coefficients of variation range from C V = 0.20-0.38. The mean of the coefficients of variation is C V = 0.278.
As a typical example we illustrate set 9 in Fig. 2. The cdf P(t) of the 13 recurrence times t is given. Also included is the best-fit Weibull distribution from eq. (4). The fitting parameters are τ = 572 days and β = 2.91. The values of (t/τ ) β are then obtained for the 13 recurrence times. For example, the shortest recurrence time is t = 252 days so that (t/τ ) β = 0.092. When this process is carried out for the six sets we have 80 values of (t/τ ) β . The cdf P([t/τ ] β ) of these values of (t/τ ) β is given in Fig. 3. Based on the applicability of the eq. (4) the fit of these data to the exponential distribution is also given in Fig. 3. The log likelihood of this fit is −74.0.
We next consider the fit to a log-normal distribution. To illustrate our approach we again consider set 9 illustrated in Fig. 2. From Table 1 we have μ = 509.92 days and σ = 195.31 d. From eq. (7) we find that μ y = 6.166 and σ y = 0.370 with μ and σ in days. The values of [(ln t-μ y )/2 1/2 σ y ] are then obtained for the recurrence times. For example, the shortest recurrence time is t = 252 days so that [(ln t-μ y )/2 1/2 σ y ] = −0.741. When this process is carried  out for the six sets we have 80 values of [(ln t-μ y )/2 1/2 σ y ]. The cdf P([(ln t-μ y )/2 1/2 σ y ]) of these values is given in Fig. 4. Based on the applicability of eq. (8) the best fit of the data to the error function distribution is also given in Fig. 4. The log likelihood of this fit is −76.2. This value is very close to the value −74.0 for the Weibull distribution so that it is not possible to clearly prefer one distribution over the other.

M I C RO R E P E AT E R E A RT H Q UA K E S I N T H E N O RT H E A S T E R N JA PA N S U B D U C T I O N Z O N E
Repeating sequences of small earthquakes in the northwestern subduction zone have also been widely studied (Matsuzawa et al. 2002;Igarashi et al. 2003;Uchida et al. 2003Uchida et al. , 2006. More than 300 Figure 4. The cdf P[(ln t-μ y )/2 1/2 σ y ] of the 80 values of (ln t-μ y )/2 1/2 σ y for the six Parkfield data sets that we have considered. Also included is the expected fit to the error function dependence given in eq. (8) for a log-normal distribution.
clusters of these repeating earthquakes have been recognized. It has been hypothesized that these earthquakes are caused by slips on small asperities surrounded by aseismic slip on the plate boundary. We have applied the same approach described for California to select 27 data sets that exhibit steady-state behaviour.
Properties of 27 data sets we consider are given in Table 2. Included in this table are the set number, the number of events n, the mean magnitudem, mean recurrence time μ and the coefficient of variation for each sequence. The number of events ranges from n = 10-28, the mean magnitudes range fromm = 2.30−3.54, the mean recurrence times range from μ = 276 days to 825 days and the coefficients of variation range from C V = 0.362-1.17. The mean of the coefficients of variation is C V = 0.564. It is interesting to note that this is more than twice as large as the value for the California data. The variability of the recurrence times on the strike-slip fault is much smaller than the subduction zone fault. The difference may be attributed to the greater complexity of the subduction zone microrepeater earthquakes relative to the near planar strike-slip behaviour of the California microrepeater earthquakes.
As a typical example we illustrate set 294 in Fig. 5. The cdf P(t) of the 23 recurrence time t is given. Also included is the best-fit Weibull distribution from eq. (4). The fitting parameters are τ = 370 days and β = 3.89. The values of (t/τ ) β are then obtained for the 23 recurrence times. When this process is carried out for the 27 data sets we have 405 values of (t/τ ) β . The cdf P([t/τ ] β ) of these values of (t/τ ) β is given in Fig. 6. Based on the applicability of the eq. (4) the fit of these data to the exponential distribution is also given in Fig. 6. The log likelihood of this fit is −377.
We next consider the fit to a log-normal distribution. For set 294 illustrated in Fig. 5, we have μ = 334 days and σ = 102 days from Table 2. From eq. (7) we find that μ y = 5.77 and σ y = 0.297 with μ and σ in days. The values of [(ln t-μ y )/2 1/2 σ y ] are then obtained for the 23 recurrence times in this set. When this process is carried out for the 27 data sets we have 405 values of [(ln t-μ y )/2 1/2 σ y ]. The cdf P([(ln t-μ y )/2 1/2 σ y ]) of these values is given in Fig. 7. Based on the applicability of eq. (8) the best fit of the data to the error function distribution is also given in Fig. 7. The log likelihood of this fit is −390. Again this value for the fit of the log-normal distribution is close to the value −377 for the Weibull distribution so again it is not possible to definitely prefer one distribution over the other.

D I S C U S S I O N
The purpose of this paper has been to consider two aspects of recurrence time statistics of characteristic earthquakes on a fault or fault segment. These are (1) the application of a rescaling technique and (2) the behaviour of the recurrence time statistics of microrepeater earthquakes. Recurrence time statistics of characteristic earthquakes play an important role in seismic hazard assessment. But the number of well-established sequences of characteristic earthquakes is small and the number of earthquakes in each sequence is also small. Thus it has been difficult to establish the validity of alternative statistical distributions. In this paper, we introduce a rescaling technique in which sequences of interval times can be superimposed. The result is that larger numbers of data points can be used to test alternative statistical distributions. Our approach can be used for the Weibull and log-normal distributions but not for the Brownian passage-time distribution. The reason for this is that the two fitting parameters in the Weibull and log-normal distributions are either additive or multiplicative but not both. For the Weibull distribution we rescale utilizing the rescaled times t i jrs = (t i j /τ j ) β j Figure 7. The cdf P[(ln t-μ y )/2 1/2 σ y ] of the 405 values of (ln t-μ y )/2 1/2 σ y for the 27 Tohoku data sets that we have considered. Also included is the expected fit to the error function dependence given in eq. (8) for a log-normal distribution.
where t ij are the i interval times of data set j and τ j and β j are the Weibull fitting parameters for set j. For the log-normal distribution we rescale utilizing the rescaled times t ijrs = (ln t ij − μ yj )/2 1/2 σ yj , where μ yj and σ yj are the log-normal fitting parameters for data set j. For the Brownian passage-time distribution the fitting parameter μ additive and multiplicative (see eq. 12 in Matthews et al. 2002) so that rescaling is not possible. As a test of our rescaling approach we consider six stationary sequences of microrepeater earthquakes on the San Andreas fault in central California and 27 stationary sequences of microrepeater earthquakes in northeastern Japan.
We consider the two sets separately and introduce our rescaled technique for both the Weibull and the log-normal distribution. For each sequence we obtain the best-fit fitting parameters for the two distributions. This constitutes a rescaling and the six (27) sequences are then combined to form a single statistical set each. For the Weibull distribution the cdf of the values of (t/τ ) β is expected to fit an exponential distribution. These fits are given in Fig. 3 for the California data and in Fig. 6 for the Japanese data. For the lognormal distribution the cdf of the values of [(ln t-μ y )/2 1/2 σ y ] is expected to fit an error function distribution. These fits are given in Fig. 4 for the California data and in Fig. 7 for the Japanese data. The log-likelihood fits for the four data sets are: California Weibull −74.0 and log-normal −76.2, Japan Weibull −377 and log-normal −390. Although the fits are quite good it is not possible to clearly establish a preference for one of the two distributions based on these data sets. This is despite analysis of data sets of unprecedented lengths obtained from our novel method of rescaled combination. The main reason is the absence of extremely long waiting times, that is, the lack of samples from the tails of the distributions. The tails of the two distributions differ while they are very similar otherwise. It would be interesting to apply rescaled combination to other timeseries such as volcanic eruption sequences.
It is also of interest to compare the statistical behaviour of the recurrence times of the microrepeater earthquakes with related phenomena. We have shown that the distributions of recurrence times for our two sets of data can be well approximated by either the Weibull or the log-normal distribution. This is also the case for characteristic earthquakes under a variety of circumstances (Abaimov et al. 2007a(Abaimov et al. ,b, 2008Turcotte et al. 2007). A measure of the variability of recurrence times is the coefficient of variation. For the six sequences we considered for the San Andreas fault we found that the mean value of the coefficient of variation is C V = 0.278. The coefficient of variation for the seven characteristic earthquakes that have occurred on the Parkfield section of the San Andreas fault is C V = 0.378 (Abaimov et al. 2008). These values tend to be somewhat lower than the usual value near C V = 0.5 (Ellsworth et al. 1999). Abaimov et al. (2007a) studied the recurrence time statistics of slip events on the creeping section of the San Andreas fault in central California. The slip events typically had magnitudes of a few millimetres and were aseismic. The distributions were shown to fit Weibull statistics to a good approximation. The 67 events recorded at 10min intervals at the Cienega Winery creepmeter had a coefficient of variation C V = 0.554. Clearly the values for the creepmeter slip events are higher than the values for the microrepeater earthquake sequences. For the 27 sequences of microrepeater earthquakes we studied from northwestern Japan the mean value of the coefficient of variation is C V = 0.564. This is a value that is typical of sequences of characteristic earthquakes.

A C K N O W L E D G M E N T S
CG is funded through a EU Marie Curie Outgoing International Fellowship and currently hosted by U. C. Davis. RMN was funded through National Science Foundation Grants EAR-0337308, EAR-0544730 and EAR-0510108.