ABSTRACT

Computing the agreement between 2 continuous sequences is of great interest in statistics when comparing 2 instruments or one instrument with a gold standard. The probability of agreement quantifies the similarity between 2 variables of interest, and it is useful for determining what constitutes a practically important difference. In this article, we introduce a generalization of the PA for the treatment of spatial variables. Our proposal makes the PA dependent on the spatial lag. We establish the conditions for which the PA decays as a function of the distance lag for isotropic stationary and nonstationary spatial processes. Estimation is addressed through a first-order approximation that guarantees the asymptotic normality of the sample version of the PA. The sensitivity of the PA with respect to the covariance parameters is studied for finite sample size. The new method is described and illustrated with real data involving autumnal changes in the green chromatic coordinate (Gcc), an index of “greenness” that captures the phenological stage of tree leaves, is associated with carbon flux from ecosystems, and is estimated from repeated images of forest canopies.

1 INTRODUCTION

The comparison of 2 sequences is a fundamental problem in several scientific disciplines, and it can be addressed in many different ways. For example, the Student’s t-test, the correlation coefficient, and the Wilcoxon 2-sample rank test are 3 techniques that, under different assumptions, provide information about some aspects of comparisons between 2 independent populations (Lin et al., 2012). When the goal is to measure the agreement between 2 variables to validate an assay, a process, or a newly developed instrument, it may be relevant to evaluate whether its performance is concordant with other existing ones or a “gold standard” (Lin et al., 2002).

The probability of agreement (PA) measures the level of agreement between 2 continuous sequences. It was first introduced in a series of papers by Stevens et al. (Stevens and Anderson-Cook, 2017; Stevens et al., 2017; 2018, 2020; Stevens and Lu, 2020) as an alternative approach to some existing agreement measures, including the concordance correlation coefficient between measurements generated by 2 different methods (Lin, 1989) and its extensions. Subsequently, Leal et al. (2019) studied the PA in a context of local influence and de Castro and Galea (2021) developed Bayesian PA methods to compare measurement systems with either homoscedastic or heteroscedastic measurement errors. We note that the literature on PA is extensive and many relevant uses for it in different scenarios have been published. However, in all cases, the general definition is always the probability that the difference between 2 statistical quantities is negligible and not statistically distinguishable. A salient example is by Ponnet et al. (2021), who adapted the probability of concordance or C-index to the specific needs of a discrete frequency and severity model that is typically used during the technical pricing of a nonlife insurance product.

In this paper, we generalize the PA (sensu Lin et al., 2002; Stevens and Anderson-Cook, 2017; Stevens et al., 2017) to the case of 2 georeferenced sequences in the plane, sequences that also may vary in time. We were motivated to extend and generalize the PA into the spatiotemporal domain because of the widespread need to analyze near-Earth remotely sensed data that are used to make inferences about ongoing climatic change and its effects on environmental dynamics at a range of spatial extents (eg, Richardson et al., 2007; 2018; Yang et al,, 2013). Our exemplar is the problem of comparing and assessing the PA of repeated images taken from fixed platforms to determine changes in the times of year when trees in the northern hemisphere emerge from dormancy and produce new leaves (“spring green-up”) and when the leaves of these same trees senesce in the fall before the trees go dormant for the winter. There is substantial evidence that because of ongoing, anthropogenically driven climate change, spring green-up is, on average, occurring earlier in the year (eg, Richardson et al., 2007; Keenan et al,, 2014) and leaf senescence is occurring later in the fall (eg, Moon, 2022). During the “growing season” between spring green-up and fall leaf senescence, forests in the northern hemisphere remove a substantial amount of carbon dioxide from the atmosphere (Barichivich et al., 2012). Changes in the carbon cycle have important implications not only for ecological dynamics and forest processes, but also for economic activities including tourism and markets for carbon offsets (eg, Le Quéré et al., 2009; Piao et al., 2019). Our spatiotemporal generalization of the PA provides a method to quantify whether these changes are non-negligible and meaningful and test their statistical significance.

The generalization we describe here makes the PA dependent on a spatial lag, similar to how the variogram and the covariance functions are used in spatial statistics. Specific conditions for the variance of the difference between the 2 sequences are imposed so that the PA is a decreasing function when the norm of the spatial lag increases. This monotonic feature of the PA is established for the bivariate Matérn and Wendland covariance functions. We then extend the PA for the case of spatiotemporal processes so that two images of the same scene taken at different times—such as annual images of trees leafing out or senescing—can be compared. In our spatiotemporal extension of the PA, we consider that the process includes a temporal and, perhaps, a spatial trend, and random noise. The resulting process is nonstationary in the mean and is flexible enough to account for a number of different trends in time.

Estimation of the PA is addressed via plug-ins and the delta method, assuming that the estimates of the parameters of the covariance function exist and are asymptotically normal (Mardia and Marshall, 1984). A simple expression for the asymptotic variance of the sample PA is also derived. To assess the properties of the PA for finite sample sizes, we carried out 2 numerical experiments: a sensitivity study that illustrates that PA decreases as a function of the norm of the spatial lag, and a Monte Carlo simulation study to estimate the PA and the relevant parameters of 2 spatial covariance models. We then applied our spatial PA to a temporal sequence of images of a forest canopy and estimated the PA of a sequence of images taken of the same scene at different times (described in Section 2). Images like these are used routinely to identify seasonal changes in the unfolding, maturation, and senescence (with accompanying fall colors) of individual trees and entire forest canopies and to estimate fluxes of carbon, water vapor, and other gases between the forest and the atmosphere (eg, Richardson et al., 2018).

In Section 3, we give some additional, albeit brief, background on PA. We then introduce the idea of PA for spatial processes and establish the main theoretical results for stationary processes (Section 4) and spatiotemporal processes (Section 5). Section 6 discusses the estimation of PA, which is then illustrated with numerical simulation experiments (Section 7) and an analysis of the forest-canopy images (Section 8). We conclude with an outline for future research and developments in this area (Section 9). Proofs of the main results are given in the Appendix, the rest of the proofs are relegated to the Supplementary Material of this paper.

2 IMAGERY

We analyzed a series of 15 annual images of the same scene taken in mid-October from 2008 to 2022 (Figure 1A). This is the time of year when leaves of deciduous trees are senescing, which leads to the spectacular display of fall colors in New England (USA), Japan, and other parts of the northern hemisphere. We chose images taken in a 3-day window each year (October 12-15), as this represents the approximate historical “peak” of fall colors in New England. As the regional climate has warmed, however, this peak has begun to shift toward later dates. Identifying the rate and spatial patterning of this shift is of interest to ecologists, foresters, tourism boards, and economists (eg, Moon, 2022).

A. The PhenoCam image taken by a stationary camera from the EMS tower at Harvard Forest, Massachusetts, USA on October 15, 2008 at 13:31 (UTC-4). A rectangular section (570 × 660 pixels) of the image was clipped (B; black outline in A) and then down-scaled (to 38 × 44) pixels (C) for estimation of the spatial PA. Note that our rectangular image is different from the clipped “region of interest” (ROI; unmasked area in D) analyzed by the PhenoCam network.
FIGURE 1

A. The PhenoCam image taken by a stationary camera from the EMS tower at Harvard Forest, Massachusetts, USA on October 15, 2008 at 13:31 (UTC-4). A rectangular section (570 × 660 pixels) of the image was clipped (B; black outline in A) and then down-scaled (to 38 × 44) pixels (C) for estimation of the spatial PA. Note that our rectangular image is different from the clipped “region of interest” (ROI; unmasked area in D) analyzed by the PhenoCam network.

The 15 images we used were taken from the database of the PhenoCam Network, (Richardson, 2023) a network of more than 700 fixed observation sites across North America and elsewhere in the world that since 2008 has collected high-frequency and high-resolution imagery with networked digital cameras to track the timing of vegetation change (“phenology”) in a range of ecosystems (Seyednasrollah, 2019). We used images from the Harvard Forest, where they have been captured at 30-60-minute intervals since 2008 with a 2048 × 1636-pixel CMOS sensor in an outdoor StarDot NetCam XL 3MP camera (Richardson, 2023).

Our interest is in changes in the forest canopy, so we clipped a 570 × 660-pixel rectangular section from each image (Figure 1B). The edges of the clipped image (black rectangle in Figure 1A) were chosen to maximize the size of the clipped image while avoiding sky (top), wires (left), and other instrumentation on the extended boom (lower right). The high-resolution clipped image (Figure 1B) was then downscaled ≈ 200-fold (to 38 × 44 pixels; Figure 1C) for further analysis and estimation of spatiotemporal PA. Downscaling was done by rasterizing the image using adjoining 15 × 15-pixel windows; we used the mean red, green, and blue (RGB) value from these windows in the rasterized image (Figure 1C). This downscaling was done for 2 reasons. First, “reasonable” values of the spatial lag |$\Vert \boldsymbol h\Vert$| (ie, <15) would have been within a single leaf of the high-resolution image, and it is of more interest to look at changes among leaves and among entire trees. Second, we determined that estimating the PA of the 2 high-resolution clipped images would require ≈6 PB of RAM, whereas the estimation of the downscaled images, which were close to the same size as those used for our simulation studies (Figures S4 and S5), preserved sufficient visual differences among trees, while being computationally more manageable.

We note that our clipped rectangle is different in shape from, but approximately the same size as, the “region of interest” (ROI) defined and analyzed by researchers who use these images for phenological studies (the unmasked area in Figure 1D). The ROI for each PhenoCam site is identified as the area of the image that maximizes the amount of vegetation of interest (ie, deciduous forest at Harvard Forest) while avoiding the sky, topographic features, instrumentation, other human artefacts (eg, buildings, wires, etc), and other areas of the image that could give seasonally biased results (eg, soil covered by snow) (Richardson et al., 2007). The original, clipped, and rasterized images and the code used for rasterizing and analyzing these images are all available on GitHub at Acosta (2024).

3 MATHEMATICAL BACKGROUND AND PRELIMINARIES

Following Leal et al. (2019), we assume that (X1, Y1), …, (Xn, Yn) is a random sample from a bivariate normal distribution with mean vector |$\boldsymbol \mu =(\mu _X,\mu _Y)^{\top },$| and covariance matrix

where (·) means transposition. Then, a method to quantify the degree of agreement between the variables X and Y relies on the differences (Di) between their corresponding values: Di = XiYi, i = 1, …, n. The PA is defined as:

(1)

where c denotes the maximum acceptable difference from a practical perspective, also called the equivalence margin. In such a case the interval (− c, c) is often called the region of practical equivalence (ROPE). In clinical studies this interval is called the clinically acceptable difference (CAD).

The selection of c is an important decision that requires careful consideration and domain expertise. In general, the value of c is chosen by the practitioner and is dependent on the specific application and the relative risks that are being considered. In any specific study, the investigator should evaluate the sensitivity of the PA to the chosen value of c. Stevens et al. (2017) created an R Shiny app that enables a user to evaluate the sensitivity of the PA to the chosen value of c. Plotting the PA as a function of the parameters of the model, including the value of c, also can provide valuable information. Likewise, knowledge acquired from Monte Carlo simulation studies can be highly valuable in understanding the dependency of the PA on c in various scenarios. Further analysis and criteria regarding the use of clinical tolerance limits to evaluate agreement have recently been explored by Taffé (2022).

Because of the normality assumption, the PA as given in (1) takes the form

(2)

where Φ(·) denotes the cumulative distribution function of the standard normal distribution, μD = μX − μY and |$\sigma _D^2= \sigma ^2_{X} + \sigma ^2_{Y} - 2\sigma _{XY}$|⁠. How large the PA should be to consider the variables interchangeable is up to the practitioner; Stevens et al. (2017) suggest using ψc ≥ 0.95 as a guideline.

Under the assumption of normality, inference for |$\boldsymbol \mu$| and |$\boldsymbol \Sigma$| can be addressed via maximum likelihood (ML) (Anderson, 2003, Section 3.2). By substituting such estimates into (2), an ML estimate for ψc, denoted by |$\widehat{\psi }_c$|⁠, can be obtained. Under mild assumptions, Leal et al. (2019) established the asymptotic normality of |$\widehat{\psi }_c$|⁠, which relies on the asymptotic distribution of the ML under normality and the delta method. It is then straightforward to obtain approximate confidence intervals and test hypotheses about ψc.

A random field, denoted by |$\lbrace Y(\boldsymbol s):\boldsymbol s\in \mathcal {D} \subset \mathbb {R}^{d}\rbrace$| or |$Y(\boldsymbol s)$|⁠, is a collection of random variables indexed by |$\mathcal {D}$|⁠, where each |$\boldsymbol s\in \mathcal {D}$| is a location (Cressie, 1993). The mean of |$Y(\boldsymbol s)$| is the function |$\mu :\mathbb {R}^{d} \rightarrow \mathbb {R}$| defined through |$\mu (\boldsymbol s)=\operatorname{E}(Y(\boldsymbol s))$| and the covariance function |$C:\mathbb {R}^{d}\times \mathbb {R}^{d}\rightarrow \mathbb {R}$| is defined as |$C(\boldsymbol s_1,\boldsymbol s_2) = \operatorname{cov}\left(Y(\boldsymbol s_1),Y(\boldsymbol s_2)\right)$|⁠. |$Y(\boldsymbol s)$| is called a weakly stationary random field if the mean function is constant, that is, |$\mu (\boldsymbol s)=\mu \in \mathbb {R}$|⁠, and the covariance function satisfies |$C(\boldsymbol s_1,\boldsymbol s_2)=C(\boldsymbol s_1-\boldsymbol s_2, \boldsymbol 0)=C_{0}(\boldsymbol s_1-\boldsymbol s_2)$|⁠. The covariance function of a stationary process will be denoted as |$C(\boldsymbol h)$|⁠, where |$\boldsymbol h= \boldsymbol s_1-\boldsymbol s_2$|⁠. Thus, |$\operatorname{var}(Y(\boldsymbol s)) = C(\boldsymbol 0)=\sigma ^2\gt 0$|⁠, and |$\rho (\boldsymbol h)=C(\boldsymbol h)/C(\boldsymbol 0)$| is the correlation function. |$Y(\boldsymbol s)$| is said to be an intrinsically stationary random field if |$Y(\boldsymbol s+\boldsymbol h)-Y(\boldsymbol s)$| is stationary. For intrinsically stationary random fields, the mapping |$2\gamma (\boldsymbol h) = \operatorname{var}\left( Y(\boldsymbol s+\boldsymbol h)-Y(\boldsymbol s)\right)$| is the variogram. For a stationary random field, the relationship |$\gamma (\boldsymbol h)=C(\boldsymbol 0)-C(\boldsymbol h)$| holds. The random field is called isotropic if the covariance function (or the variogram) depends exclusively on the distance between the spatial locations. In such a case, we write |$C(\Vert \boldsymbol h\Vert )$| or |$\gamma (\Vert \boldsymbol h\Vert )$|⁠, where ‖ · ‖ denotes the Euclidean norm. The sill represents the variance of the random field, and it is obtained through |$\lim _{\Vert \boldsymbol h\Vert \rightarrow \infty }\gamma (\Vert \boldsymbol h\Vert )=C(0)=\sigma ^2$|⁠. The range is the distance for which any pair of observations are not correlated, and is obtained as the smallest u such that γ(u) = sill. In situations when u → ∞, it is useful to define the practical range as the smallest u for which the semivariogram reaches the 95% of the sill; that is, the value of u that satisfies γ(u) = 0.95σ2.

4 PROBABILITY OF AGREEMENT FOR STATIONARY PROCESSES

In this section, we introduce the PA in the context of georeferenced variables, where |$X(\boldsymbol s)$| and |$Y(\boldsymbol s)$| are the variables of interest and |$\boldsymbol s$| represents the spatial coordinates in the plane. Formally, let |$\boldsymbol Z(\boldsymbol s)=(X(\boldsymbol s), Y(\boldsymbol s))^{\top }$|⁠, |$\boldsymbol s\in \mathcal {D}\subset \mathbb {R}^2$|⁠, be a bivariate second-order stationary random field with |$\boldsymbol s, \boldsymbol h \in \mathbb {R}^2$|⁠, mean (μX, μY), and covariance function

where |$C_{X}(\boldsymbol h) = \operatorname{cov}(X(\boldsymbol s), X(\boldsymbol s+\boldsymbol h))$|⁠, |$C_{Y}(\boldsymbol h) = \operatorname{cov}(Y(\boldsymbol s), Y(\boldsymbol s+\boldsymbol h))$| and |$C_{XY}(\boldsymbol h) = C_{YX}(\boldsymbol h)= \operatorname{cov}(X(\boldsymbol s), Y(\boldsymbol s+\boldsymbol h))$|⁠. Define the difference

(3)

This difference measures the discrepancy between the processes when there is a separation vector equal to |$\boldsymbol h$| between them. In particular, if |$\boldsymbol h=\boldsymbol 0$| the usual difference is recovered and the PA that uses the difference |$X(\boldsymbol s)-Y(\boldsymbol s)$| cannot characterize the spatial dependence effectively.

Assume that |$\boldsymbol Z(\boldsymbol s)$| is a Gaussian process with mean |$\boldsymbol \mu = (\mu _X,\mu _Y)^\top$| and covariance function |$\boldsymbol C(\boldsymbol h)$|⁠, |$\boldsymbol h \in \mathcal {D}$|⁠. Then, |$D(\boldsymbol s,\boldsymbol h)\sim \mathcal {N}(\mu _D, \sigma ^2_D(\boldsymbol h))$|⁠, where μD = μX − μY and |$\sigma ^2_D (\boldsymbol h)=C_X(\boldsymbol 0)+C_Y(\boldsymbol 0 )-2C_{XY}(\boldsymbol h)$|⁠. Then, the PA between processes |$X(\boldsymbol s)$| and |$Y(\boldsymbol s+\boldsymbol h)$| is

(4)

which takes the form

(5)

The probabilities in (4) and (5) can be written, without loss of generality, as |$\psi _c(\boldsymbol h)=\psi _c(\Vert \boldsymbol h\Vert )$|⁠, indicating the dependency of the PA on the norm of |$\boldsymbol h$| when processes |$X(\boldsymbol s)$| and |$Y(\boldsymbol s)$|⁠, and also |$\sigma _D(\boldsymbol h)$|⁠, are all isotropic. In that case, the PA can be plotted as a function of |$\Vert \boldsymbol h\Vert$| in a similar way as the covariance function is plotted for several parametric processes. Then, the PA in (1) can be obtained as a particular case of (4). In fact, |$\psi _c=\psi _c(\boldsymbol {0})$|⁠.

In the remainder of this section, we emphasize the monotonicity of the PA as a function of the spatial lag |$\boldsymbol h$|⁠. If the PA exhibits a decreasing trend, we can determine the minimum value of |$||\boldsymbol h||$| at which PA stabilizes or is negligible, akin to how the spatial range is defined for the semivariogram. Moreover, the monotonicity of the PA allows us to construct hypothesis tests about the true PA for a fixed |$\boldsymbol h$|⁠. As we will see in Section 6, the monotonicity of the PA allows us to simplify the test.

We first consider the Matérn covariance function (Matérn, 1986) to illustrate (5). This function is widely used in spatial statistics because of its theoretical properties and its flexibility for modeling local behavior of spatial correlations (Stein, 1999). The Matérn covariance function takes the form

(6)

where Γ(·) denotes the gamma function, Kν(·) is the modified Bessel function of the second kind, a > 0 is a parameter that controls the rate of decay of the correlation, and ν > 0 is the smoothing parameter that is related to the behavior of the correlation near the origin. A special case of the Matérn function is when ν = m + 1/2, |$m \in \mathbb {N} \cup \lbrace 0\rbrace$|⁠. Then,

(7)

By choosing m = 0, we have the simplest form |$M(\boldsymbol h,1/2,a) = \exp (-a \Vert \boldsymbol h\Vert )$|⁠.

For a bivariate Gaussian random field, the Matérn covariance function has been extended (Gneiting et al., 2010) as

(8)
(9)

where |$\sigma _X^2\gt 0, \sigma _Y^2\gt 0,$| and ρXY is the co-located correlation coefficient between |$X(\boldsymbol s)$| and |$Y(\boldsymbol s)$|⁠. Specific conditions for the parameters are required so that the covariance model described in (8) and (9) is positive definite. In this model, we have that |$\sigma _D^2(\Vert \boldsymbol h\Vert )=\sigma _X^2+\sigma _Y^2-2\rho _{XY}\sigma _X \sigma _YM(\boldsymbol h,\nu _{XY},a_{XY})$|⁠.

For illustrative purposes, consider σX = 1, σY = 2, σXY = 1.8, aXY = 2, ρXY = 0.9 and ν = νXY = {0.5, 1.5, 2.5}. We plot |$\psi _c(\Vert \boldsymbol h\Vert )$| versus |$\Vert \boldsymbol h\Vert$| for |$\Vert \boldsymbol h\Vert \in \lbrace 0,1, \ldots ,15\rbrace$|⁠, c = {1.5, 2, 2.5}, and using the Matérn covariance function (Figure S1). In all cases, |$\psi _c(\Vert \boldsymbol h\Vert )$| decreases as a function of |$\Vert \boldsymbol h\Vert$| and the curves decay more rapidly to 0 as ν decreases. This is a consequence of the monotonic property of the Matérn covariance as shown in the following example.

 
Example 1
Let |$\boldsymbol Z(\boldsymbol s)$| be a bivariate second-order stationary Gaussian process with the Matérn covariance function given in (8) and (9). Assume that νXY = m + 1/2 and, without loss of generality, assume that the Gaussian process has mean 0 and a = 1 in (7). Then,
This implies that
(10)

where φ(·) is the probability density function of a standard normal random variable. Because the right hand side of (10) is positive, to prove that |$\psi _c(\Vert \boldsymbol h\Vert )$| is decreasing as a function of |$\Vert \boldsymbol h\Vert$|⁠, it is enough to get the sign of the derivative of |$M(\boldsymbol h,\nu _{XY},a_{XY})$| with respect to |$\Vert \boldsymbol h\Vert$|⁠.

It should be noted that |$\psi _c(\Vert \boldsymbol h\Vert )$| will not necessarily be a decreasing function of |$\Vert \boldsymbol h\Vert$|⁠. As an example, consider a bivariate process with mean (μ, μ) and an isotropic separable covariance function |$\boldsymbol C(\Vert \boldsymbol h\Vert )$|⁠, where |$C_X(\Vert \boldsymbol h\Vert )=C_{Y}(\Vert \boldsymbol h\Vert )=\sigma ^2 (\phi /\Vert \boldsymbol h\Vert ) \sin (\Vert \boldsymbol h\Vert /\phi ),$| and |$C_{XY}(\Vert \boldsymbol h\Vert )=\rho _{XY}\sigma ^2(\phi /\Vert \boldsymbol h\Vert )\sin (\Vert \boldsymbol h\Vert /\phi )$|⁠. For simplicity, set σ2 = 1 and ϕ = 1. Then, |$\sigma _D^2(h)=2\left(1-\rho _{XY} \sin (\Vert \boldsymbol h\Vert )/\Vert \boldsymbol h\Vert \right).$| For c = 1 and ρXY = 1/2, ψc(π/2) = 0.6082, ψc(3π/2) = 0.4986 and ψc(5π/2) = 0.5351, hence |$\psi _c(\Vert \boldsymbol h\Vert )$| is not decreasing in |$\Vert \boldsymbol h\Vert$|⁠.

However, for certain parametric models, as in Example 1, |$\psi _c(\Vert \boldsymbol h\Vert )$| is a monotonic function. A sufficient condition for a parametric covariance model that ensures that |$\psi _c(\Vert \boldsymbol h\Vert )$| is a decreasing monotonic function is given in the following result.

 
Theorem 1

Suppose that |$\psi _c(\Vert \boldsymbol h\Vert )$| is as in (5). If |$\sigma _{D}(\Vert \boldsymbol h\Vert )$| is an increasing function of |$\Vert \boldsymbol h\Vert$|⁠, then |$\psi _c(\Vert \boldsymbol h\Vert )$| is a decreasing function of |$\Vert \boldsymbol h\Vert$|⁠.

Theorem 2 shows that for the bivariate Matérn covariance function in (8) and (9), |$\sigma _D^2(\Vert \boldsymbol {h}\Vert )$| is an increasing function of |$\Vert \boldsymbol {h}\Vert$|⁠.

 
Theorem 2

Suppose that |$\sigma _D^2(\Vert \boldsymbol {h}\Vert )$| is obtained using the Matérn covariance model and assume that ρXY ≥ 0. Then |$\sigma _D(\Vert \boldsymbol {h}\Vert )$| is an increasing function of |$\Vert \boldsymbol {h}\Vert$|⁠. In consequence, the conditions of Theorem 1 are satisfied and |$\psi _c(\Vert \boldsymbol h\Vert )$| is a decreasing function of |$\Vert \boldsymbol h\Vert$|⁠.

Example 1, therefore, is a direct consequence of Theorems 1 and 2. Moreover, in Theorem 2 it is not necessary to restrict the smoothness parameter of bivariate Matérn covariance model, νXY = m + 1/2, as in Example 1.

The generalized Wendland family of covariance functions (Gneiting, 2002) is defined, for an integer κ > 0, as

(11)

where B(·, ·) denotes the beta function and μ must be positive with a lower bound as given in Gneiting (2002). By continuity, for κ = 0, we have

Specific conditions for μ can also be found in Bevilacqua et al. (2019).

For a bivariate Gaussian random field, the Wendland-Gneiting covariance function has been extended (Daley et al., 2015) and is defined as:

(12)
(13)
(14)

where |$\sigma _X^2\gt 0, \sigma _Y^2\gt 0$| and ρXY is the co-located correlation coefficient between |$X(\boldsymbol s)$| and |$Y(\boldsymbol s)$|⁠. Specific conditions for the parameters cii, bii, γii, i = 1, 2, b12, c12, γ12, ν and κ are needed so that the covariance model described in (12)-(14) is positive definite (Daley et al., 2015). In this model, we have that

(15)
 
Theorem 3

Suppose that |$\sigma _D^2(\Vert \boldsymbol h\Vert )$| is as in (15), and assume that ρXY ≥ 0 and c12 ≥ 0. Then, |$\sigma _D(\Vert \boldsymbol h\Vert )$| is an increasing function of |$\Vert \boldsymbol h\Vert$|⁠. In consequence, the conditions of Theorem 1 are satisfied and |$\psi _c(\Vert \boldsymbol h\Vert )$| is a decreasing function of |$\Vert \boldsymbol h\Vert$|⁠.

5 PROBABILITY OF AGREEMENT FOR SPATIOTEMPORAL PROCESSES

Assume that, |$Z(\boldsymbol s, t)$|⁠, |$\boldsymbol s\in \mathcal {D}\subset \mathbb {R}^2$|⁠, |$t\in \mathbb {Z}_0^{+}$| is a stationary Gaussian spatiotemporal process with mean 0 and covariance function |$C(\boldsymbol h, u)=\operatorname{cov}(Z(\boldsymbol s, t), Z(\boldsymbol s+\boldsymbol h, t+u))$|⁠, where |$\boldsymbol s=(s_1, s_2)^\top$| represents the spatial coordinates, and t denotes the temporal coordinate.

The correlation function and the semivariogram of the spatiotemporal process are defined analogously to the spatial case, that is, |$\rho (\boldsymbol h,u)=C(\boldsymbol h, u)/C(\boldsymbol 0,0)$|⁠, and |$\gamma (\boldsymbol h, u)=C(\boldsymbol 0,0)-C(\boldsymbol h,u)$| (Sherman, 2011). The practical spatial range, in this case, is the value of |$||\boldsymbol h||$| at which the semivariogram reaches the 95% of the sill (i.e. |$C(\boldsymbol 0,0)=\sigma ^2$|⁠); that is, the value of |$||\boldsymbol h||$| that satisfies |$\sigma ^2\lbrace 1-\rho (||\boldsymbol h||, u)\rbrace =0.95\sigma ^2$|⁠, where |$\rho (||\boldsymbol h||,u)$| denotes the correlation function of the spatiotemporal process.

Let |$Y(\boldsymbol s, t)=\mu (\boldsymbol s, t)+Z(\boldsymbol s, t)$|⁠, where the mean function |$\mu (\boldsymbol s, t)=\boldsymbol F(\boldsymbol s, t) ^\top \boldsymbol \beta$|⁠, with |$\boldsymbol F(\boldsymbol s, t)$| known and |$\boldsymbol \beta$| is a vector of unknown parameters. For example, if |$\boldsymbol F(\boldsymbol s, t)=(1, s_1, s_2, t) ^\top$| and |$\boldsymbol \beta =(\beta _0,\beta _1,\beta _2,\beta _3)^{\top }$|⁠, then |$\mu (\boldsymbol s,t)=\beta _0+\beta _1s_1+\beta _2s_2+\beta _3t$| represents a linear trend in the spatial and temporal coordinates.

Similarly to the spatial case, let us look at the increments in time and space, defining the difference

(16)

The quantity defined in (16) measures the discrepancy between the process and itself for a spatial separation |$\boldsymbol h$| and temporal separation u. Then, under the Gaussian assumption, |$D(\boldsymbol s, t, \boldsymbol h, u)\sim \mathcal {N}(\mu _D(\boldsymbol h, u), \sigma ^2_D(\boldsymbol h, u))$|⁠, where |$\mu _D(\boldsymbol h, u) = \lbrace \boldsymbol F(\boldsymbol s+\boldsymbol h, t+u)-\boldsymbol F(\boldsymbol s, t)\rbrace ^\top \boldsymbol \beta$|⁠, |$\sigma ^2_D(\boldsymbol h, u) = 2C(\boldsymbol 0, 0)-2C(\boldsymbol h, u) := 2\gamma (\boldsymbol h, u)$|⁠, and |$\gamma (\boldsymbol h, u)$| is the spatiotemporal semivariogram (Sherman, 2011). Consequently, a natural extension of the PA between |$Y(\boldsymbol s, t)$| and |$Y(\boldsymbol s+\boldsymbol h, t+u)$| is

(17)

Therefore, the PA takes the form

(18)

where Φ(·) is as in (2). As an illustration, if |$\mu (\boldsymbol s, t)=\beta _0+\beta _1t$| (linear trend in time) and |$C(\boldsymbol h, u)=\sigma ^2\exp (-\Vert \boldsymbol h\Vert /\phi _s)\exp (-| u|/\phi _t)$| (separable exponential covariance model) the mean and the variance are

Thus, (18) can be written as:

No matter the sign of β1, |$\psi _c(\boldsymbol h, u)$| decreases as u increases, which is in agreement with the fact that the PA become smaller when separation over time is enlarged.

Note that we can also use Theorems 1-3 and (18) to consider 2 different spatial processes in time. In this case, the covariance function has the same structure: |$C_X(\boldsymbol {h})=C_Y(\boldsymbol {h})$|⁠. Indeed, by replacing |$\sigma _D(\Vert \boldsymbol {h}\Vert )$| by |$\sigma _D(\Vert \boldsymbol h\Vert ,u)$| in Theorem 1, we obtain the same result if |$\sigma _D(\Vert \boldsymbol {h}\Vert ,u)$| is a increasing function of |$\Vert \boldsymbol {h}\Vert$| for fixed u. Moreover, for fixed |$\Vert \boldsymbol h\Vert$|⁠, if |$\sigma _D(\Vert \boldsymbol h\Vert ,u)$| is an increasing function of u then |$\psi _c(\Vert \boldsymbol h\Vert ,u)$| is a decreasing function of u. The proof is virtually identical to that of Theorem 1.

In particular, for the separable exponential covariance model, the negative exponential function is always strictly decreasing. Therefore in this case, |$\sigma ^2_D(\boldsymbol h,u)$| is a strictly increasing function of |$\Vert \boldsymbol h\Vert$| for any fixed u, and u similarly increases for any fixed |$\boldsymbol h$|⁠.

6 ESTIMATION

The purpose of this section is to describe the estimation of the PA defined in (5). Here, we emphasize that the variance of the difference depends on the parameters of the correlation structure, which we denote as |$\sigma _D(\boldsymbol {h})=\sigma _D(\boldsymbol {h},\boldsymbol \theta )$|⁠, where |$\boldsymbol \theta \in \mathbb {R}^q,~\mathrm{and}~ q\in \mathbb {N}$| is a parameter vector associated with the covariance function. The next definition stresses the dependence of the PA on |$\boldsymbol \theta$|⁠.

 
Definition 1
Suppose that |$(X(\boldsymbol s), Y(\boldsymbol s))^{\top }$| is a bivariate second-order stationary random field with |$\boldsymbol s, \boldsymbol h \in \mathbb {R}^2$|⁠, mean (μX, μY), and parametric covariance function |$\boldsymbol C(\boldsymbol h;\theta )$|⁠. The PA between processes |$X(\boldsymbol s)$| and |$Y(\boldsymbol s+\boldsymbol h)$| is defined through

where μD = μX − μY and |$\sigma ^2_D(\boldsymbol h;\boldsymbol \theta )=C_X(\boldsymbol 0;\boldsymbol \theta )+C_Y(\boldsymbol 0;\boldsymbol \theta )-2C_{XY}(\boldsymbol h;\boldsymbol \theta ).$|

If |$\widehat{\mu }_D$| and |$\widehat{\boldsymbol \theta }$| are estimators of μD and |$\boldsymbol \theta$|⁠, respectively, obtained from the sample |$(X(\boldsymbol s_1), Y(\boldsymbol s_1)) ^\top$|⁠, …, |$(X(\boldsymbol s_n), Y(\boldsymbol s_n)) ^\top$|⁠, then the plug-in estimator of |$\psi _c(\boldsymbol h;\mu _D,\boldsymbol \theta )$| is denoted as |$\widehat{\psi }_c(\boldsymbol h) :=\psi _c(\boldsymbol h;\widehat{\mu }_D,\widehat{\boldsymbol \theta })$|⁠.

 
Lemma 1
If |$\boldsymbol V^{-1/2}_{\boldsymbol \theta }(\widehat{\boldsymbol \theta }-\boldsymbol \theta )$| is consistent and |$\boldsymbol V^{-1/2}_{\boldsymbol \theta }(\widehat{\boldsymbol \theta }-\boldsymbol \theta ) \overset{d}{\longrightarrow } \mathcal {N}_q(\boldsymbol 0,\boldsymbol I_q)$| as n → ∞, then |$\widehat{\sigma }_D(\boldsymbol h;\boldsymbol \theta )=\sqrt{C_X(\boldsymbol 0;\widehat{\boldsymbol \theta })+C_Y(\boldsymbol 0;\widehat{\boldsymbol \theta } )-2C_{XY}(\boldsymbol h;\widehat{\boldsymbol \theta })}$| is consistent and asymptotically Gaussian with |$\operatorname{E}(\widehat{\sigma }_D(\boldsymbol h;\boldsymbol \theta )) \approx \sqrt{C_X(\boldsymbol 0;\boldsymbol \theta )+C_Y(\boldsymbol 0;\boldsymbol \theta )-2C_{XY}(\boldsymbol h;\boldsymbol \theta )}$| and
(19)

where |$V_{\boldsymbol \theta }$| is a q × q definite positive matrix, |$\boldsymbol I_q$| is the identity matrix of order q, and ∇ is the gradient operator.

 
Theorem 4
Suppose that |$(X(\boldsymbol s), Y(\boldsymbol s))^{\top }$| is a bivariate stationary Gaussian random field. Assume that |$\widehat{\mu }_D$| is consistent, |$(\widehat{\mu }_D-\mu _D)/\sqrt{V_{\mu _D}(\boldsymbol \theta )} \overset{d}{\longrightarrow } \mathcal {N}(0,1)$|⁠, |$\widehat{\boldsymbol \theta }$| is consistent, |$\boldsymbol V^{-1/2}_{\boldsymbol \theta }(\widehat{\boldsymbol \theta }-\boldsymbol \theta ) \overset{d}{\longrightarrow } \mathcal {N}_q(\boldsymbol 0, \boldsymbol I_q)$| and |$\widehat{\mu }_D$| is independent of |$\widehat{\boldsymbol \theta }$|⁠. Then, |$\widehat{\psi }_c(\boldsymbol h)$| is consistent and asymptotically Gaussian with |$\operatorname{E}(\widehat{\psi }_c(\boldsymbol h)) \approx \psi _c(\boldsymbol h;\mu _D,\boldsymbol \theta )$| and

As a consequence of the limiting distribution established in Theorem 4, an approximate hypothesis test for the PA can be constructed. Consider the null hypothesis |$\text{H}_0:\psi _c(\Vert \boldsymbol h\Vert , \mu _D,\boldsymbol \theta )=\psi _c^{(0)}$|⁠, |$0 \le \psi _c^{(0)} \le 1$|⁠, versus one of the following three alternative hypotheses |$\text{H}_1:\psi _c(\Vert \boldsymbol h\Vert , \mu _D,\boldsymbol \theta ) \ne (\gt \, \text{or}\, \lt )\, \psi _c^{(0)},$| this hypothesis test is relevant because it compares the PA with the nominal value suggested by Stevens et al. (2017) for a fixed |$\boldsymbol h$|⁠. In practice, under the conditions of Theorem 1, the test of |$\text{H}_0:\psi _c(0, \mu _D,\boldsymbol \theta )=0.95$| versus |$\text{H}_1 :\psi _c(0, \mu _D,\boldsymbol \theta ) \lt 0.95$| can be considered; if H0 is rejected, then |$\text{H}_0:\psi _c(\Vert \boldsymbol h\Vert , \mu _D,\boldsymbol \theta )=0.95$| is rejected for all |$\Vert \boldsymbol h\Vert$|⁠, because of the monotone property of the PA.

In a spatiotemporal context, denote the covariance function parameterized by |$\boldsymbol \theta$| as |$C(\boldsymbol h, u;\boldsymbol \theta )$|⁠. Then, the PA in this case is |$\psi _c(\boldsymbol h, u; \boldsymbol \beta ,\boldsymbol \theta )$|⁠, similar to that defined in (18), with |$\sigma _D(\boldsymbol h,u)=\sigma _D(\boldsymbol h,u;\boldsymbol \theta )$|⁠. The plug-in estimator of |$\psi _c(\boldsymbol h, u; \boldsymbol \beta ,\boldsymbol \theta )$| is |$\widehat{\psi }_c(\boldsymbol h, u)=\psi _c(\boldsymbol h, u; \widehat{\boldsymbol \beta },\widehat{\boldsymbol \theta })$|⁠. Thus, if |$\widehat{\boldsymbol \beta }$| and |$\widehat{\boldsymbol \theta }$| are consistent estimators of |$\boldsymbol \beta$| and |$\boldsymbol \theta$|⁠, respectively, Theorem 4 applies to |$\widehat{\psi }_c(\boldsymbol h, u)$| considering |$X(\boldsymbol s)=Y(\boldsymbol s, t)$| and |$Y(\boldsymbol s)=Y(\boldsymbol s, t+u)$| for a fixed u. The analogous hypothesis testing problem in this context is |$\text{H}_0:\psi _c(\Vert \boldsymbol h\Vert , u, \boldsymbol \beta ,\boldsymbol \theta )=\psi _c^{(0)}$|⁠, |$0 \le \psi _c^{(0)} \le 1$|⁠, versus one of the following three alternative hypotheses |$\text{H}_1:\psi _c(\Vert \boldsymbol h\Vert , u, \boldsymbol \beta ,\boldsymbol \theta ) \ne (\gt \, \text{or}\, \lt )\, \psi _c^{(0)}.$|

7 NUMERICAL EXPERIMENTS

We carried out 2 numerical experiments to gain more insights into the properties of the PA for finite sample sizes. The first was a sensitivity analysis that examined how variations in key parameters affected the estimation of the PA for Gaussian random fields with a specific covariance function. The second was a Monte Carlo simulation study, based on 500 runs, that considered spatiotemporal processes with a linear trend and either separable or nonseparable covariance structures. In all cases, the estimates were obtained using a pairwise ML method implemented in the R programming language version 4.0.5 (R Core Team, 2022). The code is available at Acosta (2024).

7.1 Bivariate gaussian random field

Let |$\boldsymbol Z(\boldsymbol s)=(X(\boldsymbol s), Y(\boldsymbol s))^{\top }$| be a bivariate stationary Gaussian random field with a Matérn covariance function as in (8) and (9), where |$\sigma _X^2=1$|⁠, νX = νY = νXY = 0.5, aX = aY = 1, aXY ∈ {0.1, 0.15, 0.2, 0.25, 0.3}, μX = 1, μY ∈ {0, 0.25, 0.5, 0.75, 1}, ρXY ∈ {0, 0.25, 0.5, 0.75, 1}, and |$\sigma ^2_{Y}\in \lbrace 0.8,0.9,1,1.1,1.2\rbrace$|⁠. Assuming that c = 1 and a fixed covariance parameter, we examined the behavior of the PA, |$\psi _c(\Vert \boldsymbol {h}\Vert )$| (Figure 2). In all cases, we observed that |$\psi _c(\Vert \boldsymbol {h}\Vert )$| is a decreasing function of |$\Vert \boldsymbol {h}\Vert$| and for large |$\Vert \boldsymbol {h}\Vert$|⁠, reaches a fixed value corresponding to the uncorrelated case. As expected, for a fixed |$\Vert \boldsymbol {h}\Vert$|⁠, PA increases with the correlation in the data (ρXY). Finally, as either μD or σD increases, PA decreases.

The effect of variation in ρXY (top left), μD (top right), ϕXY (bottom left), and σY (bottom right), on the behavior of PA as a function of lag ‖h‖ in a bivariate Gaussian random field with a Matérn covariance structure. Note differences in range limits of the y-axis among the 4 panels.
FIGURE 2

The effect of variation in ρXY (top left), μD (top right), ϕXY (bottom left), and σY (bottom right), on the behavior of PA as a function of lag ‖h‖ in a bivariate Gaussian random field with a Matérn covariance structure. Note differences in range limits of the y-axis among the 4 panels.

Based on 1000 runs of a bivariate Gaussian random field with separable and nonseparable Matérn covariance functions, the size and the power of the test |$\text{H}_0:~\psi _c(\boldsymbol h,\theta )\ge 0.95$| versus |$\text{H}_1:~\psi _c(\boldsymbol h,\theta )\lt 0.95$| were evaluated for several different scenarios, considering a well/misspecification of covariance functions. Tables S1 and S2 show the estimations, and Figures S1 and S2 show the rejection rates, in all cases the misspecification has a higher rejection rate. More details of the experiment can be found in Appendix C.1.

7.2 Gaussian spatiotemporal process

The Monte Carlo simulation study considered a spatiotemporal process as described in Section 5. For the purposes of this study, we assumed that |$\mu (\boldsymbol s, t)=a_0+a_1 t$| and for the correlation structure, we considered both separable and nonseparable cases:

where αs, αt, and β are positive parameters for Iacocesare model (De Iaco et al,, 2002). Because |$\rho (\boldsymbol h,u)$| is a strictly decreasing function in |$\boldsymbol h$| and u, Iacocesare’s nonseparable covariance function implies that |$\sigma _D^2(\boldsymbol h,u)$| is increasing in |$\Vert \boldsymbol h\Vert$| for any fixed u, and increasing in u for any fixed |$\boldsymbol h$|⁠. In the absence of a nugget, the covariance function is |$C(\boldsymbol h, u)=\sigma ^2 \rho (\boldsymbol h,u)$|⁠.

The estimates of the spatial scale (extent) parameter ϕs in the covariance terms had very large ranges, extending from <0 to greater than the maximum size of the grid. Thus, we only estimated PA for values of ϕs such that 0 < ϕs < the maximum size of the grid (in this case, 50). More details of the Monte Carlo simulations are given Appendix C.2.

The parameter estimates for the spatiotemporal process with a separable covariance structure and 2 different values for the trend parameter a1 (−0.1 and 0.1) are given in Table 1; the estimators were practically unbiased and consistent. The corresponding estimates of PA for different values of h, u and c, and fixed parameters given in Table 1 are shown in Figures S4- S7. The corresponding results from the simulations of a spatiotemporal process with a nonseparable covariance structure and identical values for the trend parameter a1 (−0.1 and 0.1) are given, respectively, in Table 1 and Figures S8-S12, and in Table 1 and Figure 3. As with the separable case, the estimators were practically unbiased and consistent. The model with the Iacocesare covariance had a slightly better performance than the model with the exponential-separable covariance and a higher percent of valid cases used for estimating the parameters. In Tables S6-S12 we show that the PA is decreasing as a function of |$|| \boldsymbol h||$| for the spatiotemporal Gaussian processes.

Estimates of the PA as a function of $\Vert \boldsymbol h\Vert$, u and c for a spatiotemporal Gaussian process with a positive linear trend and a nonseparable Iacocesare covariance structure with fixed parameters given in Table 1, and for Ns = 50. Note differences in range limits of the y-axis among the 9 panels.
FIGURE 3

Estimates of the PA as a function of |$\Vert \boldsymbol h\Vert$|⁠, u and c for a spatiotemporal Gaussian process with a positive linear trend and a nonseparable Iacocesare covariance structure with fixed parameters given in Table 1, and for Ns = 50. Note differences in range limits of the y-axis among the 9 panels.

TABLE 1

Mean and SD of the estimates for the spatiotemporal process with an exponential separable and a nonseparable Iacocesare covariance structure.

CovariancePercent
model(NS, NT)a0a1ϕsϕtσ2αsαtβvalid
ExponentialTrue0.500−0.1006.6761.0000.100
(20,10)Average0.510−0.1017.7140.7890.09077.6
SD0.1450.0233.3420.2310.014
(50,10)Average0.504−0.1017.9440.9520.09774.0
SD0.0700.0113.8160.1260.008
True0.5000.1006.6761.0000.100
(20,10)Average0.5110.0987.4370.8300.09082.4
SD0.1450.0242.9740.2260.014
(50,10)Average0.5060.0997.5480.9400.09684.4
SD0.0630.0103.7060.1130.007
IacocesareTrue0.500−0.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.513−0.1027.2500.9870.0911.8912.1802.53462.0
SD0.1320.0192.4750.8870.0092.2462.1061.730
(50,10)Average0.506−0.1017.0921.1040.0961.6981.3582.26885.6
SD0.0860.0112.4070.7500.0051.7380.8951.209
True0.5000.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.4930.0996.8960.9790.0901.4852.4132.88167.2
SD0.1320.0202.2720.6410.0092.0092.1732.299
(50,10)Average0.5040.1016.7781.0510.0961.4521.4972.21782.8
SD0.0820.0122.0720.5790.0051.7651.1281.055
CovariancePercent
model(NS, NT)a0a1ϕsϕtσ2αsαtβvalid
ExponentialTrue0.500−0.1006.6761.0000.100
(20,10)Average0.510−0.1017.7140.7890.09077.6
SD0.1450.0233.3420.2310.014
(50,10)Average0.504−0.1017.9440.9520.09774.0
SD0.0700.0113.8160.1260.008
True0.5000.1006.6761.0000.100
(20,10)Average0.5110.0987.4370.8300.09082.4
SD0.1450.0242.9740.2260.014
(50,10)Average0.5060.0997.5480.9400.09684.4
SD0.0630.0103.7060.1130.007
IacocesareTrue0.500−0.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.513−0.1027.2500.9870.0911.8912.1802.53462.0
SD0.1320.0192.4750.8870.0092.2462.1061.730
(50,10)Average0.506−0.1017.0921.1040.0961.6981.3582.26885.6
SD0.0860.0112.4070.7500.0051.7380.8951.209
True0.5000.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.4930.0996.8960.9790.0901.4852.4132.88167.2
SD0.1320.0202.2720.6410.0092.0092.1732.299
(50,10)Average0.5040.1016.7781.0510.0961.4521.4972.21782.8
SD0.0820.0122.0720.5790.0051.7651.1281.055

The cases of positive and negative slope are included. “Percent valid” is the percentage of simulations for which estimates of ϕs were greater than 0 and less than the maximum size of the grid.

TABLE 1

Mean and SD of the estimates for the spatiotemporal process with an exponential separable and a nonseparable Iacocesare covariance structure.

CovariancePercent
model(NS, NT)a0a1ϕsϕtσ2αsαtβvalid
ExponentialTrue0.500−0.1006.6761.0000.100
(20,10)Average0.510−0.1017.7140.7890.09077.6
SD0.1450.0233.3420.2310.014
(50,10)Average0.504−0.1017.9440.9520.09774.0
SD0.0700.0113.8160.1260.008
True0.5000.1006.6761.0000.100
(20,10)Average0.5110.0987.4370.8300.09082.4
SD0.1450.0242.9740.2260.014
(50,10)Average0.5060.0997.5480.9400.09684.4
SD0.0630.0103.7060.1130.007
IacocesareTrue0.500−0.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.513−0.1027.2500.9870.0911.8912.1802.53462.0
SD0.1320.0192.4750.8870.0092.2462.1061.730
(50,10)Average0.506−0.1017.0921.1040.0961.6981.3582.26885.6
SD0.0860.0112.4070.7500.0051.7380.8951.209
True0.5000.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.4930.0996.8960.9790.0901.4852.4132.88167.2
SD0.1320.0202.2720.6410.0092.0092.1732.299
(50,10)Average0.5040.1016.7781.0510.0961.4521.4972.21782.8
SD0.0820.0122.0720.5790.0051.7651.1281.055
CovariancePercent
model(NS, NT)a0a1ϕsϕtσ2αsαtβvalid
ExponentialTrue0.500−0.1006.6761.0000.100
(20,10)Average0.510−0.1017.7140.7890.09077.6
SD0.1450.0233.3420.2310.014
(50,10)Average0.504−0.1017.9440.9520.09774.0
SD0.0700.0113.8160.1260.008
True0.5000.1006.6761.0000.100
(20,10)Average0.5110.0987.4370.8300.09082.4
SD0.1450.0242.9740.2260.014
(50,10)Average0.5060.0997.5480.9400.09684.4
SD0.0630.0103.7060.1130.007
IacocesareTrue0.500−0.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.513−0.1027.2500.9870.0911.8912.1802.53462.0
SD0.1320.0192.4750.8870.0092.2462.1061.730
(50,10)Average0.506−0.1017.0921.1040.0961.6981.3582.26885.6
SD0.0860.0112.4070.7500.0051.7380.8951.209
True0.5000.1006.6761.0000.1001.0001.0002.000
(20,10)Average0.4930.0996.8960.9790.0901.4852.4132.88167.2
SD0.1320.0202.2720.6410.0092.0092.1732.299
(50,10)Average0.5040.1016.7781.0510.0961.4521.4972.21782.8
SD0.0820.0122.0720.5790.0051.7651.1281.055

The cases of positive and negative slope are included. “Percent valid” is the percentage of simulations for which estimates of ϕs were greater than 0 and less than the maximum size of the grid.

As a consequence of the numerical experiments carried out in Sections 7.1 and 7.2 and Appendices C.1 and C.2, the value of c is strictly related to the scale of the data, thus we suggest that c should be proportional to the scale. That is, |$c=\tilde{c}*\sigma$|⁠, where σ is the scale of the random field, and |$\tilde{c}\in [l,2.0]$|⁠, with l = 1.6 (nonseparable case) or l = 1.7 (separable case). In practice, this simple rule might be emulated for other covariance structures.

8 PROBABILITY OF AGREEMENT OF THE FOREST IMAGES

8.1 Values used for the comparisons

We calculated the green chromatic coordinate (Gcc) of each pixel in the ROI of the images (Figure 1C). Gcc is an index of “greenness” that captures the phenological stage of tree leaves, is associated with carbon flux from ecosystems, and is estimated from repeated images of forest canopies (Richardson et al., 2018). For a pixel in an RGB image, Gcc is calculated as Gcc = GDN/(GDN + RDN + BDN), where (·)DN is the digital number of the green (G), red (R), and blue (B) channels, respectively, assigned to each pixel in a digital image. The PhenoCam network calculates Gcc for each pixel and reports the mean, and the 50th, 75th, and 90th percentiles of the Gcc for the ROI of each image, and their 1-day and 3-day running means and percentiles (Richardson et al., 2018). Our estimates of the mean Gcc calculated by the PhenoCam network for the corresponding ROI fell within the range of Gcc values for each pixel of our clipped and rasterized images (Figure 4).

Green chromatic indices (Gcc) of each of the rasters in the 15 downscaled images (gray symbols) and the mean Gcc estimated for the entire region of interest (ROI; masked area in Figure 1 D) by the PhenoCam network. Shown are the slopes and intercepts of the temporal trend in the Gcc by the PhenoCam network (Richardson et al., 2018) (red line), and using our spatiotemporal PA with separable (green), and nonseparable (blue) spatial covariances.
FIGURE 4

Green chromatic indices (Gcc) of each of the rasters in the 15 downscaled images (gray symbols) and the mean Gcc estimated for the entire region of interest (ROI; masked area in Figure 1 D) by the PhenoCam network. Shown are the slopes and intercepts of the temporal trend in the Gcc by the PhenoCam network (Richardson et al., 2018) (red line), and using our spatiotemporal PA with separable (green), and nonseparable (blue) spatial covariances.

8.2 Estimating the PA of the forest images

The empirical variogram of the original data showed strong temporal dependence, so we used linear regression independent of spatial covariance to remove the effect of a deterministic trend (slope = 0.00235, intercept = 0.3854; P-value <0.0001 for both). This change in Gcc corresponds to a delay in leaf senescence on the order of 0.5-0.7 days/yr, marginally larger than the delay estimated through 2010 by Gill et al. (2015) when climatic warming was somewhat slower than it is now. Figure S13 shows the marginal empirical variograms after this trend had been removed (ie, the empirical variogram of the residuals of the simple linear regression).

To model Gcc, we assume that Gcc is a realization of a spatiotemporal process, where the spatial coordinates, |$\boldsymbol s=(s_1,s_2)^\top$| are the pixel positions, and t is the temporal instant when the photo was taken. We also assume that the mean is linear and the covariance function is isotropic and stationary. That is, |$G_{cc} := G_{cc}(\boldsymbol s,t) = \beta _0 + \beta _1 t + Z(\boldsymbol s,t)$|⁠, where |$Z(\boldsymbol s,t)$| is a stationary Gaussian spatiotemporal process with mean 0 and covariance function |$C(\Vert \boldsymbol h\Vert , u)$| of the spatial lag |$\Vert \boldsymbol h\Vert$| and the temporal lag u. In this case, |$F(\boldsymbol s, t)=(1, t) ^\top$| and |$\boldsymbol \beta = (\beta _0, \beta _1)^{\top }$|⁠, so |$\mu (\boldsymbol s, t)=\beta _0+\beta _1 t$|⁠. We considered different covariance models (separable and nonseparable), each with fixed nugget effect equal to 0.

The values that maximize the objective function (log composite likelihood) for the exponential and Iacocesare models, respectively, were 19668351.95 and 19682852.64, and the values of the associated pseudo-Akaike information criterion were, respectively, −39336693.90 and −39365689.29. These results suggested a better fit to the data when using the Iacocesare nonseparable covariance model. The parameter estimates (and the values we used to initialize the estimation routine) for the temporal trend and the covariance model are given in Table 2.

TABLE 2

Initial values and parameter estimates for the linear temporal trend and the Iacocesare covariance structure.

|$\widehat{a}_0$||$\widehat{a}_1$||$\widehat{\alpha }_s$||$\widehat{\alpha }_t$||$\widehat{\beta }$||$\widehat{\phi }_s$||$\widehat{\phi }_t$||$\widehat{\sigma }^2$|
Initial0.500000.010001.000001.000002.000006.676161.000000.00100
Estimate0.384530.002581.062970.953711.948676.531702.083160.00044
|$\widehat{a}_0$||$\widehat{a}_1$||$\widehat{\alpha }_s$||$\widehat{\alpha }_t$||$\widehat{\beta }$||$\widehat{\phi }_s$||$\widehat{\phi }_t$||$\widehat{\sigma }^2$|
Initial0.500000.010001.000001.000002.000006.676161.000000.00100
Estimate0.384530.002581.062970.953711.948676.531702.083160.00044
TABLE 2

Initial values and parameter estimates for the linear temporal trend and the Iacocesare covariance structure.

|$\widehat{a}_0$||$\widehat{a}_1$||$\widehat{\alpha }_s$||$\widehat{\alpha }_t$||$\widehat{\beta }$||$\widehat{\phi }_s$||$\widehat{\phi }_t$||$\widehat{\sigma }^2$|
Initial0.500000.010001.000001.000002.000006.676161.000000.00100
Estimate0.384530.002581.062970.953711.948676.531702.083160.00044
|$\widehat{a}_0$||$\widehat{a}_1$||$\widehat{\alpha }_s$||$\widehat{\alpha }_t$||$\widehat{\beta }$||$\widehat{\phi }_s$||$\widehat{\phi }_t$||$\widehat{\sigma }^2$|
Initial0.500000.010001.000001.000002.000006.676161.000000.00100
Estimate0.384530.002581.062970.953711.948676.531702.083160.00044

The practical spatial range, defined as the distance at which 95% of the sill (ie, σ2 in this case) is reached, will depend on the temporal separation. For the Iacocesare covariance model, this distance can be obtained by using the estimates of the parameters in this equation:

Using the estimates given in Table 2 and setting u = 0 (no time lag), the estimated practical range is 22.1 pixels. In addition, for u = 8 the practical range was approximately 0.

Finally, Figure 5 illustrates estimates of the PA of the Gcc between images as a function of spatial lag |$\Vert \boldsymbol h\Vert$| for 4 different time lags (u) and four different maximum acceptable differences c. Regardless of the values of u and c, PA decreases with increasing |$\Vert \boldsymbol h\Vert$|⁠, but the rate of decrease declines rapidly with increasing u. We note that if u = 0, we are working with a single image (not comparing two images at different times); in such a case PA computes the probability that the leaves have the same color at different spatial locations in that image. In contrast, when u > 0, we are comparing images through time, and indeed, the PA (ie, color of a leaf either at the same location or at different locations) get worse as u increases. The PA is more sensitive to u than |$\Vert \boldsymbol h\Vert$| because the mean of the process has a linear temporal trend. This is consistent with the hypothesis test described in Appendix C.3. In all cases the P-value is 0 (see Figure S15), hence the hypothesis that |$\psi _c(\Vert \boldsymbol h\Vert ,u)$| is equal to 0.95 is rejected, except for u = 0, where, for example, for c = 0.04 the null hypothesis is not rejected until |$\Vert \boldsymbol h\Vert \approx 4$|⁠. The observed results imply that the timing of leaf senescence is changing (as indicated by lack of agreement in Gcc at a given location through time). In this case, because Gcc is increasing through time (Figure 4), leaves are staying greener later and senescence is delayed relative to previous years. We note that for u > 1, PA is practically constant as a function of |$\Vert \boldsymbol h\Vert$|⁠, which we interpret to mean that the PA of the spatiotemporal processes separated by 2 or more years is not affected by temporal or spatial changes.

PA as a function of $\Vert \boldsymbol h\Vert$ for four time lags u with different maximum acceptable differences c.
FIGURE 5

PA as a function of |$\Vert \boldsymbol h\Vert$| for four time lags u with different maximum acceptable differences c.

9 DISCUSSION AND FUTURE WORK

The PA has been generalized for the analysis of concordance between 2 georeferenced variables. This extension possesses monotonic properties—the PA declines as a function of the norm of the spatial lag—and thus quantifies the effective (or practical) spatial range. Our spatial PA is meaningful for isotropic processes. The hypothesis testing developed in Section 6 allows the estimation of the PA as a function of the spatial range, and provides a way to rule out spatial agreement when the null hypothesis is rejected for all ‖h‖. Our theoretical extension of the PA also works for spatiotemporal processes with a trend, allowing for the analysis of nonstationary processes in the mean.

The monotonic properties of the PA for finite sample size were supported by Monte Carlo simulation experiments, which also showed that the parameter estimates using the composite likelihood have a small bias and low variance. The value of c plays a crucial role in the estimation and the PA is sensitive to the choice of it. In practice, the value of c needs to be scaled by the square root of the sill in order to account for the scale of the data.

The application presented in Section 8 illustrated that the PA can describe the change of a spatiotemporal variable in time while accounting for spatial and temporal dependence. In this particular case, the PA also provided information about the dependence of the trend on the spatial information of the past realizations of the process; such information may be of value in modeling the trend and developing such models should be a focus of future work. Although parameter estimates were not very sensitive to the choice of the covariance function, identification of the best covariance model (eg, separable or non-separable, and types of each) can still be improved. If the trend cannot be modeled easily with a well-known function, prior exploration will be needed to characterize a parametric function that accurately identifies observed patterns. Nonparametric trend estimation (eg, Strandberg et al., 2019) could also be used.

The computational efficiency of the composite likelihood method used in Sections 7 and 8 deserves attention. The computational implementations used in this article worked well for images of the order of size 50 × 50; we had to rasterize and downscale the original images used in Section 8 by 2 orders of magnitude before we could estimate the relevant parameters. Overcoming memory limitations to enable parameter estimation from much larger images (more than 106 pixels) will require innovative parallelizable algorithms.

We also note that the Monte Carlo simulations and the application were developed and illustrated using spatial processes (images) defined on regular grids. However, there is no apparent reason that the proposed methods could not also be applied to irregularly spaced spatial data.

The normality assumption of the bivariate process is a challenging aspect because the available data consists of a single realization of a bivariate Gaussian process, as is typically assumed in spatial statistics. Based on this information, only the marginal normal distributions can be examined using standard statistical tools. However, this analysis does not provide useful information for assessing the bivariate normality of the vector |$(X(\boldsymbol s), Y(\boldsymbol s))^{\top }$|⁠. There are extensions in the literature of the Kolmogorov-Smirnov test for the marginal distributions when the data are spatially correlated (see for instance, Pardo-Igúzquiza and Sowd, 2004). If the variance of |$\widehat{\psi }_c$| is large, the normal approximation would not be suitable for values near the boundaries. An alternative nonparametric estimator could be implemented for ψc.

For practical applications of the PA, we recommend comprehensive exploratory data analysis to determine the feasibility of using linear trends in time and to measure the sensitivity of the covariance structure under different parametric models.

Finally, a natural but unexplored extension of the present work is the definition of the PA for spatiotemporal marked point processes. Because the randomness in point processes is in the location (as well as in the marks), and repeated observations of a given point process will yield a new set of locations, it is of theoretical interest to estimate the PA of point patterns that evolve through time. Such estimation would have immediate applicability to ecological spatial datasets, which are predominantly samples of point processes, not rasters (eg, Plant, 2019).

Acknowledgement

The authors thank a Co-Editor, an Associate Editor, and 3 reviewers for the careful reading of the manuscript and for the comments that helped us to enhance the quality of our work.

FUNDING

This work has been partially supported by the AC3E, UTFSM, under grant FB-0008, and from USM PI-L-18-20. R. V. also acknowledges financial support from CONICYT through the MATH-AMSUD program, grant 20-MATH-03 and from Fondecyt Grant 1230012. A. M. E. work on this project was supported by a Fulbright Specialist Grant, Fulbright Chile, and the Universidad Técnica Federico Santa María. M. D. C. work was partially funded by CNPq, Brazil. J. A. acknowledges financial support from Fondecyt Grant 11230502.

CONFLICT OF INTEREST

The authors declare no conflicts of interest.

DATA AVAILABILITY

The data generated in the simulations and the data used in the application can be accessed on the GitHub repository at https://github.com/JAcosta-Hub/Comparing-two-spatial-variables-with-the-probability-of-agreement.

References

Anderson
 
T. W.
(
2003
).
An Introduction to Multivariate Statistical Analysis, 3rd Edition
.
Wiley
:  
New York
.

Barichivich
 
J.
,
Briffa
 
K. R.
,
Osborn
 
T. J.
,
Melvin
 
T. M.
,
Caesar
 
J.
(
2012
).
Thermal growing season and timing of biospheric carbon uptake across the Northern Hemisphere
.
Global Biogeochemical Cycles
,
26
,
GB4015
.

Bevilacqua
 
M.
,
Faouzi
 
T.
,
Furrer
 
R.
,
Porcu
 
E.
(
2019
).
Estimation and prediction using generalized Wendland covariance functions under fixed domain asymptotics
.
The Annals of Statistics
,
47
,
828
856
.

Cressie
 
N.
(
1993
).
Statistics for Spatial Data
.
Wiley
:  
New York
.

Daley
 
D. J.
,
Porcu
 
E.
,
Bevilacqua
 
M.
(
2015
).
Classes of compactly supported covariance functions for multivariate random fields
.
Stochastic Environmental Research and Risk Assessment
,
29
,
1249
1263
.

de Castro
 
M.
,
Galea
 
M.
(
2021
).
Bayesian inference for the pairwise probability of agreement using data from several measurement systems
.
Quality Engineering
,
33
,
571
580
.

Gneiting
 
T.
(
2002
).
Compactly supported correlation functions
.
Journal of Multivariate Analysis
,
83
,
493
508
.

Gneiting
 
T.
,
Kleiber
 
W.
,
Schlather
 
M.
(
2010
).
Matérn cross-covariance functions for multivariate random fields
.
Journal of the American Statistical Association
,
105
,
1167
1177
.

De Iaco
 
S.
,
Myers
 
D. E.
,
Posa
 
D.
(
2002
).
Nonseparable space-time covariance models: some parametric families
.
Mathematical Geology
,
34
,
23
42
.

Gill
 
A. L.
,
Gallinat
 
A. S.
,
Sanders-DeMott
 
R.
,
Rigden
 
A. J.
,
Gianotti
 
D. J. S.
,
Mantooth
 
J. A.
 et al. (
2015
).
Changes in autumn senescence in northern hemisphere deciduous trees: a meta-analysis of autumn phenology studies
.
Annals of Botany
,
116
,
875
888
.

Keenan
 
T. F.
,
Gray
 
J.
,
Friedl
 
M. A.
,
Toomey
 
M.
,
Bohrer
 
G.
,
Hollinger
 
D. Y.
 et al. (
2014
).
Net carbon uptake has increased through warming-induced changes in temperate forest phenology
.
Nature Climate Change
,
4
,
598
604
.

Leal
 
C.
,
Galea
 
M.
,
Osorio
 
F.
(
2019
).
Assessment of local influence for the analysis of agreement
.
Biometrical Journal
,
61
,
955
972
.

Le Quéré
 
C.
,
Raupach
 
M. R.
,
Canadell
 
J. G.
,
Marland
 
G.
,
Bopp
 
L.
,
Ciais
 
P.
 et al. (
2009
).
Trends in the sources and sinks of carbon dioxide
.
Nature Geoscience
,
2
,
831
836
.

Lin
 
L.
(
1989
).
A concordance correlation coefficient to evaluate reproducibility
.
Biometrics
,
45
,
225
268
.

Lin
 
L.
,
Hedayat
 
A.
,
Sinha
 
B.
,
Yang
 
M.
(
2002
).
Statistical methods in assessing agreement: models, issues, and tools
.
Journal of the American Statistical Association
,
97
,
257
270
.

Lin
 
L.
,
Hedayat
 
A.S.
,
Wu
 
W.
(
2012
).
Statistical Tools for Measuring Agreement
.
Springer
:  
New York
.

Mardia
 
K.
,
Marshall
 
R.
(
1984
).
Maximum likelihood of models for residual covariance in spatial regression
.
Biometrika
,
71
,
135
146
.

Matérn
 
B.
(
1986
).
Spatial Variation, second edition
.
Springer
:  
New York
.

Moon
 
M.
,
Richardson
 
A. D.
,
O’Keefe
 
J.
,
Friedl
 
M. A.
(
2022
).
Senescence in temperate broadleaf trees exhibits species-specific dependence on photoperiod versus thermal forcing
.
Agricultural and Forest Meteorology
,
322
,
109026
.

Pardo-Igúzquiza
 
E.
,
Sowd
 
P.
(
2004
).
Normality tests for spatially correlated data
.
Mathematical Geology
,
36
,
659
681
.

Piao
 
S.
,
Liu
 
Q.
,
Chen
 
A.
,
Janssens
 
I. A.
,
Fu
 
Y.
,
Dai
 
J.
 et al. (
2019
).
Plant phenology and global climate change: Current progresses and challenges
.
Global Change Biology
,
25
,
1922
1940
.

Plant
 
R. E.
(
2019
).
Spatial Data Analysis in Ecology and Agriculture Using R, 2nd edition
.
CRC Press
:
Florida
.

Ponnet
 
J.
,
Van Oirbeck
 
R.
,
Verdonck
 
T.
(
2021
).
Concordance probability for insurance pricing models
.
Risks
,
9
,
178
.

R Core Team
(
2022
).
R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
.

Richardson
 
A.
(
2023
).
PhenoCam images and canopy phenology at the Harvard Forest EMS Tower since 2008. Harvard Forest Data Archive: HF158 (v.15). Environmental Data Initiative
:  https://doi.org/10.6073/pasta/be4990fb2a64b65ec4f244bd6a690aef.

Richardson
 
A. D.
,
Jenkins
 
J. P.
,
Braswell
 
B. H.
,
Hollinger
 
D. Y.
,
Ollinger
 
S. V.
,
Smith
 
M.-L.
(
2007
).
Use of digital webcam images to track spring green-up in a deciduous broadleaf forest
.
Oecologia
,
152
,
323
334
.

Richardson
 
A. D.
,
Hufkens
 
K.
,
Milliman
 
T.
,
Aubrecht
 
D. M.
,
Chen
 
M.
,
Gray
 
J. M.
 et al. (
2018
).
Tracking vegetative phenology across diverse North American biomes using PhenoCam imagery
.
Scientific Data
,
5
,
180028
.

Seyednasrollah
 
B.
,
Young
 
A. M.
,
Hufkens
 
K.
,
Milliman
 
T.
,
Friedl
 
M. A.
,
Frolking
 
S.
 et al. (
2019
).
Tracking vegetation phenology across diverse biomes using PhenoCam imagery: The PhenoCam Dataset v2.0
.
Scientific Data
,
6
,
222
.

Sherman
 
M.
(
2011
).
Spatial Statistics and Spatio-Temporal Data: Covariance Functions and Directional Properties
,
Wiley
:
United Kingdom
.

Stein
 
M. L.
(
1999
).
Interpolation of Spatial Data: Some Theory for Kriging
,
Springer
:
New York
.

Stevens
 
N. T.
,
Anderson-Cook
 
C. M.
(
2017
).
Comparing the reliability of related populations with the probability of agreement
.
Technometrics
,
59
,
371
380
.

Stevens
 
N. T.
,
Lu
 
L.
(
2020
).
Comparing Kaplan-Meier curves with the probability of agreement
.
Statistics in Medicine
,
39
,
4621
4635
.

Stevens
 
N. T.
,
Lu
 
L.
,
Anderson-Cook
 
C. M.
,
Rigdon.
 
S. E.
(
2020
).
Bayesian probability of agreement for comparing survival or reliability functions with parametric lifetime regression models
.
Quality Engineering
,
32
,
312
332
.

Stevens
 
N. T.
,
Steiner
 
S. H.
,
MacKay
 
R. J.
(
2017
).
Assessing agreement between two measurement systems: An alternative to the limits of agreement approach
.
Statistical Methods in Medical Research
,
26
,
2487
2504
.

Stevens
 
N. T.
,
Steiner
 
S. H.
,
MacKay
 
R. J.
(
2018
).
Comparing heteroscedastic measurement systems with the probability of agreement
.
Statistical Methods in Medical Research
,
27
,
3420
3435
.

Strandberg
 
J.
,
de Luna
 
S.
,
Mateu
 
J.
(
2019
).
Prediction of spatial functional random processes: comparing functional and spatio-temporal kriging approaches
.
Stochastic Environmental Research and Risk Assessment
,
33
,
1699
1719
.

Taffé
 
P.
(
2022
).
Use of clinical tolerance limits for assessing agreement
.
Statistical Methods in Medical Research
,
32
,
195
2006
.

Yang
 
P.
,
Gong
 
P.
,
Fu
 
R.
,
Zhang
 
M.
,
Chen
 
J.
,
Liang
 
S.
 et al. (
2013
).
The role of satellite remote sensing in climate change studies
.
Nature Climate Change
,
3
,
875
883
.

Appendix

Proof of Theorem 1

Without loss of generality, we assume that the Gaussian process has mean |$\boldsymbol 0$| and |$h=\Vert \boldsymbol h\Vert$| in (5). First notice that

Now, let h1 > 0 and h2 > 0 be such that h1h2. By hypothesis, σD(h1) ≤ σD(h2), then cD(h2) ≤ cD(h1) because c > 0. Finally, since Φ(·) is an increasing function, then

Therefore, ψc(h2) ≤ ψc(h1) for all h1h2. |${\square}$|

Proof of Lemma 1 Let |$\widehat{\sigma }_D(\boldsymbol h;\boldsymbol \theta )=g(\widehat{\boldsymbol \theta })$|⁠. Applying a Taylor expansion of order 1 for |$g(\widehat{\boldsymbol \theta })$| around |$\boldsymbol \theta$|⁠, we have that |$\widehat{\sigma }_D(\boldsymbol h;\boldsymbol \theta )\approx g(\boldsymbol \theta ) + \nabla g(\boldsymbol \theta )^{\top }(\widehat{\boldsymbol \theta }-\boldsymbol \theta )$|⁠. Then, |$\operatorname{E}(\widehat{\sigma }_D(\boldsymbol h;\boldsymbol \theta )) \approx g(\boldsymbol \theta )$| and |$\operatorname{var}(\widehat{\sigma }_D(\boldsymbol h;\boldsymbol \theta )) \approx \nabla g(\boldsymbol \theta )^{\top }\boldsymbol V_{\boldsymbol \theta }\nabla g(\boldsymbol \theta )$|⁠. Now, because |$g(\boldsymbol \theta )=\sqrt{\sigma ^2_D(\boldsymbol h;\boldsymbol \theta )}$|⁠, then |$\nabla g(\boldsymbol \theta )=\nabla \sigma ^2_D(\boldsymbol h;\boldsymbol \theta ) / \lbrace 2g(\boldsymbol \theta )\rbrace$|⁠, where the i-th element of |$\nabla \sigma ^2_D(\boldsymbol h;\boldsymbol \theta )$| is given by |$\partial \sigma ^2_D(\boldsymbol h;\boldsymbol \theta ) / \partial \theta _i = \partial C_X(\boldsymbol 0;\boldsymbol \theta ) / \partial \theta _i +\partial C_Y(\boldsymbol 0;\boldsymbol \theta ) / \partial \theta _i - 2 \partial C_{XY}(\boldsymbol h;\boldsymbol \theta ) / \partial \theta _i$|⁠. |${\square}$|

Proof of Theorem 4 Denote |$\sigma _D=\sigma _D(\boldsymbol h, \boldsymbol \theta )$|⁠, |$\widehat{\sigma }_D=\sigma _D(\boldsymbol h, \widehat{\boldsymbol \theta })$|⁠, |$\psi _c(\boldsymbol h;\mu _D,\boldsymbol \theta )=\psi _c(\boldsymbol h;\mu _D,\sigma _D)$|⁠, and |$\psi _c(\boldsymbol h;\widehat{\mu }_D,\widehat{\boldsymbol \theta })=\psi _c(\boldsymbol h;\widehat{\mu }_D,\widehat{\sigma }_D)$|⁠. Let |$V_{\sigma _D}(\boldsymbol h) = \operatorname{var}(\sigma _D(\boldsymbol h;\widehat{\boldsymbol \theta }))$| given in (19). By Lemma 1, we have that |$\sigma _D(\boldsymbol h;\widehat{\boldsymbol \theta })$| is consistent, and |$\lbrace \sigma _D(\boldsymbol h;\widehat{\boldsymbol \theta })-\sigma _D(\boldsymbol h;\boldsymbol \theta )\rbrace / \sqrt{V_{\sigma _D}(\boldsymbol h)}\overset{d}{\longrightarrow }\mathcal {N}(0,1)$|⁠. Now, using the delta method, it follows that

(A1)

 where

Applying expected value and variance in both sides of (A1), the result follows. |${\square}$|

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)