Spatial confidence regions for combinations of excursion sets in image analysis

Abstract The analysis of excursion sets in imaging data is essential to a wide range of scientific disciplines such as neuroimaging, climatology, and cosmology. Despite growing literature, there is little published concerning the comparison of processes that have been sampled across the same spatial region but which reflect different study conditions. Given a set of asymptotically Gaussian random fields, each corresponding to a sample acquired for a different study condition, this work aims to provide confidence statements about the intersection, or union, of the excursion sets across all fields. Such spatial regions are of natural interest as they directly correspond to the questions ‘Where do all random fields exceed a predetermined threshold?’, or ‘Where does at least one random field exceed a predetermined threshold?’. To assess the degree of spatial variability present, our method provides, with a desired confidence, subsets and supersets of spatial regions defined by logical conjunctions (i.e. set intersections) or disjunctions (i.e. set unions), without any assumption on the dependence between the different fields. The method is verified by extensive simulations and demonstrated using task-fMRI data to identify brain regions with activation common to four variants of a working memory task.


Introduction
The collection and analysis of imaging data, modelled as a random field sampled on a spatial domain, is central to a broad range of scientific disciplines such as neuroimaging, climatology, and cosmology.Often, spatial data drawn from n observations of a random field are combined to obtain an estimate, μn , of some spatially varying target function µ.
The target function µ is defined on a closed spatial domain, S ⊂ R N , and maps spatial locations to some variable of interest in R (e.g.seismic activity, infrared heat, changes in blood oxygenation in the brain, etc.).In such applications, it is typically desirable to ask "At which locations does the target function exceed a certain value, c?".For example, "Where has a significant change in temperature occurred?";"Where do significant heat readings indicate the presence of celestial bodies?".Such questions are addressed by the study of excursion sets, i.e. sets of the form {s ∈ S : µ(s) ≥ c}.
A wealth of literature has previously focused on documenting the geometric and topological properties of excursion sets of random functions (c.f.Adler [1981], Torres [1994], Cao [1999], Azas and Wschebor [2009], Adler and Taylor [2009]).These properties include, for example, the Euler characteristic (a measure of topological structure), the Hausdorff dimension (indicating how fractal the set may be), Lipschitz Killing curvatures (describing high-dimensional volumes and areas) and Betti Numbers (describing how many stationary points appear above the threshold, c) (Worsley [1996], Adler [1977], Adler et al. [2017], Pranav et al. [2019]).Much of this work, though, is limited to homogeneous stationary processes, i.e. those with zero mean and a spatial covariance structure which is dependent upon only distance.Only the most recent literature, including the present paper, concerns processes with a non-zero mean function and a potentially heterogeneous covariance function.
In addition, at present there is little published concerning how excursion sets may be compared to one another.In many applications, it is typical for a researcher to collect data from two or more study conditions and contrast the results to draw some meaningful inference (e.g.temperature changes in winter vs summer, heat measurements gathered using different imaging modalities, brain activity in healthy subjects across a range of tasks, etc.).Such settings are akin to having M spatially aligned, potentially correlated, estimates, μ1 n , μ2 n , ...μ M n : S → R, of M target functions, µ 1 , µ 2 , ...µ M .Often, pertinent questions that arise in such settings may be formally expressed using logical statements which involve conjunctions (logical 'and's), negations (logical 'not's) and disjunctions (logical 'or's).For example, a climatologist may ask where significant changes in temperature were observed during either winter or summer (i.e."Where does µ 1 or µ 2 exceed c?').Alternatively, a neuroscientist may ask "At which locations in the brain did subjects exhibit signs of cognitive behaviour during one task and not another?"(e.g."At which locations does µ 1 , and not µ 2 , exceed c?").
One approach to answering such questions is to employ null-hypothesis testing.A hypothesis test of a disjunction of nulls to assess evidence for a conjunction of alternative hypotheses can be conducted with an 'intersection-union' test.An intersection-union test rejects at level α when all of the individual nulls are rejected at level α.This approach has proven particularly useful for tests of bioequivalence (Berger and Hsu [1996], Berger [1997]), but does not consider the spatial aspect that is our focus here.
It is common practice for researchers to compare images corresponding to different study conditions via rudimentary visual inspection.(In fMRI, just a few recent examples of qualitative assessment of 'overlap' and 'differences' are Zhang et al. [2021], Dijkstra et al. [2021], Ferreira et al. [2021]).Such a subjective practice may lead to biased or misleading results.In recent years, much literature has been published on the theory of confidence regions, which focuses on quantifying spatial uncertainty by providing, with a fixed confidence, sub-and super-sets bounding the excursion set of a single target function (Sommerfeld et al. [2018], Bowring et al. [2019], Bowring et al. [2021]).However, the theory of confidence regions does not currently offer a means for investigating logical conjunctions and disjunctions of exceedance statements (i.e.intersections and unions of excursion sets).This work addresses this issue by first proposing a theory of spatial confidence regions for 'conjunction inference' (i.e.inference for logical statements involving conjunctions) in image analysis.Following this, the proposed method is extended to allow the investigation of statements containing disjunctions and negations.
Formally, our work primarily focuses on the intersection of M excursion sets (i.e. the region over which the conjunction statement "µ 1 (s) ≥ c and µ 2 (s) ≥ c and ... and µ M (s) ≥ c" holds).We denote M = {1, ..., M }, and for each i ∈ M (i.e. for each study condition) define the i th true excursion set and i th estimated excursion set as: respectively.Next, we define F c to be the intersection of the sets {A i c } i∈M , and Fc to be the intersection of the sets { Âi c } i∈M .In other words, the spatial region we are interested in, and our estimate of that region, are given by: respectively.An illustration of F c in a setting in which M = 2 is provided by Fig. 1.
We note here that the above construction can be adjusted to allow c to vary across study conditions (i.e."µ 1 (s) ≥ c 1 and ... and µ M (s) ≥ c M " for c 1 , ..., c M ∈ R).Such an adjustment would involve simply translating each field upwards or downwards (e.g.substi- For ease in the proceeding text, we shall assume c is equal for all study conditions.Our goal is to obtain confidence regions F+ c and F− c such that the below probability holds asymptotically; for a predefined tolerance level α (α = 0.05, for example).To generate such regions, we build on the theory of Sommerfeld et al. [2018], to show that under an appropriate definition of F+ c and F− c , the above probability may be approximated using quantiles of a well-defined random variable.Using a wild t−bootstrapping procedure, we will then demonstrate that the relationship between this random variable and the above probability can be used to generate F+ c and F− c for any desired value of α.Unlike much of the previous literature on random excursion sets, in this work no assumption is made on the processes' means or spatial covariance structure, and we do not make any assumption of between-'study condition' independence (i.e.{μ i n } i∈M may be correlated with one another).In the following sections, we first describe the notation and assumptions upon which our theory relies.Following this, we provide the central theoretical result of this work, which relates Equation (1) to an exceedance statement for a well-defined random variable.
Next, via the use of the wild t−bootstrap, we detail how this result may be employed to obtain confidence regions for conjunction inference in the setting of linear regression modelling.Finally, we validate our results with simulations and a real data example based on fMRI data taken from the Human Connectome Project (Van Essen et al. [2013]).Proofs of the theory presented in this work, alongside further illustration and extensive simulation results, are provided as supplementary material.

Notation
In the following sections, we shall need notation to describe the numerous possible low dimensional sub-manifolds which can arise from intersecting the boundaries of M excursion sets.To do so, we first denote the power set of a finite set, A, as P(A), and define P + (A) as P + (A) = P(A) \ {∅}.For each i ∈ M, we define ∂A i c as the level set {s ∈ S : µ i (s) = c}.Similarly, we define ∂F c as the level set {s ∈ S : min i∈M µ i (s) = c}.For a spatial set, B ⊆ S, its complement in S is denoted B c = S \ B, its topological closure is denoted B, its interior is denoted B • and its boundary is denoted ∂B.In addition, for closed B, the notation B 1 and B −1 shall represent B and B c , respectively.We note here that the notation ∂A i c and ∂B may conflict with one another.This conflict is resolved, however, by the assumptions of Section 2.2 which ensure that the boundaries of {A i c } i∈M and F c are equal to the level sets {∂A i c } i∈M and ∂F c (at least locally, in the vicinity of ∂F c , which is sufficient for our purposes).
We can now partition the level set ∂F c into sub-manifolds using the set of possible intersections of the M excursion set boundaries.For all α ∈ P + (M), we define the boundary segment ∂ α F c as follows; We note it is possible that, for some α ∈ P + (M), the boundary segment ∂ α F c is empty (in fact, this is often the case in practice).This notation is crucial to the statement of Theorem 2.1 and is illustrated by Fig. 2.

Assumptions
In this section, we outline and discuss the assumptions upon which our theory relies.
Additional discussion of why each assumption is required, alongside illustration, is given in Supplementary Theory Section S2.
Assumption 2.2.1.For each i ∈ M, there exists a bounded function σ i : S → R + and positive sequence τ n → 0, such that the below Central Limit Theorem (CLT) holds: where {G i (s)} s∈S,i∈M is a well-defined multi-variate random field with continuous sample paths in S and Remark.This assumption introduces the notation σ i (s) and τ n .In typical applications, σ i (s) represents standard deviation across observations (for study condition i, at spatial location s) and τ n is a decreasing function of sample size.For conciseness in the following text, we now define {g i } i∈M and {ĝ i n } i∈M as; Assumption 2.2.2.We assume that: (a) The functions {g i } i∈M are continuous on S.
(b) For n large enough, the functions {ĝ i n } i∈M are continuous on S.
Remark.In practice, assumptions of this form are already common in the imaging literature, as in many applications {µ i } i∈M , {μ i n } i∈M and {σ i } i∈M are assumed to be continuous across space.For further remarks, see Supplementary Theory Section S2.2.
Assumption 2.2.3.We assume that: where, if M \ α is empty, the last term is defined to be equal to S.

(b)
There is an open neighbourhood of ∂F c over which the functions {g i } i∈M and, for n large enough, {ĝ i n } i∈M are C 1 with finite, non-zero, gradients.
(c) For every point s ∈ ∂F c and α partitions every sufficiently small open ball around s into exactly two components, each of which is path connected.
Remark.The above statements ensure that F c is non-empty and has a well defined boundary which is equal to the level set ∂F c = {s ∈ S : min i∈M g i (s) = 0}.All three statements ensure that no changes in topology occur at the level c by requiring that ∂F c does not contain plateaus (spatial regions of non-zero measure over which min i∈M g i (s) = 0 exactly), local minima or maxima.In addition, each statement handles pathological cases which the other statements do not.For brevity, we do not list such cases here but instead provide further discussion in Supplementary Theory Section S2.3.c) and (f )).The last two observations are of practical relevance as, in many applications, it may be expected that the target functions for different study conditions exhibit a strong similarity.For instance, in the neuroimaging example, it may be expected that some anatomical regions of the brain display evidence of activation for all of the study conditions.We note here that there is no requirement that F c be (globally) path-connected;

Each of the examples presented in
F c may consist of several disconnected components without violating the assumptions of this section.

Theory
In this section, we present the central result of this work, Theorem 2.1, alongside two corollaries, Corollary 2.1 and Corollary 2.2.To do so, we now define the nested sets F− c and F+ c by thresholding the statistic τ −1 n min i∈M ĝi n (s): using some constant a ∈ R + .To find an appropriate value of a, such that Equation (1) holds for a desired tolerance level (e.g.α = 0.05), Theorem 2.1 is required.
Theorem 2.1.Under the assumptions of Section 2.2, the below holds: where the variable H is defined as follows; Conceptually, the event {H > a} may be thought of as the situation in which, at some point, s, inside some boundary segment, ∂ α F c , a large value was observed for the statistic Therefore, Theorem 2.1 tells us the probability that the statement F+ c ⊆ F c ⊆ F− c is violated is directly determined by such points.If we can find a value of a such that P[H ≤ a] = 1 − α then asymptotically, the inclusion probability, P[ F+ c ⊆ F c ⊆ F− c ], will also equal 1 − α.In other words, if the (1 − α) th quantile of the distribution of H is known, then confidence regions F− c and F+ c may be constructed which satisfy Equation (1).We shall take up the task of evaluating the quantiles of H in Section 3.
Theorem 2.1 is key to generating confidence regions for conjunction inference (intersections of excursion sets).However, Theorem 2.1 may also be extended to allow investigation of other logical statements involving negations or disjunctions.Below we provide two corollaries, each of which is illustrated via a worked example.Corollary 2.1 extends Theorem 2.1 to account for logical negations, whilst Corollary 2.2 provides an equivalent statement for inference performed upon logical disjunctions (unions of excursion sets).
Example 2.1.Suppose, given two target functions, µ 1 and µ 2 , we were interested in the region defined by the logical conjunction 'µ 1 ≥ c and µ 2 ≤ c' (i.e."Where did a variable of interest exceed a threshold under one study condition and not under another?").In the notation of Section 2.1, this region is readily seen to be given by (A 1 c ) 1 ∩ (A 2 c ) −1 .To apply the result of Theorem 2.1, we first define μ1 = µ 1 and μ2 = 2c − µ 2 and note that the logical statement 'µ 1 ≥ c and µ 2 ≤ c' is identical to the statement 'μ 1 ≥ c and μ2 ≥ c'.By using the latter statement, Theorem 2.1 may now be applied to obtain the desired confidence regions (assuming that an appropriate mechanism exists for evaluating the quantiles of H).However, during this process, the limiting variables {G i } i∈{1,2} , which correspond to the target functions {µ i } i∈{1,2} , must be replaced by variables corresponding to the functions {μ i } i∈{1,2} , { Gi } i∈{1,2} .By noting the definitions of {μ i } i∈{1,2} and In summary, a procedure for generating confidence regions corresponding to the logical conjunction 'µ 1 ≥ c and µ 2 ≤ c' would be identical to that previously described, except that G 2 would be substituted for −G 2 and ∂F c would be substituted for ∂((A 1 c ) 1 ∩ (A 2 c ) −1 ).In general, this example may be extended to allow for inference to be performed upon arbitrary conjunctions of statements and negated statements.To achieve this, the definitions of F c , F− c and F+ c must be extended as follows.
Corollary 2.1.Let {δ i } i∈M be an arbitrary sequence of integers in {−1, 1} and define F c as Similarly, extend the definition of the sets F+ c and F− c to: respectively and, for each α ∈ P + (M), define ∂ α F c as before.If F c satisfies the assumptions of Section 2.2, then the below statement holds asymptotically: Example 2.2.Suppose, given two target functions, µ 1 and µ 2 , interest lay in the logical disjunction of 'µ 1 ≥ c or µ 2 ≥ c' (i.e."Where did at least one of the variables of interest exceed the threshold?").The region of space over which this statement holds is given by , and shall be denoted G c .To generate confidence regions for G c , we first assume that G c satisfies Assumptions 2.2.1-2.2.3.By Assumptions 2.2.2 and 2.2.3 and De Morgan's law, it can be seen that It must be noted that Assumptions 2.2.2 and 2.2.3 are necessary for this statement to hold due to the fact that (A i c ) −1 = (A i c ) c and not (A i c ) c .By employing Corollary 2.1, for a fixed value of α, confidence regions, F− c and F+ c , may be obtained which satisfy the following; By taking the closure of each of the sets in the above probability, and again via careful consideration of Assumptions 2.2.2 and 2.2.3, the below is obtained; In other words, the sets ( F− c ) −1 and ( F+ c ) −1 serve as confidence regions for disjunction inference on µ 1 and µ 2 .Note that, as in the previous example, great attention must paid to the signs which appear in front of the variables {G i } i∈M in the definition of H.As this example began by generating confidence regions for (A 1 c ) −1 ∩(A 2 c ) −1 , it can be seen that the variables G 1 and G 2 must be appended with negative signs (i.e. when applying the result of Corollary 2.1, both δ 1 and δ 2 were set to −1).This example may be expanded upon to obtain similar results for larger values of M , as shown by Corollary 2.2.
3 Spatial Conjunction Inference for the Linear Model In this section, we build upon the work of Sommerfeld et al. [2018] and Bowring et al. [2019], in which methods for generating confidence regions were presented for a single target function, µ(s), derived from a linear regression model.In Section 3.1 we formally describe the spatially-varying Linear Model (LM) for M study conditions and, in Section 3.2, we explain how the wild t-bootstrap may be applied to estimate quantiles of the variable H.Further implementation details, for when data are sampled from a discrete lattice rather than across a continuous space, alongside pseudocode, are provided in Supplementary Results Section S2.This section focuses solely on conjunction inference.However, the methods described here may equally be applied to perform inference on other logical statements via the use of the corollaries and examples presented in Section 2.3.

Model Specification
For each i ∈ M (i.e. for each study condition), we assume that the data follow a spatiallyvarying Linear Model (LM) defined at location s ∈ S as: The known quantities in the model are; the (n×1) vector of responses, Y i (s), and the (n×p) design matrix, X i .The unknown model parameters are; the (p × 1) vector of regression coefficients, β i (s), and the (n × n) random error covariance matrix, Σ i (s).For each spatial location, s ∈ S, the Generalized Least Squares (GLS) estimator of the parameter vector β i (s), βi (s), is given by: In this context, under the i th study condition, at each spatial location s, our interest lies in assessing whether linear relationships hold between the elements of the parameter vector β i (s).Such relationships may be expressed using a contrast vector L i of dimension (p × 1).
In the notation of the previous sections, the target and estimator functions for the spatiallyvarying LM are given as µ i (s) = L i β i (s) and μi n (s) = L i βi (s) respectively, and τ −1 n σ i (s) is the contrast standard error, given as: Whilst in the above notation we have presented { βi (s)} i∈M as being estimated separately using M different models, we note that nothing in our theory prevents the same model being used across all M study conditions.For example, it is possible for the model matrices, {Y i (s)} i∈M and {X i } i∈M , and parameter vector, {β i (s)} i∈M , to be equal for all i ∈ M, with the only distinction between study conditions being represented by the contrast vector L i .This observation is noteworthy as, in many applications, it is common for researchers to compile all study conditions into a single LM to obtain a heightened statistical power.
To apply Theorem 2.1 to the above LM, we now assume that Assumptions 2.2.1-2.2.3 hold.Such assumptions are commonly satisfied by standard conditions placed on the continuity and boundedness of the increments and moments of {β i (s)} s∈S and { i (s)} s∈S (see Sommerfeld et al. [2018] for further detail).We note here that the covariance matrix, Σ i (s), is usually unknown in practice, meaning the function τ −1 n σ i (s) cannot be calculated.As demonstrated in Sommerfeld et al. [2018], however, Σ i (s) can be replaced by any consistent estimator Σi (s) and the assumptions given in Section 2.2 are still satisfied.Such estimation is common in the statistics literature and is typically achieved by assuming that some fixed correlation structure applies to Σ i (s) (e.g.diagonal independence, auto-regressive, etc.) so that the number of independent variance parameters which must be estimated is small.

The Wild t−bootstrap
In practice, the distribution of H is unknown and must be estimated.In this section, for the model specification described in Section 3.1, we employ a wild t−bootstrap resampling scheme to obtain the quantile, a, of H which satisfies P[H ≤ a] = 1 − α.
To do so, we first define the (n × 1)-dimensional residual vector, R i , for the LM of the i th study condition (i ∈ M) as: The wild t−bootstrap proceeds by, in each bootstrap instance, generating n i.i.d.Rademacher random variables, {r 1 , ..., r n }, (random variables which take the values −1 and +1 with equal probability) independently of the data.For the i th study condition, at spatial location s, a new bootstrap sample is then obtained by multiplying the elements of the residual vector by the Rademacher variables (i.e. the new sample is given by {r 1 R i 1 (s), ..., r n R i n (s)}).Once the boostrap sample has been generated, the standard deviation of the bootstrap sample, σ * ,i (s), is calculated.A bootstrap instance of the variable G i , Gi , is now generated as follows; .
A bootstrap instance of the variable H, H, may then be computed using the bootstrap variables { Gi (s)} s∈S,i∈M as follows: Quantiles of the distribution of H may now be estimated empirically from the observed sampling distribution of H. Discussion of how the above supremum, taken across continuous space, is evaluated in practice is deferred to Supplementary Results Section S2.1.
However, we do briefly note here that, as the location of the true boundary segment ∂ α F c is in practice unknown, the boundary segment ∂ α F c has been replaced by an estimate ∂ α Fc .
It should be noted that, as a result of this substitution, our proposed method may not be applied when Fc is empty.
Several strong references exist that argue and verify the correctness of using the wild t−bootstrap for estimating quantiles of maxima distributions in the above manner.Of particular note are the works of Chang et al. [2017] and Bowring et al. [2019], in which the wild t−bootstrap is proposed for the estimation of Gaussian maxima distributions and the generation of confidence regions, respectively.Discussion of the wild t−bootstrap in an imaging context may also be found in Telschow and Schwartzman [2021].Other significant references which serve as precursors to this work include Chernozhukov et al. [2013] and Sommerfeld et al. [2018], in which the Gaussian multiplier bootstrap was investigated for estimating Gaussian maxima distributions and generating confidence regions, respectively.
More general introductions to such bootstrapping procedures may be found in Wu [1986], Efron and Tibshirani [1994] and Hesterberg [2014].
In the form that it has been presented here, the wild t−bootstrap requires that {G i } i∈M be symmetric.Further, as noted above, the wild t−bootstrap has been extensively verified for estimating maxima distributions only in settings in which {G i } i∈M are Gaussian.Here we stress that, whilst the theory presented in Section 2 makes no assumptions on the distribution of {G i } i∈M , the bootstrap theory presented throughout Section 3 requires {G i } i∈M to be Gaussian.In order to utilise the results of Section 2 when the random variables {G i } i∈M are not Gaussian, alternative methods must be employed for estimating the quantiles of H.

Simulation Settings
To assess the correctness and performance of the method, a range of simulations were conducted using synthetic data.Each simulation was designed to investigate how the empirical coverage (i.e. the proportion of simulation instances in which the inclusion F+ c ⊆ F c ⊆ F− c held) was affected by various secondary factors of practical interest.In this section, we describe three simulations designed to investigate how the methods' performance was influenced by; (1) the degree of overlap between excursion sets, (2) the presence of correlation between the noise fields for different study conditions, and (3) the magnitude of the spatial rate of change of the signal.In all simulations, for i ∈ M, the model used to generate the synthetic data took the form; where n was allowed to vary from 40 to 500 (in increments of 20).The aim, across all simulations, was to assess the performance of the method when employed for conjunction inference on the true signal {µ i } i∈M (i.e. when asked the question "Where do all of the {µ i } i∈M exceed a predefined threshold?").
The mechanism used to generate the true signal, {µ i } i∈M , varied across simulations.
In Simulations 1 and 2, {µ i } i∈M were generated using two binary images of circles and squares, respectively, positioned in a 'Venn diagram' arrangement.Each binary image was scaled by a predefined amount and then smoothed using an isotropic Gaussian filter with a Full-Width Half Maximum (FWHM) of 5 pixels.In Simulation 3, µ 1 and µ 2 were simulated as a horizontal and vertical linear ramp, respectively, with predefined gradients.
Each simulation was performed twice, once using noisier 'low-Signal-to-Noise Ratio (SNR)' synthetic data and again using less noisy 'high-SNR' synthetic data.To generate the high-SNR synthetic data for Simulations 1 and 2, all simulated {µ i } i∈M were scaled by a factor of 3 prior to smoothing.This data generation process is identical to that employed in Bowring et al. [2019], in which it is noted that synthetic data generated using these parameters strongly resembles that of an fMRI analysis when voxels are of dimension 2mm 3 .In Simulation 3, for the high-SNR data, the linear ramps were initially simulated with a gradient of 8 per 50 pixels.Across all simulations, the low-SNR data was generated by reducing the signal magnitude of the high-SNR data by a factor of 4. In all simulations, thresholds of c = 1/2 and c = 2 were employed for the low-SNR data and high-SNR data, respectively.
As Simulation 1 aimed to investigate the method's performance as the overlap between excursion sets increased, in this simulation, the distance between the centre of the circles was varied, ranging from 0 pixels (full overlap) to 50 pixels (barely overlapping) in increments of 2 pixels.Similarly, as Simulation 2 aimed to assess the effect of correlation between 1 and 2 , in this simulation, this correlation was varied between −1 and 1 in increments of 0.1 (in all other simulations, the noise fields were generated independently).As Simulation 3 served to assess how sensitive the empirical coverage was to the rate of change of µ 1 and µ 2 over space, in this simulation, the initial ramp gradients were multiplied by a factor of k, where k was varied from 0.25 to 1.75 in 0.05 increments.
All simulation results were obtained as averages taken across 2500 simulation instances and compared to the nominal coverage using 95% binomial confidence intervals.In all simulation instances, the image dimensions were (100 × 100) pixels, confidence regions were computed for a tolerance level of α = 0.05 using 5000 bootstrap realizations, σ i (s) was computed using the ordinary least squares estimator and τ n was computed as τ n = n −0.5 .In each simulation instance, the assessment of whether the inclusion statement, Supplementary Results Section S5).However, in all cases, the degree of agreement between the empirical and nominal coverage improved as the sample size increased.Furthermore, no such over-coverage was observed for the equivalent high-SNR simulations.This obser-vation matches expectation, as Theorem 2.1 holds only asymptotically, and is supported by the previous literature on confidence regions, in which similar results were observed for the single-'study condition' setting (c.f.Sommerfeld et al. [2018], Bowring et al. [2019]).
The presence of strong negative correlation between the study conditions and the rate at which the signal varied across space both also appeared to influence the methods' performance (c.f.Fig 3, Simulations 2 and 3).In both instances, it is likely that the observed over-coverage is due to difficulties in estimating the true location of the boundary ∂F c , as each of these factors make it harder to identify the locations at which small changes in min i∈M µ i occur.When high-SNR data was used in the place of low-SNR data, both of these factors had a substantially reduced impact on the empirical coverage.
Despite the above observations, the simulation results overwhelmingly supported the claim that the proposed method is robust to a range of secondary factors.Included in the list of such factors are; variation in the shape and size of F c , spatial correlation in the noise, variation in the lengths of individual boundary segments, between-'study condition' correlation, realistic levels of variation in the magnitude of the noise, large numbers of study conditions and situations in which boundary segments are shared by many excursion sets.For further detail, see Supplementary Results Sections S5 and S6.
In terms of time efficiency, the wild t−bootstrap provided extremely fast computational performance.For instance, using an Intel(R) Xeon(R) Gold 6126 2.60GHz processor with 16GB RAM, and averaging over the 2500 simulation instances conducted for n = 500 high-SNR observations, in Simulation 1, the time taken to generate confidence regions for a circle separation of 20 pixels was 5.98 seconds.For a comprehensive summary of computation times, see Supplementary Results Section S7.

Real Data Application
To demonstrate the method in practice, we apply it to fMRI data drawn from the Human Connectome Project (HCP) dataset (Van Essen et al. [2013]).In this example, task fMRI data were collected as part of a block design from 80 healthy, unrelated, young adults as part of a working memory task that had four distinct components, each using a different stimuli type: pictures of places, tools, faces and body parts.In Supplementary Results Section S3, we provide a brief overview of the imaging acquisition protocol, task paradigm, preprocessing stages and first-level analysis employed for generating this dataset (for exhaustive detail, see Barch et al. [2013] and Glasser et al. [2013]).Following first-level analysis, the data consisted of 4 images for each of the 80 subjects, measuring the %BOLD response to each of the 4 stimuli types.In this example, our interest lies in identifying, at the group-level, which regions of the brain are associated with working memory regardless of stimuli type (i.e."Which regions are active in response to all four working memory stimuli types?").
As this work has predominantly focused on two-dimensional excursion sets, a single slice of the brain was chosen for analysis (z = 46mm, covering portions of the frontal gyrus involved in working memory).For each stimuli type, an n = 80 group-level linear model was constructed.Using the proposed method, confidence regions were then generated at the 5% confidence level to assess where the group-level percentage BOLD change exceeded 1% for all four stimuli types.To compare the proposed method to standard fMRI inference procedures, a group-level contrast was also generated using the Big Linear Model toolbox for each of the four stimuli types.In addition, single-'study condition' confidence regions were also generated using the methods of Sommerfeld et al. [2018] for each of the four stimuli types.
The results are shown in Fig. 4. The conjunction inference identified several localized regions, including the Superior Frontal Gyrus, which is well known for its involvement in working memory (c.f.Boisgueheneuc et al. [2006], Vogel et al. [2016], Alagapan et al. [2019]), and the Angular Gyri, which are well known to be involved in a range of tasks including memory retrieval, attention and spatial cognition (c.f.Seghier [2013], Bréchet et al. [2018]).Whilst the confidence regions for conjunction inference illustrated in Fig. 4 exhibit a clear resemblance to the four contrast images, we note that the red set, F+ c , is very small in comparison to the blue set, F− c , which is large and diffuse.This observation conveys important information about the spatial variability of the regions identified.In particular, it can be seen that, in the regions immediately surrounding the Superior Frontal Gyrus and Angular Gyri, there is a much higher resemblance between the blue ( F− c ) and yellow ( Fc ) sets than is seen across the rest of the brain.This resemblance provides an insight into the degree of spatial variation present surrounding these regions.In this case, the strength and localization of this resemblance suggest that the spatial variability of Fc is less severe in and around the aforementioned anatomical regions than it is across the rest of the brain.It may be concluded that the estimated yellow 'blobs' corresponding to the Superior Frontal Gyrus and the Angular Gyri have been more reliably localized than the other yellow 'blobs' appearing in the image.
In general, the single-'study condition' confidence regions can be seen to be much larger than those observed for the conjunction inference.This observation matches expectation as the overlap of the four excursion sets is smaller than each set individually.In addition, in this example, the conjunction method employed the entire experimental design for analysis (as opposed to the single-'study condition' method which employed only the data concerning the stimuli of interest) and is therefore expected to have a higher statistical power and, thus, tighter confidence bounds.

Conclusion and Discussion
In this work, we have produced a method for generating confidence regions for intersections and unions of multiple excursion sets.The confidence regions generated by the method generalise the notion of confidence intervals to arbitrary spatial dimensions and possess the usual frequentist interpretation that, were the 0.95 procedure repeated on numerous samples, the proportion of confidence regions, F+ c and F− c , that correctly enclose F c would tend to 0.95.Such confidence regions serve as an indicator of the reliability of Fc as an estimate of F c .Confidence regions which closely resemble the set Fc may be interpreted as signifying that Fc is a reliable estimate of F c , whilst little resemblance may suggest that there is a high degree of spatial variability present in the data and that the estimated shape, size and locale of Fc are not particularly reliable.Such statements are of particular value in imaging applications where the aim of a statistical analysis is typically to assess how reliably some form of activity may be localized to a particular spatial region.
We stress here that, whilst Fc is equal to the intersection of { Âi c } i∈M , the confidence regions F± c are not the intersections of the single-'study condition' confidence regions { Â±,i c } i∈M which would be obtained using the methods of Sommerfeld et al. [2018].This can be seen by noting that { Â±,i c } i∈M are defined using separate thresholds from different bootstraps and can be heavily influenced by the behavior of {∂A i c } i∈M in spatial regions far from F c .In fact, treating the naive intersection of the (1 − α) confidence regions { Â±,i c } i∈M as confidence regions for F c is not a valid method in general and can result in undesirable asymptotic coverage lying anywhere inside the range [1 − M α, 1].This claim is further discussed in Supplementary Theory Section S4.To our knowledge, the only valid approach for generating confidence regions for conjunction inference is that outlined in this work.
One potential limitation of the specific methods of Section 3 is that they allow for confidence regions to be generated only when the target functions, {µ i } i∈M , are linear combinations of regression parameter estimates.As noted by Bowring et al. [2021], it is often preferable for an analysis to consider standardized effect sizes, such as Cohen's d (Cohen [2013]) or Hedges' G (Hedges [1981]), instead of regression parameter estimates, as standardized effect sizes allow for information about statistical power to be incorporated into the analysis.To this end, Bowring et al. [2021] outlined an approach for generating confidence regions (in the single-'study condition' setting) for images of Cohen's d effect sizes.Here, we suggest that a potential avenue for future work is to combine the methods of Bowring et al. [2021] and those proposed in this work to allow for conjunction, or disjunction, inference to be performed on standardized effect sizes rather than regression parameter estimates.
Another limitation noted here is that, whilst our proposed method theoretically works for data of arbitrary dimensions, the simulations of Section 4 verify the performance of our method only for two-dimensional data.Whilst it is typical for many practical applications to focus on two-dimensional data (e.g.climate maps, surface images), higher-dimensional datasets are abundant in a wealth of disciplines (e.g.brain images, astrological maps).
For this reason, we intend to investigate further the performance of the method in higher dimensions in future work.

Figure 1 :
Figure 1: Illustration of the set F c for a setting in which M = 2. Shown in (a) are two spatially-varying target functions, µ 1 (s) (blue) and µ 2 (s) (green), overlaid and thresholded at the level c.In (b), the regions at which µ 1 (s) ≥ c and µ 2 (s) ≥ c are displayed as red and yellow circles, respectively, with the intersection set, F c , illustrated in purple.In (c), potential confidence regions, F+ c (grey) and F− c (orange), for F c are illustrated.

Figure 2 :
Figure 2: Annotated images of 'overlapping' excursion sets (top) with corresponding illustrations of boundary partitions (bottom) for settings in which the number of study conditions, M , is equal to 2 (left) or 3 (right).
(a) Every open ball around a point on the boundary ∂F c has a non-empty intersection with (F c ) • .In addition, for all α ∈ P + (M) every open ball around a point in ∂ α F c has a non-empty intersection with the set: Fig. 2 satisfy Assumptions 2.2.2 and 2.2.3 and highlight several features worthy of note.Firstly, we do not assume ∂F c is a C 1 curve, but rather piece-wise C 1 (c.f.examples (d) and (e)).Secondly, the assumptions allow for the possibility that the excursion sets could be nested within one another (c.f.examples (c) and (f )).Thirdly, the assumptions permit the excursion sets to share common boundaries (see examples (b), (

,
Corollary 2.2.Let {δ i } i∈M be an arbitrary sequence of integers in {−1, 1} and define G c as G c = i∈M (A i c ) δ i .Define the sets Ĝ+ c and Ĝ− c as: respectively and, for each α ∈ P + (M), define ∂ α G c analogously to ∂ α F c in the previous sections.If G c satisfies the assumptions of Section 2.2, then the below statement holds asymptotically:

Figure 3 :
Figure 3: Simulation results generated using the low-SNR synthetic data.Nominal coverage is shown as a dashed line, with a corresponding binomial confidence interval also shaded.All data points are averages taken across 2500 simulation instances.

Figure 4 :
Figure 4: %BOLD images (top) and 95% confidence regions (bottom left) for the HCP working memory task for each of the four visual stimuli types, alongside conjunction confidence regions assessing the overlap of the four thresholded %BOLD images (bottom right).Displayed is axial slice z = 46mm.The thresholds employed were c = 1% BOLD change, for all images, and α = 0.05, for the confidence regions.Upper and lower confidence regions are displayed in red and blue, respectively.Across the bottom row, the yellow sets are the point estimates Â1 c , ... Â4c and Fc respectively.The red conjunction set has localized regions in the Superior Frontal Gyrus (top) and the Angular Gyri (left/right), for which we can assert with 95% confidence that, for all four study conditions, there was (at least) a 1.0% change in BOLD response.