Hydrodynamical simulations of merging galaxy clusters: giant dark matter particle colliders, powered by gravity

Terrestrial particle accelerators collide charged particles, then watch the trajectory of outgoing debris - but they cannot manipulate dark matter. Fortunately, dark matter is the main component of galaxy clusters, which are continuously pulled together by gravity. We show that galaxy cluster mergers can be exploited as enormous, natural dark matter colliders. We analyse hydrodynamical simulations of a universe containing self-interacting dark matter (SIDM) in which all particles interact via gravity, and dark matter particles can also scatter off each other via a massive mediator. During cluster collisions, SIDM spreads out and lags behind cluster member galaxies. Individual systems can have quirky dynamics that makes them difficult to interpret. Statistically, however, we find that the mean or median of dark matter's spatial offset in many collisions can be robustly modelled, and is independent of our viewing angle and halo mass even in collisions between unequal-mass systems. If the SIDM cross-section were sigma/m = 0.1cm^2/g = 0.18 barn/GeV, the 'bulleticity' lag would be ~5 percent that of gas due to ram pressure, and could be detected at 95 percent confidence in weak lensing observations of ~100 well-chosen clusters.


INTRODUCTION
Galaxy clusters grow by merging with each other.During a merger, their three major constituents behave differently.Galaxies are pointlike on this scale and act as collisionless test particles affected only by gravity.Diffuse gas experiences ram pressure, so is decelerated and disassociated from the galaxies.Dark matter (DM) follows a trajectory determined by whichever fundamental forces act on it.If DM interacts only via gravity, it should remain with the cluster galaxies.However, if it has a non-zero cross-section for collision with other DM particles, this self-interacting DM (SIDM) can also lag behind the galaxies (Clowe et al. 2006;Robertson et al. 2017a).Observationally, galaxies are visible in (a smoothed map of their) optical emission, while the diffuse gas is visible in X-ray emission or via the Sunyaev-Zel'dovich (1970) effect.The DM can be mapped via gravitational lensing.
The Bullet Cluster (1E0657-558) is the best known example of colliding clusters.The 'Bullet' refers to the smaller cluster, which has passed through and is now moving away from the main cluster.Early weak lensing measurements of the offset between its galaxies and DM implied a self-interaction cross-section per unit mass ★ E-mail: ellen.sirks@sydney.edu.au/ < 5 cm 2 g −1 , when calibrated against an analytic model of SIDM dynamics (Markevitch et al. 2004).This was improved by Randall et al. (2008) who ran SIDM-only simulations of Bullet Cluster-like systems.Combined with higher precision strong lensing measurements, Bradač et al. (2008) found / < 1.25 cm 2 g −1 .Including ordinary matter in simulations of SIDM reduces its effect, by steepening the gravitational potential well at the cluster core (e.g.Mastromarino et al. 2023), and fully hydrodynamic simulations of the Bullet Cluster relaxed the constraint to / < 2 cm 2 g −1 (Robertson et al. 2017a).
Particle colliders do not stop collecting data after one event.Astrophysical constraints should improve with statistical measurements from a large sample of merging clusters (Massey et al. 2011).Furthermore, because the average velocity  of DM particles increases with halo mass, measurements of collisions between galaxy clusters, galaxy groups, or individual galaxies could also characterise any velocity-dependence of the interaction: this would constrain the mass of the force mediator particle (Adhikari et al. 2022).In a first attempt at this measurement, Harvey et al. (2015) adopted a strategy of analysing as many cluster mergers as possible, all with equal weight.When calibrated against an analytical model, the 30 mergers in the Hubble Space Telescope (HST) archive at the time yielded a constraint on the cross-section of / < 0.47 cm 2 g −1 at  ∼ 1000 km s −1 .In the future, this strategy can be easily extended to all-sky, monochromatic lensing surveys like Euclid.With additional telescope time, Wittman et al. (2018) showed that multi-colour imaging can be used to reduce noise by better identifying components of clusters that entered a merger together.Some systems give anomalously high measurements; some anomalously low.Simultaneously re-weighting to account for the fact that some have more statistical power than others, the constraint changed to / < 2 cm 2 g −1 .
To calibrate future observations, this paper uses hydrodynamical simulations of galaxy clusters with both SIDM and ordinary matter, in a cosmologically expanding volume.We study merging clusters in simulated universes with different DM interaction strengths (always with a massive mediator particle; Fischer et al. 2022), and test whether an observable offset between DM and stars could indeed be used to measure the interaction cross-section between DM particles.
The value of the SIDM cross-section is unconstrained across many orders of magnitude (Kusenko & Steinhardt 2001;Duffy & van Bibber 2009;Loeb & Weiner 2011;Kamada et al. 2020).A 'natural' scale for models invoking a dark-sector analogue of the strong force (e.g.Mohapatra et al. 2002;Foot 2014;Hochberg et al. 2015) is the same order of magnitude as nuclear interactions, / ∼ 0.6 cm 2 g −1 = 1 barn GeV −1 .A particularly meaningful and potentially achievable goal is to test whether / is significantly more or less than 0.1 cm 2 g −1 .If it is greater than this, DM particles that scatter off of each other are gradually ejected from dense regions of the Universe, reducing the density in the centres of haloes and slowing their gravitational collapse (Peter et al. 2013;Vogelsberger et al. 2016;Tulin & Yu 2018).From the perspective of Occam's razor, this effect would then be sufficient to solve the seemingly unrelated 'small-scale crisis' in the standard model of cosmology -that simulations produce too much substructure that is too dense (Zavala et al. 2013;Vogelsberger et al. 2014;Elbert et al. 2015).The long time that it takes for scattering to fully erode a DM cusp would also provide a natural mechanism (Creasey et al. 2017) to explain the observed diversity of DM density profiles (Oman et al. 2015;Oldham & Auger 2018).For full reviews of SIDM, see Adhikari et al. (2022) or Tulin & Yu (2018).
This paper is organised as follows: we present our suite of cosmological simulations in Section 2, and our methods for locating different types of matter in Section 3. We present results in Section 4, including prospects for future observations.We summarise and conclude in Section 5.

The BAHAMAS simulations
We use the BAHAMAS suite of cosmological simulations (McCarthy et al. 2017).These use a modified version of the GADGET-3 code to model DM and baryonic physics including radiative cooling, star formation, chemical evolution, and stellar and AGN feedback.Each simulation volume is a periodic box, 400 ℎ −1 Mpc on a side.This contains 2×1024 3 particles, with DM particles of mass  DM = 5.5× 10 9 M ⊙ , gas particles initially of mass 1.1×10 9 M ⊙ , and gravitational softening length ℎ grav that is fixed in comoving coordinates at  > 3 then constant at ℎ grav = 5.7 physical kpc thereafter.They assume the WMAP 9-year cosmology (Ω m = 0.2793, Ω b = 0.0463, Ω Λ = 0.7207,  8 = 0.812,   = 0.972 and ℎ = 0.700; Hinshaw et al. 2013).

Colliding cluster sample selection
In each simulation volume we identify the 300 most massive clusters in the simulation snapshot at redshift  = 0, then select those with one or more subhaloes of ⩾ 5 per cent the total mass  cl within 4 Mpc.This yields ∼100 clusters and ∼135 subhaloes in each simulation (Table 1).To investigate the effects of DM self-interactions on both scales, we shall analyse both the main cluster haloes and subhaloes.
A similarly inclusive selection strategy could be employed by a future analysis of all-sky surveys.Without having selected simulated systems based on their dynamics, we find relatively small separations between their components of matter.Denoting the 3D distance between galaxies ('stars') and gas as  SG , our sample has mean component separation ⟨ SG ⟩ = 33±2 kpc, with rms scatter 49 kpc (which will later be needed as variance ⟨ 2 SG ⟩ = (2.4 ± 0.3) × 10 3 kpc 2 ).We also find that more subhaloes have yet to reach first pericentre within their host cluster than have passed it.
Previous studies with finite telescope time have instead preferentially observed systems with high  SG (because these turn out to be most sensitive to DM interactions: see Section 3.3).For example, Harvey et al. (2015) selected clusters with bimodal distributions of X-ray emission and the largest separations between galaxies and gas that fitted within the field of view of the Hubble Space Telescope (HST).Their observed sample had 2D separations with mean ⟨ SG ⟩ = 83 ± 9 kpc with rms 114 kpc or variance ⟨ 2 SG ⟩ = (12.9± 3.0) × 10 3 kpc 2 (alternatively, at mean redshift ⟨⟩ = 0.4, ⟨ SG ⟩ = 17.7 ± 2.1 arcsec with rms 25 arcsec or variance ⟨ 2 SG ⟩ = 632 ± 157 arcsec 2 ).Future studies adopting a similar selection strategy should easily be able to maintain or increase these values, because new clusters with large separations between galaxies and gas continue to be found in X-ray or SZ surveys (Kubo et al. 2009;Okabe et al. 2010;Tempel et al. 2017;Haines et al. 2018;Zenteno et al. 2020;Fu et al. 2024).Notably, this includes nearby clusters whose separations appear huge on the sky but whose gravitational lensing signal is spread over an area too large to be observed easily by HST (e.g. McCleary et al. 2020).Nearby clusters are particularly promising for our test.Unlike weak lensing measurements of DM mass (where signal-to-noise follows lensing sensitivity in peaking at redshift ∼0.3),weak lensing measurements of DM position (in kpc) are optimal at ∼0.05 so long as the telescope has a sufficiently large field of view to capture the broader (in arcminutes) shear field (Kubo et al. 2007;Massey et al. 2011).

METHOD
It is possible to quantify the offset between a cluster's different components using their position (Massey et al. 2011) or quadrupole moments (McDonald et al. 2022).We choose the former, and first need a definition of position.In observational studies, the methods to find the positions of the gas, galaxies and DM all differ.In simulations, we can split the particles by type and access their distributions directly.

Measuring the location of components of matter
We use the shrinking-spheres method to determine the 3D location of each (star, gas, DM) component (see e.g.Power et al. 2003).A first sphere is constructed at the centre of potential returned by subfind (Springel et al. 2001;Dolag et al. 2009), with initial radius 0.35 200 for each halo or its equivalent for each subhalo1 .The centre is then moved to the centre of mass of particles of a given type within the current sphere, and the radius is shrunk by a factor  = 0.9.This process is repeated until the sphere would contain fewer than 100 particles of that type.The recorded position is the centre of mass of all particles of that type within the final sphere.
The shrinking-spheres method occasionally fails, by meandering to an incorrect local peak.Failures happen most frequently for gas particles, and more often in main haloes than subhaloes.The effect creates what appears to be either a mismatch between stellar and gas clumps that were not together at the start of infall, or the misidentificiation of a centroid analogous to that found by George et al. (2012) for real observations of stellar light.Such failures lead to a 3D separation between stars and gas much larger than the typical values (of order 50 kpc), and we mitigate them by excluding from all further analysis the ∼1 halo per simulation box for which we measure  SG > 250 kpc.The precise value of this cut is fairly arbitrary and does not affect our results.
In observational studies, only 2D projected positions can be measured.We project the measured 3D positions along ,  and -axes by discarding one coordinate in turn, then record three independent configurations for each system.If we instead use a shrinking-circles measurement as specified by Robertson et al. (2017a), we find results with consistent mean and uncertainty, but which move around within the full extent of 68 per cent error bars.This suggests that the two measurements are effectively independent, with noise that is dominated by chance projection of substructures.We shall carry out analyses in both 3D and different implementations of 2D, but no such choices affect our final conclusions.

Measuring spatial offsets between components of matter
Consider a triangle (Fig. 1) with vertices at the centre of mass of stellar matter (S for 'stars'), gas (G), and DM (D).The vector from the gas to the stars, − SG , defines the system's expected direction During a collision between galaxy clusters, the galaxies (S for 'stars'), gas (G), and DM (D) can become separated.Ram pressure on the cluster's gas means that the vector from a cluster's gas to its stars approximately indicates its direction of motion.We define the positive-definite length of this vector  SG .To test whether pressure also acts on DM, we measure the distance from the DM to the stars, in components parallel to the direction of motion,  SI , and perpendicular to it,  DI . of motion, and the 'base' of the triangle.Whether the locations are defined in 3D or 2D, the spatial offset from the stars to the gas is merely the length of this vector, which is positive-definite and has measurement uncertainty due to uncertainty  S and  G in the locations of stellar and gaseous material along a coordinate direction (we assume this to be isotropic).
The location of DM is offset from the location of stars, with components in the direction of motion and perpendicular to it where (I) is the point at which a perpendicular from (D) passes through the base of the triangle.Both equation ( 4) and ( 5) can be either positive or negative.In 2D, we take the sign of  DI to be the sign of the cross-product of  SG and  SD (numerator in equation ( 5)), resulting in a positive  DI in Fig. 1.In 3D, we use the sign of the cross-product of  SG and  S , dotted with  SD .Through standard error propagation, these offsets have measurement uncertainty and It is informative to calculate the fractional offset of DM, or 'bulleticity' This dimensionless ratio has two advantages.First, even though we can observe only the projection of a 3D offset onto the plane of the sky, this quantity is independent of the 3D orientation.Second, an approximate, analytic model of SIDM dynamics suggests that although offsets of each component gradually increase after a merger (see fig. 6 of Robertson et al. 2017a), the ratio  ∥ should be constant for all merger configurations, at all times during the merger.This implies that measurements of  ∥ from different systems can be averaged (Harvey et al. 2015, we shall discuss this model in more detail in Section 4).Measurement noise for individual systems (which may produce  ∥ < 0 or even  ∥ > 1) propagates to uncertainty on  ∥ of where our equation (9) recovers equation (1) of Wittman et al. (2018).
As a control test, we also renormalise the perpendicular offset of DM, which should be consistent with zero on average, if the Universe does not have a handedness (and in the absence of systematics).
Measurement uncertainty propagates into uncertainty on  ⊥ of

Combining measurements from many collisions
If  ∥ is universal, it should be possible to measure and interpret the average value ⟨ ∥ ⟩ from a large number of  halo merging haloes.
Assuming that a given survey will have approximately constant measurement uncertainty  D and  S , the standard error on the mean of equation ( 8) is Some merging systems have more discriminating power than others (Wittman et al. 2018).A measurement of  ∥ is a calibration of the location of DM, some distance  SI along a ruler of length  SG .If measurement precision is constant, systems with a long ruler offer high dynamic range and high signal-to-noise.Measurement precision is roughly constant in terms of angle on the sky, so systems with the longest rulers are those near the viewer, those with a collision aligned in the plane of the sky, and those with timing such that the separation is maximised.When analysing observations, Wittman et al. (2018) found it helpful to average systems using an inverse-variance weight which is obtained from equation ( 14) while ignoring terms O ( 2 ∥ ) both because they are small, and to avoid biasing the measurement of  ∥ itself.The error on the weighted mean then becomes In this paper, we shall use neither means nor weighted means.We shall instead quote measurements of  ∥ using a median, and '1like' uncertainties (separation between the 16 th and 84 th percentiles of the distribution).The median is similar to the mean within scatter, but in simulations rather than observations, our automated methods produce unstable scatter in the mean -and even more scatter in the weighted mean.A small fraction of the time this is because the shrinking spheres method failed catastrophically (see Section 3.1); it is difficult to exclude failed measurements via cuts on  SG , because those cuts would exclude haloes with the highest signal to noise -and the weight becomes strongly dominated by the system with the largest  SG that survives the cut.For example, a single subhalo with large  SG in our sample increases the weighted mean by a factor of nearly 10 compared to the value if it is excluded.A weighted mean might be more suitable when clean measurements are available, i.e. when random statistical errors dominate over systematic errors.Curiously, we find that our results are stable when using weight   ∝  SG,i (see Appendix A).This is not motivated mathematically, but empirically we find that it would be worth investigating in the future.

RESULTS
Measurements of  ∥ from simulated merging clusters follow an approximately Gaussian distribution with an asymmetric tail to positive values (see Fig. 2, and Appendix B for measurements of individual components), aside from the outliers discussed above.Measurements of  ⊥ are similar but without the tail.Distributions of  ∥ are remarkably consistent between subhaloes (leftmost two columns in Fig. 2) and main haloes (rightmost two columns), and almost indistinguishable whether measured in 2D (first and third columns) or 3D (second and fourth columns).We fit a Gaussian perturbed with skewness  and kurtosis  (the sinh-arcsinh normal distribution, Jones & Pewsey 2009)

Scatter of DM offsets
We first recover the result noticed by Kim et al. (2017) and Harvey et al. (2017Harvey et al. ( , 2019) ) that scatter in measurements of  ∥ increases with SIDM cross-section / (see Fig. 3; the same is true for  ⊥ ).This is likely due to the collision giving an impulse to the BCG that sets it oscillating.Since it oscillates within a gravitational potential that is dominated by DM, and SIDM clusters have a 'core' of constant density, the BCG is less tightly bound and its oscillations have a larger amplitude.The observed scatter in offsets samples random phases of oscillation at the moment when it is measured.We discover that the scatter of  ∥ and  ⊥ in subhaloes also increases with SIDM cross-section /, although less than that in main haloes.This may be simply because subhaloes are tidally disrupted before their offsets increase to the same extent as main haloes.Scatter in offsets could be used to measure the SIDM cross-section in the real Universe (Harvey et al. 2019).However, there is no null test for CDM, and no control test for systematics -so its interpretation will rely entirely on calibration via simulations.We shall not consider it further in this paper.

Typical DM offset
The median value of  ∥ increases with SIDM cross-section /.Measurements are generally consistent for both main haloes and subhaloes, so we show the median of the combined sample (Fig. 4).Measurements of the perpendicular control test,  ⊥ , are consistent with zero as expected.Importantly, results are virtually indistinguishable whether quantities are calculated in 3D after projection into 2D.
In this sense, the measurement should therefore be as accessible to observations as it is to simulations.
Our measurements are remarkably well fit by the model predicted by Harvey et al. (2014, see their equation 33 and fig.2).This model interpolates between two well-understood extremes.At low , the halo is optically thin, and the effective drag force grows linearly with the interaction cross-section.The constant of proportionality  reflects the relative interaction strengths of DM and gas (or the characteristic cross-section at which a halo becomes optically thick).At high , DM particles at the front of the halo always scatter incoming DM and shield particles behind, so the halo becomes optically thick.The drag then becomes a constant, with parameter  that depends on the geometry of the halo.The model predicts that  ∥ is notably independent of infall velocity, impact parameter, and time.We fit free parameters  and  to our data using the Python function scipy.optimize.minimizeand an asymmetric loss function to account for asymmetric errors bars (Table 2, with best-fitting models overlaid in Fig. 4).Our only measurement not well fit by model ( 18) is that, for subhaloes in CDM (/ = 0) simulations, we measure median  ∥ > 0 at ∼95 per cent confidence level.That we would not expect CDM to be offset from galaxies is particularly important as a null test.With our present sample size, we tentatively ascribe this as a statistical fluke or a limitation of simulation resolution (for  SG ≈ 23 kpc, a value  ∥ = 0.03 represents a measurement of  SI to ∼12 per cent of the BAHAMAS softening length).We believe this because our measure-  ∥ in Fig. 2. Variation between clusters grows with SIDM cross-section for main haloes (red squares and dashed line for 2D, green triangles and dashdotted lined for 3D) and subhaloes (blue circles and solid line for 2D, black diamonds and dotted line for 3D).The errors are the 1 uncertainties returned by the fit to the distributions in Fig. 2. The straight lines plotted here were fitted to the data points using the Python function scipy.optimize.curve_fit.The 3D data points have been slightly offset for clarity.
ments of higher mass CDM main haloes (not shown by themselves) are consistent with zero offset, and Schaller et al. ( 2015)'s measurements of lower mass individual galaxies also show zero offset in CDM simulations with better resolution.Furthermore, the mean (rather than median) value of  ∥ happens to be slightly positive for main haloes and slightly negative for subhaloes.Nonetheless, this should be re-measured in future work.Robertson et al. (2017a) measured no offset in idealised simulations, so it is feasible that this is a new effect particular to fully cosmological simulations, caused by the varied impact parameters, ongoing star formation, or chance projection of substructures (which dominate measurement noise; see Section 3.1).Finally, it is interesting to note that Robertson et al. (2017a) report a small bias when measuring the position of one halo in the outskirts of another, because the second halo contributes a gradient of particles across the shrinking circle.We confirm this, by comparing positions measured (by default) using all particles, with positions measured using only bound particles.We find that positions move by a comparable amount in the same direction, such that the effect on  ∥ is negligible.Using angle brackets to denote medians, we measure in CDM simulations a decrease from ⟨ ∥ ⟩ all 2D = 0.0307 +0.011 −0.005 to ⟨ ∥ ⟩ bound 2D = 0.0302 +0.011 −0.006 , and an increase from ⟨ ∥ ⟩ all 3D = 0.0265 +0.016 −0.007 to ⟨ ∥ ⟩ bound 3D = 0.0273 +0.017 −0.008 .

Selection effects
Our interpretation of the average measurement from many colliding systems presupposes that ⟨ ∥ ⟩ is universal (see the start of Section 3.3).The components of our simulated mergers are typically closer than those in observations (see Section 2.2).Furthermore, the offsets of individual components of matter vary with time since collision (Robertson et al. 2017a).If  ∥ also varies with time, it matters whether clusters are observed before or after pericentric passage.
Unfortunately, the BAHAMAS simulations do not include enough cluster mergers to reach discriminating statistics if we split our sam- Red circles and blue triangles show similar calculations using information available in 3D or that projected onto a 2D sky; for clarity, 3D data points are horizontally offset by a small amount.Error bars show '1-like' uncertainty, between the 16 th and 84 th percentiles.Curves show the best fits of model ( 18), with parameters tabulated in the bottom row of Table 2.  18) to our measurements of median  ∥ , from quantities accessible in either 2D or 3D.Rows show results from the entire sample, just the main haloes, just the substructures, and haloes in 2-body systems, split by the sign of their relative velocity (i.e.whether they are approaching pericentre or receding after it).
[cm 2 g ple, nor enough snapshots to measure them at different times since collision.Indeed, it is not always clear from the single snapshot available whether clusters have already reached pericentre.For systems with only two haloes, splitting by whether the net velocity of their particles is approaching or receding yields unstable measurements of parameter  and much larger uncertainties in parameter  (Table 2).For two-halo systems in SIDM1 simulations anal- ysed in 2D (using angle brackets to denote medians, and propagating 16/84 percentile uncertainties like standard deviations), we find ⟨ ∥ ⟩ recede /⟨ ∥ ⟩ approach = 1.39 +0.62 −0.46 (or 0.39 +0.48 −0.30 split further into main haloes, and 2.84 +3.17 −0.87 for subhaloes).Future investigation of this would be interesting with larger simulations.If it makes a significant difference to predictions of ⟨ ∥ ⟩, the selection of simulated clusters should be matched to selection biases in observational samples.Alternatively, observational samples could be selected carefully, inferring the direction of motion using a combination of optical and X-ray data, or shock fronts in e.g.radio emission.

Future prospects
Measurements of  ∥ are a promising way to test the interaction cross-section of DM.Symmetries are expected to provide a null result  ∥ = 0 in the case of CDM (/ = 0) and a perpendicular test for systematics,  ⊥ .Future surveys may be able to combine measurements of  ∥ and  ⊥ from a large number of observed merging galaxy clusters.
To forecast a future survey's ability to rule out the null hypothesis / = 0, we first consider the expected uncertainty on ⟨ ∥ ⟩  (equation 16), setting / = 0, and bootstrapping the other parameters from observational experience with single-band HST data (Harvey et al. 2015).A large sample of merging clus-Table 3. Assumed characteristics of hypothetical astronomical observations that could be used to measure the offset of DM from ordinary matter, as proposed in this paper.For various telescopes, columns indicate: the density of resolved galaxies behind merging clusters,  gal , the precision with which it is possible to measure the location of stellar material,  S , and DM,  D .The final column indicates the number of clusters,  cluster , that must be observed to potentially rule out the hypothesis / = 0 with 95 per cent confidence.2015)'s sample mean2 of 1/⟨ 2 SG ⟩ = 1.6 × 10 −3 arcsec −2 (as discussed in Section 2.2, but converting the number into angular units), with  halo = 2.3 cluster .The observationally achieved uncertainty on the location of stellar material,  S = 0.6 arcsec, is both subdominant and unlikely to change significantly because it depends on astrophysical effects such as confusion between multiple BCGs (George et al. 2012).Analysis of mock images suggests the achieved uncertainty on the location of DM,  D = 11.4 arcsec, is reduced by approximately 30 per cent by perfect separation between galaxies in front of or behind the cluster using multicolour photometry, and falls proportionally to 1/ √  gal , the density of resolved background galaxies (Harvey et al. 2013).Table 3 collates expectations of  gal for the James Webb Space Telescope (JWST, Gardner et al. 2006;Casey et al. 2023), Super-Pressure Balloon-borne Imaging Telescope (SuperBIT, Romualdez et al. 2016;Shaaban et al. 2022), Euclid (Laureĳs et al. 2011;Euclid Collaboration et al. 2022) and the Vera C. Rubin Observatory (LSST, Chang et al. 2013;Ivezić et al. 2019).

Telescope
For each future survey we calculate the single-tailed 95 per cent confidence limit on using the values in Table 3 and assuming a Gaussian error distribution (which Fig. 3B of Harvey et al. 2015, suggests to be reasonable).We estimate the standard error on the median by multiplying the standard error of the mean by a factor of √︁ /2 (for details see chapter 4 of Maindonald & Braun 2010).We finally convert this to a singletailed 95 per cent confidence limit on / using equation ( 18) with parameter values from Table 2 (all haloes, 2D).These two confidence limits are shown, as a function of the number of observed systems, in Fig. 5.The number of typical merging clusters that must be observed by any telescope to reach the target particle physics sensitivity of 0.1 cm 2 g −1 can be read from the bottom panel of Fig. 5, and is listed in Table 3.Indeed, SuperBIT's design characteristics were optimised to meet this goal as its primary science driver (McCleary et al. 2023).
This forecast is valid for surveys measuring (or placing upper limits on) SIDM cross-section 0.05 ≲ / ≲ 0.5 cm 2 g −1 .At lower crosssections, either simulation resolution, noise, or cosmological effects (see Section 4.2) inhibit simulated values of  ∥ reaching zero.Such tight constraints would require observations of approximately three times more clusters than listed in Table 3.At higher cross-sections, the limited number of simulated clusters leads to measurement uncertainty such that sometimes ⟨ ∥ ⟩ > , which is incompatible with equation ( 18).That model cannot be used to map ⟨ ∥ ⟩ onto /, until sufficient clusters have been observed such that (in this case) 95 per cent of the posterior probability is at values of ⟨ ∥ ⟩ < .

CONCLUSIONS
Terrestrial particle physics experiments have established that DM interacts very weakly, if at all, with Standard Model particles.However, terrestrial experiments are unable to test whether DM particles interact with each other -as predicted for many proposed models of self-interacting DM (SIDM, e.g.Kusenko & Steinhardt 2001;Duffy & van Bibber 2009;Loeb & Weiner 2011;Kamada et al. 2020), including a large class containing a dark-sector analogue of the strong force (Mohapatra et al. 2002;Foot 2014;Hochberg et al. 2015).In the latter models, the natural scale of the interaction cross-section is the same order as for nuclear interactions, / ∼ 0.6 cm 2 g −1 .
Using cosmological simulations, we have measured the effect of DM self-interactions on the major mergers of galaxy clusters.We find that the offset between DM and galaxies (as a fraction of that between gas and galaxies) is a promising test of SIDM.This 'bulleticity',  ∥ , increases with cross-section in a way that matches the predictions from an analytic model (equation 18) originally proposed by Harvey et al. (2014).Because it is a fractional offset, the same measurements can be accessed using either 3D or 2D projected data.Symmetries provide a null test ⟨ ∥ ⟩ = 0 for non-interacting DM, and an orthogonal test ⟨ ⊥ ⟩ = 0 for systematics or to measure scatter.
Three challenges remain with theoretical predictions.First, because of instabilities in our identification of peak positions, we find the median offset of DM haloes to be more robust than the mean.It would be interesting to repeat this analysis with methods for peak finding that are closer to those used with observational data.These may differently weight the distribution of matter close to a peak or at distance from it, which is important if the distribution is skewed.Second, we find that the median (and mean)  ∥ is slightly positive for all values of interaction cross-section, even / = 0 (Fig. 4).This implies that it is more likely for DM to be found between the stars and gas, rather than leading the stars.Other CDM simulations (Schaller et al. 2015;Robertson et al. 2017a), and an analytic model (equation 18) predicted ⟨ ∥ ⟩ to be consistent with zero when / = 0. Our measurement of an offset in the non-interacting CDM case might be a statistical anomaly, might be caused by the matching of DM to galaxy and gas peaks when starting shrinking spheres from subfind positions, or it might be a symptom of more complex dynamical processes.Third, our simulated survey volume is too small to contain sufficient systems for the sample to be usefully split by e.g.mass ratio, impact speed, or time before/after first pericentric passage.It will be important to test whether  ∥ is a universal function of / for a range of these parameters, as predicted.We have no evidence from BAHAMAS to indicate that it is not, but are running larger simulation volumes precisely to test this.In future simulations, it will also be interesting to measure the  ∥ produced by SIDM interactions mediated by low-mass particles that produce more frequent but smaller momentum exchange scattering events, which manifests on macroscopic scales as something closer to a drag force (Fischer et al. 2022).
Finally, we made predictions for limits on the DM self-interaction cross-section that could be ascertained by future telescopes.The test is promising, with astronomical observations of ∼100 merging clusters yielding constraints relevant to particle physics.If  ∥ is not perfectly universal, larger simulations will also be important to interpret observations, by reproducing sample selection effectse.g. with larger values of ⟨ SG ⟩ and more systems just after first pericentric passage.Reproducing selection effects would be even more important if averaging samples with a mean or weighted mean rather than a median, because these are so strongly dominated by a small number of systems., but now for the distance from stellar matter to the gas (see Fig. 1).Note that by definition  SG is always positive.

APPENDIX B: OFFSETS OF INDIVIDUAL COMPONENTS
Figures B1, B2 and B3 show the distributions of distances used to calculate the distance ratios  ∥ and  ⊥ (which are themselves shown in Figures 2 and 4).Uncertainties on the number of counts in each bin are assumed to be the square root of the counts plus one.One unexplained curiosity is that anomalously few systems have  DI = 0 in SIDM simulations.We cannot explain this, and merely speculate that the galaxies may be oscillating within the DM halo on a radial orbit, and thus spend very little time at pericentre.
This paper has been typeset from a T E X/L A T E X file prepared by the author.

Figure 1 .
Figure1.During a collision between galaxy clusters, the galaxies (S for 'stars'), gas (G), and DM (D) can become separated.Ram pressure on the cluster's gas means that the vector from a cluster's gas to its stars approximately indicates its direction of motion.We define the positive-definite length of this vector  SG .To test whether pressure also acts on DM, we measure the distance from the DM to the stars, in components parallel to the direction of motion,  SI , and perpendicular to it,  DI .

Figure 2 .
Figure2.The fractional offset of DM from galaxies,  ∥ , in simulations of colliding galaxy clusters.Columns of panels separate the offset in main haloes or subhaloes, and using quantities accessible from 2D projections on the sky or 3D simulations.Rows of panels show results from simulated universes with SIDM cross-section / = [0, 0.1, 0.3, 1] cm 2 g −1 from top to bottom.Red dashed lines show the best-fitting perturbed Gaussian, see Section 4. Fits use the MCMC sampler emcee, and assume Poisson noise on each bin.

Figure 3 .
Figure 3.The width of the best-fitting perturbed Gaussian to distributions of

Figure 4 .
Figure 4. Median fractional offset between galaxies and DM in simulated colliding clusters, as a function of the SIDM cross-section / = [0, 0.1, 0.3, 1] cm 2 g −1 (top panel), and the perpendicular control test that should be consistent with zero in the absence of systematics (bottom panel).Red circles and blue triangles show similar calculations using information available in 3D or that projected onto a 2D sky; for clarity, 3D data points are horizontally offset by a small amount.Error bars show '1-like' uncertainty, between the 16 th and 84 th percentiles.Curves show the best fits of model (18), with parameters tabulated in the bottom row of Table2.

Figure 5 .
Figure 5. Forecast upper 95 per cent confidence limits on  ∥ (top) and / (bottom) for future observations with JWST (gold hexagon), HST single-and multi-band imaging (green triangle pointing left and purple triangle pointing right), SuperBIT (blue cross), Euclid (red square) and LSST (grey triangle pointing up), as a function of the number of merging clusters observed.Most predictions assume multicolour imaging; this experiment has only ever been performed in practice with single-band HST imaging.

Figure B1 .Figure B2 .Figure B3 .
Figure B1.The distance between stellar matter and DM in simulations of colliding galaxy clusters, parallel to the direction of motion (see Fig.1).Columns of panels separate the distance in main haloes or subhaloes, and using quantities accessible from 2D projections on the sky or 3D simulations.Rows of panels show results from simulated universes with SIDM cross-section / = [0, 0.1, 0.3, 1] cm 2 g −1 from top to bottom.Note that the -axis range is different in each column.