Coalescing black hole binaries from globular clusters: mass distributions and comparison to gravitational wave data from GWTC–3

We use our cluster population model, cBHBd , to explore the mass distribution of merging black hole binaries formed dynamically in globular clusters. We include in our models the e ﬀ ect of mass growth through hierarchical mergers and compare the resulting distributions to those inferred from the third gravitational wave transient catalogue. We ﬁnd that none of our models can reproduce the peak at m 1 (cid:39) 10 M (cid:12) in the primary black hole mass distribution that is inferred from the data. This disfavours a scenario where most of the detected sources are formed in globular clusters. On the other hand, a globular cluster origin can account for the inferred secondary peak at m 1 (cid:39) 35 M (cid:12) , which requires that the most massive clusters form with half-mass densities ρ h , 0 (cid:38) 10 4 M (cid:12) pc − 3 . Finally, we ﬁnd that the lack of a high mass cut–o ﬀ in the inferred mass distribution can be also explained by the repopulation of an initial mass gap through hierarchical mergers. Matching the inferred merger rate above (cid:39) 50 M (cid:12) requires both initial cluster densities ρ h , 0 (cid:38) 10 4 M (cid:12) pc − 3 , and that black holes form with nearly zero spin. A hierarchical merger scenario makes speciﬁc predictions for the appearance and position of multiple peaks in the black hole mass distribution, which can be tested against future data.


INTRODUCTION
The analysis of gravitational wave (GW) observations has identified structures in the mass distribution of the observed population (Tiwari & Fairhurst 2021).Some of these structures already emerged from the analysis of the second gravitational wave transient catalog (GWTC-2; Abbott et al. 2021b;Abbott 2020a).However, thanks to the increased number of events in the new GWTC-3 (Abbott et al. 2021c,a), we are now more confident of their statistical significance.In particular, three important features in the underlining BH mass distribution have been uncovered: (i) the distribution of primary BH masses has a strong peak at about 10M ; (ii) there is clear evidence for a secondary peak at 35M ; and (iii) there is no evidence for any mass gap above ≈ 40-60M , which is predicted by stellar evolution models due to pulsational pairinstability and pair-instability in massive stars (e.g., Woosley 2017;Spera & Mapelli 2017;Olejak et al. 2022).In this article, we perform a large number of cluster simulations to understand whether E-mail: AntoniniF@cardiff.ac.uk (i), (ii) and (iii) can be explained by a globular cluster (GC) origin for the sources.
The mass distribution of coalescing BH binaries produced in GCs has been investigated in several studies (e.g., Rodriguez et al. 2016;Askar et al. 2017;Park et al. 2017;Antonini & Gieles 2020a;Mapelli et al. 2022;Zevin & Holz 2022).Previous work suggests that GCs are an environment where BH binaries can efficiently assemble and merge, providing one of the main formation channels of BH binary coalescences in the Universe (Portegies Zwart & McMillan 2000).In particular, it has been argued that due to the high escape velocities of GCs, BH mass growth can occur through consecutive mergers, populating any mass gap created by stellar processes (Rodriguez et al. 2018;Kimball et al. 2020;Rodriguez et al. 2020;Doctor et al. 2020;Kimball et al. 2021;Tagawa et al. 2021a).In this scenario, a BH that is formed from a previous merger and is retained inside the cluster, sinks back to the cluster core where it dynamically couples with another BH and merges with it after a series of binary-single encounters.If this process repeats multiple times, significant mass growth can occur (Antonini & Rasio 2016;Fishbach et al. 2017;Gerosa & Berti 2017).A direct comparison of model predictions to data, however, are rare (Mould, Gerosa, & Taylor 2022).It remains therefore an open question whether a GC origin provides a plausible explanation for the inconclusive evidence for an upper mass gap in the GW data, and whether the other features of the inferred BH mass distribution can also be reproduced.A putative successful GC model will then provide useful constraints on the properties of GCs and their BHs at birth.
In this work, we adopt our new fast method for the evolution of star clusters and their BBHs, cBHBd (Antonini & Gieles 2020b), to study the mass distribution of BHs produced dynamically in GCs, including the effect of hierarchical mergers and a novel recipe for sampling masses of the BBH components and the interlopers.Our efficient approach allows us to address how model assumptions affect the final results, place error bars on merger rate estimates, compare to the distributions inferred from the new GW data catalog GWTC-3, and, finally, asses a hierarchical merger origin for the formation of the most massive BHs detected by LIGO and Virgo.
In Section 2 we describe our methodology and approximations.In Section 3 we describe our main results, and the importance of model assumptions.We conclude and summarise our results in Section 4.

CLUSTER MODELS WITH HIERARCHICAL MERGERS
We simulate the evolution of BH binaries in star clusters using our code cBHBd, which we modify in order to include hierarchical mergers.We define here hierarchical mergers as binary mergers in which at least one of the two BH components is a BH remnant that was formed from a previous merger.
Our method is based on Hénon's principle (Hénon 1975) which states that the rate of heat generation in the core is a constant fraction of the total cluster energy per half-mass relaxation time.Thus, the energy production rate in the core, which we assume is produced by BH binaries, is regulated by the energy demand of the entire system (Breen & Heggie 2013).The lifetime and the merger rate of BHs in the cluster can be linked to the evolution of the cluster itself as described in details in Antonini & Gieles (2020b).Then, three ingredients are needed in order to determine the formation of BH binaries, their merger rate and their properties: (i) a model for the evolution of the cluster global properties; (ii) a model for binary BH dynamics; and (iii) a realistic set of initial BH masses.
We start by sampling the masses of the stellar progenitors of BHs from a standard mass function, φ(m ) ∝ m −2.3 (Salpeter 1955;Kroupa 2001), with masses in the range 20M to 130M .For a given cluster metallicity, Z, we evolve the stars to BHs using the Single Stellar Evolution (SSE) package (Hurley et al. 2002), which we modified to include updated prescriptions for stellar winds and mass loss (following Vink et al. 2001), and for pair-instability in massive stars (following Spera & Mapelli 2017).We therefore evolve the BH progenitors as single stars, assuming a zero binary fraction initially.At the end of this phase, the total number of BHs in a cluster model is calculate by assuming a Kroupa (2001) initial mass function in the mass range 0.1M to 130M .The value of the largest BH mass formed in the model depends on metallicity and varies between 25M for Z = 2 × 10 −2 and 55M for Z = 1 × 10 −4 .For each BH we compute a natal kick velocity from a Maxwellian distribution with dispersion 265km s −1 (Hobbs et al. 2005), lowered by the amount of mass that falls back into the forming compact object (Fryer et al. 2012).In most of our models, we start the BHs all with the same value of the spin angular momentum, χ, where χ = S /m 2 is the dimensionless spin of the BH and S is the spin angular momentum in units of m 2 .In one model the initial value of χ is sampled from a distribution that is consistent with that inferred from the GW data and is given by the median distribution shown in figure 15 of Abbott et al. (2021c).
Then, we initialise and evolve the cluster model.The initial conditions are determined by three parameters: the cluster density, ρ h,0 ; the cluster mass, M 0 ; and the total mass in BHs, M BH,0 .The latter is set equal to the total mass in BHs obtained with SSE, assuming a Kroupa initial mass function in the range 0.08 − 130 M and taking into account that a fraction of the BHs are ejected from the cluster by a natal kick.The time evolution of the cluster properties is then obtained as in Antonini & Gieles (2020a).Briefly, we integrate a set of first order differential equations which determine the time evolution of M, M BH , and the cluster half mass radius, r h .These models include simple prescriptions for mass loss and expansion due to stellar evolution and cluster 'evaporation', while BHs are assumed to be lost through dynamical ejections.
Finally, we dynamically evolve the BH binaries that form via three-body processes in the cluster core.Our treatment of binary BH formation and evolution follows closely Antonini & Gieles (2020b).The first binary BH forms after the cluster core-collapse time where t rh,0 is the initial cluster half-mass relaxation time (for the definition, see equation 10 in Antonini & Gieles 2020b).We assume that the binary is formed with a semi-major axis at the softhard boundary, a h Gµ/σ 2 , with µ = m 1 m 2 /(m 1 + m 2 ), where m 1 and m 2 are the masses of the binary components, and m 1 > m 2 .The expression of a h above is only approximate, and valid under the assumption of equal mass components.Later in Appendix A we introduce the quantity β and equipartition among BHs of different masses, then the definition is a h = 0.5Gm 1 m 2 β.
The pairing of BHs is done by sampling their masses from the set of BHs still left inside the cluster.We first draw two mass values from the power law probability distributions p(m 1 ) ∝ m α 1 1 and p(q) = q α 2 , with α 1 = 8 + 2α, α 2 = 3.5 + α and q = m 2 /m 1 .Here α is the power law index of the BH mass function, which also evolves with time as the BH population is depleted.The two BH components are then selected by choosing the two BHs that have the mass closest to the values drawn from p(m 1 ) and p(q) (or p(m 2 )).
Once selected, the binary is evolved through a sequence of binary-single encounters.Similarly, we find the mass of the third BH interloper from the power law distribution p(m 3 ) ∝ m α 3 3 with α 3 = α + 1/2.The adopted expressions for p(m 1 ), p(q) and p(m 3 ) are motivated below in Appendix A. The power law exponent, α, is obtained at the start of the integration for each cluster from a fit to the initial BH mass function after removing BHs that are ejected by natal kicks.The value of α as well as the lower and upper bound of the BH mass function are then recalculated after each time-step.Specifically, the lower bound of the BH mass function is set equal to the mass of the lightest BH in the cluster, and the upper bound is the mass of the most massive BH.This procedure allows to take into account the evolution of the BH mass function with time due to ejections and the growth of BHs through hierarchical mergers.
Following Samsing (2018), we divide each binary-single encounter in a set of N rs = 20 resonant intermediate states and assume that the eccentricity of the binary after each state is sampled from a thermal distribution (2) a merger occurs through a GW capture before the next intermediate binary-single state is formed, where R S = 2G(m 1 + m 2 )/c 2 and h is a constant of order unity.
If the binary survives the 20 intermediate resonant states, we compute: (i) the new binary semi-major axis, assuming that its binding energy decreases by the fixed fraction ∆E/E = 0.2 (Samsing 2018).(ii) the recoil kick due to energy and angular momentum conservation experienced by the binary centre of mass (Antonini & Rasio 2016) with q 3 = m 3 /(m 1 + m 2 ), and (iii) the recoil kick experienced by the interloper: If v bin > v esc , the binary is ejected from the cluster; if v 3 > v esc , the interloper is also ejected from the cluster.If the binary is ejected from the cluster, we compute its merger timescale due to GW energy loss using the standard Peter's formula (Peters 1964).If v bin < v esc , and the binary angular momentum at the end of the triple interaction is such that (Antonini & Gieles 2020b) then the BH binary merges before the next binary-single encounter takes place.Binaries that undergo this type of evolution are often named 'in-cluster inspirals' (Samsing 2018;Rodriguez et al. 2018).
We then assign the new remnant BH a GW recoil kick, v GW , and compute its new spin and mass following Rezzolla et al. (2008).
If v GW > v esc the remnant is ejected from the cluster, otherwise we compute the dynamical friction timescale to sink back to the cluster core where (Antonini et al. 2019) and only allow the BH to form a new binary after this time.If v bin < v esc , but condition equation ( 5) is not satisfied, then a new interloper is sampled from the given distribution and the binary is evolved through a new binary-single interaction.Each binary is evolved through a sequence of binary-single encounters until either a merger occurs or it is ejected from the cluster.Then a new binary is formed.The cluster is assumed to live in a state of balanced evolution in which the binary disruption rate is equal to the binary formation rate.Under this assumption, the lifetime of a binary, or the timescale until a new binary is formed, is simply where m ej is the total mass ejected by the binary, and ṀBH is the BH mass loss rate given by the cluster model.We continue selecting new binaries and evolve them through binary-single encounters until either all BHs have been ejected from the cluster, or until a maximum integration time of t = 13Gyr has passed.

Cluster initial mass function, formation time, and metallicity
In order to generate predictions for BH binary mergers, we need a GC initial mass function, and a model for how the formation rate and metallicity of clusters evolve with redshift.These ingredients of our models are described below.For this we follow the approach of Antonini & Gieles (2020a).The cluster initial mass function is obtained by fitting an evolved Schechter mass function to the observed GC mass function in the Milky Way today (Jordán et al. 2007) This gives the posterior distribution for the parameters M c and ∆.Adopting a simple model for cluster evaporation and mass loss due to stellar evolution, the corresponding initial GC mass function is given by: The corresponding fractional mass loss due to evaporation and stellar evolution is The spread in K provides an estimate of the uncertainty in the fractional mass loss from cluster until today, given the 156 Milky Way GC masses.The cluster mass formed per unit volume integrated over all times is (Antonini & Gieles 2020a) The large error bars here imply that ρ GC0 is uncertain by a factor of 2. In the next sections we include this uncertainty as well as the uncertainty on K in the predictions for the merger rate.
We obtain the distribution of cluster formation times from the semi-analytical galaxy formation model of El-Badry et al. (2019).The resulting cluster formation history peaks at a redshift of ∼ 4, which is earlier than the peak in the cosmic star formation history Local distributions of primary BH mass and mass ratio for merging BH binaries in our best-case scenario where clusters start with high densities, ρ h,0 = 10 5 M pc −3 , and the BHs are all initialised with zero dimensionless spin parameter χ = 0. Top, middle and bottom panels correspond to the delayed supernova mechanism, the rapid supernova mechanism, and the BH mass distribution of Belczynski et al. (2008), respectively.Black lines show the corresponding distributions when hierarchical mergers are not included in the calculation, i.e., it is assumed that a BH formed from a previous merger is always ejected from the cluster.Thick lines show median values of the merger rate value, and thin lines the corresponding 90% confidence intervals.The green lines and hatched regions show the median and corresponding 90% confidence regions inferred from the GWTC-3 (Abbott et al. 2021a).
(redshift ∼ 2, Madau & Dickinson 2014).We sample the formation redshift of our cluster models from the total cosmic cluster formation rate given by the fiducial model of El-Badry et al. (2019) and integrated over all halo masses.This corresponds to the formation rate per comoving volume of their Fig. 8 with their parameters β Γ = 1 and β η = 1/3, where β Γ sets the dependence of the cluster formation efficiency on surface density, and β η the dependence of the star formation rate on the halo virial mass.Here, we renormalise the cluster formation rate, φ z (z), such that 0 ∞ φ z dz = 1.Thus, we only sample the cluster formation time from El-Badry et al. ( 2019), and then rescale the cluster formation rate such to reproduce the total mass density given by our equation ( 12).We note that the cluster formation model has a negligible impact on the local merger rate and properties of the merging binaries.In Antonini & Gieles (2020a) we showed that unrealistic models where all clusters are assumed to form at the same time (e.g., z = 3) produce similar results than models in which the cluster formation rate is varied with redshift.
For the cluster metallicity, we fit a quadratic polynomial to the observed age-metallicity relation for the Milky Way GCs (Vanden-Berg et al. 2013), to obtain the mean metallicity log(Z mean /Z ) 0.42 + 0.046 t Gyr − 0.017 t Gyr 2 . (13) Given the cluster formation redshift, we then assume a log-normal distribution of metallicity around the mean with standard deviation σ = 0.25 dex.This takes into account the large spread found in the observed age-metallicity relation for the Milky Way GCs.

Merger rates and their error bars
Finally, we construct a library of cluster models over a 3dimensional grid of formation time, metallicity and cluster mass.We sample the cluster formation redshift over the range z ∈ [10; 0] with step-size ∆z = 0.5; at a given redshift, the metallicity of the cluster is sampled in the range Z ∈ [10 −4 ; 0.02] with logarithmic step size ∆ log Z = 0.1; finally, for a given formation time and metallicity, the initial mass of the cluster is varied in the range M 0 ∈ [10 2 ; 2 × 10 7 ]M , with step size ∆ log M 0 /M = 0.1.The merger rate is then calculated over the grid of cluster models as: where Ṅ(τ; M 0 , r h,0 , Z) is the BH binary merger rate at a look-back time τ corresponding to a cluster with an initial mass M 0 , metallicity Z and that formed at a redshift z.
In order to take into account the uncertainties in the initial cluster mass function, we sample 100 values over the posterior distributions of the parameters M c and ∆ obtained from the MCMC fit to the Milky Way GC mass function.We also take into account the uncertainty on the mass density of GCs in the Universe, ρ GC .We assume that the parameter ρ GC follows a Gaussian distribution with mean 7.3 × 10 14 M Gpc −3 and dispersion σ = 2.6 × 10 14 M Gpc −3 .We sample 100 values from this Gaussian distribution and for each of them we use equation ( 15) to determine a merger rate estimate for each of the [M c , ∆] values, and thus obtain a distribution of merger rate density values.Since in this work we are interested in the mass distribution of local BH binary mergers, we consider the differential merger rate in the local universe dR(z = 0)/dm 1 and dR(z = 0)/dq, which we compare to the distributions inferred from GWTC-3.

Primary BH mass and mass ratio distributions
In Antonini & Gieles (2020a) we found good agreement between model predictions and the inferred distribution of primary BH Figure 2. Differential local merger rate as a function of the initial cluster mass.We also show the initial cluster mass function (in arbitrary units) for our best fit value of Schechter mass, log M c /M = 6.26.The delayed supernova mechanism has been adopted here.
masses within the range of values m 1 = 15M to 40M .Outside this range, the binary BH merger rate was found to be several orders of magnitude smaller than inferred.The first question we address here is whether the inclusion of hierarchical mergers can reduce the discrepancy at m 1 40M between models and the inferred astrophysical distributions.
In Fig. 1 we plot the distributions of m 1 and q for three different assumptions about the initial BH mass function.In the upper panels we use the delayed supernova mechanism, in the middle panels the rapid supernova mechanism (Fryer et al. 2012), and in the lower panels we use the BH mass distribution from Belczynski et al. (2008).These prescriptions produce somewhat different initial BH mass functions, and lead also to different natal kick values.In these models, all clusters are initialised with the same half-mass radius density of ρ h,0 = 10 5 M pc −3 and the BHs are all started with zero dimensionless spin parameter, χ = 0.
In the left panels of Fig. 1, we see that the new models produce mergers above the ∼ 50M threshold.These mergers are produced by BHs that grow hierarchically through mergers -the vast majority of them are mergers between a first generation BH and a second generation BH.When we include these mergers in our calculation, we find good agreement between the models and the inferred distributions at m 1 50M , in the sense that the models give a merger rate that is comparable to the inferred value.However, we also note that a simple power-law profile above this mass threshold is not a good representation of the model distributions.Above m 1 50M , the model distributions are characterised by several peaks.Such higher mass peaks are related to peaks in the BH mass distribution at lower masses.From Fig. 1 we see that the merger rate between first generation BHs peaks at 30M and 40M .Thus, mergers between first and second generation BHs lead to additional peaks at (30+30)M = 60M , (30+40)M = 70M and (40 + 40)M = 80M .The presence of peaks within the pairinstability mass gap and their relation to lower mass peaks in the Figure 3. Dependence of mass and mass ratio distributions on initial cluster half-mass density, and initial BH spins.The delayed supernova model is assumed here.Top panels use χ = 0 and the half-mass density of the cluster is varied as indicated.In the bottom panel we take ρ h,0 = 10 5 M pc −3 and change the initial value of χ; the black histograms show the results for a model where the initial value of χ is sampled using the inferred distribution of BH spins shown in Fig. 15 of Abbott et al. (2021c).In all the other models, the BHs all form with the same value of χ as indicated.
BH mass distribution is a clear prediction of a hierarchical merger model for the origin of the binaries.
The black histograms in the left panels of Fig. 1 show the results from models in which any remnant BH formed from a previous merger is ejected from the cluster 'by hand'.In these models, the distributions are sharply truncated at ∼ 50M since BHs cannot grow hierarchically above this mass value.The merger rate at m 1 10M derived from all models is about two orders of magnitude smaller than the inferred rate.This lower-mass peak can be explained, however, through other scenarios, including formation in the galactic field (e.g., Broekgaarden et al. 2022;van Son et al. 2022) and formation in young and open star clusters because of their higher metallicity (e.g., Banerjee 2021b;Chattopadhyay et al. 2022).On the other hand, our models reproduce the inferred merger rate near m 1 30M , which can therefore be explained by a GC origin.This peak in the mass distribution is due to mergers involving first-generation BHs, and it is not related to hierarchical mergers.
By comparing the results in Fig. 1 with the models in Antonini & Gieles (2020a), we find that the latter generated a merger rate at m 1 20 M higher by a factor 2. The reason for this difference is due to the adopted new recipe for sampling the black hole binary components and the interloper masses.In Antonini & Gieles (2020a) we had assumed that m 1 = m 2 = m 3 =m max , where m max is the mass of the most massive BH in the cluster.The distributions in Appendix A mean instead that in the current models m 3 m 1 m 2 .Thus, each binary ejects more low-mass BH interloopers lowering the overall BH merger rate at low masses.
In the right panels of Fig. 1 we consider the distribution of the mass-ratio q.The new models result in a significantly higher rate of merging binaries with small mass ratio, q 0.5, providing a better match to the inferred distribution than models without hierarchical mergers.Most of these additional low-q systems are mergers between a first generation BH and a BH that formed through a previous merger above the pair-instability mass limit.At high values of q, instead, both models with and without hierarchical mergers produce a similar merger rate, which, at q 0.8, is about one-order of magnitude smaller than inferred from the data.
In Fig. 2 we show the differential contribution to the local merger rate with respect to the initial cluster mass.This allows us to identify in which type of clusters most of the mergers are formed.The contribution to the total merger rate is nearly constant in log bins between M 0 = 5 × 10 4 M and 5 × 10 6 M , while it decrease exponentially above this range because of the truncation of the initial GC mass function at log M c /M 6.26.We then show the same cluster mass distribution, but only considering mergers with m 1 50M (red histogram).These mergers, involving a primary BH above the mass gap limit, are mostly produced in clusters with relatively large masses, between ∼ 2 × 10 5 M and 5 × 10 6 M .The fraction of these higher mass mergers to the total number of mergers in each cluster mass bin increases with cluster mass.By comparing the blue and the red histograms in the figure, we see that at M 0 ∼ 10 7 M , between 10 to 30% of mergers have a primary with m 1 > 50M .The percentage goes down to ∼ 1% in clusters with an initial mass lower than 10 6 M .Finally, we show the M 0 distributions for the most massive mergers produced in our models, m 1 100M .These BHs originate from at least two previous mergers since their mass is larger than twice the initial mass cut-off at 50M .These systems are formed in the most massive GCs, with initial mass well above the Schechter mass.

Effect of initial cluster density, initial spins, and other model variations
The number of heavier BHs produced by a cluster through hierarchical mergers is affected by both the cluster density and the initial spin of the BHs.A larger cluster density means a larger merger rate and escape velocity, and therefore a larger probability that a remnant BH is retained inside the cluster following a recoil kick.Similarly, if the BHs have negligible spins, this translates into a smaller recoil velocity and higher retention probability.While in the previous section we have looked at a somewhat optimistic scenario in which clusters all form with high densities and the BHs have initially zero spins, in this section we vary these assumptions and investigate their effect on the BH binary merger rate and properties.We adopt here the delayed supernova mechanism, but similar results are obtained with the rapid supernova prescription and the Belczynski et al. (2008) mass distribution.
In the upper panels of Fig. 3, we vary the initial cluster halfmass density within the range ρ h,0 = 10 2 to 10 5 M pc −3 , and assume that the BHs have zero spins initially.The results illustrate that although our models can in principle account for most mergers above m 1 20M , this is only true under some specific conditions.As we lower the initial cluster half-mass density the merger rate goes down significantly at all values of mass and mass-ratio.The depletion is more significant at masses above the cut-off mass of 50M and for q 0.5.Thus, a scenario where most merging BH binaries with m 1 20M form in GCs would imply a typical initial cluster density ρ 0 10 4 M pc −3 .It is important to note that this condition would however only apply to clusters with initial mass M 0 5 × 10 4 M , where most of the merging binaries are formed (see Fig. 2).
In the lower panels of Fig. 3 we show how the results change with changing the initial BH spins.In these models we keep the initial density to the fixed value ρ 0 = 10 5 M pc −3 .We see that the merger rate density distributions are not affected significantly for m 1 50M and q 0.5.This is because the majority of these binaries are made of first generation BHs.Hence, their merger rate is not affected by the recoil kick velocity and by the initial choice of BH spin.On the other hand, the number of BHs formed via hierarchical mergers decreases significantly when higher initial spins are used due to the larger recoil kicks.This leads to a lower merger rate at m 1 50M when χ is increased.Even relatively modest initial spins, χ = 0.1, lead to a distribution that does no longer match the inferred distribution.The constrains on χ seems therefore quite strong as a hierarchical origin for all mergers with m 1 50M would require that BHs are formed with nearly zero spin.
Finally, we consider six additional model realisations.In one model, we assume that the BHs receive no kick at formation and that the initial density is the same for all clusters, ρ h,0 = 10 5 M pc −3 .In another model, we assume that the cluster halfmass radius scales as log r h,0 pc = −3.56+ 0.615 log This latter relation was derived by Gieles et al. (2010) from the results of Has , egan et al. ( 2005) who fit this Faber-Jackson-like relation to ultra-compact dwarf galaxies (UCDs) and elliptical galaxies.Gieles et al. (2010) derived the initial mass-radius relation correcting for mass loss and expansion by stellar evolution and correcting radii for projection.We consider an additional model realisation where we did not include any prescription for pair instability so that the initial BH mass function has no upper gap and BHs can form above 50M .Moreover, we consider two models where the initial mass function above 0.5 M is assumed to scale as φ(m ) ∝ m −2 (top-heavy) and φ(m ) ∝ m −2.6 (bottom heavy), respectively.Finally, we evolve two additional models where our standard Wolf-Rayet winds based on Hamann & Koesterke (1998) and Vink & de Koter (2005) are multiplied by a factor f WR = 0.1 and f WR = 5 (e.g., Broekgaarden et al. 2022).Unless otherwise specified, all the other model parameters are the same as before, i.e., delayed supernova mechanism, χ = 0, fallback kicks, etc. Fig. 4 shows that the mass properties of the BH binaries produced in the new models without birth kicks and with the new r h -M relation are similar to those found previously in Section 3.1.The fact that adopting the mass-radius relation equation ( 16) does not change significantly our results is not surprising.Clusters with an initial mass M 0 ∼ 10 6 M contribute the most to the merger rate (see Fig. 2).The initial half-mass density of these clusters as derived from equation equation ( 16) is ρ h,0 5 × 10 4 M pc −3 .This is comparable to the constant density value of 10 5 M pc −3 adopted previously.Interestingly, the results of models with no birth kicks show that assuming zero velocity kicks at birth increases slightly the merger rate at the lower mass end of the m 1 distribution and the number of merging binaries with asymmetric masses.On the other hand, the shape and normalisation of the distributions at masses higher than m 1 20M remain virtually the same as in the fallback kick model.
The model without pair instability physics leads to a mass distribution which is significantly different from the other model realisations, showing how our results can depend on the assumptions about stellar evolution and the adopted prescriptions.In this case, the mass distribution still peaks at m 1 ∼ 30 M , while the other peak near 40 M is not longer present.A secondary peak is found near 70M .For masses larger than this value, the merger rate drops and becomes much smaller than the rate inferred from the GW data.
In the model with a top-heavy stellar mass function, the overall merger rate is higher than for our standard models due to the larger number of BHs formed.On the other hand, for a bottomheavy mass function the total merger rate is significantly reduced due to the fewer massive stars formed.Our model with modified Wolf-Rayet wind mass-loss rate lead to results that are qualitatively similar to those obtained under our more standard assumptions.
That our resulting mass distributions are sensitive to the initial  .6, respectively.In the red and orange histograms we have multiplied our standard wind mass loss rate on the Wolf-Rayet stage by a factor f WR = 0.1 and f WR = 5.We have used the delayed supernova mechanism, and, unless otherwise specified, all the other model parameters are the same as in Fig. 1.
BH mass function, and therefore to the uncertain stellar evolution prescriptions is expected.It is interesting, however, that most of our models share similar properties.Specifically: (i) the inferred peak in the merger rate at 10M is much lower than the one inferred from the data, and (ii) the distribution of m 1 presents a main peak at near 35M .The main reason why there are so few mergers with small masses is because of the relatively low number of light BHs in the initial mass function.This is due to the low metallicity of GCs, which results in low wind mass loss and large BH masses.The other reason why the mass distribution of merging binaries peaks at relatively high values is dynamics.The masses of the binary components tend to be sampled near the top end of the BH mass function, due to the high value of the power law exponents that appear in the density probability functions p 1 and p 2 (see Section A).On the other hand, the flatter p 3 distribution means that the ejected BH interlopers will be on average lighter than the binary BH components.These lighter BHs are therefore no longer available for merging.

CONCLUSIONS
In this work we have used our fast cluster evolution code, cBHBd, to investigate the mass distribution of BH binaries produced dynamically in dense GCs.We compared our results to the astrophysical distribution of BH binary masses inferred from GWTC-3 to make inference about the astrophysical origin of the sources.For the first time, we have included hierarchical mergers in our models.This allowed us to address the question of whether a dynamical formation scenario is a feasible explanation for the detected BHs within the so called 'upper mass gap'.Such a mass gap in the initial BH mass function is predicted by stellar evolution theories, and in our models is located at 50M .Because cBHBd is highly efficient compared to other techniques (e.g., Monte Carlo, N-body), we were able to systematically investigate the impact of model assumptions on or results.Our main conclusions are summarised below: i) A purely GC formation scenario for the BH binaries detected by LIGO and Virgo is inconsistent with the 10M peak in the primary BH mass distribution that is inferred from the data.This likely excludes a scenario where the majority of the sources were formed in GCs.
ii) A GC origin can easily account for the secondary mass peak at m 1 35M inferred from the data.This requires that clusters form with initial half-mass density 10 4 M pc −3 .Assumptions about the initial BH spins and the supernova mechanism have no effect on this conclusion.
iii) Dynamical formation in GCs can explain the inferred merger rate of all BH binaries with m 1 20M and q 0.8, including binaries with component masses lying above the assumed mass limit due to pair-instability.For this to be true we require that both the most massive GCs, M 0 10 5 M , form with half-mass density 10 4 M pc −3 , and that the birth spins of BHs are nearly zero.Even small deviations from this latter condition lead to a merger rate above 50M that is orders of magnitude smaller than the inferred rate.iv) A hierarchical merger scenario predicts the appearance of multiple peaks in the primary BH mass distribution and within the upper mass gap due to a pile-up of mergers between first and second generation BHs.Inter-generation mergers lead to a simple relation between the mass value of any of such peaks and that of peaks found at masses lower than the pair-instability limit.These features can be tested against future GW data to place constrains on a GC origin for the sources.
Additional constraints on the formation of BH mergers can be placed by exploring correlations between binary parameters, which we have not considered here, but plan to study in a future work.For example, a BH formed from a previous merger will have a spin χ 0.7.We expect therefore a change in the value of the typical effective and precession spin parameters of binaries with components within the upper mass gap (e.g., Tagawa et al. 2021a;Baibhav et al. 2020) and an increase in spin magnitude for systems with more unequal mass ratio.Binaries formed dynamically will also have larger eccentricities, which can lead to a positive correlation between eccentricity, spin and binary mass in the overall population.The analysis of the data from GWTC-3 has shown marginal evidence that the spin distribution broadens above 30M , and that the mass ratio and spin are correlated in the sense that spins are larger for more asymmetric binaries (Abbott et al. 2021a).The evidence for these correlations remain weak, but it suggests that future analysis based on larger data sets will soon be able to provide more stringent constrains.The residual eccentricity of a binary is by itself another potentially powerful tool for identifying sources formed in clusters.Romero-Shaw et al. (2022) suggest that a significant fraction of the detected GW sources in GWTC-3 show support for eccentricity 0.1 at 10Hz.Their results indicate that densely-populated star clusters may produce the majority of the observed mergers.
Finally, it is worth noting that in our work we used the pairinstability prescriptions from Spera & Mapelli (2017).This gives an upper limit in the initial BH mass function of about 50M .However, there are several uncertainties in the modelling, and different assumptions can lead to significantly different values for the high mass cut-off, generally in the range 40M to 70M (Giacobbo et al. 2018;Farmer et al. 2020;Costa et al. 2021;Fryer et al. 2022).Exploring the effect of these uncertainties is beyond the scope of this paper, but should be considered in future work.

Figure 1 .
Figure1.Local distributions of primary BH mass and mass ratio for merging BH binaries in our best-case scenario where clusters start with high densities, ρ h,0 = 10 5 M pc −3 , and the BHs are all initialised with zero dimensionless spin parameter χ = 0. Top, middle and bottom panels correspond to the delayed supernova mechanism, the rapid supernova mechanism, and the BH mass distribution ofBelczynski et al. (2008), respectively.Black lines show the corresponding distributions when hierarchical mergers are not included in the calculation, i.e., it is assumed that a BH formed from a previous merger is always ejected from the cluster.Thick lines show median values of the merger rate value, and thin lines the corresponding 90% confidence intervals.The green lines and hatched regions show the median and corresponding 90% confidence regions inferred from the GWTC-3(Abbott et al. 2021a).

Figure 4 .
Figure4.Results for six alternative models.Left panels: blue histograms are for a model in which the initial cluster half-mass radius is assumed to scale with cluster mass as in equation (16); the blue histograms correspond to a model in which the BH birth kicks are zero; the red histograms correspond to a model where the recipes for pulsation pair instability (PPI) were switched off.Right panels: blue and black histograms are the results obtained assuming that the initial stellar mass function for massive stars scales as φ(m ) ∝ m −2 and φ(m ) ∝ m −2.6 , respectively.In the red and orange histograms we have multiplied our standard wind mass loss rate on the Wolf-Rayet stage by a factor f WR = 0.1 and f WR = 5.We have used the delayed supernova mechanism, and, unless otherwise specified, all the other model parameters are the same as in Fig.1.

Figure 1 .
Figure 1.PDFs for m 1 (left) and q (right) for different BH mass function upper masses (m up ) and logarithmic slopes (α).In all cases m lo = 5.The results discussed in the text (equations A4 and A6) are shown as full lines, while the dots with errors bars show Monte Carlo realisations (as a check) obtained by drawing 10 6 pairs from p I (m I ) (equation A3) and the black dashed lines show simple power-law approximations.