Kinematic substructure in star clusters constrains star cluster formation

The spatial-kinematic structure of 48 young star clusters and associations is investigated. Moran's $I$ statistic is used to quantify the degree of kinematic substructure in each region, and the results are compared to those expected assuming the hierarchical or monolithic models of star cluster formation. Of the observed regions, 39 are found to have significant kinematic substructure, such that they are compatible with the hierarchical model and incompatible with the monolithic model. This includes multiple regions whose $Q$ parameter shows the region to be centrally concentrated and clustered. The remaining nine are compatible with both models. From this it is concluded that the kinematic substructure of the observed star clusters represents strong evidence in favour the hierarchical model of star cluster formation over the monolithic model.


INTRODUCTION
Most stars, including the Sun, were born in clusters of hundreds to millions of stars (Elmegreen 2000;Lada & Lada 2003).The high stellar densities and extreme radiation fields in these clusters has a major impact on the star and planet formation process (e.g., Bonnell et al. 2001;Adams et al. 2006).As such, the question of how stars form is virtually inseparable from the question of how star clusters form.
There are two models of star cluster formation, the hierarchical mergers model and the monolithic model.In the former, star clusters are thought to form from mergers between substructures that form over a larger volume, but are brought together by gravity (Vázquez-Semadeni, González-Samaniego & Colín 2017;Rodríguez, Baume & Feinstein 2020;Sabbi et al. 2022).In the latter, it is thought that star clusters form monolithically from approximately spherical, dense concentrations of gas, which must collapse first (Kroupa, Aarseth & Hurley 2001;Krumholz & Tan 2007;Farias & Tan 2023).The models effectively differ in whether material is brought together into a dense configuration before (for the monolithic model) or after (for the hierarchical model) star formation occurs.
These two models make contrasting predictions about the spatial and kinematic structure of young star-forming regions (Banerjee & Kroupa 2014;Longmore et al. 2014;Vázquez-Semadeni et al. 2017).These differing predictions can be used to try to answer the question of which model is best supported by observational evidence, by comparing them to the structure of real star-forming regions.This goal is obstructed by the fact that spatial structure (which has historically been easier to observe than kinematic structure) is erased quickly as clusters evolve (Goodwin & Whitworth 2004;Arnold, Wright & Parker 2022).Kinematic structure, however, may yet be of use to delineate between these models observationally.
This paper follows on from Arnold et al. (2022) (hereafter A22), ★ E-mail: r.j.arnold.uk@gmail.comwhich makes use of -body simulations of star cluster formation.It shows that kinematic substructure can be identified and quantified using Moran's  statistic (Moran 1950) (a multidisciplinary statistical method used to quantify structure of given properties) even after spatial substructure has been erased.This statistic was found by A22 to be reliable even in the face of highly imperfect data (e.g.large uncertainties, high contamination, low completeness).
In A22 the evolution of Moran's  in simulations based on the monolithic and hierarchical models of star formation is compared.The initial conditions for the monolithic cases are generated as Plummer spheres (Plummer 1911), resulting in centrally-concentrated, smooth, and spherical initial conditions without initial kinematic substructure.A22 finds that Moran's  for such regions is zero within a very small margin or error, and remains so for at least 10 Myr unless the region is expanding.In that case Moran's  increases from  = 0 before plateauing.
The initial conditions for simulations based on the hierarchical model are generated using box fractals, in a method analogous to that used in Allison et al. (2010), Daffern-Powell & Parker (2020) and Blaylock-Squibbs et al. (2022) which results in spatial and kinematic substructure.A22 finds that Moran's  is significantly above zero at  = 0 in these cases, and remains so for up to 10 Myr, even though it decreases over time.Expanding regions are an exception.Like in the monolithic case Moran's  increases and plateaus over time, however the hierarchical expanding simulations maintain higher Moran's  values than their monolithic counterparts.
The aim of this paper is to use Moran's  statistic to measure kinematic substructure in young star clusters.We then aim to determine if the observed levels are consistent with those expected by the hierarchical merger or monolithic star cluster formation models.To do this, data is gathered for 48 currently or recently star-forming regions, and their spatial-kinematic structure is quantified using Moran's  statistic.These values are then compared to the predictions of A22.The relationship between the Moran's  values of different regions and their other properties, such as their spatial structure, is also examined.
In Section 2 the statistical methods used to analyse these data are described, and in section 3 the observational data is outlined.Section 4 presents the results of the analysis, and section 5 summarises the conclusions drawn from those results.

METHODS
In this section, the statistical methods used to assess the spatial and kinematic structure of the observed star-forming regions are outlined.

Moran's 𝐼 statistic
The principle statistic used in this paper to assess the spatialkinematic structure of star clusters is Moran's  (Moran 1950).This statistical tool is applied to spatial data of a given property in order to investigate the degree of structure in the spatial arrangement of that property.It is used in a wide variety of fields from epidemiology (e.g., Isazade et al. 2023, who investigated the spatial distribution of Covid-19 cases) to waste management (e.g., Tsui et al. 2022, who investigated how re-use rates vary across the Netherlands) and geography (e.g., Banasick, Lin & Hanham 2009, who investigatesd the spatial distribution of small manufacturing firms).
Moran's  for  data points is calculated as follows.First, a weight,    , is calculated for every possible pair of data points,  and , where  ≠ .The weight is the inverse of the physical distance between the data points.Once this has been done the Moran's  statistic for a given parameter  can be calculated as follows: . (1) The resulting Moran's  statistic varies between 1 and -1.Positive values indicate the presence of spatial correlation in the distribution of , i.e. similar values of  tend to be found in close proximity to one another.Negative values of Moran's  indicate the presence of spatial anti-correlation, i.e. dissimilar values of  tend to be found in close proximity to one another.If  has no spatial structure, the expected value of Moran's  is near zero 1 .
In A22 it is demonstrated that Moran's  statistic can be applied to star-forming regions to quantify the spatial correlation of their stellar kinematics.A number of different kinematic properties can take the role of , enabling this statistic to study a variety of different aspects of the spatial-kinematic structure in a data set.The parameters used in A22 are velocity in the  direction,   , velocity in the  direction,   , and speed .The calculated Moran's  values of these parameters are referred to as (  ), (  ), and () respectively.
To reduce the impact of the frame of reference choice and produce a less noisy metric (  ) and (  ) are averaged to produce ( 2 ).For real rather than simulated data, proper motion velocities ((  * ) and (  )) are used in the place of (  ) and (  ).Other parameterisations used in this paper are (  ), where   is radial velocity, ( 3 ), which is analogous to ( 2 ) but incorporates   , ( 2 ), where  2 is the stellar speed considering only proper motions, and ( 3 ) is the same but also incorporating   .Note that because  is always positive ( 2 ) and ( 3 ) quantify quite different aspects of 1 Strictly speaking, the expected Moran's  assuming no structure is −  −1 , but given the large number of stars included in the samples used in this work, and the uncertainties on the kinematic data, the approximation of () to zero is valid.kinematic structure to ( 2 ) and ( 3 ).This is expanded upon in section 4.5.1.

The LISA statistic
LISA (Local Indicators of Spatial Association, Anselin 1995) is a method of decomposing global statistics such as Moran's .This statistic is calculated for each data point individually (rather than for the region as a whole), and quantifies how much they contribute to the overall global statistic.In this context it is useful for highlighting which stars or subregions are exhibiting the most or least substructure, and aids understanding of the overall dynamics of a region.
As with Moran's , increasingly positive values of the LISA statistic indicate the data point's net contribution is to increase the observed degree of substructure in the region.Likewise, negative values of the LISA statistic indicate the data point contributes negatively.A dissimilarity between Moran's  and the LISA statistic is that the LISA statistic is not confined to a specific range, whereas Moran's  can only exist between -1 and 1.
The LISA statistic for Moran's  is: for a given data point  and a given parameter .

The 𝑄 parameter
The  parameter (Cartwright & Whitworth 2004) is a measure of the 2D (projected) spatial distribution of stars that is commonly used to quantify the degree of substructure in star clusters (e.g., Wright et al. 2014).The  parameter is defined as  = m/ s, where m is the normalised mean edge length of the minimum spanning tree of all stars in a cluster, and s is the normalised mean separation between the same sample of stars.Values of  can range from about 0.4 up to 2.0, though the vast majority of measured values in the literature are between 0.6 and 1.0.Low values of the  parameter ( < 0.8) indicate a clumpy or substructured spatial distribution, while high values ( > 0.8) indicate a smooth, centrally concentrated distribution (Cartwright & Whitworth 2004).

Data collection and data properties
A data set of currently or recently star-forming regions was assembled from three sources.The first is the recent 3D kinematic study of young stellar groups from the Gaia-ESO Survey (Wright et al. 2024, hereafter GES), with cluster members determined using spectroscopic indicators of youth, such as Lithium absorption or surface gravity indicators.The second source is the APOGEE-2 study of the Orion complex (Kounkel et al. 2018, hereafter K18), wherein cluster members are identified using a clustering algorithm applied to a sample of previously-identified YSOs.The third is the sample of nearby open clusters from Cantat-Gaudin & Anders (2020) and Cantat-Gaudin et al. ( 2020) (hereafter CG20), which applied a clustering algorithm to Gaia DR2 data to identify open clusters and then used a neural network to derive cluster properties.The latter list was limited to clusters with ages < 10 Myrs and distances < 2 kpc to ensure a high kinematic data quality and maximise the chances of detecting residual kinematic structure.
A final cut was made to the combined list of regions assembled from these three sources; regions with fewer than 30 identified members have been excluded as low-number statistics mean results they produce have reduced reliability.Trumpler 14 is also excluded, as its data appear to suffer from significant contamination from the surrounding Carina association.
A complete list of the 48 regions that were ultimately selected is shown in Table 1 with cluster ages and Gaia distances taken from each catalogue paper.Absolute stellar ages for young stars are notoriously difficult to estimate accurately, due to uncertainties in both stellar evolutionary models and the stellar properties necessary to compare with such models.Ages for individual star clusters should therefore be used with caution.An additional source, Franciosini et al. (2022), provides an updated age for 25 Orionis.
In addition to name, age, and distance, Table 1 lists a number of other properties.The  parameter was measured for all clusters, with measured values ranging from approximately 0.6 to 1.0,This is consistent with typical measurements in the literature and indicates a range of spatial morphologies from substructured to centrallyconcentrated.
The fraction of the observed stars that are expected to be contaminants is also listed, alongside the estimated completeness of each sample.The contamination of the regions presented in K18 is estimated in that work to be 6 per cent.In the case of the other regions presented here we estimated the contamination of the regions to be 5 per cent, consistent with estimates from astrometric or spectroscopic membership studies (Jackson et al. 2022).These estimates are based on the telescope used to gather the data (i.e.), and the distance to the region.Completeness is likewise estimated based on the telescope and the region's properties.Finally, the source of the data for a given region and its calculated Moran's  value, are listed in the final two columns of the table.
Four regions (25 Orionis, Barnard 30, Barnard 35, and Lambda Orionis) occur in both the GES and K18 data sets.In these cases the two data sets are combined to produce a more complete membership list.The age and distance estimates of the two studies are averaged, and stars which appear in both data sets are identified and de-duplicated.

Removing outliers
Inspection of the data shows that many clusters contain extreme kinematic outliers.These kinematic outliers may be contaminants or runaway stars, but regardless they are not representative of the kinematic substructure of the region and need to be removed from the sample before it is analysed.We therefore remove all stars with speeds more than 2.5 times the interquartile range (IQR) from the median speed (equivalent to excluding stars more than approximately 3  from the centre of a Gaussian distribution).The median and IQR are used in the place of the mean and standard deviation as they are less vulnerable to distortion by outliers.
The effects of this cut are evaluated using 20 simulated datasets from A22 that consist of highly substructured, subvirial regions of 1500 stars evolved using the -body integrator kira from the starlab code (Portegies Zwart et al. 1999Zwart et al. , 2001) ) to an age of 4 Myr (so some dynamical evolution can take place and the data is more comparable to the observed regions).The ( 2 ) of these uncontaminated regions,  True , is calculated for each simulated region.
Twenty extreme kinematic outliers are then added to each pristine data set, their directions chosen at random and their speeds drawn from a Gaussian distribution with a width ten times that of the pristine data set2 .The 2.5× IQR cut is then applied to each simulated data set to remove the kinematic outliers.This process was repeated fifty times for each of the twenty simulations.( 2 ) is calculated ( Calc ) for each of the the contaminated and cleaned data sets.
Fig. 1 shows how the true values of Moran's  statistic compare to those calculated from the contaminated and cleaned simulations.It is apparent that the effect of adding kinematic outliers is to reduce the measured value of Moran's  statistic (similar to the effects of most observational biases on Moran's , as found by A22).The IQR cut, however, brings most measurements of Moran's  statistic back in agreement with the true value.For some of the regions with a Moran's  statistic > 0.06 the IQR cut results in an over-estimation of the value of .We note however that the N-body simulations that result in such over-estimations had all undergone atypical evolution and as a result have unusual spatial and kinematic structures (e.g. one had fragmented into three separate clusters).
The number of genuine and contaminant stars removed by the IQR cut was also recorded from this experiment.On average 12 ± 11 (0.81 ± 0.76 per cent) of real stars were removed by the cut, while 17 ± 2 (86 ± 9 per cent) of the simulated outliers were removed.These results make it clear that, while the 2.5 IQR cut is not perfect, it does a good job of removing the majority of extreme outliers with relatively little infraction upon the genuine stars.
More stringent cuts are considered, but are rejected as they would inevitably result in cutting more genuine stars alongside contaminants.This trade off is deemed not worth taking as A22 shows Moran's  can give noisy results at low completeness fractions, but demonstrates high resilience against contamination.More stringent kinematic cuts are also rejected on the grounds that removing kinematic outliers from a study into kinematic structure risks removing the very structure that is being studied.This relatively permissive 2.5 IQR cut was therefore used to remove extreme kinematic outliers from all datasets.These cleaned data sets were then used to produce all the results presented in this paper.

𝐼(𝑣 2𝐷 ) of the regions
Following A22, ( 2 ) is calculated for each of our observed regions.To estimate the uncertainty on the calculated ( 2 ) values, simulated 'observed' velocities are drawn for each star.These are drawn from a Gaussian distribution centered on the velocity that was actually observed, with a standard deviation equal to the uncertainty on the velocity measurement for that star.This creates a plausible alternative data set and ( 2 ) is recalculated.This process is repeated 1000 times, and the standard deviation of the 'observed' ( 2 ) values is taken as the uncertainty on the ( 2 ) calculated from the original data.
The measured values of ( 2 ) and their estimated uncertainties are shown in Fig. 2, ordered from highest to lowest ( 2 ).The range of measured values of ( 2 ) from ∼0 to ∼0.25 is consistent with the predictions from A22 (see the left-hand panels of Figure 2 Table 1.A list of the regions studied in this paper and their properties.Ages and distances are taken from the respective catalogue papers.The  parameter is calculated in two dimensions. gives the number of identified members in each cluster, while   gives the number of stars qualifying as members after the removal of outliers (see section 3.2), which are used in the calculation of Moran's .The estimated contamination and completeness rates are presented as fractions, and the measured ( 2 ) of the region is recorded in the final column.( 2 ) are measuring similar aspects of the level of substructure in a region.

Comparison with simulation predictions
Shown in orange on Fig. 2 are the expected range of ( 2 ) values for each region if the region formed monolithically.The expected range of values is calculated using the set of twenty simulations without initial kinematic or spatial substructure presented in A22.
The calculation is performed using snapshots taken at the age of the relevant region3 .Measurement uncertainty, contamination and incompleteness are all incorporated, as described in A22, at the levels listed in Table 1.
From inspection of Fig. 2 it is apparent that for 39 out of the 48 regions studied, ( 2 ) is at least one standard deviation above the range expected if the regions formed in a manner consistent with the monolithic model, but is consistent with formation by hierarchical mergers.In the case of five of the remaining nine regions (UPK 402, Gulliver 2, Bochum 13, Berkeley 87, and NGC 6383) ( 2 ) is above the shaded orange area, but the error bar extends down into the monolithic-compatible zone.This indicates they most likely formed with kinematic structure, but could plausibly have formed without substructure.The error bars for ASCC 21 and Gulliver 10 extend above, throughout and below that zone, meaning they are potentially compatible with having formed with kinematic structure, without kinematic structure, and with kinematic anti-structure.It is therefore determined that no conclusions can be drawn from the structure of those regions.The data points and error bars for the remaining two regions (Gamma Velorum, and NGC 2362) exist solely within the orange region, indicating they show no signs of kinematic substructure.
Gamma Velorum has an estimated age of 19.5 Myr (Jeffries et al. 2016), and given its relatively compact nature at this age (its radius is 1.9 pc, Wright et al. 2024), it is presumed to be bound.These two factors mean that the region has most likely undergone a great deal of dynamical evolution, so it is to be expected that no signature of initial kinematic structure could be recovered (even if it was initially present).No conclusions can be meaningfully drawn from the nondetection of significant structure in Gamma Velorum.
NGC 2362 does not have any obvious features which would explain its low ( 2 ).However, A22 demonstrated that simulated regions initialised with statistically identical levels of kinematic structure evolve stochastically, resulting in a wide range in ( 2 ) values.After being evolved to 5.75 Myr (the age of NGC 2362) the observed values of ( 2 ) in these simulations are, by a small margin, potentially compatible with the assumption of no initial substructure.This means it is plausible that NGC 2362 did form with kinematic substructure despite that structure's present absence.Of course, the possibility also remains that NGC 2362 did not form with kinematic structure.
The remaining 39 of the 48 regions have ( 2 ) values significantly in excess of what would be expected had the regions been initially unsubstructured.Notably there are six regions whose  parameter implies a circular and centrally-concentrated morphology ( > 0.9), while their ( 2 ) values imply that they have significant levels of kinematic substructure.These regions are Sigma Orionis, IC 1805, Epsilon Orionis, NGC 6193, Lambda Orionis and 25 Orionis.The majority of these are well-known and quite compact young (Sigma Ori) or open clusters (NGC 6193 and Lambda Ori), the latter of which are old and compact enough to very likely be gravitationally bound.Only Epsilon Ori is young and sparse enough to potentially be gravitationally unbound, while IC 1805 is suspected to have an expanding halo that dominates its kinematic substructure (see Section 4.5.1). 4n A22 it was shown that spatially clustered but kinematically substructured conditions are incompatible with having formed via the monolithic model of star cluster formation, but are compatible with the hierarchical model.This observational evidence provides a direct constraint on star formation models; star forming regions must, and least predominantly, produce stars with spatial-kinematic substructure.Further, these must be formed in such a manner that this structure can survive the erasing effects of dynamical evolution for at least 5 and up to 10+ Myr.

Alternative methods of quantifying kinematics
In section 2.1 several additional kinematic parameterisations of Moran's  beyond ( 2 ) are outlined.These metrics are calculated for each of the observed regions, and the regions are ranked from highest to lowest Moran's  according to each metric.From those results it is clear that the Moran's  values of different kinematic parameters are generally similarly ranked for a given region.
This reinforces the key finding of section 4.1; kinematic structure is present in the vast majority of the observed regions.If this were not the case, and the identified structure was due to chance alignments, then it is highly unlikely structure would be present to similar extent in transverse velocity components, and the other presented paramaterisations.See appendix A for more details of this analy- Initially unsubstructured Q < 0.7 0.7 < Q < 0.9 Q > 0.9 Figure 2. ( 2 ) of the observed regions in descending order.The range of expected ( 2 ) values for each region, assuming that they formed without initial substructure, is shown in orange.The colour of each point indicates its approximate  parameter, divided between low ( < 0.7), intermediate (0.7 <  < 0.9) and high ( > 0.9).
sis and Figure A1 for a comparison of the rankings of the different metrics.

A closer look at individual regions
The kinematic structure of regions of particular interest is now discussed.To aid the interpretation of this structure, in this analysis the LISA statistic is applied to each region.Recall from section 2.2 that the LISA statistic quantifies the degree to which each data point contributes to the substructure observed in the region.In this section the regions are plotted with the stars and their velocity vectors colour coded according to their LISA statistic, to highlight the most significant kinematic substructures.

Expanding regions
Several regions with high ( 2 ) have clear visual indications of expansion in their spatial morphology and kinematics, notably NGC 6871, Barnard 35, RCW 33, UBC 323, UBC 344 and IC 1805.NGC 6871 is depicted in Fig. 3.In the left hand panel of this figure the stars are colour coded according to the LISA statistic of ( 2 ), and in the right hand panel the colour coding is done according to the the LISA statistic of ( 2 ).Comparable plots for the other five regions can be found in appendix B. In these figures the velocity vectors are in km s −1 but re-scaled uniformly such that the vectors fit well within the boundaries of the plot, rather than being too large/small to be easily interpreted.This is done the make the kinematic structure of the regions, which is the focus of this paper, as apparent as possible.
From the left hand panels of these six figures, it is evident that the stars at the largest radii tend to make the most significant contributions to the kinematic structure as quantified by ( 2 ).These stars are predominately those with the strongest radial trajectories, which is why they are found at such large distances from the centre (assuming they have expanded ballistically).This indicates that expansion is the central feature driving the kinematic substructure of these regions.
In the right hand panels of these six figures the stars are colour coded by their LISA values of ( 2 ) instead of ( 2 ).In each of these regions it is strikingly apparent that highly significant structure is present in their cores.This is particularly true for the four oldest regions, RCW 33, NGC 6871, UBC 323, and IC 1805, all of which have ages ≥ 5 Myr.The cores appear divorced from their host regions kinematically, in a very harsh, precipitous manner.This is in contrast to the comparatively gentle transition in their spatial structure from high density to low density.Note that while the right hand panels of these figures highlight that these core regions contain stars with similar speeds, these stars are not identifiable in the left hand panels as having structure in their velocities.There is also a fraction (which varies in size between these regions) of stars which are not highlighted in either figure, indicating they have no kinematic substructure whatsoever.These primarily reside at intermediate radii between the core and the expanding edges.
The most plausible explanation for this three-tiered structure (core, unsubstructured halo, expanding fringe) is that stars with sufficient initial kinetic energy to escape their host region's potential well did so very quickly.They then make up the stars driving the observed expansion, as has already been discussed.In contrast, stars with insufficient kinetic energy to escape formed a dense, bound core.Due to their low kinetic energy these core stars have similar speeds (close to zero in the rest frame in the region), so are identified as a spatialkinematic substructure by LISA of  2 .Here the rest frame is taken as the median stellar velocity of the observed members.The frequent dynamical interactions inevitable in such a bound core continually randomise the directions of the star's velocities, accounting for the lack of structure in  2 apparent there.The intermediate region, i.e. the unsubstructured halo, is most likely made up of stars whose initial velocities fell on the borderline of the regions' escape velocity at early times.These stars have likely continued to mix in the halo, accounting for the lack of structure in  2 and  2 .

NGC 6530
NGC 6530 is an extremely young region with an estimated age of just 1.5 Myr.It is also large and relatively well sampled, with 513 observed members that pass our quality cuts.NGC 6530 has highly significant substructure as quantified by ( 2 ) and shown in Fig.Other observable properties of this outwards spiral structure are that stellar density appears to decrease along its length, and stellar speeds appear to increase.
The 'spiral' is investigated further to verify whether it is a single, cohesive stream as it appears.This is done by producing a colour magnitude diagram of apparent members of the 'spiral', as well as examining their parallaxes and kinematics.From this it is determined that this structure is in fact a chance alignment of four sub-regions, rather than a cohesive stellar stream.The existence of these identified kinematic substructures within NGC 6530 re-enforces the main finding of section 4.1; young regions contain significant spatial-kinematic substructure.This is in keeping with the hierarchical model of star formation, and in opposition to the monolithic model.

DISCUSSION AND CONCLUSIONS
Spatial-kinematic data for 48 young clusters and star forming regions has been analysed.Moran's  statistic is used to quantify the degree of kinematic structure in each region, and the results are assessed in the context of the predictions made by A22, which are: • If star clusters form according to the monolithic model their kinematic structure, as quantified by ( 2 ), is expected to be zero within a very small, region-dependent margin of error.
• If star clusters form according to the hierarchical model their kinematic structure, as quantified by ( 2 ), is expected to be significantly above zero for ages up to and potentially beyond 8 Myr.At older ages ( 2 ) is expected to occupy a significantly enlarged range of possible values compared to the monolithic model.
The regions observed are predominantly (39 out of 48 regions) found to have highly significant levels of kinematic substructure, to a degree that is incompatible with those regions having formed according to the monolithic model of star cluster formation.The results for the remaining nine regions are potentially ambiguous.The relationship between spatial and kinematic substructure is investigated by comparing Moran's  and the  parameter.It is found that there is a strong correlation between the presence and strength of spatial and kinematic substructure, but that there are multiple clusters that lack spatial substructure, but have kinematic substructure.This is also predicted by, and therefore supports, the hierarchical model of star cluster formation.We conclude that on the whole the observational evidence demonstrates that stars are, at least predominantly, born with a high degree of kinematic substructure, as expected by the hierarchical mergers model of star cluster formation.Even if the hierarchical model is not assumed, star formation models must be able to account for the early and enduring kinematic substructure observed in these regions.

Figure 3 .
Figure 3.A plot of NGC 6871 colour coded by the stellar LISA values of ( 2 ) (left) and ( 2 ) (right).To make the velocity vectors of the stars more easily interpretable, a vector showing 1 km s −1 is shown in black.
4. Examination of this figure shows what appears to be an outward moving spiral structure of stars.This structure begins at an RA of 270.95 • and Dec of -24.40 • , before proceeding anticlockwise by approximately 180 • , ending at around RA = 271.00• , Dec = -24.15• .

Figure 4 .
Figure 4. NGC 6530 with stellar velocity vectors colour coded according to their LISA values of ( 2 ).To make the velocity vectors of the stars more easily interpretable, a vector showing 1 km s −1 is shown in black.
Comparison of the true and calculated Moran's  statistic for data sets containing outliers (blue) and after an IQR cut has been applied with the goal of removing those outliers (red).The true Moran's  statistic values ( True ) are shown on the -axis, and the calculated Moran's  statistic values ( Calc ) are shown on the -axis.A black line is used to indicate where these two values are equal.
(Smirnov 1939der, unbound systems (see the bottom-right panel Figure2of A22), which are generally considered to be OB associations and are not well represented in our sample.4.2 ( 2) compared to the  parameter Fig.2also shows the approximate  parameter of each cluster.It is clear that spatially substructured regions  < 0.7 tend to have the highest levels of kinematic substructure, while centrally concentrated spatial distributions ( > 0.9) tend to have low levels of kinematic substructure.A Kolmogorov-Smirnov test(Smirnov 1939) confirms that the ( 2 ) distributions of regions with  < 0.7 are significantly different from those with  > 0.9 with a -value of 0.002.This shows, as would be expected theoretically, that the  parameter and