GASTON: Galactic Star Formation with NIKA2. Evidence for the mass growth of star-forming clumps

Determining the mechanism by which high-mass stars are formed is essential for our understanding of the energy budget and chemical evolution of galaxies. By using the New IRAM KIDs Array 2 (NIKA2) camera on the Institut de Radio Astronomie Millim\'etrique (IRAM) 30-m telescope, we have conducted high-sensitivity and large-scale mapping of a fraction of the Galactic plane in order to search for signatures of the transition between the high- and low-mass star-forming modes. Here, we present the first results from the Galactic Star Formation with NIKA2 (GASTON) project, a Large Programme at the IRAM 30-m telescope which is mapping $\approx$2 deg$^2$ of the inner Galactic plane (GP), centred on $\ell$=23.9$^\circ$, $b$=0.05$^\circ$, as well as targets in Taurus and Ophiuchus in 1.15 and 2.00 mm continuum wavebands. In this paper we present the first of the GASTON GP data taken, and present initial science results. We conduct an extraction of structures from the 1.15 mm maps using a dendrogram analysis and, by comparison to the compact source catalogues from Herschel survey data, we identify a population of 321 previously-undetected clumps. Approximately 80 per cent of these new clumps are 70 $\mu$m-quiet, and may be considered as starless candidates. We find that this new population of clumps are less massive and cooler, on average, than clumps that have already been identified. Further, by classifying the full sample of clumps based upon their infrared-bright fraction - an indicator of evolutionary stage - we find evidence for clump mass growth, supporting models of clump-fed high-mass star formation.


INTRODUCTION
The stellar initial mass function (IMF) is an essential ingredient in cosmological simulations of galaxy formation and evolution, and its origin remains one of the most fundamental and important questions in the field of astronomy. In the post-Herschel era, a link between filamentary structures in the interstellar medium (ISM) and starformation has been firmly established (Molinari et al. 2010;André et al. 2010André et al. , 2014. At the low-mass end of the IMF, most stars with masses in the range * ∼ 0.1 to a few are found to form within dense filaments (Könyves et al. 2015(Könyves et al. , 2020Marsh et al. 2016). If the mass-per-unit-length exceeds a critical value set by the local sound speed, gravitational instabilities can develop, leading to the fragmentation of the filament, and the formation of stellar systems from this mass reservoir (Inutsuka & Miyama 1997; Clarke ★ E-mail: rigbya@cardiff.ac.uk † E-mail: peretton@cardiff.ac.uk et al. 2017). This core-fed scheme of fragmentation of trans-or supercritical virialized filaments may explain the origin of the IMF from ∼ 0.1 to ∼ 5 (Lee et al. 2017;André et al. 2019). However, outside this range, the proposed formation mechanisms for brown dwarfs (Whitworth et al. 2007;Luhman 2012), and the formation of high-mass ( * 8 ) stars (Tan et al. 2014;Motte et al. 2018), remain controversial.
It is well established that the Jeans-like fragmentation of molecular clouds is unable to produce dense cores which are sufficiently massive to be the progenitors of high-mass stars (Bontemps et al. 2010;Sanhueza et al. 2017), and so the formation pathway must incorporate additional mechanisms. There are broadly two families of models for high-mass star formation that remain actively debated within the field: ones in which the formation of high-mass stars resembles a scaled-up version of the low-mass star formation models (e.g. McKee & Tan 2003), where the high-mass protostellar object accretes material from a compact ( 0.1 pc) fixed mass reservoiri.e. a core supported by turbulence -and ones in which protostars grow in mass as a result of the large-scale ( 1 pc) and hierarchical gravitational collapse of its parent molecular clump (e.g. Bonnell & Bate 2006;Vázquez-Semadeni et al. 2009. These two families of high-mass star formation models may be classified as core-fed and clump-fed scenarios, respectively (Wang et al. 2010).
A growing weight of observational evidence supports a picture in which large-scale gravitational collapse, resulting in large infall rates, plays a key role in the formation of high-mass stars. Systems with the so-called 'hub-filament' configuration (Myers 2009) are also routinely found in high-spatial resolution observations of highmass clumps, with stellar protoclusters (Schneider et al. 2012;Liu et al. 2016), or the most massive dense cores found at the filament intersections (e.g. Peretto et al. 2013Peretto et al. , 2014. The filaments within these systems have also been seen to exhibit longitudinal velocity gradients that could indicate large-scale gas flows (Kirk et al. 2013;Peretto et al. 2014;Williams et al. 2018;Lu et al. 2018; ?) that supply additional kinetic support at the hub centre, whilst funnelling sufficient mass to the high-mass cores on short timescales. However, it is neither clear whether longitudinal filamentary accretion and global collapse in high-mass star-forming hub-filament systems is universal, nor at which masses the transition from the solar-type star forming mode to the high-mass mode occurs.
While the execution of such high-spatial resolution studies across the entire Galaxy remain unfeasible, large and unbiased samples of dense clumps now exist from single dish Galactic Plane surveys (Schuller et al. 2009;Aguirre et al. 2011;Moore et al. 2015;Molinari et al. 2016a). The Apex Telescope Large-Area Survey of the Galaxy (ATLASGAL; Schuller et al. 2009) at 870 µm identified all of the high-mass clumps ( > 1000 ) within the Galaxy (Urquhart et al. 2018), and Jackson et al. (2019) showed that the majority of high-mass clumps are undergoing gravitational collapse. However, limitations in sensitivity or angular resolution prevent these surveys from being able to detect the precursors of intermediate-mass star-forming clumps, where any transition between the core-fed and clump-fed regimes is expected to occur. Studies of clump morphology with higher-angular resolution single dish facilities may allow progress on this front, (e.g. Rigby et al. 2018), but large samples must still be acquired.
In this paper we present the first results from GASTON, a new 200-hour Large Programme (PI: Peretto) being undertaken at the IRAM 30-m telescope using the NIKA2 camera (Bourrion et al. 2016;Calvo et al. 2016;Adam et al. 2018;Perotto et al. 2020). The GASTON project is comprised of three parts aimed at investigating the enigmatic origin of the IMF from different angles: i) a ∼ 2 deg 2 survey of a section of the inner Galactic plane searching for the transition between the solar-type and high-mass star forming scenarios; ii) a search for pre-brown dwarf cores, iii) a study of dust properties in nearby resolved pre-stellar cores. NIKA2 is a state-of-the-art focal plane array at the Institut de Radio Astronomie Millimétrique (IRAM) 30-m telescope at Pico Veleta in Spain, which completed its commissioning campaign in April 2017. NIKA2 observes in 260 GHz and 150 GHz (hereafter 1.15 mm and 2.00 mm) continuum wavebands simultaneously by means of a dichroic mirror, with angular resolutions of 11.1 and 17.6 arcsec, respectively. There are two 1.15 mm arrays, each made up of 1140 kinetic inductance detectors (KIDs), and there are 616 detectors on the 2.00 mm array, filling the 6. 5-diameter instantaneous field of view (FoV) in each case. The noise-equivalent flux densities (NEFDs) are approximately 30 mJy s 1/2 at 1.15 mm and 9 mJy s 1/2 at 2.00 mm which, when combined with the large instantaneous FoV, result in mapping speeds that are an order of magnitude larger than the previous generation of instrumentation in operation at the IRAM 30-m telescope.
In this paper, we describe the observations and present the first results from the GASTON GP project, in which we explore the potential for the NIKA2 observations to identify signatures of the core-fed or clump-fed scenarios of high-mass star formation. The paper is structured in the following way. In Sect. 2, we describe the observations and data reduction, while in Sect. 3 we describe our analysis of the 1.15 mm data and present the results, in Sect. 4. We summarise our results in Sect. 5

Observing strategy
The GASTON GP field, centred on ℓ = 23. • 9, = 0. • 05, was selected for the project due to its extremely high density of compact dust continuum sources, as identified by the Herschel infrared Galactic Plane Survey (Hi-GAL; Molinari et al. 2016a), as well as a large of number of infrared-dark clouds (IRDCs; Peretto & Fuller 2009). The GP part of GASTON has been allocated a total of 73.6 hours of telescope time. The approximately 2 deg 2 field was divided into three sub-maps, centred on ℓ = 23. • 3, 23. • 9, and 24. • 5, and = 0. • 05, which were observed using an on-the-fly mode, and combined to form the survey area. Pairs of observations with alternating position angles at these three positions are repeated in order to obtain the total integration time, and in such a manner that an approximately uniform sensitivity should be achieved over a large fraction of the map.
The pairs of rectangular raster-scanned maps, are performed with alternating position angles in an effort to reduce striping in the reduced maps and are scanned across the plane, as illustrated in Fig.  1, to ensure that the start-point of each scan is in a position of low emission. Initially, we adopted a strategy in which scans were alternated by ±15 • with respect to the normal to the Galactic plane, as used by ATLASGAL (Schuller et al. 2009), but a ±30 • strategy was finally adopted for most scans, representing a compromise between reducing striping artefacts from data reduction, and maintaining a good mapping efficiency. For the first of the observing runs (N2R12 and N2R14), scans were made with a length of 78 arcmin, with 195 arcsec row spacing (i.e. one half of the 6. 5 FoV) between a total of 11 adjacent sub-scans, and with a scanning speed of 60 arcsec per second. Following the second observing run (N2R14), some minor adjustments were made: i) the scanning speed was increased to 70 arcsec per second, in order to assist with the recovery of extended emission; ii) the sub-scan spacing reduced to 180 arcsec to achieve more consistent coverage between sub-scans; iii) the sub-scan length was changed to 87 arcmin to achieve the same latitude coverage as the earlier scans.
The maps presented in this study are the result of a combined 30 hours of integration time, and consist of a total of 108 individual observations taken over five observing pools: N2R12 (2017 October 27-30), N2R14 (2018 January 17-23), N2R15 ( 2018 February 17-18), N2R26 (2019 January 15-22), and N2R28 (2019 February 12-19). The observations were carried out with sky opacities ranging from 225 GHz = 0.08 to 0.44, and a mean value of 0.19, as measured by the IRAM 30-m taumeter. The mean source elevation was 39. • 9, and the majority ( 80 per cent) of the observations were obtained between the hours of 08:00-14:00, indicating that the primary beams are likely to be relatively stable around their nominal values of 11.1 (1.15 mm) and 17.6 (2.00 mm) arcsec (Perotto et al. 2020, se their 2 Image generated using the package : https:// github.com/pjcigan/multicolorfits

Figure 1.
Scanning strategy for the GASTON GP field. Rectangles show the area covered by the central pixel of the arrays in the various scans centred on the white crosses. Dotted white rectangles show the initial strategy, and solid white rectangles show the final adopted strategy. The background image is a four-colour composite 2 consisting of Herschel Hi-GAL (Molinari et al. 2016a) 70 µm, 160 µm, and 350 µm images in cyan, yellow, and magenta, and NIKA2 1.15 mm signal-to-noise ratio (to exclude noisy edges) in orange (this study). The circle in the lower left corner shows the extent of the NIKA2 FoV. Fig. 12). The remainder were observed between 15:00-19:00, during which time the primary beams tend to degrade slightly to ∼ 12.5 and 18.0 arcsec, respectively. A detailed description of the operation, calibration and data reduction methods for NIKA2 can be found in Adam et al. (2018) and Perotto et al. (2020).

Data reduction
The data were reduced using the IDL pipeline developed by the NIKA Core Team, which converts the raw time-series data into calibrated maps. While a detailed description of this pipeline will be presented in Ponthieu et al. (in prep), we summarise the main steps here. First, individual KIDs that do not meet the performance criteria, are masked from the timelines, as is also done for samples associated with detected cosmic ray impacts. Next, the pointing information from the telescope control system is used in conjunction with a record of the positional offsets of each KID with respect to a reference position to determine the position of each KID on the sky as a function of time. The KIDs are then inter-calibrated, using coefficients determined from their relative gains, before applying a correction for the instantaneous atmospheric opacity, and finally applying an absolute calibration determined for the specific observing run (see Sect. 2.3).
At this stage, a model of the correlated low-frequency noise components, which consist of electronic noise that is coupled to the readout electronics of subsets of KIDs, and atmospheric noise, which is common to all KIDs, is created; this is the so-called 'commonmode'. We use the 'Most Correlated Pixels' method (Perotto et al. 2020), which works by first discarding any samples in each timeline that, when combined with the telescope pointing information and a source mask, are thought to coincide with an astrophysical source. Next, cross-correlation coefficients between all KIDs are determined and, for each KID, a model of the low-frequency noise is built by co-adding the timelines of that KID and the 15 most correlated KIDs. A linear fit of this common mode is made and subtracted from each timeline, leaving, in principle, the true astrophysical signal.
Since the GASTON GP field is known to contain a large number of bright and diffuse structures with complex morphologies, we adopt an iterative method to define the source mask used in the calculation of the common mode. For the initial iteration, we use a source mask created 3 from a Hi-GAL map of the region at 500 µm (Molinari et al. 2016a). On subsequent iterations, the source mask is determined from the astrophysical signal from the previous iteration, masking out all samples where the signal-to-noise ratio (S/N) is greater than 2, and the process is repeated to improve the estimate of the common mode, and thereby the astrophysical signal. We used a total of 55 iterations, at which point, the change in the flux is negligible compared to the map produced by the penultimate iteration.
Finally, the timelines, when combined with the telescope pointing 3 The source mask is created by first rescaling the 500 µm Hi-GAL map to 1.15 mm, assuming a greybody spectral energy distribution with a dust emissivity spectral index of = 1.8 and a dust temperature of 12 K. We then subtract a version of those data smoothed to 6.5 arcmin to approximate the NIKA2 spatial filtering, and convert the resulting map into a binary source mask by identifying those pixels which exceed half of the expected sensitivity of 1.55 mJy beam −1 .
information, are projected (tangential projection) onto a pixel grid with 2.5 arcsec pixels. Inverse-variance noise weighting is used to assign the data samples onto the pixel grid, using a nearest grid point method, and to combine the maps associated with each individual scan into a mosaic of the GP field. The correlated noise removal is the origin of the spatial filtering within the reduced maps, which is common to all submillimetre and millimetre-continuum imaging from ground-based telescopes. The level of spatial filtering applied here is typically on the order of the instantaneous FoV which, for NIKA2, is approximately 6. 5. We do not calculate the pipeline transfer function that describes the ratio of the power spectrum within the reduced image to an estimate of the absolute source power spectrum over the sky region in this paper. An example of the 2.00 mm NIKA2 transfer function towards a galaxy cluster may be found in Ruppin et al. (2018, see their Fig. 3), who found that around 95 per cent of emission at angular scales between the 2.00 mm beam size and the FoV is recovered, falling off rapidly at larger scales, and we expect a similar level of recovery here.

Calibration
Uranus and Neptune are used as the primary flux calibrators, and so colour corrections must be applied to our photometric maps to account for the difference between the spectral energy distributions (SEDs) of those planets, and the approximate SEDs of the sources is the NIKA2 imaging. We apply colour correction factors of 1.02 and 0.97 to the 1.15 and 2.00 mm bands, respectively, which were interpolated from Table 12 of Perotto et al. (2020), assuming an SED of the form ∝ ( / 0 ) 2+ , with the mean Galactic plane value for the dust emissivity spectral index of = 1.8 (Planck Collaboration 2011). The total calibration uncertainties are determined from the quadrature sum of the 5 per cent planet model uncertainties, the 5.7 and 3.0 per cent point-source rms calibration uncertainty, the 0.6 and 0.3 per cent systematic calibration uncertainties, and the 3 and 2 per cent uncertainties in the reference beam efficiencies, giving a total of 8 and 6 per cent calibration uncertainty on the 1.15 and 2.00 mm integrated flux densities, respectively. A detailed description of the absolute calibration procedure can be found in Perotto et al. (2020).

Ancillary data
We make use of two spectral line data sets in order to determine velocities to continuum sources in Sect. 3.3. Firstly, we use 13 CO (1-0) data (110.201 GHz) from the FOREST unbiased Galactic plane imaging survey with the Nobeyama 45 m telescope (FUGIN; Umemoto et al. 2017), which have an angular resolution of 21 arcsec, and an rms of ( * A ) = 0.87 K per 0.65 km s −1 velocity channel. The spatial extent of the FUGIN data covers the entire GASTON GP field, and the velocity range of −100 < LSR < 200 km s −1 , covers all of the Galactic spiral arms along this sight-line (Dame et al. 2001).
We also make use of data from the pilot study for the Radio Ammonia Mid-Plane Survey (RAMPS; Hogge et al. 2018), a Galactic plane survey of ammonia and water maser emission carried out with the Green Bank Telescope. The L23 and L24 regions of the pilot study have a considerable overlap with the GASTON GP field, spanning 22. • 5 ℓ 24. • 5, with | | 0. • 4, and we make use of the moment 1 maps of NH 3 (1,1) inversion emission at 23.694 GHz. These data have an angular resolution of 32 arcsec, and 0.018 km s −1 spectral channels.
Data from Hi-GAL (Molinari et al. 2016a) at 160 and 250 µm are used in Sect. 3.5 for colour temperature determinations. The 160 µm PACS (Poglitsch et al. 2010) data have an angular resolution of 12 arcsec, and the 250 µm SPIRE (Griffin et al. 2010) data have an angular resolution of 18 arcsec. In both cases the uncertainties are dominated by calibration, which we take as 7 per cent for PACS 4 and 6.5 per cent for SPIRE 5 , including a 1 per cent contribution for extended sources due to uncertainty in the SPIRE beam.
We also use 8 µm data from the Galactic Legacy Infrared Mid-Plane Survey Extraordinaire (GLIMPSE; Benjamin et al. 2003;Churchwell et al. 2009) in Sect. 4.3. These Spitzer data have an angular resolution of 2.0 arcsec.

Photometric maps
In Fig. 2, we present the NIKA2 photometric maps, after cropping to the largest contiguous rectangular field where the sensitivity is approximately uniform. The total integration time obtained so far is ∼30 hours in 108 scans, corresponding to roughly 40 per cent of what will be acquired upon the completion of GASTON.
Although we do not use the 2.00 mm map in the analysis presented in this paper, we present it in Fig. 2 for the sake of completeness, and to display the data quality. We will make use of these data in future GASTON studies and, in particular, the 1.15 and 2.00 mm data will be used in combination to study the dust emissivity spectral index, , as in Rigby et al. (2018).

Source extraction
A source extraction was carried out on the 1.15 mm maps using the 6 package, which uses a dendrogram (Rosolowsky et al. 2008) scheme to segment the emission while maintaining the ability to track its hierarchical structure. The dendrogram separates emission features when they are considered to be significant given their size, and difference in their intensity with respect to the background and to overlapping emission features.
The 1.15 mm map was first cropped to 22. • 95 ℓ 24. • 85 and −0. • 55 ℓ 0. • 65 (as in Fig. 2) in order to exclude the noisier edge regions. The map was then smoothed to 13 arcsec resolution in order to increase the S/N. Measuring the sensitivity in such deep maps of the inner Galactic plane is problematic because of source confusion, and so several methods were used to measure the rms noise level. The preferred method was to perform a Gaussian fit to the distribution of pixel values taken from the null maps within a sample of four octagonal sky apertures with radii of ∼1.9 arcmin (displayed in Appendix A) located across the main survey region. The four sky apertures were carefully placed across the map in order to both sample the range of integration depth in a representative manner, and to avoid regions with residual emission. The null maps are produced by the data reduction pipeline, and are made by alternately adding and subtracting the maps generated by individual scans of the full observation set. In principle, this produces a map free of emission structures whilst preserving the noise characteristics of the map. This method yields the most robust noise estimate in a line-of-sight with emission structures at all positions. The rms determined in this way was 4.18 mJy 13-arcsec-beam −1 .
Based on the estimated rms noise level, we set the minimum value (min_value) above which emission is considered to be real, and the minimum difference between substructures (min_delta) to 3 times the global rms, while the minimum structure size (min_npix) to be 16 pixels -equivalent to half of the 13-arcsec beam area. The latter choice was made for the minimum size criterion because pure point sources will be reported as smaller than the beamsize due to the clipping of the wings that extend below the minimum intensity level (as in the 'bĳection' scheme described in Rosolowsky et al. 2008), and so would be filtered from the data if the full beam area was used. This is especially true for low S/N sources. The application of a single average rms noise level over the entire area results in a small number of spurious detections in the noisier-than-average regions and, conversely, some undetected compact sources are visible in the residuals in the regions where the sensitivity is greater than this average value. To enable us to eliminate spurious sources, we calculate the local S/N of each source in each waveband. As mentioned earlier, the null maps contain some residual positive or negative emission, and are unsuitable for use as noise maps for all sources. We therefore produce a noise map in each waveband, that is calculated from the weight maps produced by the IDL pipeline. The per-pixel weights are proportional to the inverse variance of the flux densities measured at each position on the pixel grid from the contributing observations. The noise at each pixel position, , is therefore calculated as = 0 √︁ 0 / , where 0 is the rms measured from the pixels within the four sky apertures, 0 is the mean value of the pixel weights within the four sky apertures, and is the per-pixel weight. We apply a 'pruning' scheme in the dendrogram construction such that any objects that do not have a local S/N that exceeds 6 -in keeping with the dendrogram's input parameters -in at least one of their constituent pixels are removed.
In the dendrogram nomenclature, emission features may described as 'trunks', 'branches', or 'leaves', depending on their position within the hierarchy. The trunks are the lowest-lying structures in the hierarchy, at the minimum detection threshold and thus describe the maximum extent of any isolated emission region, while branches are at a higher contour level within a trunk, and may contain multiple leaves. In the following analysis, we refer to the dendrogram leaves -those emission features that contain no further discernible substructure -as 'clumps', whereas we use the term 'source' to refer to any emission feature extracted using the dendrogram, regardless of its position within the hierarchy. That is to say that a source may contain multiple clumps. The dendrogram extraction found a total of 2446 sources of all kinds, consisting of 488 isolated emission structures (trunks) with 1467 clumps (leaves). We summarise the basic properties -flux densities and source sizes -in Appendix B.

Source velocity assignments
Determining the systemic velocity, and thereby a kinematic distance, for each extracted source is a necessary step in calculating many of its physical properties. However, due to both the high sensitivity of the GASTON GP maps at 1.15 mm, and the particularly complicated line-of-sight for the survey region, source confusion presents a significant challenge in their determination. We therefore adopted a procedure with several stages to establish the most reliable possible velocities, which are summarised in this Section, and further details may be found in Appendix C.
To determine a velocity for each GASTON source, first, an integrated 13 CO (1-0) spectrum is extracted from the FUGIN data of all pixels that lie within the source footprint. Many of the spectra exhibit multiple emission features, and so the velocity with the greatest intensity is assigned as the centroid velocity, cen , of each source. In cases where the line of sight contains multiple velocity components, this velocity is assumed to trace the highest-column density material which is most likely to be associated with the dust continuum source. A quality flag was assigned to each velocity assignment, with a value of 3 given to the most robust assignments, and 1 to the poorest, and a value of 2 is given for intermediate quality assignments.
The main advantage of this method is its simplicity, but it may obviously provide wrong velocity assignments in some cases, primarily due to the difference in densities traced by the molecular and continuum data. In principle, the dust continuum emission traces the full column density of molecular hydrogen, but the spatial filtering applied by the data reduction pipeline removes much of the emission from diffuse gas, leaving the more compact, and higher columndensity gas. Observations of dense gas tracers would provide more suitable velocities for the GASTON GP sources than 13 CO (1-0), and so, as a second stage, we use data from the RAMPS pilot study (Hogge et al. 2018) of NH 3 (1,1), which has a critical density of crit ∼ 3 × 10 3 cm −3 to determine dense-gas velocities. We adopt velocities for 517 (21 per cent) of the sources from the first component from the line-fitting procedure, and a further 513 (21 per cent) from the first-moment maps of Hogge et al. (2018), assigning a quality flag of 3 for all ammonia-derived velocities.
The hierarchical structure of the 1.15 mm emission is recorded within the dendrogram catalogue, and since each source now has an individual velocity assignment, along with a quality flag, the structure can be used to inform us about likelihood of the more dubious velocity assignments being the correct ones. For example, if a velocity assignment of 50.5 km s −1 is given for a source with a poor quality flag, but its parent structure has a robust velocity assignment of 50.9 km s −1 , then it is considered to be part of that same velocity group. If a poor velocity assignment is made to a source at a significantly different velocity to its neighbours within the hierarchy that have robust assignments, then we adopt the median velocity of those neighbours for that source. In this way, the velocity assignments for all sources (where possible) with a quality flag of either 1 or 2 were refined, and we adopt the quality flag of the most robust velocity within that group for all its constituent sources. We refer to the velocities refined in this way as group . In Tab. 1 we present the distribution of quality flags assigned to the velocities determined and refined in this Section.
Finally, in order to ensure that velocity-coherent structures are retained, we perform a friends-of-friends grouping. Clusters of sources were identified by recursively linking groups that lie within a tolerance of 0. • 06 in the ℓ and axes, and within 2.5 km s −1 in line-of-sight velocities, group , which approximates the characteristics of small (10 pc-diameter) molecular clouds (e.g. Roman-Duval et al. 2010) at a distance of 5 kpc -the median near-kinematic distance for the revised source velocities along this sight-line. This allows, for example, the prominent large-scale filament that extends from [ℓ, , group ] = [24. • 16, 0. • 40, 94 km s −1 ] to [23. • 57, 0. • 59, 98 km s −1 ] to be identified as a single cluster. 2249 (92 per cent) of all GASTON sources are connected to one of the 211 clusters identified in this way, which contain up to 233 individual sources, with a median size of 25 sources. For each cluster, we also calculate the median group velocity, cluster , and the associated standard deviation, cluster , using only the sources which have the most reliable velocity flags from that cluster.
The usage of 21 arcsec-resolution spectral line data with our 13 arcsec resolution element means that there is an element of beamaveraging that is likely to blend together velocity components that would be distinct at matching resolution. We have not made use of available survey data in 12 CO (3-2) (Dempsey et al. 2013) at 16.6 arcsec resolution, due to optical depth considerations. While Table 1. Distribution of quality flags for each stage in the velocity determination process. For each flag value, the total number of sources with that value ( ) and the percentage of the sample made up by those sources (%) is given.  Frayer et al. 2020) hold much promise, unbiased Galactic plane surveys in dense gas tracers such as N 2 H + , HCN, NH 3 or even C 18 O at sufficiently high sensitivity and angular resolution are not currently available, and the FUGIN and RAMPS data present the best data for the moment. We note that CHIMPS2 (Eden et al. 2020) aims to cover the GASTON GP field in 13 CO (3-2) which, at an angular resolution of 15 arcsec and with a higher-critical density molecular gas tracer than that of the FUGIN, should complement GASTON upon its completion, though it is limited to | | < 0. • 5. We note that we do not use data from the Galactic Ring Survey (GRS; Jackson et al. 2006), which have velocity resolution and sensitivity superior to the FUGIN data, because the higher angular resolution of the latter are more important in this case.

Distance determination
Distances to sources lying within the Galactic plane are typically determined by measuring their radial velocity, and by assuming that the source is following a circular orbit described by a model of the Galactic rotation curve. For sources lying within the Solar Circle, each radial velocity corresponds to two kinematic distance solutions. In order to determine whether a source lies at the near-or far-kinematic distance, one either requires additional information, or some assumptions must be made. We determined kinematic distance solutions to each source using the Galactic rotation model of Reid et al. (2019) based on their ℓ, , group coordinates. As a first estimate, we use version 2.4.1 of the Bayesian distance calculator 7 (Reid et al. 2016), adopting the recommended weightings for the priors based upon the spiral arm model, Galactic plane scale height, and proximity to sources with a measured parallax. This gives an independent distance estimate to each source (regardless of position within the hierarchy), but we note that this preferentially concentrates sources into the spiral arms. However, since the spiral structure of the Milky Way is very much an ongoing matter of research, we do not adopt these distance solutions directly. Rather, we use the distances calculated with these various priors to distinguish between the near and far kinematic distances. 2274 (93 per cent) of the 2446 sources are assigned to the near kinematic distance in this way, with only 120 (5 per cent) assigned to the far distance, and the remainder are located at the tangent point ( ≈ 7.5 kpc, assuming the Reid et al. 2019 distance to the Galactic centre of 8.15 kpc), where the near and far solutions are equal. By comparison, 77 per cent of the ATLASGAL clumps within the catalogue of Urquhart et al. (2018) have distances less than the tangent distance, after reducing the sample to those within the GASTON field, and adjusting for the difference in tangent distance associated 7 http://bessel.vlbi-astrometry.org/ However, in the latter case, there is a bias towards far distances, as objects for which data (such as extinction data) that may allow a near determination to be made were unavailable are assumed to be located at the far distance. After removing clumps with this default assumption from the Hi-GAL catalogue, the number rises sharply to 89 per cent. We show the resulting distribution of distances in Fig.  3, and compare the values to their counterparts in other catalogues in Appendix E. We find that two-thirds of our distance estimates match with those of the other catalogues to within 1 kpc and can be considered to be in general agreement. It is not surprising that a large fraction of sources are found to be located at the near distance as a result of the combination of Malmquist bias, and the fact that the ℓ = 24 • sight-line covers nearby sections of the Sagittarius, Scutum, and Norma spiral arms (Reid et al. 2019). However, we caution that the friends-of-friends analysis used here will inevitably lead to cases where sources at the far distance appear in clusters of sources for which our methodology favours a near distance.
In a manner similar to the determination of the cluster velocities, the median distance to each cluster is determined from the set of distances stemming from its constituent sources with the most robust velocity measurements, and then assigned to all sources within the cluster. Finally, we account for clusters which host maser sources with a known parallax distance. A total of 11 masers with parallax distances are located in the field, and after identifying the cluster associated with the , , position of each maser source, we assign the parallax distance to all sources within that cluster. There are a total of nine clusters that contain maser sources and are assigned distances in this way. Two of the eleven maser sources, G023.00-0.41 ( = 79 ± 5 km s −1 , = 4.88 ± 0.36 kpc) and G023.20-0.37 ( = 82±10 km s −1 , = 4.18±0.61 kpc), are associated with objects within the same cluster, and since the associated distances are significantly different, a choice must be made. For this cluster, we adopt the parallax distance associated with G023.00-0.41, which is in stronger agreement with both the mean of the predetermined kinematic distances ( = 5.20 kpc) and systemic velocities (77.5 ± 0.2 km s −1 ). A further two of the maser sources, G023.65-0.12 and G23.6, are located within the same cluster, but their distances and velocities are identical, and so present no conflict. The median discrepancy between the kinematic and maser parallax distances to these seven clusters is 0.41 kpc, though the largest difference is 1.73 kpc.
At this point, we have derived two distances for each object within the dendrogram, group and cluster , both of which are derived principally from the sources' line-of-sight velocities, and the assumption that they simply follow circular motions about the Galactic centre. However, it is well known that line-of-sight streaming motions of up to ∼ 10 km s −1 can be present for molecular clouds, and peculiar motions within molecular cloud complexes further complicate this picture. To mitigate these effects in the subsequent analysis, the cluster distances were, therefore, used to determine any distance-dependent properties, and we note that the largest internal velocity dispersion of the clusters identified using the friends-of-friends grouping is 2.3 km s −1 . The difference between the two sets of distances is minimal and, when considering the population of clumps, 98 per cent have a difference between group and cluster of less than 0.2 kpc.

Physical properties
Estimating dust temperatures ( d ) and column densities ( H 2 ) of molecular hydrogen from dust continuum imaging is typically done by fitting a modified blackbody to the spectral energy distribution (SED): where is the specific intensity, H 2 is the mean molecular weight per hydrogen molecular (with a value of 2.8), H is the mass of a hydrogen atom, is the specific dust mass absorption coefficient, and is the Planck function evaluated at the frequency for the dust temperature. Three or more photometry points taken from imaging at submillimetre or millimetre wavelengths, convolved to the same resolution, are usually required (assuming both and are fixed), providing a fit at the resolution of the longest wavelength being used -typically 36 arcsec for Herschel 500 µm data. However, the usage of dust temperatures derived at 36 arcsec-resolution for the 13 arcsecresolution scales of the 1.15 mm data would mean that local minima and maxima would be beam-diluted. The different spatial frequencies probed by the ground-and space-based observatories also presents a further limitation.
To overcome these limitations, we adopt colour temperatures derived from the ratio of Hi-GAL 160 to 250 µm flux densities (e.g. Peretto et al. 2016;Rigby et al. 2018), which have an effective reso-lution of 18 arcsec. The colour temperature is related to the ratio of flux densities in the following way: ( 2) where is the source-integrated flux density = ∫ Ω. By adopting a fixed value of = 1.8, and using 160 and 250 µm photometry from Hi-GAL imaging, we sampled col from a grid of values ranging from 2.7 to 50 K, with 0.1 K intervals. Uncertainties on the Herschel photometry result in uncertainties on col of ∼1.5 K. We first performed a convolution to effectively match the PSF of the 160 µm image to that of the 250 µm image following Aniano et al. (2011), and both images were high-pass filtered to remove spatial frequencies larger than 6.5 arcmin using the application of the Cambridge Astronomy Survey Unit Tools software package 8 to approximate the spatial filtering present in the NIKA2 data.
Many of the sources in the GP field consist of a mixture of compact and extended structure, and the flux densities of compact sources, therefore, usually contain contributions both of the compact object itself, and the extended emission structure on which they reside. We have therefore adopted the 'clipping' scheme (Rosolowsky et al. 2008) of flux estimation for (and thereby the mass) for each source, in which the local background level of flux is subtracted from the total, or put another way, only flux above the contour level defining the structure is counted in the integrated flux. Although Rosolowsky et al. (2008) prefer the 'bĳection' scheme, in which the entirety of the flux density integrated across the source's sky position is used for the mass calculation within molecular clouds, the clipping scheme is used here, since we are most interested in tracing the mass of the smallest structures within the hierarchy. We determine the appropriate background level for each source by first performing a binary dilation on each of the source masks, which expands the perimeter by one pixel, and allows the encompassing perimeter pixels to be isolated (i.e. pixels just below the 1.15 mm contour level at which the source was identified). The background flux density at 1.15 mm is the contour level at which the source was separated from its parent structure, and was determined as the mean of the minimum pixel value within the source, and the maximum pixel value within the perimeter pixels. A slightly different approach must be taken to determine the background flux densities for the 160 and 250 µm, because the morphology of emission at those wavelengths may differ from the morphology at 1.15 mm at which the source was defined. In contrast to the 1.15 mm case, some of the pixels within the boundary region may be brighter than some of those within the source area. At these wavelengths, the background value is therefore taken as half of the sum of the mean value of pixels within the perimeter, and the minimum pixel value within the source, thus approximating the 1.15 mm contour.
The masses of the sources identified in the dendrogram are calculated from , the monochromatic flux densities integrated over the solid angle subtended by the source at 1.15 mm using where the dust absorption coefficient is given by = 0.1( /1THz) (Beckwith et al. 1990). We use a value evaluated at 1.15 mm (260 GHz) of 260 = 0.009 cm 2 g −1 , assuming = 1.8, and incorporating a gas-to-dust mass ratio of 100. This value is close to the value of 0.008 cm 2 g −1 used in our previous NIKA study (Rigby et al. 2018), and that adopted by Hi-GAL in Elia et al. (2017). We discuss the impact of different values of upon our results in Sect. 4.4. For all sources, we use their cluster distance, cluster , as determined in Sect. 3.4, and we decrease the integrated flux densities by 4 per cent to account for the median level of CO (2-1) contamination (determined in Appendix D), adding a 4 per cent contribution to the corresponding uncertainties. Uncertainties are calculated using Monte Carlo methods for the primary quantities for each source (i.e., distance, absolute flux calibration uncertainty), and propagated to all derived quantities, such as mass and colour temperature. We note that the contribution of the factor of ∼ 2 uncertainty on is not included in our mass uncertainties since its effects are likely to be mostly systematic across the region, and will consequently not have an impact on the analyses presented here.
We note that, as a result of the 18 arcsec resolution of the colour temperature estimates, smaller-scale temperature variations will not be detected. This may result in the over-estimation of source masses where compact sources warmer than the beam-smeared average are present, and vise-versa for colder compact sources. At large radii, the temperature of protostellar sources is of the form ( ) ∝ 2/(4+ ) , where is the dust emissivity spectral index (Terebey et al. 1993) and, by adopting = 1.8, we can expect to underestimate dust temperatures of point sources at 18 arcsec by up to ∼ 11 per cent compared to those at 13 arcsec. For resolved sources, this effect is unlikely to play a significant role, but we conservatively increase the uncertainties on col for all sources by 11 per cent to account for this effect.
We discuss the resulting distributions of mass and temperature in Sect. 4.2 and Sect. 4.3, and present the overall distribution in Fig. 6.

Comparison to other ground-based (sub-)millimetre data
This particular region of the Galactic plane is well covered in many recent surveys of (sub-)millimetre continuum emission. The most similar ground-based surveys to our NIKA2 observations are the 1.1 mm BGPS at 33 arcsec resolution (Aguirre et al. 2011;Ginsburg et al. 2013) and, at 870 µm and with 19.2 arcsec resolution, AT-LASGAL (Schuller et al. 2009). While the data from these surveys are very well matched to GASTON in terms of wavelength, they were designed to cover far wider regions of the GP (covering 192 and 360 deg 2 , respectively), and so their sensitivities are lower than those of the data presented here.
We have identified a total of 1467 GASTON clumps (dendrogram leaves), compared to 346 ATLASGAL compact sources (Urquhart et al. 2018) and 164 from the BGPS (Ellsworth-Bowers et al. 2015) within the same sky area. Of the 164 sources from the BGPS, only 2 sources do not have centroid coordinates lying within three convolved beam radii 9 of a GASTON clump centroid. In both cases the BGPS source is located within a large-scale diffuse region in which the GASTON image has partially resolved and detected clumps around the periphery, and have centroid coordinates located far away from the particular region of emission extracted from the BGPS data. All of the ATLASGAL clumps lie within three convolved beam radii of a GASTON compact source, and they all fall within the boundaries of emission extracted by the dendrogram analysis in this work. The centroid coordinates of 1237 (84 per cent) and 1114 (75 per cent) out of the 1467 GASTON clumps are located further than three convolved beam radii of any BGPS or ATLASGAL source centroid, respectively. Although this positional matching based on centroid coordinates will be less effective for sources that are not centrally concentrated, ∼95 per cent of GASTON clumps have angular radii of less than three beam radii (i.e. 39 arcsec), and so it is clear that the majority of GASTON clumps are new identifications.
In Fig. 4 A more sensitive study of this region was recently carried out with the Large Millimetre Telescope at 1.1 mm using AzTEC (Heyer et al. 2018). Those observations were centred on ℓ = 24. • 5, = 0. • 0, with a radius of 0. • 55 -partially overlapping with the GASTON GP field -inside which they reached a median sensitivity of 9 mJy beam −1 with a resolution of 8.5 arcsec. After rescaling to 1.15 mm, assuming a spectral energy distribution of the form ∝ 3.8 , to match our NIKA2 observations, the point-source sensitivity would correspond to ∼ 7.6 mJy beam −1 , making them around half as sensitive as those presented here. There are a total of 993 AzTEC compact sources within the region that overlaps with GASTON, after restricting the source catalogue to those considered by Heyer et al. (2018) to be robust, i.e. probability of a false-positive detection of < 2 per cent, compared to 693 compact GASTON clumps, and 1175 GASTON sources of all kinds. 698 (70 per cent) of the AzTEC source centroids fall upon a pixel associated with an extracted GASTON source, and the remaining 30 per cent are likely to be a result of: i) the greater resolution of the 32-m LMT Alfonso Serrano at a 1.1 mm (8.5 arcsec) compared to the 30-m IRAM observations at 1.15 mm (11.1 arcsec) and ii) the more pronounced spatial filtering in the AzTEC observations (removing spatial scales > 50 arcsec) compared to that of NIKA2 (∼ 6. 5); iii) the area of maximum sensitivity in the AzTEC map extends into regions of higher noise in the GASTON map, where our detection rate is lower. Furthermore, we have adopted dendrogram parameters in this study optimised to detect emission structures on all scales, and a treatment using harsher spatial filtering like that applied in the AzTEC data, or those of the ArTéMiS observations of Peretto et al. (2020) would be likely to increase the overall number of compact sources detected.
The statistics discussed in this Section are summarised in Tab

A new population of clumps
NIKA2 is expected to be more sensitive to cold and compact dust sources within the Milky Way than Herschel, owing to its greater angular resolution at longer wavelengths. To determine whether we have identified compact structures within the GASTON GP data that were not identified using the various extraction techniques applied to the Herschel data, we compared our catalogue to two sets of Hi-GAL compact source catalogues. We performed a series of cross-matching exercises to identify any 1.15 mm GASTON clumps that have not been identified by either the five monochromatic source catalogues of Molinari et al. (2016a), or the 'high-reliability' band-merged (BM) catalogue of Elia et al. (2017).
To carry out the source matching, we adopted the elliptical source sizes from the monochromatic catalogues, and the circular source sizes for the BM catalogue -after convolving with a Gaussian kernel representative of the 1.15 mm NIKA2 beam -to look for overlaps with the footprint of each dendrogram structure, and record the number of matches in each case (see Tab. 2). By comparison to the monochromatic catalogues, we found the lowest fraction of GAS-TON clumps that do not match to any Hi-GAL compact source in the 250 µm band, with 321 GASTON clumps (22 per cent) without a 250 µm counterpart, and 765 (40 per cent) with no bandmerged counterpart. We compare the Herschel 250 µm data and the GASTON GP 1.15 mm data for a 0. • 5 × 0. • 5 region centred on ℓ = 24. • 35, = 0. • 15 in Fig. 5, overlaying markers representing catalogued sources from the 1.15 mm dendrogram extraction and the two Hi-GAL catalogues. The highest match fraction with sources in the 250 µm was expected, as these particular Herschel observations represent a compromise between the greater angular resolution available at the shorter wavelengths, and the greater sensitivity to the coldest structures at the longer wavelengths. Although the 500 µm catalogue represents the closest match in wavelength, the large difference in angular resolution results in a substantially different segmentation of the emission under the different source extraction methods.
In Fig. 6, we compare the distributions of mass and colour temperature, col , for the full sample of sources. Their statistical properties are summarised in Tab. 3. The new clumps identified by our NIKA2 observations with no counterpart in the Hi-GAL catalogues fall, on average, towards lower mass and temperature than those associated with Herschel-catalogued sources. We compare the log 10 ( / ) and col distributions for the clumps that do have 250 µm Hi-GAL matches and those that do not, using two-sample Anderson-Darling. The results show that the distributions of both properties in the two populations are inconsistent at a high confidence level, with -values of < 0.02 in both cases.
There are 14 clumps with masses in excess of 100 M that are new detections. We note that to compare these values to other survey data, such as ATLASGAL, one must consider the masses measured in the same way, and the ATLASGAL clump masses of Urquhart et al. (2018) adopt a method more similar to the 'bĳection' scheme (as discussed in Sect. 3.5). The approximate conversion between the 'clipped' masses reported in this study, and the equivalent bĳected masses, calculated from combined Herschel and NIKA2 column density maps (following the method of Rigby et al. 2018) is bijection ≈ 41 × 0.67 , with a simple linear least-squares fit in log-space. In this metric, 33 of the 321 new clumps have bijection > 1000 , and may be considered as high-mass. 256 of the 321 new clumps (80 per cent) have no compact 70 µm counterpart found by matching to the Molinari et al. (2016a) catalogue, and so may be considered as candidates for starless sources, though we caution that this is an upper limit, and a lack of a 70 µm match does not guarantee a lack of embedded star formation (e.g. Traficante et al. 2017).
We point out here that the various catalogues used in the crossmatching exercises in this Section and in the previous one, were generated using different source extraction techniques. We caution that we expect the total number of sources in the different data sets to vary with the methodology, but it is clear that we detect a substantial  number of previously undetected sources, with the 321 sources that do not coincide with 250 µm compact sources representing the most conservative estimate.

The evolution of star-forming clumps
To characterise our sample, we have computed the infrared-bright fraction for each source, IRB , based on GLIMPSE 8 µm imaging (Benjamin et al. 2003;Churchwell et al. 2009). The infrared-bright fraction is defined as the fraction of pixels within the source boundary that are brighter than the median value of all pixels within a 4.8arcmin box centred on each pixel (see Watkins et al. in prep for a full description). The filter size was selected so that the most extended sources within the 8 µm image -typically bubbles associated with H regions -could be compared to their local background. Although this filter scale represents a different spatial scale for the nearest and farthest sources within the region, it offers robust measurements for sources with angular sizes less than half of the filter scale, which is true for all GASTON clumps. We note that a very similar the classification scheme was used by Battersby et al. (2011), who used GLIMPSE 8 µm images to classify a sample of Hi-GAL sources.
In Sect. 4.2, each clump was also matched to the 70 µm catalogue of Molinari et al. (2016a) where 70 is the sum of the integrated flux densities of any compact sources from the catalogue of Molinari et al. (2016a) that lie within one beam-radius of a GASTON clump boundary. The infrared-bright fraction, IRB , may serve as an indicator for the evolutionary stage of a source: sources that are completely infrareddark ( IRB = 0) are likely to be at the earliest stages of evolution, containing no observable indications of star-formation in terms of their 8 µm flux and, conversely, sources that are completely infraredbright ( IRB = 1) are likely to be sources exhibiting signs of advanced star-formation, most commonly in the form of emission from young stellar objects, H regions, or heated polycylcic aromatic hydrocarbon (PAH) molecules. Although a more detailed analysis will be presented in Watkins et al. (in prep.), we demonstrate the basic utility of IRB as an evolutionary indicator in several ways in Fig. 7. Firstly, there is a positive correlation (Pearson correlation coefficient = 0.57, −value 0.1) between IRB and the source temperature as traced by col . Secondly, there is also a positive correlation ( = 0.57, −value 0.1) between IRB and log 10 ( bol / ). Temperature and bol / are both generally regarded as robust tracers of the time-evolution of clumps (e.g. Urquhart et al. 2014Urquhart et al. , 2018Molinari et al. 2016b;Svoboda et al. 2016;Elia et al. 2017). Finally, we show that the fraction of GASTON sources that are associated with compact 70 µm sources from Hi-GAL (Molinari et al. 2016a) also increases with IRB , and reaches a plateau of ∼ 85 per cent for sources with IRB 0.5.
It is important to be aware of the potential for distance biases using the IRB parameter, since darkness at 8 µm usually requires a bright background against which the radiation can be absorbed. We therefore reduce our sample in the following analysis to include only those clumps that lie at distances between 3.5 and 7.0 kpc. This criterion serves a dual function; firstly, it limits the effect of the distance bias in IRB since infrared-dark clouds are generally located at the near kinematic distance (e.g. Ellsworth-Bowers et al. 2013), and the tangent for kinematic distances at this longitude range is ∼ 7.5 kpc. Secondly, this limits the role of a varying spatial resolution element in the source extraction process, and these particular limits mean that the most distant sources have only a factor of 2 lower spatial resolution than the nearest ones. Conveniently, these limits remove only ∼10 per cent of the sample.
We define four categories through which we divide the sample of clumps based upon their infrared-bright fractions that can be used to broadly trace evolutionary phase: i) IRB < 0.040 for predominantly infrared-dark clumps; ii) 0.040 IRB < 0.229 for clumps that are mostly infrared-dark; iii) 0.229 IRB < 0.663 for clumps that are partially infrared-bright; iv) IRB 0.663 for clumps that are predominantly infrared-bright. The specific values of IRB were chosen such that the four sub-samples are of equal size, each containing a total of 321 clumps. The fraction of clumps in each category that are associated with compact 70 µm counterparts, 70 also increases monotonically with the proposed evolutionary sequence, with 21, 33, 59 and 69 per cent of the clumps having 70 µm counterparts in samples i)-iv), respectively.
Tracing the precise time-evolution of star-forming regions is not a trivial matter, but although we caution that IRB does not give an absolute value of 'evolutionary stage' or age for individual sources, it may be regarded as relative age indicator and, therefore, a suitable measurement when dealing with large samples of sources. Using a similar methodology to that used in this paper, Battersby et al. (2017) estimated the fraction of a sample of dense molecular regions above a column density threshold associated with high-mass star formation that were starless or star-forming. By comparing the relative fractions of these categories to those associated with methanol masers and ultra-compact H regions -which have established lifetimes -the authors estimated the lifetime of the starless and star-forming phases as 0.2-1.7 and 0.1-0.7 Myr, respectively. These two timescales may broadly map onto stages i)-ii) (predominantly starless) and iii)-iv) (predominantly star-forming), respectively, indicating a total duration of 2.5 Myr for the evolution from IRB = 0 to IRB = 1.
In Fig. 8, we display these four samples in (logarithmic) mass and colour temperature space. By plotting the mean position of the four sub-samples in the two axes, a progressive movement of the distributions is seen as the clumps get brighter at 8 µm. These mean values suggest that the clumps, on average, gain a moderate amount of mass -increasing by ∼ 0.2 dex between stages i) and iii), reflecting a 60 per cent increase -before losing the same amount of mass, all the while increasing in temperature. The errors on the mean values indicate that although these changes in mass are subtle, they are significant at the 3.1 level between stages i) and ii), at the 4.3 level between stages i) and iii), and at the 4.0 level of significance between stages iii) and iv). The increase in source counts as the GASTON GP project progresses will allow these values to be revisited with greater statistical power. The mean values and their associated standard errors are given in Tab. 4. A further striking feature of Fig. 8 is the increasing spread in temperatures along the evolutionary sequence. This behaviour is expected as a consequence of the more rapid evolution of the highest-mass sources, in addition to the mass function. The more numerous low-mass sources are expected to evolve relatively slowly, only increasing in temperature, and losing mass at later stages as the clump is dispersed. The rarer high-mass clumps are expected to rapidly gain in mass and temperature at early sages, before losing mass at later stages.  This reflects the behaviour that might be expected from a clumpfed model of star formation, whereby the mass gain of the clump is the result of accretion from its direct environment, and the clump is subsequently dispersed by feedback as star formation progresses. By contrast, in a core-fed scenario, such clumps do not exhibit an initial increase in mass, since all of the mass is in situ before star-formation begins, which seems incompatible with Fig. 8. Furthermore, the movement of the mean points in Fig. 8 bears a striking resemblance to the mass-temperature tracks described by the clump-fed models of Peretto et al. (2020, see their Fig. 7), supporting the coeval mass growth of clumps and embedded protostars within. However, it is important to note that the sizes of the GASTON sources presented here are, on average, larger than those presented in Peretto et al. (2020). This might suggest that the same accretion process from large to small scales occur over a relatively wide range of scales. A qualitatively similar behaviour as that seen in Fig. 8 is observed by Elia et al. (2017, see their Fig. 22). In that study, clumps were categorised as starless or star-forming based upon the presence of 70 µm emission, with the star-forming sample further divided ac-cording to the detection or non-detection of 21-24 µm counterparts, and coincidence with H regions.
We highlight that each of the four subsamples is likely to contain a mix of both core-fed and clump-fed sources, and so the signal present here in the progressive movement of the mean position will be moderated by the core-fed sources toward the low-mass end of the mass distribution. If the clump-fed scenario is more likely to be important for the highest-mass clumps, then this behaviour could be reflected in the upper envelope of the distributions. The 95th percentile in mass changes by +0.08, −0.03, and +0.06 dex, while the 95th percentile in temperature increases by 0.7 K, 2.7 K and 4.4 K between stages i)-ii), ii)-iii), and iii)-iv), respectively. The lack of a reduction in the 95th percentile in logarithmic mass in the final stages is more difficult to interpret, as it suggests that the clump dispersal phase of the most massive clumps is either not being sampled well here, or lasts a relatively long time. The former explanation could easily be true, as the number of sources in the 95th percentile is only 16 in each stage, and thus the measurement suffers from small numbers. Urquhart et al. (2014) argued that, although the accretion phase for high-mass protostars is probably very rapid, they may then take a relatively long time to begin dispersing the enveloping clump, resulting in the clustering of sources at the apex of their mass-luminosity evolution. Qualitatively, this would result in the same behaviour seen here with these highest-mass clumps spending much of their time in the high-mass, high-temperature part of this diagram. This is also supported by the clump-fed models of Peretto et al. (2020), whose evolutionary tracks for the highest-mass stars shows an accretion phase that is relatively short-lived compared to the lifetime approaching their maximum mass.
It is unlikely that this sequence arises as a result of observational effects such as sensitivity or completeness. The point-source sensitivity limits are displayed in each panel of Fig. 8, and do not display any signs of correlation with the movement of the mean position of the distributions in stages i)-iii); lower-mass clumps are more likely to be observable at later stages when temperatures are higher, and so this effect may in fact be suppressing mean value of the mass in the middle stages, and the average mass gain in high-mass sources is therefore likely to be greater than the value of 60 per cent deduced from this study.
The source extraction, which was carried out on the 1.15 mm map, is completely independent of the IRB parameter, and so we do not expect any systematic bias in terms of completeness as a function of evolutionary stage. The effects of mass completeness manifest themselves in this parameter space by the position of the peak of the distribution. The real distribution of clump masses is expected to consist of a power law ∝ , where < 0, and so the turnover close to the mean value reflects the completeness that is a function of the sensitivity and source extraction methodology. Since both of these factors are consistent across all samples, it is difficult to conceive a scenario in which differences in completeness could be dominating the position of the mean.

Caveats on the proposed evolutionary scenario
In the previous Section we have provided evidence, acquired from the catalogue of clumps identified at 1.15 mm, that the overall population is most massive at intermediate evolutionary stages, as traced by the IRB proxy for relative evolution. In this Section, we explore various caveats to this analysis, and test the robustness of the observed trend.
Firstly, our methods for acquiring clump distances in Sect. 3.4 have resulted in a large fraction -93 per cent -of sources that are located at the near kinematic distance. We expect a high fraction . Mean evolutionary trend after altering the samples to: a) displace a randomly-selected third of clumps to the far kinematic distance (using three sets); b) adopt group as the distance estimate; c) adjust the distance limits to 4 < cluster < 6 kpc; d) normalise the clump masses to the mean distance of 5.35 kpc; e) adopt equally-spaced IRB limits for the subsamples; f) use alternative values of . In each case, the original trend is shown as grey squares, with mean positions of the altered samples overlaid in coloured circles. The shaded regions show the ±1 confidence limits.
along this line of sight, and it is, perhaps, not surprising that a higher fraction is recovered than for clumps in ATLASGAL (77 per cent), BGPS (79 per cent), or Hi-GAL (89 per cent of the 'goodquality' assignments), due simply to the greater mass sensitivity of GASTON, and the greater prevalence of low-mass compared to high-mass clumps. However, we have tested the robustness of the proposed evolutionary trend by repeating the analysis three times after first randomly assigning 1/3 of the population of clumps to their far kinematic distance solutions. The resulting trends are illustrated in Fig. 9, and the overall effect is, in two of the three cases, to slightly reduce the significance of the differences between stages i)ii), i)-iii), and iii)-iv) compared to the original distance assignments, while the significance is increased in the third case. This tendency to slightly reduce the significance can be mainly ascribed to the resulting smaller sample sizes, as the clumps now assigned to the far-kinematic distance fall outside the distance limits of the sample, consequently increasing the error on the mean. Another choice affecting the results was the decision to adopt the cluster-averaged distances, cluster , as opposed to the dendrogram group-averaged distances, group . If we adopt the latter in place of the former, and repeat the analysis, we see almost no change in the results, since the average difference in the two distances in on the order of one hundred parsecs. In fact, as can be seen in panel b) of Fig. 9, the evolutionary trend is slightly strengthened in this scenario with the largest change coming between stages i) and iii), now at the 4.5-level of significance, compared to 4.3-in the original analysis.
The choice of distance limits used could also have an effect upon the recovered trend, and we repeated the experiment by adopting more conservative distance limits, such that a maximum factor of 1.5 difference in the spatial resolution at the near and far cut-off was used, i.e. 4 < cluster < 6 kpc. The effect here is, again, to reduce the sample size, now with 259 or 260 clumps in each stage compared to the original 321, and thereby reduce the significance of the trend. In this case, the largest change in log 10 ( / ) between stages i) and iii) is still significant at the 3.7-level (panel c) of Fig. 9).
We also note the presence of potential biases in both distance and angular size as a function of IRB . Relative to one another, the average values of both quantities shows the same behaviour as the average mass throughout the four proposed evolutionary stages. A bias towards nearer distances for the more infrared-dark sources can be understood by their requirement for a bright infrared background against which they can be seen, though it is not clear why there should be a bias for the brightest sources to also be located closer to the observer. The bias towards larger sources at intermediate values of IRB can be understood by the nature of the median filter, by which sources should tend towards IRB = 0.5 as their size approaches the filter size, by construction. We illustrate these biases in Fig.  10, indicating the mean values and standard deviations. One way to explore the effect of the distance bias is to normalise all of the clump masses to the same distance, and we do this by rescaling the masses as if they were located at the mean distance within the sample, multiplying them by a factor of (5.35 kpc/ cluster ) 2 , and we include the modified evolutionary trend in panel d) of Fig. 9. The bias in distance is extremely mild, and has little effect upon our conclusions. However, the effect of the bias in angular size is more difficult to test, and so we caution that there may be an effect at play here though, again, the effect is probably very mild, considering the overlapping distributions. In Fig. 10, we also show the lack of any significant bias in IRB as result of the latitude of the clumps, indicating that this particular line of sight has a sufficiently bright infrared-background so as to make any latitude-dependent bias negligible.
In Sect. 4.3, the limits to IRB were selected to equalise the size of the four samples, and thereby eliminate the statistical effects of varying sample sizes. Since IRB is expected to trace the relative evolution of each clump (however non-linearly), a secondary effect of this choice of samples is that a similar fraction of the clumps' total lifetimes may be covered in each stage. We can, however, investigate whether these particular boundaries of IRB that define the evolutionary stages, are responsible for the apparent sequence. Therefore, we conducted a further test, by defining the subsamples with equallyspaced IRB limits at IRB = 0.25, 0.50, 0.75. The four samples i)-iv) now contain 656, 202, 148, and 248 clumps, and, again, the same general trend is present (panel e) of Fig. 9), though at a reduced significance, with 2.35-, 1.57-and 2.73-changes in the value of log 10 ( ) between stages i)-ii), i)-iii), and iii)-iv), respectively. that equal time is spent -on average -in each stage. It is conceivable that other IRB limits could be adopted to either amplify or diminish the significance of the proposed evolutionary tend, but with a lack of physical or statistical justification, we explore these no further here.
Our calculation of temperatures and masses in Sect. 3.5, could also be having an effect upon the evolutionary trend. We adopted a value of = 1.8, based upon the Galactic plane average value, as measured by Planck (Planck Collaboration 2011), and thereby calculated a value of 260 . However, the value of has been shown to vary as a function of frequency (e.g. Planck Collaboration: et al. 2014), and so we test the impact of a two-valued upon the reported evolutionary trend. Following Planck Collaboration: et al. (2014), we adopt a farinfrared value of FIR = 1.88 ± 0.08 for our calculation of colour temperatures (as per Eq. 2), and a millimetre value of mm = 1.60 ± 0.06, for the determination of 260 used in the mass calculation (Eq. 3). In panel f) of Fig. 9, it can be seen that the relative evolutionary trends are unaffected by these changes, with the exception that the value of log 10 ( / ) is systematically shifted downwards by ∼0.1, accompanied by a small reduction in temperature of 0.2-0.4 K.

SUMMARY AND CONCLUSIONS
In this paper, we present the Galactic Star Formation with NIKA2 (GASTON) project, a guaranteed-time large programme using the IRAM 30 m telescope's new millimetre continuum camera, NIKA2. The GASTON large-programme consists of two projects aimed at probing the origin of different regions of the stellar initial mass function: i) a survey of high-to intermediate-mass star-forming clumps within a 2 deg 2 region of the inner Galactic plane (GP) centred on ℓ = 23. • 9, = 0. • 05; ii) a search for pre-brown dwarf cores; and a third project: iii) consisting of a study of dust property variations in nearby well-resolved cores.
We have described the observing strategy for the GASTON GP project, and presented the first results obtained primarily through analysis of the 1.15 mm photometric maps after ∼ 40 per cent of the total integration time. A dendrogram extraction technique has allowed us to isolate resolved and unresolved compact sources as well as more extended structures, whilst maintaining a description of the hierarchy of emission complexes. By extracting 13 CO (1-0) spectra for each source using FUGIN (Umemoto et al. 2017) and RAMPS (Hogge et al. 2018) data in combination with the latest models of the Galactic rotation curve (Reid et al. 2019), we have assigned kinematic distances to each structure and, in concert with Hi-GAL (Molinari et al. 2016a) data, have been able to calculate their masses.
Of the 2446 dendrogram structures that we have identified, a total of 1467 structures are either compact or unsubstructured (within the constraints of the observations) and 22 per cent of these appear to be a new population of clumps with no counterpart in the Herschel Hi-GAL catalogues (Molinari et al. 2016a;Elia et al. 2017). These new clumps are, on average, less massive, and colder than their 250 µm-detected counterparts, and represent a population of sources for which the Herschel resolution element at wavelengths in excess of 250 µm is not sensitive to. The majority of these sources are candidates for starless clumps, with 80 per cent having no compact 70 µmcounterparts. By the end of the GASTON GP observing campaign, we expect this number to increase significantly.
We have proposed a categorisation that describes the relative evolutionary stage of the clumps in terms of their infrared-bright fraction, IRB . The mean temperature of the clumps, along with the fraction of 70 µm-bright sources, and the ratio bolometric luminosity to mass in each stage increases monotonically with the proposed sequence, supporting the use of IRB as an evolutionary indicator. In mass-temperature space, the mean position of the distribution of clumps follows a path that agrees well with clump-fed scenarios for high-mass star formation, in which high-mass star-forming cores accrete mass from their parsec-scale environment, before finally losing mass at later stages as the dense gas initially associated with the star-forming region is dispersed and accreted.
Upon completion of the GASTON GP field, we will publish the full catalogue of sources, and revisit the measurements we have made in this study. sources at the detection threshold of our dendrogram extraction. None of the sources in our dendrogram extraction have a contrast value of less than unity although, in principle, point sources located in regions with a bright background could result in values of less than one. Figure B1 illustrates the relationship between peak flux density, and the two types of integrated flux density. The relationship between peak and bĳected flux densities shows that the majority of sources lie above the line of equality, which isolated point sources at very high S/N would be expected to follow. Sources lying above this line are extended sources, as can be seen by the colour scale indicating the sources' equivalent radii, eq (the radius of a circle with the equivalent area). In the case of the clipped flux densities, sources lying in regions with significant backgrounds may lie below the line of equality, and indeed the most significant departures occur for sources located in the region around ℓ = 23. • 4, = −0. • 2. The minimum expected clipped integrated flux density is 4.1 mJy, for point sources at the detection threshold, which could occur at any value of peak intensity if the local background is sufficiently bright. Some sources lie just below this line, which may occur where sources are either not point-like, or have local noise levels slightly higher than the adopted average rms in the extraction parameters. The relationship between clipped and bĳected integrated intensities is also shown, in which it can be seen that sources with the highest contrast, i.e., sources in regions with relatively low backgrounds, lie closer to the 1:1 relation. A linear least-squares fit in log-space finds an average relationship between the clipped and bĳected flux densities of bijected = 9.6 0.8 clipped . In Fig. B2 we display histograms of the peak and integrated flux densities, and equivalent radii of all sources identified by the dendrogram. The distributions of further subsamples are shown, and we show the population of clumps (sources containing no further observed substructures), and complexes (dendrogram 'trunks' containing more than one clump). Note that in the case of the peak flux densities, the peak flux within a complex is the same value as the peak flux of its brightest constituent clump, and so the complex distribution is a sub-set of the clump distribution. The brightest clumps have peak flux densities of ∼ 10 Jy beam −1 , and the largest complexes have integrated flux densities of ∼ 100 Jy. Source sizes range from the minimum detectable size of eq = 5.6 arcsec to 6.5 arcmin.

APPENDIX C: VELOCITY ASSIGNMENTS
In Sect. 3.3, we derived an initial velocity assignment for each source based on the channel within a source area-averaged spectrum with the maximum intensity. A level of robustness for each velocity assignment was estimated by determining the standard deviation of the velocities associated with the three brightest channels within each spectrum. In cases where this standard deviation is less than 1 km s −1 (i.e. the three brightest channels are almost perfectly adja-  Figure C1. Example of the velocity determinations of four sources within a sub-set of the data. In the left panel, the 1.15 mm flux density map is shown (after smoothing to 13 arcsec resolution), with contours overlaid corresponding to the footprints of catalogued sources with ID numbers 1289, 1620, 1705, and 1938. In the central panel, the corresponding 13 CO (1-0) spectra are presented, which were extracted over the footprint of the sources, with the velocity associated with the maximum intensity is assigned to that source indicated by a solid line above the spectra, and any velocity information determined from either the first-fitted velocity component or first moment maps of the RAMPS data are indicated by dot-dashed and dotted lines below the spectra. The right-hand image shows the RAMPS NH 3 (1,1) first moment (intensity weighted coordinate) map from Hogge et al. (2018), with the same source outlines overlaid. In the Table, details of the 13 CO-derived velocity ( CO ), initial velocity flag, the S/N of the 13 CO (1-0) spectrum, and the source velocities derived from the RAMPS NH 3 (1,1) first moment ( m1 ) and first velocity component ( c1 ), are given for each source. In each case, an asterisk denotes the adopted velocity, cen . cent), the assignment is considered to be robust and assigned a flag value 3; where the standard deviation is more than 5 km s −1 , the assignment is considered to be poor and assigned a flag value of 1; cases between those limits are assigned an intermediate robustness flag value of 2. In practice, these flags allow the most robust velocities -where there is a single dominant peak -to be isolated from cases with either a low S/N or from spectra in which there are multiple velocity components of similar strength. In this initial velocity assignment, we record 1680 (69 per cent) of sources with a robustness flag of 3, 305 (12 per cent) sources with an intermediate robustness flag, and 461 (19 per cent) with a poor robustness flag. Following this initial velocity assignment, we make further velocity estimates by using the more appropriate -but limited in coverage -data of the NH 3 (1,1) inversion transition from the RAMPS survey. The RAMPS pilot study (Hogge et al. 2018) presented, as data products, maps of source velocities derived from a line-fitting procedure, as well maps of the first moment, with the latter having a greater extent, but the former being more accurate. For each GASTON source, and for each RAMPS velocity map, we record the mean velocity of pixels that fall within the source area. After excluding measurements where either the mean error (in the case of the fitted first components) or the standard deviation on the mean velocities (in the case of the first moments) are greater than 5 km s −1 , we record a total of 523 first-component velocities and 1097 first moment velocities. Where available, we replace the 13 CO-derived cen values with their NH 3 -derived counterparts, preferring the line-fitted measurements over the first moment-derived velocities, and adopt a robustness flag of 3. At this stage, the cen assignments are made up of 1407 (58 per cent) 13 CO-derived velocities, 516 (21 per cent) with NH 3 first moment-derived velocities, and the remaining 523 (21 per cent) with velocities derived from the NH 3 fitted first velocity components. There are 1913 (78 per cent), 174 (7 per cent) and 359 (15 per cent) of cen assignments with quality flags of 3, 2, and 1, respectively. 888 (85 per cent) of the NH 3 -derived velocities agree with their initial 13 CO-derived assignment to better than 5 km s −1 , indicating that the initial assignments are generally reliable.
In Fig. C1, we illustrate the method used in Sect. 3.3 for the initial velocity assignments for each of the sources identified in Sect. 3.2. Where possible, we also display the velocity derived from the available RAMPS data. At this point, we refer the reader back to the main text of Sect. 3.3 for a description of how the emission structures are used to refine the individual velocity measurements.

APPENDIX D: LINE CONTAMINATION
The 1.15 mm NIKA2 band spans a range in frequency from 210-300 GHz (taking these edges as 20 per cent transmission), and as such, the continuum may contain important contributions from the CO (2-1) line at 230.538 GHz, especially for observations of the inner