The SEDIGISM survey: first data release and overview of the Galactic structure

The SEDIGISM (Structure, Excitation and Dynamics of the Inner Galactic Interstellar Medium) survey used the APEX telescope to map 84 deg^2 of the Galactic plane between l = -60 deg and l = +31 deg in several molecular transitions, including 13CO(2-1) and C18O(2-1), thus probing the moderately dense (~10^3 cm^-3) component of the interstellar medium. With an angular resolution of 30'' and a typical 1-sigma sensitivity of 0.8-1.0 K at 0.25 km/s velocity resolution, it gives access to a wide range of structures, from individual star-forming clumps to giant molecular clouds and complexes. The coverage includes a good fraction of the first and fourth Galactic quadrants, allowing us to constrain the large scale distribution of cold molecular gas in the inner Galaxy. In this paper we provide an updated overview of the full survey and the data reduction procedures used. We also assess the quality of these data and describe the data products that are being made publicly available as part of this first data release (DR1). We present integrated maps and position-velocity maps of the molecular gas and use these to investigate the correlation between the molecular gas and the large scale structural features of the Milky Way such as the spiral arms, Galactic bar and Galactic centre. We find that approximately 60 per cent of the molecular gas is associated with the spiral arms and these appear as strong intensity peaks in the derived Galactocentric distribution. We also find strong peaks in intensity at specific longitudes that correspond to the Galactic centre and well known star forming complexes, revealing that the 13CO emission is concentrated in a small number of complexes rather than evenly distributed along spiral arms.

(sub-)millimetre (ATLASGAL, Schuller et al. 2009, BGPS, Aguirre et al. 2011, JPS, Moore et al. 2015Eden et al. 2017), and radio range (CORNISH, Hoare et al. 2012;Purcell et al. 2013, THOR, Beuther et al. 2016Wang et al. 2020, GLOSTAR, Medina et al. 2019. These continuum surveys have been complemented by spectral line surveys, which are essential for estimating distances and determining physical properties. Some examples include the 12 CO (1-0) survey from Dame et al. (2001), the CO Boston University-Five College Radio Astronomy Observatory Galactic Ring Survey (GRS, Jackson et al. 2006), the Census of High-and Medium-mass Protostars (CHaMP, Barnes et al. 2011), the CO High-Resolution Survey (COHRS, Dempsey    Rigby et al. 2016Rigby et al. , 2019, the FOREST unbiased Galactic plane imaging survey with Nobeyama (FUGIN, Umemoto et al. 2017), the Mopra Southern Galactic Plane CO Survey (Burton et al. 2013;Braiding et al. 2018), the Three-mm Ultimate Mopra Milky Way Survey (ThrUMMS, Barnes et al. 2015), the Milky Way Imaging Scroll Painting (MWISP, Su et al. 2019), and the Forgotten Quandrant Survey (Benedettini et al. 2020). In Table 1 and Fig. 1, we present a summary of these recently completed CO surveys and show their coverage on a schematic top-down view of the Milky Way.
This wealth of data has greatly enhanced our view of the Galactic structure and its major components. However, there is still no consensus on the exact structure of the Galaxy. For instance, it is not firmly established how many spiral arms are present and what is their exact location (e.g. Taylor & Cordes 1993;Reid et al. 2014Reid et al. , 2016Vallée 2017;Drimmel 2000;Siebert et al. 2011;García et al. 2014;Gaia Collaboration et al. 2018), nor what is the exact size and orientation angle of the central bar (e.g. Bissantz et al. 2003;Pettitt et al. 2014;Li et al. 2016b) -thus making it difficult to pin-point and study the large-scale distribution of molecular gas in the Galaxy.
Progress is also being made in the characterisation of the earliest phases of (high-mass) star formation (e.g. Urquhart et al. 2013aUrquhart et al. ,b, 2014bCsengeri et al. 2014Csengeri et al. , 2017Traficante et al. 2017;Elia et al. 2017;Urquhart et al. 2018;Pitts et al. 2019), but the role of large scale structures and the interplay between the various phases of the interstellar medium (ISM) are still not well constrained. Key questions remain, that are also relevant to the study of star formation in external galaxies, such as: what role do the spiral arms play in the formation of molecular clouds and star formation, and what controls the star formation efficiency (SFE). Observations of nearby spiral galaxies have revealed a tight correlation between dense molecular gas and enhancements of star formation activity within spiral arms (e.g. Leroy et al. 2017). Also in the Milky Way, it is clear that spiral arms are rich in molecular gas. For example, recent results from the THOR survey reveal an increase by a factor 6 of atomic to molecular gas ratio from the arms to inter-arm regions. However, it is not clear whether this is due to the collection of molecular clouds that fall into their gravitational potential (e.g. Foyle et al. 2010), or if the molecular gas forms within the spiral arms themselves. Furthermore, it is unclear if the enhanced star forming activity observed (Urquhart et al. 2014a) is directly attributable to the presence of spiral arms or is simply the result of source crowding within the arms ).
The SFE is the result of a number of stages: the conversion of neutral gas to molecular clouds, then to dense, potentially star-forming clumps, and finally to proto-stars and young stellar objects. Each of these stages has its own conversion efficiency. Identifying the stage that is primarily affected by the environment could bring constraints on the dominant SF-regulating mechanism. Some progress has been made recently to investigate the effects of environment on the SFE in our Galaxy, based on limited samples of objects (Eden et al. 2012;Moore et al. 2012;Longmore et al. 2013;Ragan et al. 2018). In order to extend these studies to much larger samples, we have performed a large scale (∼84 deg 2 ) spectroscopic survey of the inner Galactic disc: the SEDIGISM (Structure, Excitation and Dynamics of the Inner Galactic Interstellar Medium) survey. The spectroscopic data provide essential information on the distribution of interstellar matter along the line of sight, thus complementing the existing continuum surveys. These data allow us to achieve an unbiased view of the moderately dense ISM over a large fraction of the Galactic disc. The SEDIGISM survey covers a large portion of the fourth quadrant at high velocity and angular resolution and will make a significant contribution to our understanding of Galactic structure. The survey has been described in Schuller et al. (2017, hereafter Paper I). It consists of spectroscopic data covering the inner Galactic plane in the frequency range 217−221 GHz, which includes the 13 CO (2 -1) and C 18 O (2 -1) molecular lines, at 30 arcsec angular resolution. Thus, this survey complements the other spectroscopic surveys that have been previously mentioned. This is the first of three papers that describe the survey data and present the initial results. In the present paper, we provide an overview of the survey data and a first look at the connection between the molecular gas and large scale structural features of the Galaxy (we will refer to this as Paper II). In the accompanying papers, we present a catalogue of giant molecular clouds (GMCs) and investigate their properties with respect to their star formation activity and their Galactic distribution (Duarte-Cabral et al. submitted; hereafter Paper III); and we investigate the dense gas fraction and star formation efficiency as a function of Galactic position (Urquhart et al. submitted; hereafter Paper IV).
The structure of this paper is as follows: we describe the SEDIGISM observations and data quality in Sect. 2. We present the large scale distribution of 13 CO and C 18 O in Sect. 3 and investigate the association between molecular gas and spiral arms. We discuss some interesting regions in Sect. 4, and demonstrate the usability of other molecular transitions within the spectral range covered by the data for scientific exploitation in Sect. 5. Finally, we summarise our conclusions in Sect. 6.

Observations
Observations were done with the 12 m diameter Atacama Pathfinder Experiment telescope (APEX, Güsten et al. 2006), located at 5100 m altitude on Llano de Chajnantor, in Chile. The observational setup and observing strategy have been described in Paper I and the key observational parameters are summarised in Table 2; here we provide a brief overview of the most important features of this survey.
The prime target lines are 13 CO (2 -1) and C 18 O (2 -1) but the 4 GHz instantaneous bandwidth also includes a number of transitions from other species (H 2 CO, CH 3 OH, SO, SO 2 , HNCO, HC 3 N, SiO). The observations have been carried out in tiles of 0.25 • ×0.50 • using position-switching in the on-the-fly mapping mode. Each position in the survey is covered by at least two maps observed in orthogo-nal scanning directions, along galactic longitude and latitude. The 0.25 • ×0.50 • tiles oriented along ℓ or were sometimes observed under different conditions. As a result of this plaiting, only 0.25 • ×0.25 • sub-cubes were observed with roughly constant conditions and show a uniform noise level, as visible in Fig. 2. Some fields were observed in 0.5 • × 0.5 • maps at the beginning of the survey, also with two orthogonal scanning directions.
The reference positions for each field were selected to be ±1.5 • off the Galactic mid-plane to avoid contamination, and while this was sufficient to ensure the C 18 O (2 -1) data were clean, this was not always the case for the brighter 13 CO (2 -1) transition. Therefore, we have systematically performed pointed observations towards the references points, using an off position further from the Galactic plane. More details can be found in Appendix A and in Table A1 2 .
The full survey coverage is ∼84 deg 2 (cf. Fig. 2): the main part of the survey, as described in Paper I, covers 300 • ≤ ℓ ≤ +18 • , with | | ≤ 0.5 • (78 deg 2 ). Because additional observing time was available for this project in 2016 and 2017, we were able to slightly increase the survey coverage. We have extended the coverage in latitude to | | ≤ 1 • around the Galactic Centre (358 • ≤ ℓ ≤ +1.5 • ) and mapped a 2 deg 2 region covering the extreme star forming region W43 (+29 • ≤ ℓ ≤ +31 • , with | | ≤ 0.5 • ). We have also increased the latitude coverage up to +0.75 • at ℓ = 348 • to cover RCW 120, and down to −0.75 • for 338 • ≤ ℓ ≤ 339 • to improve the coverage of the Nessie giant filament (Jackson et al. 2010). The data taken towards W43 allow for a comparison with existing surveys in the northern hemisphere, such as the HERO survey performed with the IRAM 30-m telescope in the same spectral lines (Carlhoff et al. 2013). Here we present all the data that have been taken with APEX for this survey between 2013 and 2017.

Data reduction
The data provided by the APEX telescope consist of spectra calibrated in antenna temperature scale ( ★ A ), written in files readable by the CLASS software from the GILDAS package 3 . We have developed a dedicated pipeline in GILDAS/CLASS, which consists of standard data reduction steps, such as conversion to mb scale (i.e. mb = * A / mb ), spectral resampling, removing a spectral baseline, and gridding the spectra onto a data cube. In particular, a critical step consists in an automatic detection of emission features in order to define the windows to be masked when subtracting baselines; this is described in detail in Paper I.
As previously mentioned, in some cases the initial off-position was found to contain emission that would contaminate our maps and so these were checked against a more distant position, one degree further away perpendicular to the Galactic plane, and reduced independently. Emission in the reference position appears as an absorption feature that is constant over the extent of a given map. When coinciding with the velocity range of real emission in the map, it leads to underestimating the gas column density, and could also impact the observed velocity pattern of the emission. Where the reference position has been found to show emission, we corrected for this by adding the spectrum measured for the reference position to each spectrum in the map. This operation is illustrated in Fig. A1 in Appendix A. Since the rms noise measured on these spectra was typically around  Absorption feature at −6 km s −1 from the reference position 0.07 K (see Table A1), this operation does not significantly increase the noise in the map data. The pipeline used for the current first data release (DR1) is, therefore, almost identical to the procedure described in Paper I. However, an additional, non-standard data processing step was necessary; we noticed that when an on-the-fly map and the corresponding reference position were observed on different days, the Doppler corrections applied to the reference position and to the map data could differ by up to three velocity channels (0.75 km s −1 ). Therefore, to properly account for the absorption feature in the spectra where necessary, we computed this difference between the Doppler corrections computed for the map centre and for the reference position observed at a different time. After shifting the observation of the reference spectrum accordingly, we then added the modified reference spectrum to each position of the map. This step was only necessary where the reference position has been found to show significant emission (see Table 3 for details).

Data quality
The output of the pipeline is a set of data-cubes calibrated to the mb scale, centred on the most relevant spectral lines possibly detected in the band-pass, and projected on 9.5 pixels. The velocity resolution in the final cubes is 0.25 km s −1 for the 13 CO (2 -1) and C 18 O (2 -1) lines, and 0.5 km s −1 for all the other transitions: H 2 CO(3 0,3 − 2 0,2 ) and (3 2,1 − 2 2,0 ), HC 3 N(24 − 23), SO(6 5 − 5 4 ), SiO(5 − 4), HNCO(10 0,10 − 9 0,9 ), and CH 3 OH (4 2,0 − 3 1,0 ). Due to variations in weather conditions, the elevation of the telescope during observations, and the performance of the instrument, the local noise level rms varies slightly over the survey area, as illustrated in Fig. 2. For each pixel in each 13 CO (2 -1) data cube, rms is computed as the standard deviation of the signal in the first 50 spectral channels (i.e. −200 km s −1 < lsr < −187.5 km s −1 ), as this part of the velocity range is almost always devoid of emission, except possibly for some small regions near the Galactic centre. In Fig. 3 we show the distribution of the median noise values for all of the individual 0.25 • × 0.25 • sub-cubes, which shows that the range of noise values is between 0.5-1.6 K ( mb ), with most fields (61 per cent) showing a noise below 1.0 K; the median and mean values are 0.94 and 0.96 K respectively. While the sensitivity achieved is sufficient to map the distribution of the relatively bright 13 CO (2 -1) line emission, we are only able to detect C 18 O (2 -1) towards the densest regions. Transitions from other species are likely to be detected only towards the brightest and most compact dense cores.
Another issue worth highlighting is that subtracting baselines in regions with bright, broad emission lines is known to be very arduous and error-prone, particularly when the baseline exhibits variations over a velocity range comparable to the line-width. The central region of the Galaxy is clearly the most extreme case in that respect, and the data for this region should be used with particular caution. In addition, there are a few other regions that are affected by baseline ripples and absorption features that we have not been able to fully remove by adding the spectrum of the corresponding reference position. We identify these regions in Table 3 and recommend extreme caution when using data for these fields. However, this issue only emerges when averaging spectra over large regions; correction from the reference position to the individual spectra does not show such artefacts. Finally, the C 18 O (2 -1) data shows a spike at v lsr ∼ −48 km s −1 that appears in a single channel with a varying intensity over all fields. We have removed this spike from the indi-vidual spectra using a sigma-clipping method, however, an artefact may still show up when averaging the spectra over large areas.
In order to check the consistency of our calibration and the data quality, we have compared the SEDIGISM data for the W43 complex with the W43-HERO survey (Carlhoff et al. 2013), which covered a good fraction of the 2 deg 2 field centred at ℓ = 30 • in the same transitions as SEDIGISM, 13 CO (2 -1) and C 18 O (2 -1), with the IRAM 30 m telescope. The results from this comparison show no systematic differences between the two data sets for the 13 CO (2 -1) line and indicate that the calibration is consistent between the two surveys, although the distribution of the 13 CO emission reveals some minor differences towards the brightest areas. Since the C 18 O SEDIGISM data is only tracing the most compact, dense clumps but is not sensitive to the more diffuse, extended material, no statistical comparison is possible (see Appendix B for more details).

Public data release
The reduced data, processed with the current version of the dedicated pipeline, is now available to the community. This first public data release (DR1) consists of 78 cubes in 13 CO (2 -1) and 78 cubes in C 18 O (2 -1), where each cube covers 2 • in longitude over ±0.5 • in latitude (or more in the few regions with extended coverage, see § 2.1). Cubes are separated by 1 • in longitude, providing 1 • overlap between adjacent cubes. The pixel size is 9.5 , the velocity resolution is 0.25 km s −1 , and the velocity range covers −200 to +200 km s −1 . The DR1 data can be downloaded from a server hosted at the MPIfR in Bonn (Germany). 4 The 4 GHz of instantaneous bandwidth of the spectral tuning used to observe SEDIGISM includes several transitions from other molecules (see Table 1 in Paper I). In particular, six lines are detected when spatially averaging the data corresponding to the brightest emission regions in 13 CO (2 -1); this will be discussed in more detail in Sect. 5. Therefore, we also provide data cubes covering these transitions as part of the DR1, where the spectra have been smoothed to 0.5 km s −1 velocity resolution in order to increase the signal-tonoise ratio. The velocity range also covers from −200 to +200 km s −1 for all lines, except for the SiO (5-4) transition, which is located near the edge of the spectral set-up so that only the −200 to +150 km s −1 velocity range is available.
In Paper III, we have extracted ∼11,000 molecular clouds and complexes from the 13 CO (2 -1) data from this data release, using the Spectral Clustering for Interstellar Molecular Emission Segmentation (SCIMES, Colombo et al. 2015) algorithm. The catalogue with the distance estimates and the derived physical properties for 4 https://sedigism.mpifr-bonn.mpg.de all clouds, as derived in Paper III, as well as the masks representing these clouds in the 13 CO (2 -1) cubes are also made publicly available alongside the main data release we present here.
Some work is still ongoing aimed at improving the data quality and at solving known issues, such as artefacts for some transitions or a proper estimate of the baselines in regions with complex, extended emission. We plan to provide data cubes with improved quality as part of future data releases.

GLOBAL DISTRIBUTION OF MOLECULAR GAS
In this section, we use the 13 CO (2 -1) and C 18 O (2 -1) data to constrain the Galactic structure on the largest scale. Some peculiar regions will be discussed in more detail in Sect. 4, while compact objects and individual molecular clouds are the topic of subsequent papers.

Integrated emission
We show maps of the 13 CO emission integrated over the ±200 km s −1 velocity range for the full survey in Fig. 4. These maps reveal that the emission from molecular gas is extended over much of the inner part of the survey region (330 • < ℓ < 15 • ) but becomes much more patchy at larger angular distances from the Galactic centre. The brightest emission is associated with the Central Molecular Zone (CMZ; Morris & Serabyn 1996), that extends over |ℓ| ≤ 1.5 • or a radius of ∼200 pc. Outside of the Galactic centre the brightest emission is concentrated in distinct regions, all associated with prominent star forming regions.
Our latitude coverage is rather narrow: only ±0.5 • in most directions. By comparing with other surveys with larger latitude coverage like ThrUMMS or ATLASGAL, it is clear that we miss a number of molecular clouds and complexes, especially the nearby ones (see also Alves et al. 2020, regarding nearby clouds in the outer Galaxy). Also a few known complexes up to ∼3 kpc are not included, e.g. NGC 6334 and NGC 6357. Another exception is for gas lying towards the longitude range of 300 • < ℓ < 318 • , and a Galactocentric distance of 8 kpc and beyond, where the Galactic plane descends below a latitude of < −0.5 • due to the Galactic warp (e.g. Chen et al. 2019;Romero-Gómez et al. 2019), making this area of the Galaxy not well covered by our survey. However, the area covered by SEDIGISM still encompasses the vast majority of the molecular gas in the inner Galaxy; Rigby et al. (2016) also concluded from a comparison of the CHIMPS data (with the same latitude coverage as SEDIGISM) with the GRS survey (| | < 1 • ) that the molecular line emission drops off quickly with distance from the mid-plane.

A four-spiral-arm model
To facilitate the discussion that follows concerning the distribution of molecular material in the Milky Way, Fig. 5 presents a schematic plot of the Galaxy as seen from the northern Galactic pole that includes the spiral arms and the Galactic bar. We have chosen to use the spiral arm loci derived by Taylor & Cordes (1993) and updated by Cordes (2004) as these have been determined independently of the distribution of molecular gas unlike the loci used in the recent work by Reid et al. (2019), which have been fitted by-hand to the 12   due to a lack of reliable maser parallax distances. A single galactic bar is shown for illustrative purposes, with orientation and length scales in line with contemporary measurements (Bland-Hawthorn & Gerhard 2016). The near and far 3 kpc arms are not included in the Cordes (2004) model and have been added in as small, 2-fold symmetrical arm segments (Dame & Thaddeus 2008) with pitch angles of 1.5 • expanding radially at a velocity of 55 km s −1 , with the near 3 kpc arm aligning with the arm segment from Bronfman et al. (2000) (the exact nature of these features is still somewhat unknown, see Green et al. 2011). There is some evidence that the far-3 kpc arm is expanding slightly faster, but this is inconsequential for our analysis as there is effectively no emission seen in this region in the SEDIGISM data, due to the limited sensitivity. Fig. 5 shows that the SEDIGISM survey (indicated by the grey shading) covers large parts of three of the main spiral arms (Norma, Sagittarius, and Scutum-Centaurus arms), and almost all of the 3 kpc arms, thus allowing us to refine our understanding of the structure of the Galaxy.  Taylor & Cordes (1993) and updated by Cordes (2004), with an additional bisymmetric pair of arm segments added to represent the 3 kpc arms. The grey shaded areas indicate the regions covered by the SEDIGISM survey. The star shows the position of the Sun and the numbers identify the Galactic quadrants. The bar feature is merely illustrative and does not play a role in our analysis. The smaller slice in the first quadrant corresponds to the W43 region.

SEDIGISM ℓ maps
In the upper and middle panels of Fig. 6 we present a longitudevelocity map of the 13 CO (2 -1) and C 18 O (2 -1) transitions produced by integrating the emission between | | < 0.5 • . Only voxels above a 3 rms threshold were considered to produce this ℓ map, where the local noise rms is estimated as discussed above (Sect. 2.3). While much of the complex emission seen in the ℓ map ( Fig. 4) is the result of many giant molecular clouds being blended along our line of sight across the inner Galactic disc, we find that these clouds are well separated in velocity, making it easier to break down the emission into distinct molecular structures. It is clear from these maps that while the 13 CO (2 -1) is detected over a wide range of velocities, the C 18 O (2 -1) emission is much less extended and is likely to only be tracing dense clumps, which cannot be easily detected beyond a few kpc due to beam dilution. But even if the C 18 O (2 -1) is less useful for studies of large scale structures, it is indispensable for detailed studies of the physical properties of dense structures such as filaments (Mattern et al. 2018) and clumps (Paper IV). On the 13 CO (2 -1) ℓ -map shown in Fig. 6, we also overlay the spiral arm loci derived by Taylor & Cordes (1993) and Cordes (2004). The and positions given by Taylor and Cordes have been converted to ℓ and using a three-component rotation curve (bulge + disc + dark halo) tailored to the data of Eilers et al. (2019). The shape of the rotation curve towards the Galactic centre is somewhat uncertain, and so we simply adopt a steeply rising bulge component matching that adopted in Eilers et al. (2019, Fig. 3). The Reid et al. (2019) values for solar position and circular velocity at the orbit of the Sun, 8.15 kpc and 236 km s −1 , are used for projection into line-of-sight velocity, simply assuming pure circular rotation. Comparing the CO emission to the loci of the spiral arms we find very good agreement outside the Galactic centre region (|ℓ| ≤ 5 • ). The only significant region of emission that is not closely associated with a spiral arm is the CMZ, but this is known to have extreme non-circular velocities (this region is discussed in more detail in Sect. 4). We also note that the majority of the CO emission is located within the solar circle (i.e. v lsr < 0 for ℓ > 300 • and v lsr > 0 for ℓ > 0 • ) and so there is very little emission seen towards the far parts of the Perseus, Sagittarius, or Scutum-Centaurus arm. This is likely the result of the sensitivity limit of the survey and beam dilution, and this means that probably we are only able to detect the most massive clouds outside the solar circle on the far-side of the Galaxy.
Our simple projection of arms into ℓ space assumes purely circular motions for the primary arms, thus will not perfectly align with structures like the Norma Arm in the inner galaxy (appearing to move towards us with a v lsr of roughly -30 km s −1 , Sanna et al. 2014). The response of gas to spiral arms alone creates non-circular motions, forming some peculiar features towards the inner Galaxy seen in ℓ space (e.g. Gómez & Cox 2004;Pettitt et al. 2015). Our modern, -era understanding of the Galactic bar also suggests slower pattern speeds than assumed in earlier works, which place corotation as far out as 6 kpc (Sanders et al. 2019;Bovy et al. 2019). The ISM responds strongly to the motion of the bar out to corotation, and even as far as the more distant Outer Lindblad Resonance for certain models of bars (Sormani et al. 2015;Pettitt et al. 2020). Any spiral arm-like features are thus inherently coupled to the bar within at least corotation and more sophisticated modelling is required to fully understand the kinematics of the gas.
In the lower panel of Fig. 6 we show the total 13 CO (2 -1) and C 18 O (2 -1) emission as a function of Galactic longitude; the 13 CO (2 -1) emission profile reveals a number of significant peaks, the most prominent of which is associated with the Galactic centre region. Many of the others are associated with well known starforming complexes such as W33, G333 and G305; these are located at ℓ = 13 • , 333 • and 305 • , respectively. The integrated C 18 O (2 -1) emission also reveals peaks that are correlated with the same starforming regions indicating these regions have either higher optical depth and column densities than elsewhere in the Galactic plane, or they contain enough gas at high temperature to produce strong emission in the J=2-1 lines.
Analysis of the dense gas traced by the ATLASGAL survey by Urquhart et al. (2018) has shown that approximately 50 per cent of the current star formation in the disc of the inner Galaxy is taking place in a relatively small number of very active regions (∼30). Indeed nearly all of the peaks seen in the lower panel of Fig. 6 are associated with one of these regions (see also Urquhart et al. 2014a). If we define all emission above a value of 10 per cent of the highest peak value (at ℓ ∼ 0) as being associated with a complex, we find that they are responsible for ∼70 per cent of all the emission. The 13 CO emission traces the lower density diffuse gas in which the dense clumps are embedded and thus allow us to probe the structure, kinematics, and physical properties of these regions. This highlights the survey's ability to conduct detailed studies of the molecular gas associated with some of the most intense star formation regions in the Galaxy, and put them in a global setting with respect to the large scale structural features of the Galaxy.
In Fig. 7 we show channel maps towards W33, a large complex representative of high-mass star forming regions in the Galactic disc. According to maser parallax measurements, this complex is located in the Scutum-Centaurus arm at a distance of 2.4 kpc (Immer et al. 2013). In the upper-left panel of this figure we show the integrated emission over the entire velocity range where emission is detected Figure 6. Galactic longitude-velocity distribution of the SEDIGISM survey between 300 • < ℓ < 18 • . The greyscale image shows the distribution of molecular gas as traced by the integrated 13 CO (2 -1) and C 18 O (2 -1) emission (upper and middle panels). To emphasis the weaker extended emission we have used a log scale and have masked the emission below 3 . The intensity in mb scale has been integrated over the ±0.5 • range in Galactic latitude. The location of the spiral arms are shown as curved dotted-dashed lines, coloured to identify the individual arms; colours are as shown in Fig. 5. For the C 18 O (2 -1) line, the values of a horizontal row of three pixels centred on −48.5 km s −1 have been set to zero due to the presence of a spike that appears at this velocity when large areas are integrated together (for more details see Sect. 2). Lower panel: Integrated 13 CO (2 -1) and C 18 O (2 -1) intensity as a function of Galactic longitude (black and red respectively). The intensities have been integrated over the ±200 km s −1 in v lsr and the ±0.5 • range in latitude for each longitude. The flux scale has been normalised to the peak intensity of the 13 CO (2 -1) emission. The C 18 O (2 -1) spectrum has been multiplied by 5 and an offset of −0.2 has been applied to make the profile clearer. (0-60 km s −1 ). Each of the subsequent panels shows a channel map where the emission has been integrated over 6 km s −1 in velocity. These maps reveal that the 13 CO (2 -1) emission seen towards this region consists of dense clumps, diffuse larger clouds and numerous filamentary structures spread out over 60 km s −1 . It is worth noting that there is a wealth of intricate features that emerge in individual channel maps, but that do not appear or are washed out in the integrated intensity map. This also implies that, even if most of the molecular gas is associated with a few major complexes as discussed above, there are plenty of other smaller features detected in the SEDIGISM data that are not associated with known complexes.

Correlation between molecular gas and spiral arms
The ℓ -map presented in Fig. 6 clearly shows that the molecular gas is broadly correlated with the spiral arms. To properly map the distribution of the molecular gas across the disc requires determining distances, which is beyond the scope of the current paper but is discussed in the accompanying paper by Duarte-Cabral et al. (Paper III). However, it is possible to examine the intensity distribution as a function of the Galactocentric distance. This is accomplished by calculating the kinematic distance for each pixel in the ℓ -map above a 3 rms threshold using the three-component rotation curve of Although this produces two distances for sources located within the solar circle (i.e. with a Galactocentric radius gc < 8.15 kpc) equally spaced on either side of the tangent distance (referred to as the near and far distances) it provides a unique distance from the Galactic Centre which, therefore, allows us to investigate the distribution of the integrated intensity as a function of Galactocentric distance. Given that the spiral structure is different in the 1 st and 4 th quadrants, and that the SEDIGISM survey has only covered a small portion of the 1 st quadrant, we have restricted this analysis to the 4 th quadrant. We have also excluded the Galactic Centre region (|ℓ| < 10 • ) as kinematic distances are unreliable in this part of the Galaxy. Even outside this region, this approach cannot provide very accurate distances because of non-circular motions that deviate from the rotation curve, but it allows us to roughly estimate the fraction of molecular gas that is associated with the spiral arms.
In Fig. 8, we show the integrated 13 CO (2 -1) intensity as a function of Galactocentric distance, normalised to the peak of the dis-tribution. This plot shows that the emission from the molecular gas is highly structured with strong peaks seen at approximately 4.25, 5, 5.5 and 6.5 kpc; the first of these roughly corresponds to the tangent with the Norma arm, the second and third correspond to the far side of the long-bar where it intersects with the Perseus arm (Bland-Hawthorn & Gerhard 2016), and the fourth with the tangent of the Scutum-Centaurus arm. The vast majority of the emission is contained between 4 and 7.5 kpc. The lack of emission below 2 kpc is due to the restricted longitude range selected for this analysis, while the lack of emission at distances greater than 8 kpc likely reflects the poor sensitivity to molecular material on the far-side of the solar circle and beyond, where beam dilution certainly plays an important role. This thick emission zone is analogous to the thick ring of material seen in the 1 st quadrant (often referred to as the 5 kpc molecular ring), but as pointed out by Jackson et al. (2006), is likely to arise from a complicated combination of column density and velocity fields and may not actually represent a real ringlike structure (see also Dobbs & Burkert 2012). The highly structured nature of our 13 CO (2 -1) emission further lends support for a 4-arm model of the Galaxy (Urquhart et al. 2014a), which can nevertheless co-exist with a ringlike structure.
In order to quantify what fraction of the total emission is associated with the spiral arms we have calculated the minimum offset from the arms for each pixel on the ℓ -map above 3 rms . In Fig. 9 we show the cumulative distribution of the integrated 13 CO (2 -1) emission as a function of velocity offset from the spiral arm loci shown in the upper panel of Fig. 6. When performing the matching of the pixels with the spiral arms we allowed for a variation of ±0.5 • in Galactic longitude as the spiral arm tangents are not well constrained. We consider pixels within Δ < 10 km s −1 , which is of the order of the amplitude of streaming motions around the spiral arms (∼7-10 km s −1 ;Burton 1971;Stark & Brand 1989;Reid et al. 2009), to be associated with a spiral arm. This plot reveals that approximately 60 per cent of the molecular emission is closely associated with a spiral arm. This proportion is a little lower than the value of 80 per cent derived by Urquhart et al. (2018) from a similar analysis of GRS clouds identified by Rathborne et al. (2009). However, as pointed out by Roman-Duval et al. (2009) only approximately two-thirds of the emission in the GRS was accounted for in the source extraction with diffuse emission below the detection threshold accounting for the rest. The strong correlation we have found between the 13 CO emission and the spiral arms is consistent with the findings of Roman-Duval et al. (2009) and Rigby et al. (2016). Nevertheless, this analysis also indicates that a significant amount of molecular gas (up to 40 per cent) is located in the inter-arm regions.

NOTEWORTHY REGIONS
Although it is clear that the majority of the 13 CO emission outside the CMZ is closely associated with the spiral arms, there are a number of interesting features seen in the ℓ map that are worth discussing in some detail.

Galactic Centre
In Fig. 10 we show the ℓ -map of the Galactic Centre region. The emission in this region can be separated into two distinct types: a narrow horizontal strip of emission centred around v lsr = 0 that stretches across the whole map, and a region of more complex emission between 359 • < ℓ < 2 • and velocities −150 km s −1 < v lsr < 150 km s −1 . The horizontal strip around v lsr = 0 is the result of foreground and Figure 9. Cumulative integrated 13 CO (2 -1) intensity as a function of velocity offset from the nearest spiral arm for longitudes between 300 • < ℓ < 350 • . The velocity offsets have been calculated by finding the minimum velocity difference to a spiral arm for each pixel in the ℓ map with a flux above 3 .
background emission within the Galactic disc, while the CMZ itself is responsible for emission over a large range of v lsr . The CMZ is a peculiar region of the inner Galaxy that includes a number of large molecular complexes such as Sagittarius A (ℓ = 0, v lsr = 50 km s −1 ), Sagittarius B (ℓ = 0.6 • , v lsr = 50 km s −1 ), Sagittarius C (ℓ = 359.3 • , v lsr < 0 km s −1 ) and Sagittarius D (ℓ = 0.9, v lsr = 80 km s −1 ), each covering more than 10 arcmin 2 ; these molecular complexes are labelled in Fig. 10. This map nicely shows the complex kinematics in the Galactic centre region, in particular the presence of non-circular motions and gas emission at forbidden velocities (negative for ℓ > 0 • and positive for ℓ < 0 • ; Riquelme et al. 2010 and references therein).
In addition to these two large-scale features, we can also see some finer scale detail such as the narrow absorption features at ℓ = 359.5 • with v lsr −50 km s −1 , −30 km s −1 and 0 km s −1 ; these features are due to absorption of the strong emission emanating from the hot gas in the CMZ by the colder foreground segments of the 3 kpc, Norma, and Sagittarius arms (previously observed in HCO + and HCN; e.g. Fukui et al. 1977Fukui et al. , 1980Linke et al. 1981;Riquelme et al. 2010). On this plot we have also overlaid the loci of the 3 kpc arms. Comparing the molecular emission with the loci of the near 3 kpc arm (indicated by the lower dashed-dotted yellow line shown in Fig. 10), we find good agreement with the absorption feature seen at ℓ = 359.5 • and −50 km s −1 , which we have already attributed to this arm. We also see some association with molecular emission along its length, although this emission is weak and rather sporadic. It is also interesting to note the velocity of the absorption feature associated with the Norma arm (∼ −30 km s −1 ), while the model loci of this arm pass very close to v lsr = 0 at this longitude.

Bania molecular clouds
In Fig. 11 we show the Bania complex of molecular clouds (Bania 1977;Bania et al. 1986), which consist of three large (40-100 pc) distinct molecular complexes located in a narrow longitude range close to the Galactic centre (ℓ between 354.5 and 355.5 • ) with v lsr velocities of 68, 85 and 100 km s −1 . Adopting the nomenclature from the original papers, these are known as Clump 4, Clump 3 and Clump 1, Figure 10. Longitude-velocity map of the Galactic centre region. The lower and upper yellow dashed-dotted lines show the loci of the near and far 3 kpc arms respectively (see text for details). On this map we label some of the more significant molecular clouds and absorption features that have been attributed to foreground spiral arms; some of these are discussed in the text. Figure 11. Longitude-velocity map of the Bania Clouds. The features described in the text are labelled following the nomenclature used by Bania (1977). as labelled in Fig. 11 (the reference Clump 2 was given to another object located at ℓ 3 • ). This region is unusual in that it is associated with velocities that are forbidden by Galactic rotation models. Even if this complex was located outside the solar circle on the far-side of the Galaxy at a distance of 40 kpc, the maximum velocity that we would expect in this direction would be ∼16 km s −1 (Burton & Gordon 1978).
These clouds were originally mapped in 12 CO (1-0) where all of the clouds were detected with good signal to noise (Bania 1980(Bania , 1986. However, in the less abundant 13 CO (2 -1) tracer, Clump 3 and Clump 4 are only weakly detected. Clump 1 is much brighter and appears to be elongated, extending over 1 • in longitude. This cloud is also the only one of the three that is associated with an H region (G354.67+0.25, Caswell & Haynes 1982), which is located at the western edge of the cloud. Bania (1986) suggested that this complex could be associated with a feature that he refers to as the 135 km s −1 arm, which can be reproduced by a Galactocentric ring of material with a radius of 3 kpc rotating at a velocity of 222 km s −1 and expanding from the Galactic centre at a speed of 135 km s −1 . Clump 1 is located at the southern terminus of this structure, at a distance of 11.4 kpc. This large scale structure is not seen in our ℓ -map but is clearly seen in the ℓ -map of Dame et al. (2001) (see Fig. 2 from Jones et al. 2013). However, the nature of this 135 km s −1 arm is contentious (see discussion by Jones et al. 2013) and it is not clear if Clump 1 is part of this structure or is entering the dust lane (Liszt & Burton 1980). Modern simulation efforts often attribute these features to stem from gas approaching the far end of the bar, about to begin the journey back towards the Galactic centre (Baba et al. 2010;Li et al. 2016b;Sormani et al. 2018).

Population of nearby wispy clouds
Examination of the ℓ -map has revealed the existence of a population of unusual clouds. These appear as very narrow horizontal lines in the ℓ -map (see Fig. 12 for some examples), so much so, that we initially thought them to be artificial, perhaps caused by spikes in the spectrometers or due to artefacts introduced during the data reduction procedure. However, on closer examination these were found to be extended over large areas (∼0.5-1 • in diameter) and to have morphologies typical of molecular clouds (see Fig. 13 for an example of their structure). These clouds have three primary characteristics; they are large in size, they have very narrow line-widths (FWHM ∼0.5-1 km s −1 ) and they tend to have velocities close to the solar one (i.e. v lsr close to zero). In Table 4, we summarise the positions and velocities for seven of these clouds clearly seen in the ℓ -map. Their velocities and large angular sizes would suggest that the majority of these are local clouds. However, we note that one (Cloud 2) has a velocity that would place it Figure 12. This map is a zoom of a region of the 13 CO (2 -1) ℓ -map presented in Fig. 6, which contains three elongated clouds that have very narrow line-widths (FWHM ∼ 0.5-0.75 km s −1 ). We have classified them as wispy clouds. Figure 13. 13 CO (2 -1) emission integrated over the line-width for Cloud 4 (see Table 4 for more details). at a larger distance. Given their narrow line-widths it is possible that these types of clouds have been missed in previous surveys where the velocity resolutions were > 0.5 km s −1 as they would only be 1 or 2 velocity channels wide and discarded as artefacts. It would therefore be interesting to investigate these objects in more detail, however, their near proximity to the Sun makes kinematic distances unreliable and, without these, determining physical properties is not possible.
Typical molecular clouds have FWHM line-widths of a few km s −1 (e.g. Paper III) and given that the thermal contribution is of the order of 0.3 km s −1 (assuming a temperature of 10−20 K) most of the motion in these clouds is non-thermal in nature, and often attributed to turbulence. The clouds identified in this Section are unusual in that their line-widths are much narrower than typically found for molecular clouds, and, therefore, the thermal and non-thermal components appear roughly balanced. In Fig. 14 we show an example of line-width for Cloud 4; this has been produced by integrating the emission seen in the ℓ -map in longitude (i.e. along its length). Given that the non-thermal energy can work to support clouds against gravitational collapse, such low values could indicate that these clouds would be potentially unstable to collapse, if they were associated with sufficient mass. In the absence of a robust distance estimate, we cannot determine the masses of the clouds, and thus are limited in our ability to make any further analysis on the nature of these clouds. Nevertheless, the fact that these are not seen in the C 18 O data suggests that they either have low excitation temperatures or low column densities and are, therefore, rather diffuse and perhaps transient.
We note that the most striking of these clouds are located in the 4 th quadrant. This potentially highlights a subtle difference between  Table 4).  Fig. 6), for instance a chain of clouds running from +10 to +15 km s −1 in v lsr over the 32.5 • ≤ ℓ ≤ 35.5 • longitude range.
In order to improve our sensitivity to weaker emission we have also performed a stacking analysis to search for emission from other species using spectral-cube 5 . We analysed the signal-to-noise ratio in the extracted lines vs. intensity threshold in the 13 CO (2 -1) line in steps of 5 K, and we found that most detected lines peaked at a threshold of 30 K. Then, we selected all voxels within the 13 CO (2 -1) cube covering the G334 field where the 13 CO (2 -1) brightness was above a set threshold of 30 K. For each pixel in those regions, we used the first moment map to select the peak velocity. The weak line spectra (i.e., HNCO and SiO) corresponding to these positions were then shifted by the 13 CO (2 -1) velocity and averaged, such that any signal is expected to have velocity offset close to zero. In Fig. 15 we show the results of our stacking analysis, which reveals that we have indeed detected a weak feature in the HNCO (10 -9) line at 5-in integrated intensity. We have also detected a stronger feature in the SiO transition (∼8-), but this feature is not peaked at 0 km s −1 and is much broader; since high-excitation SiO primarily traces outflows, this broad, offset feature may represent the average of several outflow features over the G334 field.
Using this approach, we can also increase the signal-to-noise of the detection of the more prominent species (e.g. SO, H 2 CO and HC 3 N; see Fig. 15). The success of this semi-blind stacking approach suggests that detailed studies of Galactic-scale cloud chemistry will be possible despite the survey's relatively short integration times per position.
Finally, we want to highlight that the most extreme star forming 5 https://spectral-cube.readthedocs.io/ regions of our Galaxy exhibit extended emission in several of these weaker lines. We illustrate this in Fig. 16, where we show an example of the integrated intensity maps of all lines towards a region around Sgr B2, one of the most active star forming sites in our Galaxy, located close to the Galactic Centre (see Fig. 10 for location), where we detect extended emission in the SO , SiO , HNCO (10 0,10 -9 0,9 ), CH 3 OH (4 2,0 -3 1,0 ) and the H 2 CO 3 0,3 -2 0,2 lines.

PERSPECTIVES AND CONCLUSIONS
Here we have presented the first public data release of the SEDIGISM survey, which covers 84 deg 2 of the inner Galactic plane in 13 CO (2 -1) and C 18 O (2 -1), at 30 angular resolution and a typical noise level of order 1 K ( mb ) at 0.25 km s −1 resolution. Future data releases may address remaining issues such as the baseline subtraction in complex regions and other artefacts not fully addressed with the current reduction pipeline (e.g. a spike near v lsr -48 km s −1 in C 18 O (2 -1) spectra). All data products extracted from this data set, in particular a catalogue and masks of molecular clouds (Paper III) are also being made publicly available alongside this data release, thus providing the community with high added-value products that are complementary to other surveys. This will constitute an invaluable resource for Milky Way studies in the southern hemisphere.
In this paper we have provided an up-to-date description of the data reduction procedure and data products, and highlighted some known issues with some of the fields. We also discussed the Galactic distribution of the molecular gas and investigated its correlation with known star-forming complexes and the large scale structural features of the Galaxy such as the spiral arms. Overall, the data appears consistent with a 4-arm model of the Galaxy. Using the model from Taylor & Cordes (1993) and updated by Cordes (2004), we found that ∼60 per cent of the 13 CO (2 -1) emission is tightly associated with the spiral arms (i.e. within 10 km s −1 of an arm) and very clear peaks can be seen in the distribution of intensities with Galactocentric distance, that can be attributed to specific spiral arms.
We have also shown how the velocity information allows us to analyse the complex nature of the molecular gas, which can be separated into different scale structures (filamentary, diffuse and compact structures). This also allows us to investigate the large scale dynamics of the interstellar medium. We have also demonstrated the feasibility of using transitions from less abundant molecular species for science exploitation. Finally, we have highlighted some interesting regions where the SEDIGISM survey can provide either a new perspective (i.e. Galactic centre region and the Bania molecular complex) or identify a new population of local molecular clouds, which appear as large (∼1 • ) features near v lsr = 0 km s −1 and with very narrow (<1 km s −1 ) line-widths.
A systematic exploitation of the full survey data is now under way. We have also used the SEDIGISM data to confirm the nature of filaments previously identified in ATLASGAL (Li et al. 2016a) and to explore their kinematics and mass per unit length (Mattern et al. 2018). In parallel to this work we are publishing a catalogue of nearly 11,000 molecular clouds and complexes extracted with SCIMES (Colombo et al. 2015), for which we derived distances and physical properties (Paper III), and a systematic assessment of the dense gas fraction and star formation efficiency as a function of environment (Paper IV).
In addition to these studies there are a number of ongoing projects aimed at extracting and characterising very long filaments directly from the spectral cubes. We also plan to constrain further the largescale Galactic structure (number and position of the spiral arms, Figure 15. Stacked spectra over the G334 field for the transitions SO (5 -4), H 2 CO(3 0,3 -2 0,2 ), H 2 CO(3 2,1 -2 2,0 ), HC 3 N(24 -23), HNCO (10 -9) and SiO (5 -4). The red line in each panel shows the best-fit Gaussian profile. orientation of the bar), by comparing the SEDIGISM data with the results of simulations. By exploiting the SEDIGISM and ThrUMMS (Barnes et al. 2015) data together, we will also characterise the excitation conditions of the interstellar medium over a large fraction of the Galaxy, extending the exploratory work presented in Paper I to the full survey coverage. Other topics under study include the analysis of the dynamical properties and turbulence in giant molecular complexes.
Clearly, a lot of novel studies can be carried out based on the SEDIGISM data alone, and the exploitation of this survey combined with data from other spectroscopic or continuum surveys opens new perspectives for a detailed investigation of the structure and physical conditions of the interstellar medium.
CONICYT project Basal AFB-170002. HB acknowledges support from the European Research Council under the Horizon 2020 Framework Program via the ERC Consolidator Grant CSF-648505. HB also acknowledges support from the Deutsche Forschungsgemeinschaft (DFG) via Sonderforschungsbereich (SFB) 881 "The Milky Way System" (sub-project B1). This work was partially funded by the Collaborative Research Council 956 "Conditions and impact of star formation" (subproject A6), also funded by the DFG. MW acknowledges funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 796461. TCs has received financial support from the French State in the framework of the IdEx Université de Bordeaux Investments for the future Program. CLD acknowledges funding from the European Research Council for the Horizon 2020 ERC consolidator grant project ICYBOB, grant number 818940. This document was prepared using the Overleaf web application, which can be found at www.overleaf.com.

DATA AVAILABILITY
The data presented in this article is available from a dedicated website: https://sedigism.mpifr-bonn.mpg.de Table A1. Spectra observed towards the reference positions: measured rms and spectral lines detected. The positions (Col. 1) give the galactic coordinates of the associated on-source maps. The full table is available in the online version of the paper. To compare the data from the SEDIGISM and HERO surveys we first smoothed and interpolated both data sets to a common resolution of 35" and 0.5 km s −1 . To do so we used the python package _ and the smoothing procedures described in its documentation 6 . Since the two data sets do not cover exactly the same region, we regridded the HERO data on the SEDIGISM data grid, using the function. To consider only the significant emission in the comparison, we mask the data using a dilate masking technique (Rosolowsky & Leroy 2006). This technique consists of generating two masks, at relatively low and high signal-to-noise ratio (SNR). Connected regions in the position-position-velocity (PPV) space in the low signal-to-noise ratio mask that do not contain a region of the high signal-to-noise ratio mask are eliminated from the final mask. The result is an actual expansion of the high signal-to-noise regions to lower significant emission, without the inclusion of noisy peaks. For our purposes we consider ≥ 10 for the high signal-to-noise ratio mask and ≥ 5 for the low signal-to-noise ratio one. We show the resulting integrated intensity maps (integrated over the full available range of v lsr ) for the two data sets in Fig. B1 for 13 CO (left column) and C 18 O (right column).
It is clear from Fig. B1, that the HERO data have a lower noise than SEDIGISM, since less pixels are masked. Beside this, the 13 CO integrated maps from the two surveys appear largely alike (Fig. B1, left column). To avoid biases due to the different sensitivity in the two surveys, we have also computed integrated 13 CO maps considering only voxels above a fixed value of mb = 5 K (Fig. B2). The ratio between the two integrated intensity maps appears mostly consistent around unity (within the calibration uncertainty), which demonstrates that both data sets are consistent. Since the C 18 O (2 -1) line emission is much weaker than 13 CO, we can only compare the brightest regions between both data sets (Fig. B1, right column). We notice some slight changes for the distribution of the C 18 O emission between SEDIGISM and HERO, but there is no pronounced systematic difference visible. Figure B1. Integrated intensity maps from SEDIGISM (top) and HERO (bottom) 13 CO (left) and C 18 O (right) data. SEDIGISM and HERO data have been homogenized and masked as described in Section B. Figure B2. Integrated 13 CO (2 -1) intensity maps from SEDIGISM (left) and HERO (middle), and absolute ratio between the two maps (right). Only voxels above a threshold of 5 K have been considered.