A mock redshift catalogue of the dusty star-forming galaxy population with intrinsic clustering and lensing for deep millimetre surveys

We present a new cosmologically motivated mock redshift survey of the Dusty Star-Forming Galaxy population. Our mock survey is based on the Bolshoi– Planck dark-matter halo simulation and covers an area of 5.3 deg 2 . Using a semi-empirical approach, we generate a light cone and populate the dark-matter haloes with galaxies. Infrared properties are assigned to the galaxies based on theoretical and empirical relations from the literature. Additionally, background galaxies are gravitationally lensed by dark-matter haloes along the line-of-sight assuming a point-mass model approximation. We characterize the mock survey by measuring the star formation rate density, integrated number counts, redshift distribution, and infrared luminosity function. When compared with single-dish and interferometric observations, the predictions from our mock survey closely follow the compiled results from the literature. We have also directed this study towards characterizing one of the extragalactic legacy surveys to be observed with the TolTEC camera at the Large Millimeter Telescope: the 0.8 sq. degree Ultra Deep Survey, with expected depths of 0.025, 0.018 and 0.012 mJy beam − 1 at 1.1, 1.4 and 2.0 mm. Exploiting the clustering information in our mock survey, we investigate its impact on the effect of flux boosting by the fainter population of dusty galaxies, finding that clustering can increase the median boosting by 0.5 per cent at 1.1 mm, 0.8 per cent at 1.4 mm and, 2.0 per cent at 2.0 mm, and with higher dispersion.


INTRODUCTION
A key piece to unravel the formation and evolution of galaxies is the study of the dusty star-forming galaxy (DSFG) population.These galaxies are highly obscured by dust, preventing the estimation of total star formation rates (SFR) at UV-optical wavelengths.However, at submillimetre wavelengths, we can observe the emission of dust grains heated by stars.The camera SCUBA in the James Clerk Maxwell Telescope (JCMT) discovered a subset of DSFG known as submillimeter galaxies (SMG, e.g., Smail et al. 1997;Barger et al. 1998;Hughes et al. 1998).These galaxies are found mainly at high redshifts, with a peak at  ∼ 2 − 3 (Chapman et al. 2003;Aretxaga et al. 2003;Chapman et al. 2005;Aretxaga et al. 2005;Stach et al. 2018;Franco et al. 2020;Gómez-Guĳarro et al. 2022), and they exhibit high infrared luminosities ( IR > 10 12 L ⊙ , Smail et al. 1997;Hughes et al. 1998;Magnelli et al. 2013;Gruppioni et al. 2020;Casey et al. 2013), corresponding to extreme SFRs of hundreds and even thousands of M ⊙ yr −1 (e.g.Smail et al. 1997;Hughes et al. 1998;Barger et al. 2014;da Cunha et al. 2015).For a review on this population see (Casey et al. 2014).The SFR density (SFRD) is highly obscured at  ≲ 3, with a contribution from DSFGs reaching approximately fifty per cent at  ≈ 3 (Bourne et al. 2017; Dudzevičiūtė ★ E-mail: aracelinavam@gmail.comet al. 2020).However, at  ≳ 4 their contribution is still not well constrained since the samples of DSFGs at these high redshifts remain incomplete (e.g.Casey et al. 2014;Gruppioni et al. 2020;Khusanova et al. 2020;Zavala et al. 2021;Fudamoto et al. 2021;Algera et al. 2022).Measuring reliable photometric or spectroscopic redshifts of DSFGs often requires counterpart associations across UV to midinfrared wavelengths.However, detecting high− galaxies at these wavelengths becomes challenging due to their flux densities rapidly dropping with redshift.Additionally, the relatively poor angular resolution and sensitivity of the single-dish (sub-)mm telescopes further complicate the association between the DSFGs and their UV/optical counterparts.The single-dish telescopes observe only the most luminous galaxies in large areas, e.g., SCUBA/SCUBA-2 on the JCMT (Geach et al. 2017), MAMBO on IRAM (Bertoldi et al. 2007), SPT (Vieira et al. 2010), and AzTEC on ASTE (Aretxaga et al. 2011).Interferometric telescopes as ALMA have shown that a considerable fraction of these luminous galaxies are actually the result of blending fainter sources within the larger beam of single-dish telescopes (e.g.Karim et al. 2013;Simpson et al. 2015;Stach et al. 2018): some cases being the result of projected sources in the line-of-sight, while others have been confirmed to be physically associated galaxies potentially tracing over-dense regions of the Universe.Due to their high angular resolution and sensitivity, interferometers allow more detailed studies of DSFGs.However, compared to single-dish observations, interferometric surveys are limited to mapping relatively smaller areas of the sky (hundreds of square arcminutes, e.g.Lindner et al. 2011;Fujimoto et al. 2015;Dunlop et al. 2017;Franco et al. 2018).To obtain a complete sample of DSFGs on both small and large scales and to understand their role not only in the SFRD (Madau & Dickinson 2014;Dunlop et al. 2017;Magnelli et al. 2019;Gruppioni et al. 2020) but also in their physical evolution, there is a need for a link between single-dish and interferometric telescopes.
The new camera TolTEC1 (Wilson et al. 2020) installed in the 50m Large Millimeter Telescope Alfonso Serrano (LMT2 , Hughes et al. 2020) observes at three different wavelengths simultaneously (1.1, 1.4 and 2.0 mm), with angular resolutions of 5.0, 6.3 and 9.5 arcsec.TolTEC will image a series of Extragalactic Legacy Surveys, two of which have already been defined: the Ultra Deep Survey (UDS) and the Large Scale Structure Survey (LSS) (Montaña et al. 2019).
The UDS survey will be sensitive to typical star forming galaxies (SFR > 10 M ⊙ yr −1 and  IR > 10 11 L ⊙ ) with 1 rms depths of 0.025, 0.018 and 0.012 mJy beam −1 at 1.1, 1.4 and 2.0 mm.The goal is to study the history of stellar mass build-up and metal production in massive galaxies during the last 13 Gyr of cosmic time (Montaña et al. 2019).The total survey area will cover ∼ 0.8 deg 2 , observing the CANDELS fields in UDS, GOODS-S, and a 0.5 sq.degree area in COSMOS.On the other hand, the LSS survey will cover a larger area of the sky to investigate the spatial distribution of the starforming galaxies through the cosmic web.It will study galaxies with SFR > 100 M ⊙ yr −1 and  IR > 10 12 L ⊙ at 1.1, 1.4 and 2.0 mm with depths of 0.25, 0.18 and 0.12 mJy beam−1 at 1 in rms.The observed areas will be of 3.5 to 10 deg 2 for target individual fields and a total of 40-60 deg 2 for the full survey (Montaña et al. 2019).These two extragalactic legacy surveys will close the gap between single-dish and interferometric observations.Works based on N-body cosmological simulations and different methods to populate galaxies within the dark matter halos (e.g., Cowley et al. 2014;Béthermin et al. 2017;Popping et al. 2020) or in hydrodynamics simulations (e.g., Lovell et al. 2021), have attempted to recreate the IR properties of the DSFGs and consequently reproduce the integrated number counts.These simulations allow us to understand better the DSFG population, and can be used to study the impact of observational effects (flux boosting, confusion noise, source blending, completeness, cosmic variance, etc.) on the detection of galaxies and different observables, e.g., the measured number counts (Cowley et al. 2014;Béthermin et al. 2017) and angular correlation functions (Cowley et al. 2016(Cowley et al. , 2017)).The flux boosting effect is present when unresolved neighbouring galaxies fall within the same telescope beam and increase the measured flux density.The flux boosting corrections to individual galaxies are commonly estimated through simulations that randomly distribute the galaxies in the observed area (e.g., Coppin et al. 2005Coppin et al. , 2006;;Geach et al. 2017;Zavala et al. 2017;Chen et al. 2022).These latter simulations do not consider the clustering of the DSFGs.However including clustering potentially has an impact on the flux boosting effects since it depends on the likelihood of having nearby galaxies.
In this paper, we present a new mock redshift survey of the DSFG population.This mock survey incorporates clustering properties and the gravitational lensing magnification caused by the dark-matter haloes in the line-of-sight on background galaxies, as it is built on a dark-matter cosmological simulation.As discussed above, previous mock surveys have overlooked both of these effects.We take advantage of this new mock redshift survey to characterize the TolTEC/UDS extragalactic legacy survey and to explore the impact that the clustering of DSFGs might have in the determination and correction of flux density boosting effects.
In section 2 we describe our procedure to generate our mock redshift survey of the DSFG population.In section 3 we characterize our mock redshift survey and compare it with the SFRD, redshift distribution, number counts, and the luminosity function of DSFGs.In Section 4 we show our predictions for the UDS extragalactic legacy survey of TolTEC and we discuss the different observational effects present in this survey.In section 5 we analyze the impact in the corrections of flux boosting when the clustering of galaxies is taken into account.Finally, we present our conclusions in the Section 6.

SIMULATION AND MOCK REDSHIFT SURVEY
During the past years, the Simulated Infrared Dusty Extragalactic Sky (SIDES, Béthermin et al. 2017) has been the largest (2 deg 2 ) publicly available simulation of the DSFG population.SIDES accurately reproduces the FIR-mm number counts of the DSFGs and, since it is based on the Bolshoi-Planck cosmological simulation, it therefore includes their clustering properties.SIDES has been widely used to compare and analyze observations of the DSFG population (Béthermin et al. 2020;Chen et al. 2022;Bing et al. 2023;Traina et al. 2024) and has recently been updated to predict their emission line properties (CONCERTO, Béthermin et al. 2022).The 2 deg 2 covered by SIDES, however, restricts its use to the analysis of relatively small area surveys, limits the predictions of cosmic variance effects, and misses the brightest and more extreme SMGs.The larger areas of future millimetre wavelength surveys, particularly those to be conducted with the TolTEC camera, has motivated the development of the new larger area (5.3 deg 2 ) mock redshift survey presented in this work.
In this section we describe the procedure followed to build our mock redshift survey of the DSFG population.First we describe the construction of a lightcone of dark matter haloes and how these haloes are populated with galaxies.We latter explain how starforming galaxies are identified within the mock redshift survey, and the process followed to assign their infrared properties.Finally, we estimate gravitational lensing magnifications and the observed flux densities for each galaxy.In Appendix A1 we highlight the main differences between our model (described in §2.1 - §2.4) and that from the SIDES simulation.

Lightcone
We use the Bolshoi-Planck cosmological simulation (Klypin et al. 2016;Rodríguez-Puebla et al. 2016b), which has a side length of 250ℎ −1 Mpc, with 2048 3 dark matter particles and a mass resolution of 1.55×10 8 ℎ −1 M ⊙ .The periodic conditions of the Bolshoi-Planck simulation allows us to build a lightcone, covering a wide range of redshifts ( ≲ 7) by repeating the volume of the box in the  direction as many times as necessary.The projection of the Bolshoi-Planck box results in a 5.3 deg 2 area.This size is large enough to generate multiple synthetic observations of the maps expected for the UDS.The maximum number necessary to cover a square area, with a side length , is given by: where Integer(x) rounds the value of  to the nearest integer.Thus, for a square area of 5.3 deg 2  repetitions = 26, the corresponding comoving volume covered by the lightcone is  ≈ 1.15 × 10 5 ℎ −3 Gpc 3 up to redshift 7. Due to the box repetitions, the light-cone projection on the sky can present false radial patterns.To avoid this, we include random displacements of the boxes no greater than 10% of the length of the box in one of the axis as well as several random rotations and permutations of each box.A visual inspection of the lightcone projection shows these are satisfactory solutions.To establish a connection between the galaxies and their host dark-matter haloes, we use an updated version to the semi-empirical galaxy-halo connection developed by Rodríguez-Puebla et al. (2016a, 2017).The general idea is to match the halo number density and the galaxy number density through a particular combination of their properties while at the same time modelling the mass assembly and star formation history of the galaxies.Here we use the maximum circular velocity attained along the main progenitor branch of the halo merger trees ( peak ) and the stellar mass ( ★ ) of the galaxies to make the connection.The above is well known that reproduces the auto-correlation functions of galaxies as a function of stellar mass (see e.g., Reddick et al. 2013;Dragomir et al. 2018;Calette et al. 2021) and when dividing into star-forming and quiescent galaxies (Kakos et al. 2024).
SFRs are determined following the growth in mass of the host haloes and their associated galaxies through the merger trees of the Bolshoi-Planck simulation, dividing the stellar mass growth into two components: 1)  −  star formation and 2)  −  star formation due to the merger of smaller galaxies.The history of stellar mass growth is calibrated to reproduce the evolution of the stellar mass function of all galaxies from  ∼ 0.1 to  ∼ 10, the fraction of quiescent galaxies from  ∼ 0.1 to ∼ 4, and the mean of the star-forming main sequence of galaxies from  ∼ 0.1 to  ∼ 8 (Rodríguez-Puebla et al. in prep.).In this way, each dark-matter halo and subhalo in our lightcone hosts a galaxy with a defined  ★ and SFR.
It is important to note that the semi-empirical approach does not introduce any modelling of astrophysical processes (as in the case of semi-analytical models) or recipes to physically link halo evolution with galaxy evolution.The approach is based on a continuous statistical matching of the simulated halo and observed galaxy distributions over a wide redshift range.As a result, the average stellar mass growth (in-situ and ex-situ) and the SFR history of galaxies bound to their dark matter haloes in the simulation are predicted.However, despite the empirical nature of the approach, it contains some underlying assumptions, for example, that the IMF (Initial Mass Function) is universal.Remarkably, after introducing observed systematic and statistical uncertainties in the galaxy-halo connection modelling, the integration in time of the SFR histories of galaxies in the mock catalogues is fully consistent with the evolution of the galaxy stellar mass function, not being necessary to invoke a -and mass-dependent IMF.
Figure 1 shows the predicted stellar-to-halo mass ratio at different redshifts, where this ratio quantifies the time-integrated efficiency of star formation in halos.As seen, the stellar-to-halo mass ratio is on average independent of redshift.In detail, however, it peaks around  halo ∼ 10 12.0 M ⊙ at  ∼ 0.25, increases slightly with  up to  halo ∼ 10 12.5 at  ∼ 3, and then it decreases to smaller masses at even higher redshifts.At masses below the peak, the role of star formation feedback is supposed to be crucial, while at higher masses, AGN feedback and virial-shocked gas cooling are supposed to be relevant.

Selection of Star-Forming Galaxies
For the purposes of this work, it is necessary to make a selection of galaxies that are actively forming stars.To achieve this, we initially calculate the specific star formation rate (sSFR = SFR/ ★ , a measure of the star formation per unity mass) for each galaxy within our mock catalogue.Subsequently, we apply the criterion of Pacifici et al. (2016) to preselect star-forming galaxies.This criterion parametrizes the redshift evolution of the sSFR threshold between star-forming and quiescent galaxies as: where   () is the age of the Universe in Gyr at redshift  and sSFR min is in Gyr −1 .We calculate sSFR min for each galaxy in our mock catalogue and compare this value with its intrinsic sSFR.If sSFR > sSFR min the object is selected as a candidate star-forming galaxy.We then group our mock galaxies in redshift and stellar mass bins, calculate the mean and standard deviation of the sSFR per bin, and make a linear fit to these values.The selected starforming galaxies will be those that are above the fit minus 0.5 dex.For a better selection, the mean values are recalculated using this last selected subset of galaxies, and a new linear fit is performed.This sigma clipping process continues until the linear fit converges and no longer exhibits significant variation (Δ < 0.05).The last fit defines the main sequence of our star-forming galaxies.
Figure 2 shows the sSFR- ★ plane for four different redshift bins, and the corresponding main sequences obtained following the procedure described above.We also include a projected density map (number of galaxies per bins of stellar mass and sSFR) corresponding to the galaxies selected as star-forming.For comparison, we show the main sequences by: Speagle et al. (2014), where the authors investigate its evolution in terms of  ★ , SFR and time (up to  ∼ 6), using a compilation of 25 studies from the literature; Popesso et al. (2022), who followed the same procedure as Speagle et al. (2014) but with a more extensive collection of studies including SFR indicators from the UV, mid-infrared, and far-infrared; Cardona-Torres et al. ( 2023) who used the HST-CANDELS data to constrain the main sequence in the Extended Groth Strip and characterize a sample of faint SMGs selected at 450 and 850 m from the SCUBA-2 Cosmology Legacy Survey.Our predictions are in good agreement with the main sequences by Speagle et al. (2014) and Popesso et al. (2022), although the latter one drops at  ★ > 10 11 M ⊙ , a region that is not well constrained by our mock catalogue.Our main sequence is steeper at  = 0.3 − 0.6 than that of Cardona-Torres et al. ( 2023), consistent at  = 1.2 − 1.5, and slightly shallower at  = 2.5 − 3.0.

Infrared Properties of the Galaxies
In this subsection we describe the methodology followed to establish the infrared emission of the galaxies.Throughout the paper we adopt a Chabrier (2003) IMF.

Dust-obscured SFR and Infrared Luminosities
To determine the fraction of SFR enshrouded by dust (  obs = SFR IR /SFR) for galaxies at 0.5 <  < 2.5, we adopt the empirical relation by Whitaker et al. (2017), who derived  obs for a mass complete sample of star-forming galaxies with  ★ ∼ 10 8.7 -10 11 M ⊙ , selected from the 3D-HST treasury program and Spitzer/MIPS 24 m maps in the CANDELS fields, finding a strong dependence between  obs and  ★ and a negligible evolution with .For  > 2.5 galaxies we use the results from Dunlop et al. (2017), where the authors combined ALMA 1.3 mm observations of the Hubble Ultra Deep Field and HST data in the UV to trace the IR and total SFRD as a function of .We make a linear fit to their SFRD and SFRD IR and estimate  obs as the ratio between both fits (SFRD IR /SFRD).This is later scaled to match the  obs predictions from Whitaker et al. (2017) at  = 2.5 for the different stellar masses.Finally, for galaxies at  ≤ 0.5 we extrapolate the relation of Whitaker et al. (2017), with an upper limit of  obs = 0.75 for  ★ > 10 10 M ⊙ galaxies, motivated by the analytical approximation from Rodríguez-Puebla et al. (2020).The resulting recipe to estimate  obs is: (5) where  = (1.96± 0.14)×10 9 and  = -2.277± 0.007.We use  obs from equations ( 3)-( 6) to estimate SFR IR (=  obs • SFR) and derive infrared luminosities for our galaxies following the relation of Kennicutt (1998) scaled for a Chabrier (2003) We impose a 2000 M ⊙ yr −1 upper limit to the obscured SFRs.While in large area surveys there are galaxies with apparent flux densities that suggest SFRs larger than this value, follow-up observations have confirmed they are gravitationally amplified systems (e.g., Zavala et al. 2018;Vieira et al. 2010;Negrello et al. 2010;Vieira et al. 2013;Harrington et al. 2017).The intrinsically brightest galaxies found in these surveys have equivalent SFRs close to the limit we adopt (e.g., Karim et al. 2013;Barger et al. 2014).
Table 1 summarizes the general properties of our mock redshift survey.

Dust Temperature
Once  IR has been estimated, the corresponding dust temperature ( dust ) of the galaxies can be inferred.Casey et al. (2018) used a compilation of dusty star-forming galaxies to model the relation between the intrinsic IR luminosity of galaxies and the observed peak of their SED: where  0 = 102.8± 0.4 m,  t = 10 12 L ⊙ and  = −0.068± 0.001.We follow the above relation to associate the intrinsic  IR of our galaxies with their corresponding  peak and, finally, estimate their  ′ dust using an approximation to Wien's displacement law: The  ′ dust obtained with equation ( 9) must be corrected for heating effects due to the Cosmic Microwave Background (CMB).The temperature of the CMB at  > 4 approaches and can exceed the temperature of the dust grains at z = 0 in the galaxies, rising their temperature.This increase in  ′ dust boosts the dust continuum emission which implies that the flux densities of galaxies will be higher than without considering this effect.Following da Cunha et al. (2013) the correction in the temperature of the galaxies is given by: where  ′ dust is the dust temperature of the galaxy in absence of the CMB,  CMB = 2.725 K is the CMB temperature at  = 0 and   = 1.8 is the dust spectral emissivity index (value adopted from Casey et al. 2018).

Observed Flux Density
To estimate the observed flux density (  ) of our mock galaxies we assume that their spectral energy distribution (SED) is well modeled by a modified black body given by:

This work
where () = (/ 0 )  E is the optical depth,  0 = 3 THz is the frequency at which the modified black body changes from optically thick to optically thin and   () is the Planck function.The observed flux density of our galaxies at frequency  is estimated through the relation: where   is the intrinsic SED of galaxies (equation 11),  IR is their infrared luminosity,  is their redshift and  L is the luminosity distance.The total infrared luminosity is defined as the integral of the SED from 8 to 1000 m.

Gravitational Lensing
To introduce gravitational lensing effects in our mock redshift survey, we follow Narayan & Bartelmann (1996) and consider amplifications produced by dark-matter haloes in the line of sight (under a pointmass model approximation) on the background galaxies.
In order to identify halo-galaxy alignments, we set a search radius  b = 5 arcsec, with origin at the position of each halo.We find that background sources at angular distances larger than this value will have amplifications smaller than 20 per cent for the most massive haloes ( halo > 10 14 M ⊙ ), and therefore do not have a strong impact in our results.For each halo with identified background galaxies, we can calculate its corresponding Einstein radius assuming a pointmass approximation (Narayan & Bartelmann 1996): with  the gravitational constant,  the speed of the light, and  d ,  ds and  s the angular distances between observer and lens, lens and source, and observer and source, respectively.The angular separation () of the lensed images is given by: with  l being the angular separation between the optical axis between the observer and the lens, and the true source position.Combining equations ( 13) and ( 14), the magnification of the image is: with the total magnification given by  = | + | + | − |.

TESTS OF THE MOCK CATALOGUE
In order to characterize the performance of our mock redshift survey, we measure the predicted cumulative number counts, star-formation rate density, total infrared luminosity function and redshift distribution, and compare them with observational results.

Number Counts
Figure 3 shows the number counts estimated from our 5.Given the current uncertainties and the dispersion between the different observational results, our mock catalogue accurately reproduces the measured number counts.At the faint end ( 1.1 ≲ 0.1 mJy), our predictions fall below the 1.2 mm ALMA results from Fujimoto et al. (2015).Their measurements, however, were derived from 120 pointings (mainly archival data) towards different astronomical sources, potentially biasing their results.Furthermore, the surveyed area available to estimate the number counts at  1.1 ≲ 0.1 mJy was ∼ 2 sq.arcmin, which may introduce sampling variance uncertainties.On the other hand, at  1.1 ∼ 0.1 mJy, the 1.2 mm ALMA number counts from González-López et al. ( 2020 2022) identified faint sources around ALMA calibrator maps (ALMACAL project).Both studies are based on the analysis of small-area individual maps, and are also potentially affected by sampling variance effects3 .
The bright end of the number counts has been mainly traced by single-dish observations, e.g. the 5 sq.degreeSCUBA-2 Cosmology Legacy Survey including seven independent fields (0.07 -2.22 sq.deg; Geach et al. 2017), the 1.6 sq.degree AzTEC JCMT-ASTE survey of seven blank-fields (0.08 -0.72 sq.deg) and 3.1 sq.degree towards 37 potentially overdense regions (Scott et al. 2012), and the recent 0.32 sq.degree NIKA2 Cosmological Legacy Survey including GOODS-N and COSMOS.Interferometric observations have explored this bright end through follow-up observations of the brightest sources detected by single-dish telescopes.For example, Simpson et al. (2015) presented 870m ALMA pointed observations of 30 galaxies selected at 850m from the SCUBA-2 Cosmology Legacy Survey, finding that 61 per cent of the selected sources were the result of blending two or more fainter DSFGs.Karim et al. (2013) reached a similar conclusion through 870m ALMA follow-up observations of 122 SMGs selected from the Large Apex BOlometer CAmera (LABOCA) Extended Chandra Deep Field South submillimetre survey (LESS), with all of the brightest sources ( 870 ≳ 12 mJy) comprising emission from multiple fainter galaxies.At the bright end ( 1.1 ≳ 5 mJy), our mock catalogue starts to deviate from the general trend of the observed number counts and from our predictions if no gravitational lensing effects are introduced.This transition therefore indicates where gravitational lensing effects starts to have a strong impact on the measured number counts, although a small contribution from intrinsically bright galaxies is also expected given the larger area of our mock redshift survey compared to the observational measurements (i.e., observing larger areas increases the probability of finding brighter sources and strongly lensed galaxies; Clements et al. 2010).The bottom panel of Figure 3 shows the ratio between our mock redshift survey with and without lensing: at  1.1 > 5 mJy, the number counts start to be dominated by gravitationally lensed galaxies, which can increase the un-lensed number counts by a factor of up to 26 at  1.1 ∼ 10 mJy.
The number counts from our mock redshift survey are also in good agreement with those from SIDES (Béthermin et al. 2017) at  1.1 ≳ 0.05 mJy.The difference at lower flux densities can be the result of the different mass resolution between the mock catalogues: SIDES includes galaxies with  ★ ≥ 10 8 M ⊙ whereas our new redshift survey has a resolution of  ★ ≥ 10 8.75 M ⊙ .It is worth noting that while the Bolshoi-Planck simulation resolves haloes down to  peak ∼ 60 km/s, which corresponds to haloes of  ★ ∼ 4 × 10 7 M ⊙ , the scatter (the product of intrinsic evolution plus random errors) around the  ★ − peak relation allows for a conservative completeness in stellar mass terms down to  ★ ≥ 10 8.75 M ⊙ .
In order to explore the relative contribution of galaxies with different stellar masses to the overall population of DSFGs, Figure 4 shows the 1.1, 1.4, and 2.0 mm number counts measured from our mock redshift survey for different  ★ bins.As expected, a general trend is seen for the three wavelengths, with the faint end (e.g. 1.1 ≲ 0.1 mJy) being dominated by the lower mass log( ★ /M ⊙ ) = 9.5 − 10.0 population, and more massive galaxies increasingly contributing towards larger flux densities.Interestingly, the interval  1.1 ∼ 1 − 10 mJy, which has been mainly probed by typical single-dish surveys, has an almost equivalent contribution by the three stellar-mass bins in the range log( ★ /M ⊙ ) = 9.5 − 11.0, with a small contribution from log( ★ /M ⊙ ) = 11.0 − 11.5 galaxies and the most massive log( ★ /M ⊙ ) > 11.5 population having a negligible contribution to the number counts at all flux densities.Finally, Figure 4 indicates that the  1.1 ∼ 0.1 mJy detection limit of the TolTEC Ultra Deep Legacy Survey will allow us to probe the relatively low mass (9.5 ≲ log( ★ /M ⊙ ) ≲ 10) population of DSFG, and study their properties in a sample of a few tens of thousands of galaxies.

SFR History
The total SFRD (IR+UV) as a function of  measured from our mock redshift survey is presented in Figure 5, showing the relative contributions from the IR and UV (including the effects of dust attenuation) SFR.Our predictions are consistence with different observational results and models from the literature (e.g., Zavala et al. 2021), in particular with the fact that the dust-obscured star formation activity dominates the star formation history of the Universe for the last ∼ 12.4 Gyr (i.e.since  ≈ 4.5).For comparison, Figure 5

Redshift Distribution
To test the redshift distribution of DSFGs predicted by our mock catalogue, we reproduce, in terms of depth and total area, three recent millimeter wavelength surveys that probe the faint and bright population of DSFG: Brisbin et al. (2017), Gruppioni et al. (2020), andFranco et al. (2020).For each of these surveys, we generate ten independent realizations and estimate the average redshift distribution.Figure 6 presents our results and the comparison is discussed below.As an additional reference, the redshift distribution measured on the complete 5.3 sq.degree mock redshift survey and using the total area of SIDES are also shown.
As it has been mentioned, Gruppioni et al. (2020) studied a sample of 56 serendipitously detected sources in the ALPINE survey of high redshift ( = 4.4 − 5.9) galaxies.ALPINE reached a sensitivity equivalent to that expected for the TolTEC UDS survey ( 1.1 ∼ 0.1 mJy), and the total area where serendipitous sources were detected (i.e.excluding the central region of the high- target) corresponds to 25 sq.arcmin.Gruppioni et al. (2020) estimated a median redshift of  med = 2.3 ± 0.3 for their sample, and our mock redshift surveys predict  med = 2.41±0.12(a similar value is measured over the whole 5.3 sq.deg mock redshift survey).While these median redshifts are in good agreement, it should be noted that the results from Gruppioni et al. (2020) may be biased by groups of galaxies gravitationally lensing the ALPINE targets or sources associated to these high- systems.These potential biases, combined with sampling variance effects due to the small survey area, can impact the overall shape of the observed redshift distribution.On the other hand, the SIDES simulation predicts a larger number of low- DSFGs resulting in a lower median redshift of  med = 2.1.
In the case of Franco et al. (2020), the authors took advantage of 3.6 and 4.5 m Spitzer/IRAC and 3 GHz Very Large Array data to extend the catalogue of sources from the 69 sq.arcmin GOODS-ALMA survey (Franco et al. 2018), reducing the detection limit down to  1.1 ≥ 0.63 mJy.The final catalogue includes 35 DSFGs with photometric and spectroscopic redshifts in the range  = 0.65 − 4.73.Interestingly, including the fainter population, identified by means of the near-IR and radio data, decreased the median redshift of the sample from  med = 2.73 to  med = 2.40.In our case, the mean distribution of our mock redshift surveys has a median value of  med = 2.67 ± 0.11, and  med = 2.74 for the complete mock redshift survey.Our predictions are therefore slightly higher but still consistent with the observational results, particularly with the original catalogue of  1.1 ≥ 0.88 mJy ALMA sources.The SIDES mock catalogue, on the other hand, shows a larger population of lower redshift galaxies and predicts  med = 2.44, in better agreement with the extended catalogue of Franco et al. (2020).As in the case of Gruppioni et al. (2020), the redshift distribution shows sensitivity to variations around the peak value due to the small map sizes, where the cosmic variance dominates.
Finally, Brisbin et al. (2017) presented ALMA 1.25 mm follow-up observations of 129 sources detected with  1.1 ≳ 3.5 mJy in the 0.72 sq.degree AzTEC/ASTE maps of COSMOS (FWHM ∼ 30 arcsecond).The high sensitivity (1 1.25 ∼ 0.15 mJy) and angular resolution of the ALMA observations resulted in the detection of 152 galaxies, 143 of them with either spectroscopic or accurate photometric redshifts, and with a median redshift  med = 2.48 ± 0.05.
Our mock redshift survey predicts a considerably higher value of  med = 2.95 ± 0.09 while the SIDES mock catalogue predicts a closer value of  med = 2.66.

Infrared luminosity function
To measure the infrared luminosity function in our mock redshift survey we directly count the number of galaxies in  IR and  bins.Figure 7 shows the results compared to different observational studies.We find that our mock redshift survey predicts more luminous galaxies than the infrared luminosity function from Koprowski et al. (2017), which is based on the ∼1.5 sq.degree SCUBA-2 Cosmology Legacy Survey (Geach et al. 2017) and the deep 1.3 mm ALMA observations of the HUDF (covering 4.5 arcmin 2 ; Dunlop et al. 2017) to constrain the faint end.Gruppioni & Pozzi (2019), however, suggest that the single SED template used by Koprowski et al. (2017) to estimate luminosities may have biased their results.Furthermore, the 850 m SCUBA-2 observations are less sensitive to 'warm' galaxies, a population that tends to dominate the bright end of the luminosity function, as suggested by 70, 100 and 160 m Herschel observations (Gruppioni et al. 2013).On the other hand, compared to our predictions, the luminosity function derived from the analysis of serendipitous detections around the ALPINE targets (Gruppioni et al. 2020), shows a systematic excess of luminous galaxies ( IR ≥ 10 12 L ⊙ ) through all redshift bins.Towards fainter luminosities, however, their curve flattens and starts falling below our predictions, particularly at  < 2.5.Finally, Rodríguez-Puebla et al. ( 2020) used a compilation of several observational results at  < 4.2 (see Table 2 of that work) to constrain analytical models of the infrared luminosity function.Their compilation includes data from Spitzer (24 and 70m), AKARI (90m), Herschel (70, 100, 160, 250, 350 and 500m), and SCUBA-2 (450m).The luminosity function predicted by our mock redshift survey is in very good agreement, given the uncertainties and dispersion of the data, with this multi-wavelength compilation.
To better constrain the evolution of the luminosity function, larger statistical samples are still required, particularly at the higher redshifts (i.e. > 2.5).In §4.3 we take advantage of our mock redshift survey to make predictions on the overall improvement that the TolTEC UDS will represent in our understanding of the evolution of the luminosity function.

Clustering characterization
To characterize the clustering properties of the DSFG in our mock redshift survey we estimate their angular two-point auto-correlation function ().This statistical tool measures the excess probability of finding a pair of galaxies separated by an angular scale , compared to a uniform non clustered distribution.To compute  we use the numeric estimator presented in Landy & Szalay (1993): where ,  and   are the number of galaxy pairs in the clustered, uniform and cross-matched distributions respectively.To improve the efficiency of the calculations, considerably reducing computing time, the algorithm of He ( 2021) was adapted to analytically estimate .We also include integral constraint corrections (Roche & Eales 1999), which are particularly important to consider when  is measured at angular scales of order of the map size. ★ > 10 10.5 M ⊙ ) identified in the 2 sq.degree COSMOS field using a machine-learning technique trained with deep SCUBA-2 and ALMA data.Due to the large mapped area, sample size, and available optical/IR photometric redshifts, this currently represents the most statistical robust clustering measurements of the DSFG population.Figure 8 compares the angular two-point auto-correlation functions measured by An et al. (2019) in the  = 2 − 3 redshift bin, and that of  ★ > 10 10.5 M ⊙ sources in a 2 sq.degree area of our mock redshift survey.The uncertainties in our measurements are estimated through a bootstrap analysis of 50 maps drawn from the larger 5.3 sq.degree mock redshift survey, therefore taking into account cosmic variance effects due to the finite size of the survey area.As a complementary comparison, we estimate the clustering of DSFG from the SIDES simulation, using 1 sq.degree maps for the bootstrap analysis due to the smaller size of the simulation.
Figure 8 shows that the clustering properties of the DSFG in our mock redshift survey are in good agreement with those measured for the SMGs and SFGs identified in the COSMOS field.We further fit a power-law of the form () =  −0.8 to the different data sets, and find equivalent clustering amplitudes (within the uncertainties) between them:  = 4.38 ± 0.67 for our mock catalogue, and  = 4.66 ± 0.59 and 5.52 ± 0.38 for the SFGs and SMGs samples in An et al. (2019).As mentioned above, our error bars are larger since they are estimated from 50 realizations of 2 sq.degree maps and hence consider cosmic variance.A similar agreement is found for the other two redshift bins ( = 1 − 2 and  = 3 − 5) presented in An et al. (2019), although affected by smaller statistics at the higher redshifts.

PREDICTIONS FOR THE UDS/TOLTEC SURVEY
In this section we take advantage of the mock redshift survey developed in §2 to make predictions of the constrains that the TolTEC UDS will make on the number counts, infrared luminosity function, and redshift distribution of the DSFG population.The UDS will consist of deep (1 1.1 ≈ 0.025 mJy) observations on a larger 0.5 sq.degree area of COSMOS and smaller (≳ 200 arcmin 2 ) maps towards other CANDELS fields.We adopt these two map sizes and, for each area, generate 10 independent realizations in order to provide mean values and sampling variance for each observable.Considering a 4 detection limit, the UDS will be sensitive to the faint end ( 1.1 ∼ 0.1 mJy) of the number counts, probing a regime that has only been sampled by interferometers (e.g.Fujimoto et al. 2015;Hatsukade et al. 2016;González-López et al. 2020;Zavala et al. 2021;Chen et al. 2022).Although these observations benefit from higher angular resolutions, and therefore a reduced confusion noise that makes them sensitive to a fainter population, their relatively small areas (≲ 100 sq.arcminutes) have limited the studied samples to ≲ 150 sources.Most of these surveys are therefore potentially affected by sampling variance effects.With a ∼ 30 times larger area, the UDS will detect thousands to tens of thousands of DSFGs with  1.1 = 0.1 − 1.0 mJy, allowing to impose strong constrains in this faint regime of the number counts.

UDS/TolTEC number counts
On the other hand, although the total area of the UDS will not be able to impose statistically robust constrains on the brightest and more extreme population of SMGs ( 1.1 ≳ 10 mJy), it will sample and constrain the range of flux densities probed by typical single-dish surveys ( 1.1 ∼ 1 mJy).Furthermore, the improved angular resolution (FWHM 1.1 ≈ 5 arcsecond) of the TolTEC observations will reduce the impact of source-blending in the measured number counts, an effect that has been confirmed to bias high the measurements from lower angular resolution (FWHM> 10 arcsecond) single-dish observations (e.g.Karim et al. 2013;Brisbin et al. 2017;Barger et al. 2022).The UDS will therefore provide an important link between the faintest population of DSFGs probed by the deepest small-area interferometric surveys (e.g.Fujimoto et al. 2015;González-López et al. 2020), and the brighter, gravitationally lensed SMGs identified in larger-area but lower angular resolution single-dish observations (particularly at the longer wavelengths, e.g.Vieira et al. 2010).

The cosmic variance in the UDS/TolTEC surveys
In Figure 10 we present the number counts at 1.1 mm for ten independent areas of 200 arcmin 2 and 0.5 deg 2 (representative of the smaller and larger areas considered for the UDS) along with the number counts for the total area of the mock redshift survey.We also add the residual fraction ( N ) obtained by subtracting the total number counts from each realization and normalizing by the total.Our mock surveys predict that, for the smaller 200 arcmin 2 maps, cosmic variance can introduce small relative variations of  N < 0.3 at  1.1 < 2 mJy, systematically increasing with flux density and reaching  N ∼ 2 at 2 mJy < S 1.1 < 7 mJy.For higher flux densities the variation can be more significant,  N ∼ 45.In the case of the 0.5 sq.degree map, the The shaded area represent the 1 dispersion from ten realizations, and the vertical lines indicate the expected 4 detection limit of the UDS.As an additional reference, we include the total number counts of the mock redshift survey (black solid line).The empty squares and circles are observations from interferometric and single-dish telescopes, respectively.The solid circles correspond to SPT measurements, known to be dominated by a gravitationally lensed population of galaxies (Vieira et al. 2010).TolTEC/UDS will place strong constrain to the number counts between  1.1 ∼ 0.1 to a few mJy, linking the results from large-area single-dish surveys sensitive to the rare and brighter SMGs down to the fainter population of DSFGs probed by small-area deep interferometic observations.cosmic variance starts to noticeable affect the counts ( N ∼ 1) at ∼10 mJy, reaching  N ∼ 4.3 for the highest flux.

UDS/TolTEC Luminosity Function
Similar to the number counts, the different areas and depths of the TolTEC extragalactic legacy surveys will allow us to study different regions of the luminosity function.In particular, the UDS has an infrared luminosity lower limit of ∼ 10 11 L ⊙ , enabling the detection of Luminous Infrared Galaxies (LIRGs), with SFRs of dozens of M ⊙ yr −1 .Figure 11 shows our predictions of the luminosity function we expect to be sensitive to in the two characteristic areas of the UDS.The results are divided in four redshift bins, spanning from  = 0.5 to  = 4.5, and are compared against the same data sets included in Figure 7.The UDS will therefore constrain very well (i.e. with little cosmic variance) the shape of the luminosity function in the less luminous regime, addressing some discrepancies noted in previous studies (Koprowski et al. 2017;Gruppioni et al. 2020;Rodríguez-Puebla et al. 2020).At  IR > 4 × 10 12 L ⊙ , the 200 arcmin 2 area exhibits more variance than the 0.5 deg 2 map, as it becomes challenging to capture brighter galaxies.Nevertheless, the 0.5 deg 2 area will provide more reliable luminosity function estimates up to ∼  IR ∼ 10 13 L ⊙ .While the UDS survey will contribute to probe part of this brighter end of the LF, its capabilities remain limited in this regard.More precise insights into this region of the luminosity function will be obtained with the shallower (1 1.1 ∼ 0.25 mJy) but ∼ 75 times wider LSS/TolTEC surveys.

UDS/TolTEC redshift distribution
Our predictions of the DSFG redshift distribution expected in the larger 0.5 sq.degree area of the UDS are presented in Figure 12.To explore the dependence of the redshift distribution with flux density, the population of DSFGs in our mock redshift survey is divided in two flux density intervals, a fainter sample with 0.1 <  1.1 [mJy] < 1.0 and a brighter sample with  1.1 ≥ 1.0 mJy.A clear difference between the redshift distributions can be seen, with the fainter population dominating at lower redshifts ( med = 2.16 ± 0.01) and brighter sources being more abundant at higher redshifts ( med = 2.81±0.01).This behavior, which is present in some of the observational results discussed in §3.3, can be attributed to the known cosmic downsizing.That is, the fact that more massive galaxies form earlier and faster than less massive ones (see e.g., Firmani & Avila-Reese 2010).
The area and depth of the UDS, combined with the ∼ 5 arcsecond angular resolution of TolTEC on the 50-LMT and deep multiwavelength data available in the targeted fields, will result in a large sample of ∼ 10, 000 galaxies with robust Optical/IR counterparts and photometric redshifts.This will allow us to measure the redshift distribution of DSFGs, and study its dependence with different physical properties, at a level not yet possible with current data sets.The colored shadow areas correspond to the 1 error for the mean.The results for 5.3deg 2 is also presented (black solid line).The different areas with different depths allow us to study the luminosity function in a wide interval of luminosity.We compare our luminosity function with previous results (Koprowski et al. 2017;Gruppioni et al. 2020;Rodríguez-Puebla et al. 2020) mark as different colored symbols (brown squares, yellow circles and green triangles).The discontinuous lines with the same colors than the symbols are fits to the respective data.The vertical line indicates the detection threshold in luminosity for the UDS/TolTEC survey.

THE IMPACT OF GALAXY CLUSTERING ON FLUX BOOSTING
The measured flux densities of sources in a real observation can sometimes be overestimated due to several factors, including the contribution from faint unresolved sources and the influence of neighbouring sources that may be blended together within the telescope beam.This effect, commonly referred to as 'flux boosting', is known to be stronger in sources detected at low signal-to-noise ratio (S/N), as discussed in previous studies (e.g.Hogg & Turner 1998).Most studies rely on simulations to estimate and correct for flux boosting effects (e.g.Zavala et al. 2017;Geach et al. 2017;Smail et al. 2021).
Most of these simulations, however, randomly distribute sources on noise maps without taking into account any clustering information, which could potentially bias the flux boosting of sources.
Taking advantage of the clustering information in our mock redshift survey, we investigate its impact on the estimation of flux boosting.We measure the effect of flux boosting at the three TolTEC wavelengths (1.1, 1.4 and 2.0 mm), adopting the corresponding angular resolution (FWHM = 5.0, 6.3 and 9.5 arcsec) and expected r.m.s.noise level ( r.m.s.= 0.025, 0.018, and 0.012 mJy) of the UDS survey.We then compare our results to a mock redshift survey with randomly distributed sources.
We categorize the galaxies in our mock redshift survey into two distinct groups.The first group, referred to as 'candidates', comprises galaxies with flux densities exceeding the r.m.s.noise level in the UDS (i.e.,  1.1 > 0.025 mJy,  1.4 > 0.018 mJy,  2.0 > 0.012 mJy).Fainter galaxies, unlikely to be individually detected but still contributing to the background noise, are denoted as 'boosting sources' and are included in the second group.
Each candidate source, with intrinsic flux density  instrinsic , is convolved with a Gaussian telescope beam in order to incorporate the flux density contribution from surrounding fainter sources.The resulting flux density is denoted as  blend .The noise contribution The resulting 'synthetic' flux density of each source is given by  synthetic =  blend +  noise .Finally, the boosting factor is defined as the ratio between the synthetic and the intrinsic flux densities ( synthetic / intrinsic ).The same methodology is applied to the mock redshift survey with the galaxies randomly distributed (i.e.removing clustering information).Figure 13 shows the median boosting factor, with 1 errors estimated through a bootstraping analysis, as a function of synthetic flux density ( synthetic ) and signal-to-noise ration (S/N).Additionally to model the behavior of the boosting factor in the mock redshift survey we fit to our data a power law following Geach et al. (2017) with the form: with the fitted parameters for each data set given in Table 2.As expected, fainter galaxies (i.e.S/N ∼ 4) are more likely to be strongly boosted by unresolved neighbour sources, with median boosting factors of 1.14 +0.41  −0.23 , 1.20 +0.50 −0.27 and 1.26 +0.63 −0.30 for the clustering map for the three wavelengths.The median values of the boosting factors are very similar for both clustered and non-clustered populations at 1.1 mm and 1.4 mm.At 2.0 mm, we observe departures at S/N > 7. The larger beam size allows for a larger chance of blending sources.This is the dominant effect in the abrupt departures from the non-clustering solution seen at S/N > 60 (i.e. 2.0 > 0.72 mJy, see Figure 13), where several sources of similar brightness are blended within the same beam.At the largest S/N there are no differences in the median values, since the surface density of bright galaxies is smaller.
The errors of the median values of the boosting factors are systematically larger when clustering is considered, specially for the larger S/N values.This indicates the high uncertainty to correct for boosting individual high S/N sources.This is in contrast with the findings of previous studies (Zavala et al. 2017;Geach et al. 2017;Smail et al. 2021), where only faint galaxies were observed to be significantly affected by boosting in maps without clustering.
To quantify our results, Figure 13 shows the ratio between the boosting factors estimated for a non-clustered and a clustered population of DSFGs.On average, clustering increases the boosting factor by 0.5±0.1 per cent at 1.1 mm, 0.8±0.1 per cent at 1.4 mm, and 1.6±0.2 per cent at 2.0 mm.Despite being a relatively small correction, including clustering in the flux boosting determinations will systematically improve the flux density measurements, particularly for the lower angular resolution observations.

2.0mm
Ratio between errors Figure 13.Left: Median boosting factor vs signal-to-noise ratio for an area of 1 deg 2 obtained at 1.1, 1.4 and 2.0 mm (top, middle, and bottom panels).The magenta dots represent the median boosting factor measured in a map including galaxy clustering, while the blue dots correspond to the median boosting factor when sources are assigned random positions.The pink and blue solid curves are a fits to the data with and without clustering.The vertical bars represent the 1 error in the medians.Middle: ratio between the median boosting factor values without and with clustering of the DSFG population.The red horizontal lines represent the median value of the ratios, with the shaded area representing the 1 error of the median.Right: ratio between positive errors in the median boosting factor.

SUMMARY AND CONCLUSIONS
In this paper, we present a new cosmological mock redshift survey of the DSFG population based on a lightcone built from the Bolshoi-Planck dark matter halo simulation (Klypin et al. 2016;Rodríguez-Puebla et al. 2016b).The mock redshift survey covers a total area of 5.3 sq.degree and a redshift range  = 0 − 7. The merger trees of dark matter haloes are used in order to model the mass assembly and star formation histories of galaxies based on the updated version of semi-empirical model by Rodríguez-Puebla et al. (2017).We used a sigma clipping process to define a star forming main sequence from our mock survey and found that it reproduces previous determina-tions from the literature (Speagle et al. 2014;Popesso et al. 2022;Cardona-Torres et al. 2023).The infrared properties of the galaxies are assigned based on empirical determinations of the dust-obscured fraction of SFR (Whitaker et al. 2017;Rodríguez-Puebla et al. 2020) and the SFRD (Dunlop et al. 2017), the Kennicutt (1998) relation between SFR and  IR , the relation between the observed wavelength peak,  peak , of the IR SED and  IR (Casey et al. 2018) and the Wien's displacement law to obtain the dust temperature (along with the assumption of modified black body for the SED), which are corrected by heating effects due to the CMB (da Cunha et al. 2013).The alignment between dark matter haloes and background galaxies is used to incorporate gravitational lensing magnifications (Narayan & Bartelmann 1996).
This new mock redshift survey accurately reproduces the mm wavelength number counts, from the faint end ( 1.1 ∼ 0.1 mJy) probed by small area interferometric observations to the bright end ( 1.1 ∼ 10 mJy), mostly explored by larger single-dish surveys.Although the total area of the mock redshift survey is not large enough to include a large sample of the brighter and more extreme SMGs ( 1.1 > 10.0 mJy), it does reproduce the expected flattening (at  1.1 ≳ 8 mJy) due to the strongly gravitationally lensed population of galaxies.Furthermore, although the evolution of the IR luminosity functions is still not well constrained by observations, particularly at  > 2.5, our mock redshift survey predicts a luminosity function that falls well within the uncertainties and dispersion of the different observational results, from  = 0.5 to 4.5 and over two orders of magnitude (log[ IR /L ⊙ ] = 11 − 13).
Additionally, our mock redshift survey reproduces the total and IR measurements of the SFR density from  = 0−7.Compared to recent studies that have estimated the redshift distribution of faint DSFGs ( 1.1 ≳ 0.2 mJy), our mock redshift survey systematically predicts slightly (≲ 10 per cent) higher median redshift values, suggesting that we might be missing a small fraction of faint low- DSFGs or overestimating the high- population.It should be noted, however, that most of these observational results are based on relatively small (< tens of sq.arcminutes) interferometric surveys, and can therefore be affected by cosmic variance effects.Our mock redshift survey further predicts the effect of cosmic downsizing, with brighter DSFGs being more abundant in the past and a fainter population dominating at lower redshifts.
It should be noted that the predictions of our mock redshift survey for the fainter population of DSFGs (i.e. 1.1 ≲ 0.05 mJy and  IR ≲ 10 11 L ⊙ ) are potentially underestimated and limited by the mass resolution of the dark matter cosmological simulation (1.55 × 10 8 ℎ −1 M ⊙ ).Similarly, the brighter population (i.e. 1.1 ≳ 10 mJy and  IR ≳ 10 13 L ⊙ ) is poorly sampled by the total area of the survey (5.3 sq.degree).A larger mock redshift survey ( ∼ 100 sq.degree) to probe this brighter population and its clustering properties is under development and will be presented in a forthcoming paper.
Mock redshift surveys like the one presented in this work, are an essential tool for the design of future extragalactic surveys and to predict the different properties of the population of galaxies that they will probe.Furthermore, they can be used to develop and test source extraction techniques and data analysis tools, characterize different biases affecting real observations, estimate the expected confusion noise, between others.In this work we take advantage of our mock redshift survey to make predictions for the TolTEC Ultra Deep Survey ( ≈ 0.8 sq.degree, 1 1.1 ≈ 0.025 mJy) and estimate the impact of clustering in flux boosting determinations.Below, we summarize our main results regarding these questions: • The UDS will detect ∼ 24, 000 DSFGs above a 4 threshold (i.e. 1.1 ≳ 0.1 mJy).This sample will result in a comprehensive view of the 1.1, 1.4, and 2.0 mm number counts between  1.1 ∼ 0.1 and 30 mJy.The  1.1 ∼ 0.1 − 1.0 mJy regime will be strongly constrained, providing a robust connection between deep interferometric observations and the wider but shallower single-dish surveys.Towards the brighter end, cosmic variance will have a ∼ 20 per cent impact on the counts at  1.1 ∼ 5 mJy, increasing to > 40 per cent for  1.1 > 10 mJy.
• The UDS sample will be dominated by relatively low mass DSFGs with  ★ = 10 9.5 − 10 10.5 M ⊙ , with ∼ 17 per cent of the sample corresponding to  ★ < 10 9.5 M ⊙ galaxies.This will allow us to explore a population an order of magnitude less massive than the typical SMGs identified with previous single-dish surveys.
• The area and sensitivity of the UDS, combined with the ∼ 5 arcsecond angular resolution of TolTEC on the 50-LMT and deep multi-wavelength data available in the UDS fields, will result in a large sample of DSFGs with robust Optical/IR counterparts.This will allow us to estimate accurate physical properties and photometric redshifts, and study the evolution of the IR luminosity function in a wide redshift range (0.5 <  < 4.5), particularly in the LIRG regime (10 11 <  IR /L ⊙ < 10 12 ).Furthermore, the UDS will constrain the redshift distribution of relatively faint DSFGs and its dependence with different physical properties.This will provide important insights on the effect of downsizing in this population of galaxies.
• The median flux boosting of sources detected with a S/N = 4 in the UDS will be 1.14 +0.41  −0.23 , 1.20 +0.50 −0.27 and 1.26 +0.63 −0.30 at 1.1, 1.4, and 2.0 mm.As expected, flux boosting has a smaller impact on brighter sources detected with a higher signal-to-noise, e.g. for 10 sources, the mean boosting factors and their dispersion are reduced to 1.04 +0.15  −0.10 , 1.05 +0.18 −0.10 and 1.08 +0.23 −0.12 .• Including clustering has a relatively minor impact on the determination of flux boosting factors, with no evident dependence on source S/N.Compared to a randomly distributed population, clustering increases flux boosting by 0.5±0.1,0.8±0.1 and 1.6±2.0 per cent at 1.1, 1.4, and 2.0 mm.The relative increase with angular resolution suggests that clustering might have a stronger impact on lower angular resolution observations (e.g.FWHM≳ 10 arcsec).Clustering, however, increases the uncertainties associated to the flux boosting factor determination, and should be propagated to the flux density uncertainties.

Figure 2 .
Figure 2. Main sequences of our mock catalogue (magenta solid line) obtained following the selection of star-forming galaxies presented in section 2.2.The projected density map represents the selected star-forming galaxies.For comparison, we include the main sequences derived by Speagle et al. (2014, orange dashed line), Popesso et al. (2022, green dashed line) and Cardona-Torres et al. (2023, blue dashed line).

Figure 4 .
Figure 4. Number counts at 1.1, 1.4, and 2.0 mm.The black solid line represents our measurement accounting for all the galaxies in the mock redshift survey.The color solid lines show the contributions to the number counts where we separate the galaxies by stellar mass ranges.The vertical lines are the 4 detection limits in the UDS surveys.
includes: the compilation of UV and IR measurements fromMadau & Dickinson (2014), where the UV observations have been corrected for dust attenuation; the total SFRD presented inDunlop et al. (2017), derived combining deep ALMA observations and rest-frame UV data of the 4.5 arcmin 2 HUDF; the results fromKhusanova et al. (2020), who estimated the SFRD at  = 4.5 and 5.5 from the UV-selected galaxies in the ALMA Large Program to INvestigate CII at Early Times (ALPINE); and theGruppioni et al. (2020) SFRD derived from serendipitous detections around the ALPINE targets.Also shown is the total SFRD fromKoprowski et al. (2017), estimated by integrating their measured IR luminosity function and including data fromParsa et al. (2016) to incorporate the UV contribution.

Figure 5 .
Figure 5.Total SFRD measured in our mock redshift survey (black solid line), and its relative contribution from the unobscured UV and dust-obscured IR components (blue and pink solid lines respectively).Our results are in good agreement with the observational data from Madau & Dickinson (2014, pink and cyan squares), Dunlop et al. (2017, olive green squares), Khusanova et al.(2020, yellow squares) andGruppioni et al. (2020, orange squares).We note that the compilation of UV measurements fromMadau & Dickinson (2014) have been corrected for dust attenuation in order to recover the total SFRD.We also plot the SFRD estimate byKoprowski et al. (2017, brown dashed-dotted  line).

Figure 7 .Figure 8 .
Figure 7. Infrared luminosity function measured in our 5.3 deg 2 mock redshift survey (black solid line, with shaded regions indicating the 1 Poisson error).The brown squares represent the sample by Koprowski et al. (2017) and brown dashed line they fit.The yellow circles are the data by Gruppioni et al. (2020) and the yellow dotted line their fit.The green triangles are the compilation data by Rodríguez-Puebla et al. (2020) calibrated to the same cosmology.The Béthermin et al. (2017) simulation is indicated with the gray dashed-dotted line.The vertical line indicates the limit in luminosity that the UDS/TolTEC survey will reach at S/N=4.

Figure 10 .Figure 11 .
Figure 10.Number counts for ten independent areas (colored lines) and their mean number counts for 200 arcmin 2 and 0.5 deg 2 (black dashed line in the left and right panels respectively), showing the effect of cosmic variance.The bottom panels show the residual fraction in the number counts obtained by subtracting the mean number counts from each independent realization and normalizing by the mean value.

Figure 12 .
Figure12.Redshift distribution at 1.1 mm for selected DSFGs in our mock redshift survey in an area of 0.5 sq.degree to be observed as part of the UDS/TolTEC survey.The cyan and golden distributions correspond to galaxies with flux densities of 0.1 <  1.1 [mJy] < 1.0 mJy and  1.1 > 1 mJy respectively.The errors bars represent the 1 dispersion from the ten 0.5 sq.degree map realizations.The median values for each distribution are indicated by the upper vertical lines.The difference between the bright and faint DSFG redshift distribution can be attributed to the known downsizing effect.

Table 1 .
Main properties and range of parameters of the galaxies in our mock redshift survey.
Figure 9. Mean 1.1, 1.4 and 2.0 mm number counts predicted by our mock redshift survey for the areas of the UDS/TolTEC survey, i.e. 200 arcmin 2 (red) and 0.5 deg 2 (blue).

Table 2 .
Parameters fitted to equation (17) to model the boosting factor in maps with and without clustering.