Probing Cosmic Dawn : Ages and Star Formation Histories of Candidate $z\geq$9 Galaxies

We discuss the spectral energy distributions and physical properties of six galaxies whose photometric redshifts suggest they lie beyond a redshift $z\simeq$9. Each was selected on account of a prominent excess seen in the Spitzer/IRAC 4.5$\mu$m band which, for a redshift above $z=9.0$, likely indicates the presence of a rest-frame Balmer break and a stellar component that formed earlier than a redshift $z\simeq10$. In addition to constraining the earlier star formation activity on the basis of fits using stellar population models with BAGPIPES, we have undertaken the necessary, but challenging, follow-up spectroscopy for each candidate using various combinations of Keck/MOSFIRE, VLT/X-shooter, Gemini/FLAMINGOS2 and ALMA. Based on either Lyman-$\alpha$ or [OIII] 88 $\mu$m emission, we determine a convincing redshift of $z$=8.78 for GN-z-10-3 and a likely redshift of $z$=9.28 for the lensed galaxy MACS0416-JD. For GN-z9-1, we conclude the case remains promising for a source beyond $z\simeq$9. Together with earlier spectroscopic data for MACS1149-JD1, our analysis of this enlarged sample provides further support for a cosmic star formation history extending beyond redshifts $z\simeq$10. We use our best-fit stellar population models to reconstruct the past rest-frame UV luminosities of our sources and discuss the implications for tracing earlier progenitors of such systems with the James Webb Space Telescope.


INTRODUCTION
An important goal of contemporary cosmology is to locate the epoch when the universe was first bathed in starlight, popularly referred to as 'cosmic dawn'. According to numerical simulations, dark matter halos sufficiently massive to induce star formation could develop as early as 150-250 Myr after the Big Bang, corresponding to the redshift range 15-20 (Wise et al. 2014;Villanueva-Domingo et al. 2018). With some assumptions about their ionising capabilities, studies based on the demographics of early galaxies (Robertson et al. 2015) have claimed to be compatible with independent constraints on early reionisation based on analyses of the optical depth of electron scattering to the cosmic microwave background (CMB;Planck Collaboration et al. 2018), as well as tentative measures of 21cm absorption at a redshift [16][17][18][19] in the CMB possibly asso-★ E-mail: nl408@cam.ac.uk ciated with UV photons produced by early star formation (Bowman et al. 2018). Recent deep imaging with the Hubble and Spitzer Space Telescopes has extended our redshift horizon to 9-11 (Ellis et al. 2013;Oesch et al. 2018). At this redshift, the universe is less than 600 Myr old and the presence of main sequence stars older than 250 Myr would imply galaxy formation originating before a redshift 14. Such a case of 'age-dating' a high redshift galaxy was presented by Zheng et al. (2012) for the lensed galaxy MACS1149-JD1 based upon a prominent photometric excess seen in the Spitzer/IRAC 4.5 m band. For the assumed photometric redshift of =9.6 ± 0.2, the IRAC excess would imply the presence of a Balmer break indicative of a burst of star formation some 200-300 Myr earlier. However, central to this conclusion is excluding the possibility of a redshift below 9 where such an IRAC excess could arise from contamination of the 4.5 m band by intense rest-frame optical [O III] 5007 Å emission consistent with a much younger stellar population (Labbé et al. 2013;Roberts-Borsani et al. 2016). An early attempt to secure a spectroscopic redshift of MACS1149-JD1 using grism capabilities onboard the Hubble Space Telescope (HST) (Hoag et al. 2018) provided a tantalising hint of a Lyman-break at > 9, however the lack of an emission line prevented a secure redshift constraint. A convincing redshift of = 9.11 was eventually determined through Lyman-and [O III] 88 micron emission by Hashimoto et al. (2018), using ALMA and VLT/X-Shooter. The subsequent re-analysis of the object using the secure spectroscopic redshift and improved IRAC photometry resulted in an estimated age of 290 ± 150 Myr for the dominant stellar component, consistent with a formation redshift of 15.4 ±2.3.
Various studies have discussed whether MACS1149-JD1 might be representative of the galaxy population at early times and hence provide a realistic first estimate of when cosmic dawn occurred. However, accurate interpretation of the associated photometry is not without its challenges. For instance, although Hashimoto et al. (2018) derived a stellar mass of 10 9 for MACS1149-JD1, this relied on an assumed and uncertain lensing magnification, thus introducing important uncertainties when comparing its properties to theoretical predictions. Furthermore, a fundamental difficulty has been that a prominent Balmer break requires a declining star formation rate with cosmic time for at least some portion of the early history of a galaxy, something that is difficult to reconcile with contemporary numerical simulations (Binggeli et al. 2019).
Observationally, several 7.5 < < 9 objects (both lensed and unlensed) with secure spectroscopic redshifts display a significant IRAC excess which could be indicative of the presence of a Balmer break and early episodes of star formation (e.g., Tamura et al. 2019,Bakx et al. 2020, Strait et al. 2020 Hashimoto et al. (2018), they nonetheless concluded that, collectively for these four sources, nearly half the stellar mass was produced prior to 10.
Even for sources believed to be beyond 9, it is possible that an IRAC excess may not entirely arise via a Balmer break. For MACS1149-JD1 Hashimoto et al. (2018) Katz et al. (2019) predicted the presence of three massive > 9 galaxies with similarly strong IRAC colours and matched photometry to MACS1149-JD1. While their results also point to likely episodes of extended star formation, they conclude that dust may contribute to the IRAC excess, thereby reducing the strength of the Balmer break.
Given the implications for the birth of galaxies, determining the amount of star formation prior to 9 is an important, but challenging endeavour. While ambiguities of interpretation may remain given the signal/noise of the available photometric data, our motivation in this paper is to further explore the Balmer break interpretation for six promising ≥9 candidates (including MACS1149-JD1) each with a 4.5 m IRAC excess. A major advance is a concerted attempt to secure their spectroscopic redshifts. Our goal is to explore the possible extent of star formation activity prior to a redshift 10. We consider this timely given it may have important consequences for exploring this early period further with the James Webb Space Telescope.
A plan of the paper follows. In Section 2, we describe the selection criteria we applied to construct our sample of ≥9 objects. In Section 3 we present the relevant spectral energy distributions (SEDs) and use these to infer the likely photometric redshifts and their star formation histories. This analysis provides a first assessment of the early star formation history for an enlarged population of 9 galaxies. The results of our spectroscopic follow-up campaign are described in Section 4. Given the challenges of working at these limits, we are not able to convincingly secure a redshift for all of our candidates. Section 5 returns to a discussion of the physical properties of the sample in the light of the limited spectroscopic data. We also discuss the extent to which our IRAC excess sample may be representative of the broader population of 9 galaxies and the frequency with which Lyman-emission is being located deep in the reionisation era. Finally we discuss the implication of our findings in the context of direct searches for earlier activity with the James Webb Space Telescope in Section 6.

SELECTION OF Z>9 SOURCES
To make further progress in charting the cosmic star formation history beyond the limits directly accessible to HST, we have selected further candidates for detailed study from three near-infrared surveys: CANDELS (Grogin et al. 2011) and the Frontier Fields (Lotz et al. 2017) from Hubble Space Telescope imaging, and the UltraVISTA ground-based campaign (McCracken et al. 2012).
We applied the following selection criteria to likely 9 candidates with H-band < 26.5 mag (see Section 3): (i) a 2 nondetection in all bands bluewards of the Lyman-break at the putative redshift; (ii) a 5 detection in two consecutive bands redwards of the Lyman-break; (iii) red Spitzer/IRAC colours (3.6 m -4.5 m > 0.5); (iv) a photometric redshift probability distribution permitting ≥9 at 1 . We used publicly available catalogues for the CAN-DELS and UltraVISTA fields. AstroDeep catalogues were used for four Frontier Field clusters (A2744, MACS0416, MACS0717 and MACS1149) and we constructed our own catalogues for the clusters A370 and AS1063 following the method described in Laporte et al. (2016). The Spitzer/IRAC data used in this paper are drawn from several surveys and therefore not uniform in depth. CANDELS IRAC data have a 5 depth of 25.5 mag; the 5 depth of the Ultra-VISTA IRAC data is 25.8 mag and the Frontier Fields data reach a 5 depth of 26.5 mag. We extracted the Spitzer photometry using a 1.2 arcsec radius aperture centred at the position of the source on the F160W image, and applied the usual aperture correction. However noting that, for blended sources, this may overestimate the IRAC flux, in those cases we extracted the photometry more carefully, taking into account the shape of the galaxies with GALFIT (Peng et al. 2010) assuming a Sersic profile.
Including MACS1149-JD1, these criteria led to six suitable sources distributed as follows : two (presumably lensed) sources viewed behind a Frontier Fields cluster, three in the CANDELS fields GOODS-N and GOODS-S, and one in UltraVISTA survey (see Table 1). The most restrictive criterion in their selection is that relating to the red IRAC colour as this can only be applied to relatively bright targets given the depth difference between ∼1 m data and the IRAC images. All these sources have already been identified as promising high-redshift candidates by previous studies (Oesch et al. 2018;Ishigaki et al. 2018;Bowler et al. 2020).

PHOTOMETRIC ANALYSIS
The spectral energy distributions (SEDs) and photometric redshift likelihood distributions for our selected six > 9 candidates are presented in Figure 1. The SEDs utilise photometry drawn from publicly-available imaging data taken with Hubble and Spitzer Space Telescopes and, where available, the VISTA deep surveys. We matched the Point-Spread Function (PSF) of HST images using TinyTim models (Krist et al. 2011). We extracted the total SED following the method described in Finkelstein et al. (2013) with an aperture correction computed from the Sextractor MAG_AUTO measured on the F160W image from WFC3/IR or the H-band from VISTA. The IRAC/Spitzer photometry following method described in the previous section . Two sources in our sample, MACS0416-JD and MACS1149-JD1, are located behind Frontier Fields clusters and thus are gravitationally-lensed. We estimated the magnification for these using the online magnification calculator for the Frontier Field survey 1 which is based on lensing models developed by  Grillo et al. (2015). This yields a magnification of 1.90 +0.03 −0.02 and 28 +44 −11 respectively for MACS0416-JD and MACS1149-JD1. The latter estimate represents a revision of the magnification adopted by Hashimoto et al. (2018).
We simultaneously determine the optimal photometric redshift and a range of physical properties for the six selected galaxies using BAGPIPES (Carnall et al. 2018). We allow the photometric redshift to lie in the range 0.0 < phot < 10.0 which, for ≤ 9.0, permits the IRAC excess to be formed, in part, from [O III] line emission (Roberts-Borsani et al. 2020). BAGPIPES generates HII regions from CLOUDY photoionisation code (Ferland et al. 2017) and assumes that the nebular emission is the sum of emission from HII regions of different ages following Byler et al. (2017). In particular, it is capable of reproducing intense [O III] emission with equivalent widths EW[O III+H ]>1000Å , sufficient for consideration of dominant contributions to an IRAC excess at high redshift. BAGPIPES assumes that the gas-phase metallicity is similar to that of the stars producing the ionising radiation (with a range of ∈[0.0:2.5] ). Finally, BAGPIPES accounts for dust emission via the Draine & Li (2007) model, with a reddening law following the Calzetti (2001) relation. We fit the SED of our candidates assuming a single-component model with four possible star-formation histories (SFH) : delayed, exponential, constant or burst-like. The best SED-fit is selected to be that with the largest likelihood and is presented as the grey curve for each candidate in Figure 1, with the associated physical parameters listed in Table 1. The best fit is always obtained for the delayed or constant SFH. Although we tabulate the age of the best-fit stellar population, a more meaningful measure is the fraction (expressed as a percentage) of the observed stellar mass that was assembled prior to a redshift = 10. For the two gravitationally-lensed galaxies, the luminosities and stellar masses have been corrected for the magnifications assumed above. Table 2 shows the range of parameters adopted in BAGPIPES for each particular SFH.
Despite the different HST and ground-based datasets from 1 https://archive.stsci.edu/prepds/frontier/lensmodels/#magcalc which our six IRAC excess sources were selected, they span a limited range in luminosity and stellar mass. As a consequence of its detection to the shallower magnitude limit with a ground-based telescope, UVISTA-1212 is the most luminous and massive galaxy in the sample. Later in Section 5 we discuss how representative is our sample with respect to the larger population of sources thought to be at 9. For all of the sources except UVISTA-1212, the most probable photometric redshift derived is above 9 and thus a Balmer break may be the most likely explanation for the IRAC excess. Nonetheless, as discussed by Hashimoto et al. (2018) and Roberts-Borsani et al. (2020), alternative explanations based on contributions from strong nebular emission lines or dust reddening can be considered.
Although [O III] 5007 Å is redshifted out of the IRAC 4.5 m band for > 9.1, Hashimoto et al. (2018) considered the contribution of H and [O III] 4959Å. Even for a very young starburst with a hard radiation field (log U = -2.0) and low metallicity (20% solar), it is not possible to reproduce a significant IRAC excess. Furthermore, for the youngest ages, the IRAC 3.6 m flux is raised by the presence of a significant nebular continuum. Likewise, [O II] 3727 Å will enter the 4.5 m band for redshifts above 9.7. The photometric likelihood distribution does extend to such a high redshift for some of our sources ( Figure 1). However, the IRAC excess would only be matched if [O II] had a rest-frame equivalent width of over 1000 Å which seems improbable. Finally, it is possible to match the SEDs by combining a delayed SFH with dust extinction. In this case, to match the IRAC excess, the extinction would typically need to exceed >1.7 mag, which is larger than that inferred in previous measures of reionisation era galaxies (e.g. Schaerer & de Barros 2010, Strait et al. 2020. In Figure 1 we show the posterior distribution of the age and stellar mass to indicate the reliability of concluding there was star formation prior to = 10 as listed in Table 1. Despite uncertainties in the available photometry, the shape of the posterior is reasonably well-defined for ages greater than 200 Myr, indicating the likelihood of significant earlier star formation. The BAGPIPES solution for MACS1149-JD1 can be compared with the solutions derived by Hashimoto et al. (2018)  Myr) are consistent with the range of early estimates. We include a separate entry in Table 1 for the case where the redshift is fixed at its spectroscopic value of =9.11 (Hashimoto et al. 2018). We emphasise that the inferred contribution of early star formation is uncertain in many cases, particularly for GN-z10-3 whose IRAC excess is the least prominent. Nonetheless, overall our sample is consistent with having significant star formation prior to 10, as was concluded for MACS1149-JD1. We will return to a further discussion of this in Section 6.

SPECTROSCOPIC FOLLOW-UP
Although Lyman-is the most prominent rest-frame UV emission line in star-forming galaxies at intermediate redshift (Stark et al. 2010(Stark et al. , 2011, its visibility deep in the reionisation era at 9 is Figure 1. Spectral energy distributions and best-fit BAGPIPES (Carnall et al. 2018) models for the 6 galaxies in Table 1. Blue points represent photometric data or limits, and red points represent the predicted fluxes. All models incorporate contributions from nebular emission (see text  Table 1. Bright phot ≥9 galaxies with red IRAC colours (3.6 m -4.5 m > 0.5) with physical properties determined using BAGPIPES (Carnall et al. 2018). Photometric redshift uncertainties represent 1 values and absolute magnitudes and stellar masses are corrected for lensing magnification where appropriate (see text). The final column displays the inferred percentage of the presently-observed stellar mass formed before =10.  Table 2. The BAGPIPES parameter ranges used to fit the SEDs of the 6 targets studied in this paper. The age (or maximum age) is chosen to be less than 1 Gyr given the selection criteria applied to build our sample. The ionisation parameter range we used is large to account for extreme values (log U>-2) as observed in previous high-redshift galaxies (Stark et al. 2017 (Laporte et al. 2017;Mainali et al. 2020). Surprisingly, for 11 galaxies with spectroscopic redshifts above = 7.5, Lyman-was detected in 7 with additional UV metallic lines in only 4 cases. In all cases, detecting the intrinsically fainter metallic lines has been more challenging than searching for Lyman-.
Far infrared emission lines, such as [O III] 88 m arising from ionised gas and [C II] 158 m from the neutral gas, have been pivotal in spectroscopic confirmation at high redshifts with the ALMA interferometer (Inoue et al. 2014). For the same 11 > 7.5 galaxies of which 7 are accessible with ALMA, [O III] and/or [C II] emission have been detected in 5 cases. Moreover, in cases where both ALMA and near-infrared spectrographs on ground-based telescopes were both used in similar exposure times (e.g., Laporte et al. 2017, Hashimoto et al. 2018, the ALMA detections were more significant. For this reason, we adopted a multi-facility approach in attempting to secure spectroscopic redshifts for the targets introduced in Table  1. We discuss the results of our spectroscopic campaigns for each target in turn below. A summary of the results of our spectroscopic campaign is given in Table 2. The offset between the centroid of the UV continuum and the peak of [OIII]88 m emission is 0.40 arcsec, corresponding to an observed physical separation of 1.8 kpc at ∼9.28. Such a small displacement is consistent with previous [OIII]88 m observations at high-(e.g. Carniani et al. 2017, Harikane et al. 2020). We measured the integrated flux on a map created using the immoments CASA recipe. We determine =0.298±0.051 Jy.km.sec −1 which is within the range values found previously in ≥8 galaxies (Tamura et al. 2019, Hashimoto et al. 2018. Assuming a Gaussian profile, we measure a FWHM of 313±22 km sec −1 . The significance of the line can be estimated via a histogram of the pixel by pixel signal to noise ratio (SNR) at the spatial position of the target. In the absence of emission, this will approximate a Gaussian distribution. If emission is present, a second peak will be seen indicating its significance. The lower panel of Figure 2 reveals such a second peak at SNR 6 consistent with the analysis above. No dust continuum was detected to a 1-level of 18 Jy/beam.
We also observed this target in service mode with X-Shooter/VLT between August and September 2019 (ID: 0104.A-0028(A), P.I. : R. Ellis). The target was centred using a small blind offset (<20 arcsec) in the 0.9 arcsec JH slit to reduce the background and a nod length of 4 arcsec was used. The total on source exposure time in the near-infrared arm was 10hrs in excellent seeing condition (<0.8 arcsec). The data were reduced using the latest version of the X-shooter pipeline (3.5.0) and inspected visually by two authors (NL and RAM). No emission line was found in any of the three X-Shooter arms. Given the ALMA-based redshift, the absence of Lyman-implies a 1 upper limit of 1.8×10 −18 cgs assuming a FWHM=200km/s (Figure 3).

GN-z10-3 and GN-z9-1
We observed GN-z10-3 with MOSFIRE/Keck undertaking a 5hrs 20min on source exposure in April 2019 (Keck 2019A_U128, PI: B.E. Robertson). In order to monitor the seeing and transparency, two stars were included in the MOSFIRE mask. For the final analysis, only frames with seeing less than 0.8 arcsec FWHM were selected representing a total exposure of 4.9hrs. Data reduction was undertaken using the 2018 version of the MOSFIRE DRP. As the target was centred in the MOSFIRE slit, emission lines will feature a positive signal accompanied by two negative counterparts separated by the nodding scale of 1.25 arcsec. We identified a convincing (>   (Figure 4). No other line is present in the entire J-band spectra. Assuming this line is Lyman-, it corresponds to a redshift of =8.78. In order to verify its reliability, we examined data comprising independent halves of the total exposure time. The line was retrieved with suitably reduced signal/noise on both spectra. The line width is comparable with that measured for other > 7 Lyman-detections with MOSFIRE (e.g., Zitrin et al. 2015, Finkelstein et al. 2013, Hoag et al. 2017).
The spectroscopic data refines the redshift value within the photometric redshift likelihood distribution (Figure 1). Although the IRAC 4.5 m excess at this improved redshift below = 9 implies a likely contribution from [O III] 5007 Å emission, we will return to the analysis of its SED in Section 5. Further sub-mm observations are currently on-going with NOEMA to detect [O I] 145 m, one of the brightest FIR emission lines (Katz et al. 2019).
During the same observing run, a separate mask was designed for GN-z9-1. A total on source exposure time of 5.6hrs was secured in good seeing conditions. The data was reduced and analysed in the manner described above. However, in this case, no clear (>5 ) emission line over the entire J-band. Reduced the significance threshold to 3 , a faint feature is present at =13241.2Å with a positive signal at the expected position of the target in the slit and two negative counterparts on each side. We estimated a flux of (2.2±0.7)×10 −18 cgs. Assuming this line is Lyman-, the redshift of GN-z9-1 would be =9.89 ( Figure 5). However, although consistent with the photometric redshift likelihood distribution (Figure 1), we consider this a marginal claim. Deeper spectroscopic observation, such as those that will become feasible with JWST, are needed to confirm this redshift.

UVISTA-1212
UVISTA-1212, one of the brightest ≥8 candidates known to date, was observed with FLAMINGOS-2 on the Gemini-South telescope via the Fast Turnaround programme (ID: GS-2020A-FT-102, PI: Roberts-Borsani). Observations were accomplished in service mode in February 2020 in J-band using the R3K grism and a 4 pixel wide slit. From an allocation of 9.8 hours, a 8.8 hour on source exposure was taken in good seeing conditions. The data were reduced using the FLAMINGOS-2 Python cookbook for longslit observations (specifically the reduce_ls script, which primarily uses the PyRaf module as a wrapper around IRAF functions to reduce the data). More specifically, the nsflat, nsreduce and nscombine were used to reduce the dark, flat and arc frames, as well as the standard star and science observations. The final 1D spectrum is shown in Figure 6. After a careful inspection of the reduced spectra, no clear emission line was found over the entire J-band spectrum.

GS-z9-1
GS-z9-1 has a photometric redshift of 9.26 +0.41 −0.42 and we have obtained 9.8 hours of observing time (8.0 hours on source) from an allocation of 13.5 hours with X-Shooter/VLT (ID : 106.20ZH.001, PI: R. Ellis). Observations were undertaken in service mode in good seeing conditions (<1.1 arcsec) between October 2020 and January 2021 2 We carefully checked the centring of the source in the X-Shooter slit and applied the same data reduction procedure as described in section 4.1. Unfortunately no clear emission line is seen after a first inspection of the data (Figure 7). At the expected position of Ly-at ℎ , we measure a 1 upper limit of 7×10 −19 cgs.

Updating the Properties
We now reconsider the physical properties of our sample, taking into account, where available, the newly-acquired spectroscopic redshifts (Table 3). In addition to MACS1149-JD1, we now have additional redshifts for MAC0416-JD and GN-z10-3, representing half of our original sample. Although we have marginal evidence that GN-z9-1 lies beyond 9, for it, GS-z9-1 and UVISTA-1212, we will continue to assume the photometric redshifts given in Table  1. A re-analysis of the SED is particularly relevant for GN-z10-3 whose spectroscopic redshift is now confirmed to lie below = 9, implying a possible significant contribution to the IRAC excess from rest-frame optical emission lines. The updated results based on BAGPIPES fits to the SEDs, now constrained with a fixed spectroscopic redshift, are given in Table 4. In the three cases where the spectroscopic redshift is confirmed to be above 9, the derived properties are naturally similar to those given in Table 1. With the exception of UVISTA-1212, which appears to be an outlier in our sample, all galaxies have a stellar mass ranging from ∼ 1 to 4×10 9 M , a star formation rate ranging from ∼10 to 30 M /yr and a low dust attenuation (<0.5 mag). However, for GN-z10-3, despite a spectroscopic constraint of =8.78, a re-analysis of the SED incorporating this value may still be consistent with a mature stellar population. This may appear surprising given the IRAC excess could be readily explained with strong [O III] 5007 Å emission. Figure 8 shows a re-analysis of the SED shown in Figure 1 with the redshift appropriately constrained at = 8.78. The posterior distribution in age and stellar masses is now bimodal and thus it is not possible to distinguish between a young population whose [O III] 5007 Å emission contributes to the IRAC excess, and an older solution with a contribution from a Balmer break. Similar interpretational ambiguities were seen for < 9 sources with In the case of nondetection, the histogram shape should be consistent with a gaussian. The yellow region highlights a deviation from a gaussian shape at SNR 6.
IRAC excesses by Roberts-Borsani et al. (2020) who argued this degeneracy may be broken with independent measures of [O III] 88 m emission. In the case of the young solution for GN-z10-3, the required equivalent width (EW) of [O III]+H is ∼1000 Å which is consistent with the range observed in some high redshift galaxies (e.g., Strait et al. 2020, Bowler et al. 2020. The flat continuum between F160W and 360 m would then imply that [O II] has an  EW > 50Å. Recent studies suggest this is close to the maximum possible for a young starburst (e.g. Yang et al. 2017). Although we are unable to distinguish between these two fits with the current data, we adopt the properties of an older stellar population in Table  4.

The ∼9 Lyman-fraction
We now consider the question of the visibility of Lyman-deep in the reionisation era. This can be done in terms of a Lyman- emission fraction for our sample. Thus we have two targets with a secure Lyman-detection and one (GN-z9-1) which is tentative (see Section 4). All candidates have reliable photometric redshifts at ∼ 9 or a far infrared emission line, suggesting the likelihood of contamination of the sample by low redshift interlopers is small. Despite obvious uncertainties for such a small sample, the Lymanemission fraction calculated for the sample of 6 ∼ 9 galaxies presented in this paper is very high: LAE = 0.33 +0.28 −0.21 if we include GN-z9-1 or LAE = 0.50 +0.26 −0.26 otherwise. This high fraction of detections continues a trend of recent results (Roberts-Borsani et al. 2016;Stark et al. 2017;Mason et al. 2018) that have weakened the originally-claimed rapid decline above 7 (Pentericci et al. 2011;Schenker et al. 2012Schenker et al. , 2014. Our results are compared with these earlier trends in Figure 9). As with Roberts-Borsani et al. (2016) and Stark et al. (2017), we explicitly selected bright galaxies amenable to ground-based spectroscopy with red Spitzer/IRAC colours (3.6 m -4.5 m > 0.5, Section 2). However, while the 100% visibility of Lyman-in Roberts-Borsani et al.   (2016)'s sample might be attributed to a correspondingly strong rest-frame optical [O III] emission responsible for the IRAC excess, for the three > 9 targets in the present sample, only a Balmer break synonymous with a mature stellar population (e.g., Hashimoto et al. 2018) could be responsible. This suggests that the enhanced Lyman-fraction arises from the fact that some luminous galaxies, regardless of their [O III] emission, are capable of producing early ionised bubbles. As discussed by Mason et al. (2018), much of the early work probing the redshift-dependent fraction necessarily targeted sub-luminous galaxies in order to efficiently exploit multislit spectrographs with limited fields of view. There is good evidence that intense [O III] emitters at low and intermediate redshifts have high LyC escape fractions (Izotov et al. 2018;Fletcher et al. 2019;Nakajima et al. 2020) and/or ionising efficiencies (e.g., Nakajima et al. 2016), consistent with the creation of ionised bubbles at high-redshift. What remains unclear is how > 9 galaxies with IRAC excesses indicating a mature stellar population can emit enough Lyman continuum photons to ionise large bubbles. The Strömgren radius of the ionised bubble created by a galaxy scales with ∝ ( em esc,LyC ) 1/3 (e.g., Cen & Haiman 2000), where esc,LyC is the escape fraction of Lyman continuum photons and em the time the galaxies has been emitting such photons.
Assuming an average ionising efficiency and esc,LyC < 3%, the galaxies in our ∼ 9 sample, which all sustained star formation over 100 − 200 Myr (see Table 1), can create an ionised bubble as large as those of COLA1 (Hu et al. 2016;Matthee & Schaye 2018) or A370p_z1 (Meyer et al. 2021), two actively star forming > 6 galaxies with very high escape fractions. Such a large ionised bubble may explain why Lyman-is observed in MACS1149-JD1 blueshifted by ∼ −450km s −1 with respect to [O III] 88 m. In summary therefore, the high detected fraction of Lyman-emission at 9 in luminous galaxies may offer further support of their mature stellar populations.

How Representative is our Sample?
Early evidence for mature stellar populations at > 7.5 (Hashimoto et al. 2018;Roberts-Borsani et al. 2020) has been hard to reconcile with numerical simulations of the first galaxies. While Binggeli et al. (2019) conclude that ∼ 9 Balmer breaks must be very rare, Katz et al. 2019 find in the contrary that they can be formed easily in their simulations, although dust might complicate the interpretation. In light of our results, it is thus important to assess observationally the prevalence of ∼ 9 galaxies with older stellar populations.
While we cannot compare Balmer-break objects with a larger population of spectroscopically confirmed ∼ 9 galaxies, we can assess the prevalence of Balmer-break objects in a parent sample of > 9 photometric candidates. In order to investigate the relative abundances of the two populations, we have carefully selected > 9 galaxies candidates using the same procedure and magnitude cuts as in Section 2, but removing the IRAC excess criterion. The candidates were visually inspected by two authors (NL, RAM). Due to the absence of filters bluewards of < 8500 Å in ULTRAVISTA, we restricted our search to the Frontier Fields and CANDELS. This additional search results in 14 photometric candidates (including the 5 Balmer-break objects presented above). Balmer-break objects thus appear to represent a significant fraction( 0.36 +0.17 −0.14 ) of the > 9 bright galaxy population, in contrast to the results of numerical simulations. Indeed this Balmer-break fraction could represent a lower limit since the denominator comprises galaxies without spectroscopic redshifts: several could be low redshift interlopers or 8 < < 9 galaxies (given the large photometric redshift errors) whose IRAC excess may not arise from a Balmer Break. This is also the case for the nominator (e.g. GS-z9-1), although to a lower extent.
We can compare the rest-frame UV luminosities of this parent sample with those of our Balmer break galaxies. As most of the candidates are drawn from the GOODS-North/South fields(11/14), we restrict the comparison to the CANDELS fields, ensuring we can avoid the need to make corrections for lensing magnification in the cluster fields. We derive the rest-frame UV magnitude of our candidates using the F125W magnitude, the best-fit photometric redshift and a UV slope = 2, ( ∝ ) for the K-correction. We compare the rest-frame UV magnitudes for the two samples in Figure 10. Although the statistics are poor, our objects with red IRAC colors are not notably brighter or fainter than blue IRAC color objects. A two-sided KS test is inconclusive and the null hypothesis that the two distributions are different cannot be rejected ( = 0.12). Thus, given the modest sample size to date, our > 9 Balmer break sources remain broadly representative of the larger > 9 galaxy population. We note that all > 9 candidates represent the brightest objects existing at that early time due to obvious limited depth of the imaging data. The hypothesis that Balmer break objects are less frequent in the fainter population is plausible, but can only be tested with JWST.

DISCUSSION
In this article, we have argued that from a sample of six 9 galaxies with Spitzer/IRAC excess for which spectroscopic redshifts have now been secured for three, possibly four, we infer past star formation histories consistent with significant contributions beyond a redshift = 10.
We can estimate the earlier rest-frame UV luminosity for each of the six galaxies in our sample assuming the best-fit star formation history (SFH). As described in Section 3, the best SED fit is obtained for most of the objects with an evolved stellar population. Table 1 shows the best-fit parameters for all galaxies, updated in Table 4 with the spectroscopic redshift for 3 of them. Based on these SFHs we compute the rest-frame L 1500 luminosity as a function of redshift from ∼15 to the observed redshift taking into account the dust reddening. For those galaxies without spectroscopic redshifts, the photometric value is assumed. Figure 11 shows the evolution of m 1500 as a function of redshift and demonstrates that each galaxy is as luminous in its earlier life. While these past star formation histories are clearly illustrative, they serve to demonstrate the feasibility of searches for earlier progenitors with JWST (e.g., Roberts-Borsani et al. 2021).
From our SFHs, we can also determine the redshift-dependent history of the assembled stellar mass providing independent insight into the evolution of the UV luminosity density within the first 500 Myr (Figure 12). Two contrasting hypotheses are still debated in the literature: a rapid (Oesch et al. 2018) or a smooth (McLeod et al. 2016) decline of the UV luminosity density over 8 < < 11. For the case of a fast decline (∝(1+ ) −11 ), the luminous galaxies probed would have formed ∼50% of their stellar mass by redshift =10, whereas for the smooth decline case (∝(1+ ) −4 ) ∼80% of the stellar mass is formed at =10. Although our sample is admittedly small, it represents the only relevant data currently available. Averaging the SFHs of our 6 galaxies we determine that they formed ∼70% of their mass by =10, which seems to favour a smooth decline.

DATA AVAILABILITY
The data underlying this article will be shared on reasonable request to the corresponding author. Figure 11. Earlier redshift evolution of the rest-frame 1500Å luminosity computed from our SED fits for the six galaxies in our sample. Filled regions show the 5 sensitivity of NIRCam/JWST filters (F150W, F200W and F277W) in 3hrs.

Figure 12.
Redshift evolution of the assembled stellar mass computed from the SFH deduced from our SED fits and averaged over the 6 galaxies discussed in this paper (red). This is compared with that predicted for two contrasting measures of the rate of decline in the UV luminosity density deduced from large photometric surveys (Oesch et al. 2014, McLeod et al. 2016. We estimate that 70% of the stellar mass was already in place by = 10 for the 6 galaxies in our ≥9 sample.  Table 4. Updated properties for those sources in Table 1 with a spectroscopic redshift. The final column displays the percentage of the presently-observed stellar mass already formed before =10. was supported in part by NASA program HST-GO-14747, contract NNG16PJ25C, and grant 80NSSC18K0563, and NSF award 1828315. We thank Adam Carnall for providing an updated version of BAGPIPES allowing high ionisation parameter. Based on observations made with ESO Telescopes at the Paranal Observatory under programme ID 0104.A-0028. This paper makes use of the following ALMA data: ADS/JAO.ALMA#2019.1.00061.S. ALMA is a partnership of ESO (representing its member states), NSF (USA) and NINS (Japan), together with NRC (Canada), MOST and ASIAA (Taiwan), and KASI (Republic of Korea), in cooperation with the Republic of Chile. The Joint ALMA Observatory is operated by ESO, AUI/NRAO and NAOJ. Based on observations obtained at the international Gemini Observatory, a program of NSF's NOIRLab, which is managed by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation on behalf of the Gemini Observatory partnership: the National Science Foundation (United States), National Research Council (Canada), Agencia Nacional de Investigación y Desarrollo (Chile), Ministerio de Ciencia, Tecnología e Innovación (Argentina), Ministério da Ciência, Tecnologia, Inovações e Comunicações (Brazil), and Korea Astronomy and Space Science Institute (Republic of Korea). The data presented herein were obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation. The authors wish to recognize and acknowledge the very significant cultural role and reverence that the summit of Maunakea has always had within the indigenous Hawaiian community. We are most fortunate to have the opportunity to conduct observations from this mountain. ID: 2019