Deep rest-frame far-UV spectroscopy of the giant Lyman-alpha emitter 'Himiko'

We present deep 10h VLT/XSHOOTER spectroscopy for an extraordinarily luminous and extended Lya emitter at z=6.595 referred to as Himiko and first discussed by Ouchi et al. (2009), with the purpose of constraining the mechanisms powering its strong emission. Complementary to the spectrum, we discuss NIR imaging data from the CANDELS survey. We find neither for HeII nor any metal line a significant excess, with 3 sigma upper limits of 6.8, 3.1, and 5.8x10^{-18} erg/s/cm^2 for CIV $\lambda$1549, HeII $\lambda$1640, CIII] $\lambda$1909, respectively, assuming apertures with 200 km/s widths and offset by -250 km/s w.r.t to the peak Lya redshift. These limits provide strong evidence that an AGN is not a major contribution to Himiko's Lya flux. Strong conclusions about the presence of PopIII star-formation or gravitational cooling radiation are not possible based on the obtained HeII upper limit. Our Lya spectrum confirms both spatial extent and flux (8.8+/-0.5x10^{-17} erg/s/cm^2) of previous measurements. In addition, we can unambiguously exclude any remaining chance of it being a lower redshift interloper by significantly detecting a continuum redwards of Lya, while being undetected bluewards.


INTRODUCTION
An increasingly large number of galaxies is found by their Lyman-α (Ly α) emission in narrowband imaging surveys at redshifts up to z ∼ 7.3 (e.g. Ouchi et al. 2010;Shibuya et al. 2012) 1 . Searches are ongoing to find Ly α emitters (LAE) at redshifts z ∼ 7.7 and z ∼ 8.8 (e.g. Clément et al. 2012;McCracken et al. 2012;Milvang-Jensen et al. 2013), but first results beyond z ∼ 7 indicate a rapid decline in the fraction of star-forming galaxies with strong observable Ly α emission (e.g. Konno et al. 2014). This is in agreement with the ⋆ E-mail:johannes@dark-cosmology.dk 1 Currently, the spectroscopically confirmed LAE with the highest redshift (z = 7.5, Finkelstein et al. 2013) has been found with HST/WFC3 broadband data low number of Ly α detections in spectroscopic follow-ups for Lyman-break selected galaxies (e.g. Stark et al. 2010;Treu et al. 2013;Caruana et al. 2012Caruana et al. , 2014Pentericci et al. 2014). Such an evolution can be caused either by an increased amount of neutral hydrogen in the vicinity of the galaxies or by a change in galaxy properties, e.g. in the escape of the ionising continuum (e.g. Dijkstra et al. 2014).
Typical LAEs at redshift z ∼2-3 are compact and faint (e.g. Nilsson et al. 2007; Grove et al. 2009), but a population of LAEs with emission extending up to 100 kpc has been found (e.g. Fynbo, Møller & Warren 1999;Steidel et al. 2000;Francis et al. 2001;Nilsson et al. 2006). Currently the most distant object showing characteristics of a Ly α blob (LAB), despite the effects of cosmological surface brightness dimming, is the source Himiko found by Ouchi et al. (2009) at a redshift of 6.6 with Subaru/NB921 imaging.
While low surface brightness extended Ly α halos are identified to be a generic property around LAEs (e.g. Steidel et al. 2011;Matsuda et al. 2012), several mechanisms are theoretically proposed to support the much stronger extended Ly α emission of LABs. Each of them might be responsible either alone or in combination. The suggested possibilities include cooling emission from gravitationally inflowing gas (e.g. Haiman, Spaans & Quataert 2000;Scarlata et al. 2009;Dekel et al. 2009;Dijkstra & Loeb 2009;Faucher-Giguère et al. 2010), superwinds produced by multiple consecutive supernovae (e.g. Taniguchi & Shioya 2000), photoionisation by AGNs (e.g. Haiman & Rees 2001), extreme starbursts in the largest overdensities, where the individual galaxies in the proto-cluster are jointly contributing to make up a blob (Cen & Zheng 2013), and starbursts within major mergers (Yajima, Li & Zhu 2013).
Observational evidence from individual objects suggests that several of these mechanisms may contribute. Hayes, Scarlata & Siana (2011) find based on polarisation measurements evidence for Ly α photons to be originating from a central source and being scattered at the surrounding neutral hydrogen. In other cases, evidence for an AGN as a central ionisation source is found directly (e.g. Kurk et al. 2000;Weidinger, Møller & Fynbo 2004;Bunker et al. 2003). In a few cases due to the absence of an apparent central ionising source (starburst or AGN) it has been argued for gravitational cooling radiation as the only remaining scenario (Nilsson et al. 2006;Smith & Jarvis 2007). However, this is a conclusion which can be challenged (Prescott et al. 2015) as halos producing the required amount of cooling radiation would be expected to have a star-forming galaxy at their centre. Prescott et al. (2015) have found a possible ionising source for the Nilsson et al. (2006) object in a hidden AGN offset from peak Ly α emission.
Substantial observational efforts have already been devoted to Himiko (e.g. Ouchi et al. 2009Ouchi et al. , 2013Wagg & Kanekar 2012). In this paper we present deep VLT/XSHOOTER (Vernet et al. 2011) spectroscopy of this remarkable object, extending over the full range from the optical to the near-infrared (NIR) H-band. The main purpose of the observation is to search for other emission lines than Ly α, which helps to shed further light on the origin of the extended Ly α emission. In particular, both a very hot stellar population (e.g. Schaerer 2002;Raiter, Schaerer & Fosbury 2010), as expected for a metal-free Population III (Pop III), and gravitational cooling radiation (e.g. Yang et al. 2006) would give rise to relatively strong He II λ1640 emission. By contrast, due to preceding metal-enrichment, an AGN is expected to display in addition to He II highionisation emission lines from C, Si or N. We supplement the spectroscopic data by analysing CANDELS JF 125W and HF 160W archival imaging (Grogin et al. 2011;Koekemoer et al. 2011).
In section 2.1, we describe our spectroscopic observations, while we give details about the data reduction in section 2.2, followed by a discussion of the photometry done on archival data (sec. 2.3). Results for the Ly α spatial distribution, the spectral rest-frame UV continuum, Ly α flux and profile, and non-detection limits for the rest-frame far-UV lines are presented in sections 3.1 through 3.4. Subsequently, we discuss in sections 4.1 and 4.2 implications from the broadband SED, in section 4.3 a possible interpretation Table 1. Number of exposures and exposure times per observing block are listed. Except in OB1 and OB9, exposures were taken with 1200 s both in the VIS and UVB arm. In the NIR, each of the exposures was split into two sub-integrations with half the exposure time (e.g. 1200 s VIS ⇒ 2x600 s NIR). Where not all exposures could be used for the reduction due to passing clouds, both the used and the total number are stated. Numbers in square brackets indicate the exposure times included in the NIR stack, if different from the VIS stack.

Date
Obs of the Ly α shape, and finally and most important in section 4.4 the implications from our non-detection limits for the mechanisms powering Himiko. Throughout the paper a standard cosmology (ΩΛ,0 = 0.7, Ωm,0 = 0.3, H0 = 70 km s −1 ) was assumed. All stated magnitudes are on the AB system (Oke 1974). Unless otherwise noted, all wavelengths are converted to vacuum wavelengths and corrected to the heliocentric standard. A size of 1 ′′ corresponds at z = 6.595 to a proper distance of 5.4 kpc. The universe was at that redshift 800 Myr young. When stating in the following 'He II', 'C IV', 'C III]', and 'N V', we are referring to He II λ1640, C IV λλ1548, 1551, [C III]C III] λλ1907, 1909, and N V λλ1239, 1243, respectively.

Spectroscopic Observations
The XSHOOTER data has been taken at  in the second half of the nights starting on 2011 September 2, 3, and 4, subdivided into nine different observing blocks (OB) with integration and observing times summarised in Table 1. We used the same set of slits for XSHOOTER's three spectral arms throughout: 1. ′′ 6x11 ′′ (UVB), 1. ′′ 5x11 ′′ (VIS), and 0. ′′ 9x11 ′′ JH (NIR), which is a slit including a filter blocking wavelengths longer than 2.1 µm, effectively reducing the impact of scattered light in the NIR spectra. All used data was taken under atmospheric conditions classified as either thin cirrus (TN) or clear (CL). After excluding OB1 for being affected by thick clouds, the total usable exposure time was 35480 s. In our NIR reduc-OB 8 OB 2 OB 11 OB 12 OB 6 OB 10 OB 9 OB 3 Figure 1. Alignment of the slit in the different OBs w.r.t to Himiko. A 5. ′′ 7 x 8. ′′ 3 colour composite including CANDELS WFC3 H F 160W and J F 125W as red and green channels, respectively, and the N B921 image as blue channel is shown in the left panel. N B921 is a ground based image with a seeing indicated by the 0. ′′ 8 diameter cyan circle. The 1. ′′ 5x11 ′′ slit, as used in the VIS arm, is included with the different orientations used during the observation. In the right, the 0. ′′ 9x11 ′′ slits, which were used in the NIR arm, are overplotted on a 12. ′′ 1 x 12. ′′ 1 J F 125W cutout. The legend lists the OBs in counterclockwise order (along columns). The positive slit direction is in all OBs towards the bottom of the figure, so mainly towards the south. OB11 and OB12 cannot be separated in the plot, as exactly the same position angles were used. 5"x11" -VIS 5"x11" -NIR 0.9"x11"JH -NIR Figure 2. Accuracy of flux calibration based on cross-calibrating the spectro-photometric standard LTT7987, taken on September 4, against the response function from our main standard Feige110, taken on the night before. The shown curve gives the ratio between the measured LTT7987 flux and its expectation. In the NIR, results are included both for an observation using the same 5 ′′ x11 ′′ slit for LTT7987 and FEIGE110 and for and observation of LTT7987 with the 0. ′′ 9x11 ′′ JH slit. Thin vertical lines indicate wavelengths used by the pipeline for fitting a spline to the standard star in the response function calculation.
tion, we used only frames with the same exposure time of 600 s, resulting in a slightly smaller total time of 33600 s. Acquisition on our target's narrowband image (NB921; Ouchi et al. 2008Ouchi et al. , 2009 centroid in the slit's centre was obtained through a blind offset from a star located 48. ′′ 14 west and 8. ′′ 99 north. We can claim that the pointing accuracy, at least along the slit, was in each of the OBs better than 0. ′′ 1, as can be concluded from the spatial centroid of Ly α in each of the OBs (cf. Table 2). As the NIR spectrum is observed in XSHOOTER simultaneously with Ly α in the VIS arm, we can exclude the possibility of non-detections due to pointing problems.
Spectra were taken with a nod throw of 5. ′′ 0 and a jitter box size of 0. ′′ 5, allowing for an optimal skyline removal. In order to minimise the slit loss, the position angle was set to the parallactic angle at the start of each OB. This is mainly relevant in the NIR, as the VIS and UVB arms are equipped with atmospheric dispersion correctors. The corresponding position angles are shown in Fig. 1.
Assuming that emission lines are co-aligned with one or all of the three continuum bright sources, our decision to use the parallactic angle might have resulted in a higher than necessary slit loss. This can be seen from Fig. 1 (right). At the time of the observations no HST/WFC3 data was available.
It is not possible to measure the seeing directly from the Ly α spectrum, both due to resonant effects of Ly α and as Himiko is not a point source. Therefore, we needed to extract information about the seeing from the FITS header, which has an uncertainty of about 0. ′′ 2. 2 Corrected to both airmass of the observations and the observed wavelength of Ly α and averaged over sub-integrations, we estimate a seeing in the individual OBs between 0. ′′ 6 and 0. ′′ 8 (FWHM ; Table 2), with a mean value of 0. ′′ 7 for the stacked OBs, or 0. ′′ 6 corrected to the wavelength expected for He II wavelength.

Data reduction
We performed our final reduction using XSHOOTER pipeline version 2.3.0 (Modigliani et al. 2010), where we made a small modification to the pipeline code. This was to additionally mask all pixels neighbouring those pixels identified by the pipeline as cosmic ray hits. Without this precaution, artefacts remain in the data, which are not indicated in the pipeline quality map. Otherwise, we used mainly standard parameters.
The echelle spectra were rectified to a pixel size of 0.4Å and 1Å in wavelength direction and 0. ′′ 16 and 0. ′′ 21 in slit direction for VIS and NIR arm, respectively. We are only making use of the spectra from XSHOOTER's VIS and NIR arm, as the wavelength range covered by the UVB arm does not contain any information for this object.
While we obtained our final NIR reduction by automatically combining all frames with the pipeline using the nodding recipe, 3 we could improve the result in the VIS reduction somewhat by using our own script. In the latter case we first reduced the VIS frames in nodding pairs with the pipeline, and then combined the frames based on a weighted mean, using the inverse square of the noise maps produced by the pipeline as weights.
A nodding reduction is commonly considered as essential for a good skyline subtraction in the NIR. Nevertheless, we tried also a stare reduction both in the VIS and the NIR, which could give in an idealised case a √ 2 lower noise. In the case of the NIR spectrum, where the use of dark frames taken with the same exposure time as the science frames 2 We used the FITS header keyword HIERARCH ESO TEL IA FWHM. We tested the use of this keyword with a number of standard and telluric stars. We found that the FWHM values based on this keyword on average agree with a scatter of about 0. ′′ 2 with the Gaussian FWHM measured from the spatial profile.  . Pixel to pixel noise in skyline free regions close to the expected wavelength of He II in the NIR arm. We have successively added 4 frames corresponding to 2 nodding positions to the stack and calculated the κ − σ (κ = 4) clipped standard deviation for each of the steps. The result is the solid line. The dotted line shows the noise level as expected from a 1/ √ N dependence based on the noise in the first step (4 frames). It is perfectly in agreement with the data. In addition, the noise prediction from the pipeline noise model is included.
is necessary for a stare reduction, we used a large enough number of dark frames to not be limited by their noise. 4 While the stare reduction worked for the VIS arm, we experienced in the deep NIR stack spatially abruptly changing residual structures, which we could not safely remove by modelling with slowly changing functions. Consequently, we had to decide that a safe stare reduction was not feasible at this point.
By contrast, the pixel to pixel noise decreases as expected with the square root of the number of exposures in the case of the nodding reduction, as shown in Fig. 3 for a region around the expected He II line and using only pixels not affected by skylines and bad pixels. Reassuringly, this indicates that the structure seen in the stare reduction is at least within the individual nodding sequences temporarily stable and therefore removed by the nodding procedure. As the relevant VIS arm wavelength range redwards of Ly α is affected by telluric emission lines, which requires a robust sky subtraction, we decided finally to use a nodding reduction in the VIS arm, too.
In addition to those pixels flagged as bad in the pipeline, we masked skylines, which we automatically identified by iterative sigma clipping against emission in a stare reduced and non-background subtracted spectrum, and pixels with unexpected high noise, defined as having absolute counts larger than 10 times the 1 σ error. Except in the region of Ly α, where we did not mask these outliers, this assumption is safe in not clipping away any source signal. Additionally, when determining the S/N within extraction boxes over a certain wavelength region, we excluded those parts of the spectrum with a noise either 1.5 times or 2.0 times larger than the minimal noise within 200Å of the region's centre. The decision between 1.5 and 2.0 was made based on the amount of pixels remaining for the analysis.
Comparing the calculated pixel to pixel noise to the prediction from the pipeline's noise model, we find for the stack of all 56 NIR frames a rms noise of 1.2 × 10 −19 erg s −1 cm −2Å −1 per pixel, while the direct pixel to pixel variations have a standard deviation of 8.1 × 10 −20 erg s −1 cm −2Å −1 . As we are using the error spectra based on the pipeline throughout this paper, stated uncertainties for several quantities might be overestimates. On the other hand, there is correlation in the spectrum due to the rectification, which is difficult to quantify, especially as it varies with the position in the spectrum. A full characterization of the noise would require the propagation of the covariance matrix (e.g. Horrobin et al. 2008), which is currently not available in the pipeline.
The instrumental resolution at the position of Ly α and He II was determined based on fitting Gaussians to nearby skylines. We get in the two cases R=5300 (56 km s −1 ) and R=5500 (55 km s −1 ), respectively.
For the determination of the response function, we used a nodding observation of the standard star Feige110 taken with 5 ′′ slits during the night starting on 2011 September 3, directly before beginning with the observation of OB Himiko 1-6. We used this response function for all OBs taken within the three nights. In the pipeline, the response function is obtained by doing a cubic spline interpolation through knots at wavelengths having atmospheric transmission close to 100 per cent. For ensuring a very good response function close to Ly α, we had to remove a knot from the default list at 9270Å and add instead another one at 9040Å.
In order to avoid possible issues of temporal variability of the NIR flat-field illumination, we decided to use the same flat-field observations both for the standard star and the science frame observations, even though they were taken with different slits. 5 Stability and accuracy of the response function were tested by calibrating an observation of the flux standard LTT7987, taken in the second night of our program, based on the response function determined from Feige110 in the first night. The result of this test is shown in Fig. 2. 6 We can conclude that the accuracy of the spectrophotometry is about 5 per cent.
In order to reach the maximum possible depth for our science frames, we mainly avoided spending time on telluric standards. Only in the beginning of the observations in the first night we took one telluric standard with the same slit set-up as chosen for our science observations. We used this frame to fit a model telluric spectrum with ESO's Molecfit package (Smette et al. 2015), using the input parameters suggested in Kausch et al. (2015). Based on the obtained atmospheric parameters, we created model telluric spectra for the airmass of each individual nodding pair. Here, we need to make the strong assumption of constant atmosphere over a time-scale of several hours, which is unlikely completely correct. Nevertheless, the obtained accuracy is appropriate for our purpose. For the second and third night, we used telluric standards taken for other programs right before the start of our observations to fit the atmospheric parameters. While these observations were based on differing slits, we could use the wavelength solution and line kernel obtained Table 2. Results of a Gaussian fit to the spatial Ly α profiles as measured in the individual OBs (cf. Fig. 5) and the stacked spectrum. The 'centre' column gives the displacement w.r.t to the expected position. In addition, the seeing of the individual OBs, corrected to observed wavelength and airmass, is stated.
for night one to create appropriate telluric spectra for nights two and three. We adjusted the error spectra after applying telluric corrections.
All 1D spectra were extracted from the rectified 2D frames. As there is no detectable trace in the NIR spectrum, we needed to assess the accuracy of the position of the trace in the rectified frame based on a check on the reduced standard star. We find that the centre of the trace does not differ more than one pixel in slit direction from the expected position over the complete range of the NIR arm.

Photometry on archival data
As located in the Subaru/XMM-Newton deep field, Himiko is covered both by very deep ground and space based imaging, from X-ray to radio, including the N B921/Subaru narrowband image, which allowed to identify Himiko as a giant Ly α emitter (Ouchi et al. 2009).
Deep HST imaging is available from the CANDELS survey (Grogin et al. 2011;Koekemoer et al. 2011). This includes data from ACS/WFC in VF 606W and IF 814W and in JF 125W and HF 160W with WFC3. We performed in this work photometry on the CANDELS data.
It is noteworthy, that recently, both Jiang et al. (2013) and Ouchi et al. (2013) have published photometry in JF 125W and HF 160W based on two other observations (HST GO programs 11149, 12329, 12616 and GO 12265, respectively), supplemented by deep IRAC 1 & 2 data from Spitzer /IRAC SEDS (Ashby et al. 2013). Especially the analysis of Ouchi et al. (2013) has been targeted at studying Himiko, including additional WFC3/F 098M intermediate band data allowing to identify peaks in the Ly α distribution and ALMA [C II] observations. Ground based near-infrared (NIR) data in J, H, and K is available from the ultra-deep component (UDS) of the UKIRT Infrared Deep Sky Survey (UKIDSS; Lawrence et al. 2007). We are using the UKIDSS data release 8 (UKIDSSDR8PLUS), which has significantly increased depth compared to previous releases. 7 For our SED fitting in section 4.1, we are using NIR photometry determined by us from the CANDELS and the UKIDSS data and supplement it by the IRAC SEDS (1&2) and IRAC SpUDS (3&4) (Dunlop et al. 2007) photometry presented by Ouchi et al. (2013).
We determined magnitudes in several apertures on all optical and NIR images with SExtractor's double image mode (Bertin & Arnouts 1996), with CANDELS JF 125W as detection image. Without unwanted resampling, it is due to the requirement of the same pixel scale in double mode not directly possible to use JF 125W as detection image for all other images. Therefore, we put fake-sources in images with the appropriate pixel scale at the positions determined from the JF 125W image and used these as input in double mode. As the UKIDSS photometric system is in VEGA magnitudes, we use the VEGA to AB magnitude corrections of 0.938, 1.379, 1.900 as stated in Hewett et al. (2006) for J, H, and K, respectively.
Consistent with the visual impression ( Fig. 1), we detect in the JF 125W image three distinct sources at the position of Himiko. We refer in the following to these sources as Himiko-E (east), Himiko-C (centre), Himiko-W (west). Their coordinates are stated in Table 3. The transverse distance between Himiko-E and Himiko-W is 1. ′′ 18 (6.4 kpc), while the distance between Himiko-C and Himiko-W is 0. ′′ 46 (2.5 kpc).
In order to obtain accurate error estimates for the fluxes measured within circular apertures, 8 we determined several measurements with apertures of the same size as used for the object in non-overlapping source free places. For images exhibiting correlated noise, as being the case for the drizzled or resampled mosaic images used for our analysis, this is the appropriate way to account for the correlation. From the κ-σ clipped standard deviation of these empty-aperture measurements, we obtained the 1 σ background limiting flux in the chosen aperture. The clipping with a κ = 2.5 and 30 iterations makes sure that apertures including strong outlying pixels or which despite the method to find source free places are not really source free are rejected.
We defined source free regions based on the SExtractor segmentation maps and masking of obvious artefacts like spikes or blooming. In addition, we made sure that the used regions have approximately the same depth as the region including Himiko. Depending on the available area, we had for the different images between 26 and 190 non overlapping empty apertures. Safely assuming that the noise for the objects is not dominated by the objects, we use these values as errors on the determined fluxes. All stated magnitude errors have been converted from the flux errors by the use of: In Table 3, measurements within 0. ′′ 4 diameter apertures centred on each of the three peaks are listed for the HST images, while magnitudes within 2. ′′ 0 apertures are stated for all used images in Table 4. In the latter case, the apertures are centred on the same position as that stated in Ouchi et al. (2009Ouchi et al. ( , 2013. In addition, aperture corrected magnitudes are included. To calculate the appropriate aperture corrections for the images with larger PSFs, we assumed that in all the considered bands the flux is coming from the three main sources and that the flux ratio between the three peaks is the same as in the JF 125W HST image. This allows us to convolve this flux distribution with the PSF 9 and consequently determine the fraction of the total flux included in the aperture on the created fake image by using SExtractor in the same way as for the science measurements. For the PSF profiles, we assumed in the case of the UKIRT/WFCAM NIR images Gaussians determined by a fit to nearby stars. We get for J,H,K FWHMs of 0. ′′ 74, 0. ′′ 74, and 0. ′′ 69, respectively. Finally, Tables 3 and 4 also include the UV slope β (f λ ∝ λ β ). We calculated it based on the estimator β = 4.43(JF 125W − HF 160W ) − 2 (Dunlop et al. 2012). Both the total object and the two eastern components seem to have very steep slopes of −3.0 ± 0.9, −3.3 ± 0.6, −3.4 ± 0.8, respectively. However, the uncertainties are large.
Interestingly, there seems to exist a slight tension between the JF 125W in the data used by us (CANDELS) and that obtained by Ouchi et al. (2013) with 24.71 ± 0.13 and 24.99 ± 0.08, respectively. This corresponds to a difference of about 1.8 σ. Consequently, they infer a less steep slope of β = −2.00 ± 0.57. While Jiang et al. (2013) have not derived magnitudes for the three individual sources, their total magnitude is with 24.61 ± 0.08 deviating even more. However, they have been using SExtractor MAG-AUTO measurements. Therefore, the comparison between their values and those derived by Ouchi et al. (2013) and us should be treated with caution. On the other hand, the ground based UKIDSS J magnitude is with 25.09±0.26 closer to the value obtained by Ouchi et al. (2013).
The greatest difference between the CANDELS data and that from Ouchi et al. (2013) is in the central component. They measure from their data 27.03 ± 0.07 for JF 125W and we derive 26.66 ± 0.10. A subjective visual inspection of the Jiang et al. (2013) data seems to rather confirm the relatively blue colour in the central blob.
For all stated coordinates, we use the world coordinate system as defined in the CANDELS JF 125W image (J2000). We find that this coordinate system is slightly offset w.r.t to the coordinate system used by Ouchi et al. (2009). Their coordinate 10 corresponds to RA, DEC = +2 h 17 ′ 57 ′′ .581, −5 • 08 ′ 44 ′′ .72 in the CANDELS JF 125W astrometric system. The UKIDSS data appears within the uncertainties well matched to the CANDELS astrometry. Therefore, we do not apply any correction here. Fig. 1 (left) includes a R,G,B composite using HF 160W , JF 125W (both CANDELS), and N B921, respectively. Cutouts around the position of Himiko for all used images are shown in Fig. 4.

Spatial flux distribution and slit losses
The different observing blocks (OBs) taken with different position angles allow to compare the spatial extent of Ly α along the slit to the N B921 image for several orientations. . Ly α spatial profiles for three individual example OBs (2, 8 and 10) and for the complete stack of all OBs. Included are profiles directly extracted from the spectrum over the wavelength range from 9227 to 9250Å and profiles extracted from images under the assumption of the 1. ′′ 5 slit. The latter is shown both for the NB921 image and a J F 125W image, which was corrected to the seeing of the individual spectroscopic OBs. Finally, for the stack the expected continuum profile is also shown for the 0. ′′ 9 slit. The legend is split between the different panels, but applies to all four sub-panels.
To do so, we extracted for each OB a Ly α spatial profile averaged over the wavelength range from 9227 to 9250Å, and replicated this for the N B921 image by assuming an aperture with the shape of the 1. ′′ 5 slit and calculating a running mean over pseudo-spatial bins of 0. ′′ 2. The 0. ′′ 8 FWHM seeing of the N B921 image (Ouchi et al. 2009) is slightly larger than that expected in the different OBs of our spectral data (cf. Table. 2), but within the uncertainties comparable.
We quantified the centroid and spatial width of Ly α in each of the individual OBs by fitting Gaussians to the spatial profiles. Resulting values and the integrated Ly α flux over the same range are stated in Table 2. The centroids imply an excellent pointing accuracy.
For comparison, the expected spatial profile for the continuum within the slit was estimated based on fake JF 125W images (cf. Section 2.3), convolved with the estimated seeing for each OB. All three profiles are shown for three example OBs (2, 8, and 10) in Fig. 5.
The profiles extracted from the spectrum are as expected in good agreement with the N B921 image. For those OBs, like OB8, which are nearly perpendicular to the alignment of the three individual sources, the Ly α profile is significantly more extended than the estimated continuum profile. The main excess in the emission seems to be towards the north. On the other hand, for those OBs like OB10, with the slit aligned closer to the east-west direction, there seems to be an offset of the continuum towards the west. This indicates Ly α emission more concentrated on the eastern parts, in agreement with the F098M/WFC3 observations of Ouchi et al. (2013). We note that the conclusions would not change when assuming different seeing values within ±0. ′′ 2 of the values taken from the header.
Finally, the spatial profile of the combined Ly α stack is shown in the lower right panel of Fig. 5. The shown N B921 and continuum profiles have been calculated as the average of all contributing frames with their respective position angles. In this panel, we are showing in addition the profile for the continuum as expected in the 0. ′′ 9 slit. Noteworthy, we expect the continuum offset by 0. ′′ 2 towards positive slit directions w.r.t to the Ly α centroid.
Slit losses both for Ly α and the continuum and both for the 0. ′′ 9 and the 1. ′′ 5 slits were calculated based on the NB921 or the seeing convolved fake JF 125W image by determining the flux fraction within slit-like extraction boxes. Identical to the profile determination, we treated the individual OBs separately, and simulated the stack, by combining the slit losses for the individual OBs weighted with the appropriate exposure times.
Assuming the 1. ′′ 5 VIS arm slit and an extraction width of 4 ′′ along the slit, being close enough to no loss in slitdirection, results in a slit loss factor of 0.7 (0.4 mag) for Ly α based on the NB921 image.
We determined the optimal extraction-mask size for the expected continuum distribution under the assumption of background limited noise. We did so by maximising the ratio between enclosed flux and the square root of the included pixels. The maximum value is reached in the 1. ′′ 5 VIS slit between 7 and 9 pixels, when keeping the centre of the extraction mask at the formal centre of the slit. For an eight pixel (1. ′′ 28) extraction width, we derive a slit loss factor of 0.74 ± 0.09 (0.32 ± 0.13 mag). The stated uncertainties are resulting from the assumed uncertainty on the seeing.
The spatial distribution for possible He II or metal line emission is not known. Therefore, we consider hypothetically both the cases that it is co-aligned with the continuum or with the Ly α emission. Assuming the two distributions, we obtain optimal extraction widths in the 0. ′′ 9 slit with around six and eight pixels (1. ′′ 26 and 1. ′′ .68), respectively, fixing the trace centre at the formal pointing position in both cases. As a compromise, we assumed for the default extraction a width of seven pixels, corresponding in the continuum and the N B921 case to slit loss factors of 0.53 +0.10 −0.05 (0.7 +0.11 −0.09 mag) and 0.4 (1.0 mag), respectively. Certainly, alternative scenarios are possible which could lead to lower or higher slit losses, e.g. if line emission would be originating mainly from the central or the western-most source, respectively.

Spectral continuum
We made the continuum hidden in the noise of the rectified full resolution spectrum visible by strongly binning the telluric corrected 2D spectrum in wavelength direction from an initial pixel scale of 0.4Å pixel −1 , as produced with the pipeline, to a pixel scale of 11.2Å pixel −1 . Instead of taking a simple mean of the pixels contributing to the new wavelengths bins, we calculated an inverse variance weighted mean, with the variances taken from the pipeline's errorspectrum. This allows to obtain a low resolution spectrum with relatively high S/N in a region strongly affected by telluric emission or absorption. One caveat with this approach is that the flux is not correctly conserved for wavelength ranges where both the flux density and the noise changes quickly. This is the case at the blue side of the Ly α line (cf. Fig. 7). Therefore, we used for our binned spectrum a simple mean in the region including Ly α.
A faint continuum is clearly visible in the resulting spectrum redwards of Ly α and due to IGM scattering not blue-   wards, as expected for Himiko's redshift (Fig. 6). A similar effort in the NIR did not reveal an unambiguous continuum detection.
The detection allows to directly determine the continuum flux density close to Ly α. For the interval from 9400 to 9750Å (1238 to 1285Å rest-frame), chosen to be in a region with comparably low noise in our optimally rebinned spectrum, we get with the 1. ′′ 28 extraction mask a flux density of 10.1 ± 1.4 × 10 −20 erg s −1 cm −2Å −1 (cf. Fig. 6). We used in the calculation a κ-σ clipping with a κ = 2.5, rejecting one spectral bin.
The determined flux density is equivalent to an observed magnitude of 25.18 ± 0.15, or after applying the aperture correction of 0.32 mag, of 24.85 ± 0.15, corresponding to a rest-frame absolute magnitude of M1262;AB = −21.99±0.15. The magnitude is slightly fainter than the the CANDELS JF 125W measurement (24.71 ± 0.13) and slightly brighter than the HF 160W magnitude (24.93 ± 0.15), but within the errors consistent with both of them.  (2011) [LA11] at this redshift is plotted. As a test, we have applied the median transmittance to the Gaussian fit, shown as red dashed line. Finally, the magenta Gaussian in the left indicates the instrumental resolution.

Ly α
The final 2D VIS spectrum in the wavelength region around Ly α is shown in Fig. 7, from which we extracted the 1D Ly α spectrum. This was done in an optimal way (Horne 1986), using a Gaussian fit to the spatial profile. The resulting 1D spectrum is plotted in Fig. 8. As a sanity check, we compared this result to extractions based on simple apertures of different widths. Large enough apertures converged within the errors to the result from the optimal extraction. We determined several characteristic parameters of the directly measured Ly α line. All stated errors are the 68 per cent confidence intervals around the directly measured value based on 10000 MC random realisations of the spectrum using the error spectrum. It needs to be noted that this approach overestimates the uncertainties, as the noise is added twice, once in the actual random process of the observation and once through the simulated perturbations. This means that the resulting perturbed spectra are effectively representations for a spectrum containing only half the exposure time. Yet, the values based on this simple approach allow to get a sufficient idea of the accuracy of the determined parameters.
The pixel with the maximum flux density is at a λ vac;hel of 9233.5 +1.2 −0.4Å . This corresponds to a redshift of 6.5953 +0.0010 −0.0003 . 11 The peak Ly α redshift, z peak , is due to Ly α radiative transfer effects likely different from the systemic redshift of the ionising source (cf. sec Furthermore, we calculated the skewness parameters S and Sw (Kashikawa et al. 2006). For a wavelength range from 9227 to 9250Å we get values of 0.69 ± 0.07 and 12.4 +1.4 −1.8Å for S and Sw, respectively. This is again consistent with the result obtained by Ouchi et al. (2009): S = 0.685±0.007 and Sw = 13.2±0.1Å. Using an increased wavelength range from 9220 to 9260Å, we measure values of 0.9 ± 0.2 and 16 ± 4Å for the two quantities, being consistent with the results for the smaller range, but possibly indicating a somewhat larger value. Indeed, the spectrum might show some Ly α emission at high velocities (cf. Fig. 7). However, this is weak enough to be due to skyline residuals and we do not discuss it further.
Integrating the extracted spectrum over the wavelength range from 9227Å to 9250Å, the obtained flux is 6.1 ± 0.3 × 10 −17 erg s −1 cm −2 . After subtracting the continuum level, the Ly α flux is 5.9 ± 0.3 × 10 −17 erg s −1 cm −2 , or 8.8±0.5×10 −17 erg s −1 cm −2 after correcting for the slit loss factor of 0.67, as derived in section 3.1. This corresponds to a luminosity of 4.3±0.2×10 43 erg s −1 . By comparison, Ouchi et al. (2009) derive a fLy α of 7.9 ± 0.2 × 10 −17 erg s −1 cm −2 and 11.2±3.6×10 −17 erg s −1 cm −2 from the z ′ /N B921 photometry and their slit loss corrected Magellan/IMACS spectrum, respectively. This is in good agreement with our result, considering that the slit loss as calculated from the N B921 image will only be approximately correct, as the seeing in our observation is not known with certainty (cf. sec. 3.1).
From the continuum flux measured in the spectrum and the measured Ly α flux, we derive a Ly α (rest-frame) equivalent width (EW0) of 65 ± 9Å, nearly identical to the 78 +8 −6Å stated by Ouchi et al. (2013). Jiang et al. (2013) have in their recent study derived a Ly α EW0 of only 22.9Å for Himiko. The explanation for this discrepancy is that they have fitted a fixed UV slope based on their relatively blue JF 125W -H160W and extrapolated this slope to the position of Ly α. However, the continuum magnitude derived from our spectrum is not consistent with this assumption.

Detection limits for rest-frame far-UV lines
As we do not unambiguously detect any of the potentially expected emission lines except Ly α, we focus on determining accurate detection limits. The instrumental resolution of 55 km s −1 allows to resolve the considered lines for expected line widths. Therefore, detection limits depend strongly both on the line width, and, due to the large number of skylines of different strengths, on the exact systemic redshift. As the systemic redshift is not known exactly from the Ly α profile alone and the search for [C II] by Ouchi et al. (2013) resulted in a non-detection, too, all limits need to be determined over a reasonable wavelength range.
Studies to determine velocity offsets between the systemic redshift and Ly α for LAEs around z ∼ 2-3 find Ly α offsets towards the red ranging between 100 km s −1 and 350 km s −1 (e.g. McLinden et al. 2011;Hashimoto et al. 2013;Guaita et al. 2013;Chonis et al. 2013), while the typical velocity offset for z ∼ 3 LBGs is with about 450 km s −1 higher ). On the other hand, in LABs also small negative offsets (blueshifted Ly α) have been observed (McLinden et al. 2013).
Whereas offsets at the redshifts of the aforementioned studies are produced by dynamics and properties of the interstellar (ISM) and circumgalactic medium (CGM), at the redshift of Himiko an apparent offset can also be produced by a partially neutral IGM (cf. sec. 4.3).
Motivated by the lower redshift studies, we formally searched for emission from the relevant rest-UV lines for −1000 km s −1 < ∆v < 1000 km s −1 from peak Ly α. We calculated for this range the noise, σm, for extraction boxes with widths up to 1000 km s −1 . The boxes' height in spatial direction was in all cases corresponding to the 1. ′′ 47 trace. .
applied when stating detection limits (2) Here, σi,j are the noise values for the individual pixels from the pipeline's error model. Skylines, and bad and high-noise pixels, determined as described in sec. 2.2, were excluded in the sum, effectively leaving a certain fraction f included of a box's pixels. The signal-to-noise (S/N) was obtained by dividing the spectrum's flux integrated over the same nonexcluded pixels by the determined noise. When stating detection limits, we rescaled the calculated noise by the inverse of the fraction of included pixels (cf. eq. 2).
In the case of the N V λλ1238, 1242, C IV λλ1548, 1551, and [C III], CIII] λλ1907, 1909 doublets we formally calculated values jointly in two boxes centred on the wavelengths of the two components for a given redshift. Wide extraction boxes merge into a single box. Where we state wavelengths instead of redshift or velocity offset, we refer to the centralwavelength of the expected 'blend'. Widths are stated for the individual boxes, meaning that the effective box widths are larger.
While we test over a wide parameter space, we refer in the following several times to somewhat arbitrary fiducial detection limits based on a narrow 200 km s −1 and a wider 600 km s −1 extraction box, assuming a systemic redshift 250 km s −1 bluewards of peak Ly α, which is as mentioned above a typical value for LAEs. This redshift is also marked in several plots throughout the paper as green dashed line.
We took account for the continuum by removing an estimated continuum directly in the telluric corrected rectified 2D frame. The assumed spatial profiles in cross dispersion direction for the 0. ′′ 9 NIR and 1. ′′ 5 VIS slits are those estimated from the seeing-convolved HST imaging (cf. sec. 3.1). While we assumed for the region around N V λ1240 in the VIS arm a continuum with f λ (λ) = 10.2 × 10 −20 erg s −1 cm −2Å −1 , being the flux measured directly from the spectrum as described in sec. 3.3 and corrected for aperture loss, we were using for the NIR spectrum a f λ (λ) = 3 × 10 −20 erg s −1 cm −2Å −1 within the slit at the effective wavelength of H160W and a spectral slope of β = −2.
The used NIR flux density is close to that of the measured HF 160W . Due do the difference between our JF 125W and the measurement based on UKIDSS J and the JF 125W by Ouchi et al. (2013), we decided for the conservative option 12 not to follow the profile shape seen by our data and use a continuum flat in fν instead, even so the corresponding β = −2 is only at the upper end of the uncertainty range allowed from the measurement in our work (cf. also sec. 2.3).
In Table 5, 5 σ detection limits, extracted fluxes, and the fraction of non-rejected pixels are stated all for N V, C IV, He II, and C III] in three different extraction apertures. The spectra for the relevant regions are shown in Fig. 9.

He II
The production of He II λ1640Å photons, which originate like Balmer-α in H I from the transition n = 3 → 2, requires excitation at least to the n = 3 level or ionisation with a following recombination cascade. Whereas the ionisation of neutral hydrogen, H I, requires only 13.6 eV, a very high ionisation energy of 54.4 eV is necessary to ionise He II and consequently only very 'hard' spectra can photo-ionise a significant amount. Such hard spectra can be provided by AGNs, having a power-law SED with significant flux extending to energies beyond the Lyman-limit, nearly or completely metal-free very young stellar populations with an  Figure 11. Signal-to-Noise ratio (S/N) measured within extraction boxes equivalent to those included in Fig. 10. No continuum and telluric correction was applied for this plot. Maximum values are reached for boxes with λ hel;vac = 12453Å with a velocity width of 430 km s −1 (cyan hexagon) and λ hel;vac = 12447Å with a velocity width of 100 km s −1 (magenta triangle), respectively. Table 6. 2D spectrum and fake-source analysis for a wavelength range corresponding to the expected He II wavelength. In the row observed, the 2D spectrum in the region around the possible He II excess is shown. The grayscale extends linearly between ±1 × 10 −19 erg s −1 cm −2Å −1 (white < black). Above a Gaussian smoothed version is shown with the same scale, at the top of which a sky-spectrum for the same region is included. Both the initial mass function extending to very high masses, or the radiation emitted by a shock. In Fig. 10, the 5 σ noise determined as described in sec. 3.4 is shown for the relevant central-wavelength/box-width space around the expected He II position and the S/N calculated over the same parameter space is shown in Fig. 11. While there might be indication of some excess at velocities between −450 and 0 km s −1 , we do not find sufficient signal to claim a detection. The two masks giving the highest S/N 13 are a wider one with a width of 430 km s −1 at a central wavelength of 12453Å (z = 6.591, ∆v = −160 km s −1 ) and a narrower one with a width of 100 km s −1 at a central wavelength of 12447Å (z = 6.588, ∆v = −300 km s −1 ). The S/N in the two cases is after continuum subtraction 1.9 and 1.7, respectively. A somewhat higher S/N can be reached by using a narrower trace more centred on the position of the expected continuum.
The 2D spectrum over the relevant velocity offset range is shown in Table 6 in the row labelled 'observed'. The position of the default trace is indicated as magenta dashed lines in the smoothed figure, which is identical in the two columns, while the two extraction boxes giving the highest S/N are indicated by cyan vertical lines in the two columns, respectively. Additionally, a sky-spectrum over the same region and a Gaussian smoothed version are included. In the Gaussian smoothing, we excluded pixels being masked in our master high-noise pixel mask. For a guided eye, it might be possible to identify the excess visually. Yet, it is certainly possible that noise is seen and it cannot be considered a detection. It is noteworthy that there is a triplet of weak skylines in the centre of the region. While these skylines should in principle not increase the noise by much, as also being consistent with the error spectrum, this would assume an ideal sky subtraction. As can be seen in the figure, there are however some unavoidable residuals, extending over the complete slit.
We tried to understand which line flux would be required for an excess to be considered visually a safe detections. We did this by adding Gaussians with FWHMs of 100 km s −1 and 400 km s −1 centred on the wavelengths of the two extraction boxes leading to the highest S/N and assuming a spatial profile as expected for the continuum. The results are shown in Table 6. After visual inspection of four authors, we concluded that an additional flux of 2 × 10 −18 and 4 × 10 −18 erg s −1 cm −2 would be required in the two cases for a detection considered to be safe, corresponding as expected to approximately 5σ detections.
Finally, we estimated the 3σ upper limit on the He II EW0 in our fiducial 200 km s −1 extraction box, using the same continuum estimate as used for the continuum subtraction. This is a continuum flux density at the He II wavelength of 4.0 ± 0.5 × 10 −20 erg s −1 cm −2Å −1 , assuming the appropriate slit loss. This results in an upper limit of the observed frame EW of 75 ± 10Å, which corresponds to a rest-frame EW0 of 9.8 ± 1.4Å. The errors are due to the uncertainty in HF 160W , not including the uncertainty in the continuum slope β. Direct 2d spectrum telluric corrected Noise map Gaussian convolution of 2d spectrum Figure 13. Region around the N V λ1240 doublet, as expected based on the Ly α redshift. Upper: Telluric corrected 2D spectrum, scaled between ± the minimum rms noise, σ min , in the shown region. Middle: Noise map linearily scaled between σ min and 3σ min (white < black). The high noise regions are both due to skylines and due to regions requiring telluric correction. Lower: Spectrum smoothed with a Gaussian kernel with a standard deviation of one pixel.

High ionisation metal lines
If Himiko's Ly α emission was powered either by a 'type II' or less likely by a 'type I' AGN, being disfavoured from the limited Ly α width, relatively strong C IV λ1549 emission would be expected. This would be accompanied by somewhat weaker N V λ1240, C III] λ1909, He II λ1640, and Si IV λ1400 emission lines (e.g. Vanden Berk et al. 2001;Hainline et al. 2011).
3.6.1 N V N V λ1240 is a doublet consisting of two lines at 1238.8Å and 1242.8Å, respectively. With their oscillator strength ratio of 2.0 : 1.0, the effective blend wavelength is 1240.2Å.
The spectrum around N V is shown in the left most panel of Fig. 9. The hatched wavelength ranges mark the regions for the two N V lines under the assumption of our fiducial 200 km s −1 wide box. Weak and blended skylines, which are not marked by our skyline algorithm, and telluric corrected absorption causes the noise to vary strongly over the region.
Using the extraction aperture as shown in Fig. 9 and subtracting the continuum in the 2D frame as described in sec. 3.4, we derive for the wider and narrower of our two fiducial boxes excesses of 0.7 and 0.5 σ, respectively, where we need to exclude a relatively high fraction of pixels due to high noise (cf. Table 11). The result is consistent with zero. Exploring the ∆v-width parameter space for the S/N (Fig.  12), velocity offset and box-width can be chosen in a way to get a higher S/N. E.g. boxes at +450 km s −1 with a width of 840 km s −1 per doublet component, corresponding to a single merged extraction box of 1324 km s −1 , have a S/N of 3.4 with an included fraction of 42%. However, such a relatively large offset towards the red from the Ly α redshift seems not feasible. Restricting the analysis to a more likely range of ∆v from -500 to 0 km s −1 , we would find a maximum S/N of 2.3 for box widths of 400 km s −1 at no velocity offset w.r.t to Ly α, with an included fraction in the box of 45%. The flux within non-excluded pixels is 1.5 × 10 −18 erg s −1 cm −2 . A visual inspection of the relevant region does not allow for the identification of any line (Fig. 13).

Other rest-frame UV emission lines
In Fig. 9 we are also showing cutouts for two further lines (C IV and C III]). The relevant wavelength range for the two components of C IV λλ1548, 1551, which have an oscillator strength ratio of 2.0:1.0, is located in a region of high atmospheric transmittance within the J band. While there are a few strong skylines, especially between the two components, there is enough nearly skyline-free region available. Visually, we do not see any indication for an excess. Also statistically, considering again the ∆v range from −500 to 0 km s −1 w.r.t to the Ly α peak redshift, we find a maximum excess of 1.4 σ after continuum subtraction.
By contrast, the C III] doublet is located between the J and H band, suffering from strong telluric absorption (cf. Fig. 9). Nevertheless, it could be possible to detect some signal in the gaps between absorption. Especially, the relevant part bluewards of the Ly α redshift has a relatively high transmission. Both the visual inspection and the formal analysis of the telluric and continuum corrected spectrum indicate no line. Detection limits for our fiducial boxes are stated in Table. 5.
Si IV λ1403 is also in a region of high atmospheric transmission. However, here the overall background noise in the spectrum is relatively high and as expected for this compared to C IV usually weak line, we do not see an excess.
Another line is N IV] λ1486, which has in rare cases been found relatively strong both in intermediate and high redshift galaxies (e.g., Christensen et al. 2012;Vanzella et al. 2010). Unfortunately, the spectrum is at the expected wavelength for N IV] covered with strong skylines (not shown).
The same holds for the [O III] λλ1661, 1666 doublet, which can be relatively strong in low mass galaxies undergoing a vigorous burst of star-formation (e.g. Erb Table 7. Parameters of the used SSPs, which are the Yggdrasil burst models from Zackrisson et al. (2001Zackrisson et al. ( , 2011 and the BC03 Bruzual & Charlot (2003)  any excess in the relevant wavelength range. As the region is in addition affected by several bad pixels, we do not state formal detection limits for this doublet.

SED fitting
We performed SED fitting including JF 125W , HF 160W , K, and IRAC1-4, in total seven filters, where for IRAC3-4 only upper limits are available. As UKIDSS K and the IRAC data are not resolving the three components and a profile fit is with the available S/N not feasible, we fitted the three sources jointly by using the aperture corrected 2 ′′ diameter photometry (Table 4). Throughout our SED fitting we fixed the redshift to z = 6.590, a reasonable guess for the systemic redshift based on the Ly α line.
We used our own python based SED fitting code CONIECTO, which allows both for a MCMC and a grid based analysis. Here, we derived our results with the grid based option. The code requires as input single age stellar populations (SSP) for a set of metallicities. Additionally, a precalculated nebular spectrum including continuum and line emission needs to be specified for each age-metallicity pair, and can be added to the respective SSP with a scale-factor between 0 and 1. This scale-factor can be understood as covering fraction, fcov, or 1 − f esc ion , with f esc ion being the escape fraction of the ionising continuum. The inclusion of nebular emission has been found crucial for the fitting of redshift z ∼ 6 − 7 LAEs and LBGs (e.g. Schaerer Figure 14. SED fitting results under the assumption of continuous SFH using the Yggdrasil models. Upper panel shows the lowest possible χ 2 at a given point in the age-E (B-V) plain. Four physical quantities for the best models are shown below. For orientation, the same χ 2 best contours from the top plot are replicated as dotted line in each of the subplots. The model with the absolute minimal χ 2 is marked as cross.
SEDs for a given SFH history are obtained by integrating the single age stellar populations. We were restricting our analysis to instantaneous bursts and continuous starformation. Due to the lack of deep enough rest-frame optical photometry in IRAC3-4, which would not be affected by strong emission lines at our object's redshift, it does not make sense to use more complicated SFHs. Already the used set of parameters allows for over-fitting of the models.
We applied reddening to the integrated SEDs using the Calzetti et al. (2000) extinction law, assuming the same reddening both for the nebular and the stellar emission, motivated by evidence for the validity of this assumption in the high redshift universe (Erb et al. 2006).
As main input SSPs we used the models 'Yggdrasil' (Zackrisson et al. 2011), which include metallcities all from zero (Pop III) to solar (Z⊙ = 0.02). Their code consistently treats both nebular line and continuum emission (Zackrisson et al. 2001) by using Cloudy (Ferland et al. 1998) on top of stellar populations. We are here referring to SSPs as their publicly available instantaneous burst models. More details are summarized in Table 7.
For reasons of comparison to the SED fitting done by  Ouchi et al. (2009Ouchi et al. ( , 2013, we also obtained results with input models as used in these works. These are based on BC03 (Bruzual & Charlot 2003) models, where a nebular emission is calculated by the prescription presented in Ono et al. (2010). We are referring in the following to these models as BC03-O10.
After calculating synthetic magnitudes on a large age-E(B-V)-Z-fcov grid, we have minimized χ 2 w.r.t to mass for each parameter set. The grid covered ages between 0-800 Myr, with the upper limit being the age of the universe at z = 6.59, and E(B-V) from 0.00 to 0.45. Used metallicities were those available in the input models (Table 7) and for fcov we allowed for three different values (0,0.5,1.0).
In Fig. 14, the results are shown for continuous starformation using the Yggdrasil models. For each point in the age-E(B-V) space the Z-fcov tuple allowing for the minimal χ 2 , χ 2 best , is chosen. The resulting χ 2 best contours are indicated in all five subplots, with the five subplots showing χ 2 best , stellar mass, star-formation rate averaged over 100 Myr (SF R100) 14 , metallicity, and fcov, respectively. Stellar masses refer to the mass in stars at the point of observation.
The global best fit SED over the explored parameter space is a very young 3 +32 −2 Myr stellar population, which is strongly star-forming (SF R100 = 2.3 +26.2 −1.2 × 10 2 M⊙ yr −1 ) and has a stellar mass of 7 +32 −3 × 10 8 M⊙ with a χ 2 = 1.2. 15 This best fit model is shown as cross in the maps and as SED 14 If the age of the stellar population is smaller than 100 Myr, it is averaged over the population age. SF R 100 is identical to the instantaneous SFR for a constant SFH. 15 Not reduced χ 2 in Fig. 15. The uncertainties on the estimated parameters were determined based on the range of models having χ 2 < 6. A value of six can be understood as a good fit for seven filters.
Clearly, a large range of parameters is allowed. It is only for metallicity and reddening that relatively strong constraints can be inferred, with mainly Z = 0.2Z⊙ models giving good fits and some Z = 0.4Z⊙ models being allowed within χ 2 < 6. Models with E(B-V ) greater than 0.25 are unlikely. Further, at least partial nebular contribution is required. The wide range of solutions can be understood through the interplay of a somewhat evolved population with a 4000Å break and strong nebular emission, combined with small amounts of dust extinction.
The χ 2 best plots are compared for burst and continuous star-formation both using the Yggdrasil and the BC03-O10 model sets in Fig. 16. Burst models give acceptable fits only for very young ages, where they are basically identical to the continuous star-formation models. Both model sets give consistent results, with small differences mainly existing where specific metallicities are only available in one of the two sets. E.g., the area of relatively low χ 2 best at high ages for burst models using Yggdrasil is absent in the BC03-O10 models, as they require zero metallicity (Kroupa and log-normal IMF). However, these SEDs would require stellar masses in excess of 10 11 M⊙ and can hence be considered infeasible. In general, biases are possible due to discretised metallicities.
We also tested the impact of substituting our JF 125W measurement with that of Ouchi et al. (2013). The fainter JF 125W flux shifts the χ 2 best contours slightly towards higher reddening and hence higher required SFRs, or alternatively higher ages. Ouchi et al. (2013) have found from their SED fitting with BCO3-O10 models under the assumption of continuous SFR a best fit model with a mass of 1.5 +0.8 −0.2 × 10 9 M⊙, an age of 4.2 +0.8 −0.2 × 10 6 yr, and an E(B − V ) = 0.15 +0.04 −0.03 , consistent with our results. 16 If Himiko's extended Ly α emission was driven by merger activity between the three continuum bright objects, SFRs as high as a few 100 M⊙ yr −1 are expected, as seen e.g. in the simulations by Yajima, Li & Zhu (2013) focusing on LABs, and consistent with our best fit model. Importantly, a burst with the same properties as our best fit model having happened 100 Myr before the observed burst, would have faded in all detected filters by at least a factor 20. Hence, there could easily be a moderately young stellar population of a few 10 9 M⊙ without significantly contributing to the SED.
While the present work was under revision, Schaerer et al. (2015) also published results from SED fitting for Himiko. Most importantly, they made use of the upper limit on the far-infrared luminosity, which can be estimated from the ALMA upper limit for the 1.2mm (observed-frame) con-  Fig. 14. For comparison, the χ 2 best result using the BC03+O10 models and the case of burst SFH is shown. tinuum presented by Ouchi et al. (2013), in order to constrain the maximum allowed dust extinction.
Their obtained upper limit corresponds assuming a Calzetti et al. (2000) extinction law to E(B − V ) < 0.05 ± 0.03, with the uncertainty due to the unknown dust temperature assumed for the conversion from continuum flux density to FIR luminosity. Using this upper limit, a huge part of the allowed parameter space can be ruled out. Schaerer et al. (2015) could argue, using for the SED fitting the photometry of Ouchi et al. (2013), that a young and heavily star-forming solution is disvafoured. However, due to the somewhat bluer F 125W -F 160W measured from the CAN-DELS data compared to that measured in the data of Ouchi et al. (2013), our fit requires somewhat less extinction and hence very young models consistent with their upper limit can be found.

Further SED considerations
As shown in section 4.1, the observed broadband magnitudes are compatible with models over a wide age, mass, and reddening range, where the single stellar population fit is likely an oversimplification of the problem. When looking at the JF 125W -HF 160W colour of the individual components (cf .  Table 3), there exists a significant difference between the two more eastern and the western-most component, with the latter being about 0.3 mag redder. This can all indicate differing stellar populations, differing escape fractions of the ionising continuum, f esc ion , or differing amounts of dust. Interestingly, the JF 125W -HF 160W ∼ −0.3 ± 0.1 of the two eastern components, corresponding to a β = −3.3 ± 0.6, is somewhat difficult to explain with a Pop II when taking account for nebular emission under the assumption of low ionising continuum escape f ion esc (e.g. Bouwens et al. 2010). Only for complete escape of the ionising radiation, a population with a metallicity as in our best fit SED (Z = 0.2Z⊙) would produce a slope as steep as β = −3, whereas for f ion esc = 0 the steepest β is a about -2.5. By contrast, the best fit SED model requires a very low f ion esc to explain the IRAC magnitudes.
While each of the components does not differ more than 1.5σ from this β = −2.5, the fact that Ouchi et al. (2013) find at least for the two outer components similar results based on their independent data, increases the probability that the steep slope of Himiko-E is real. Certainly, possibilities exist to reconcile the blue colour of the individual components with the low escape fraction required by the SED model. E.g., there could be anisotropic ionising escape in our direction, or the ionising radiation could escape the star-forming regions and the nebular continuum emission is produced somewhat further out, making the nebular emission more extended than the stellar continuum.
Both the visual inspection of Fig. 4 and the spatial profiles in the individuals OBs (Fig. 5) indicate an offset of the overall continuum light w.r.t to the N B921 light distribution, which is dominated by the Ly α emission. The Ly α emission seems least strong around Himiko-W . This has been confirmed by the WFC3 / F098M imaging of Ouchi et al. (2013), who find the strongest Ly α emission originating from Himiko-E, which is with EW0 = 68 +14 −13 however not as high that it would put constraints on the escape fraction.
Two-photon, 2γ, continuum (Breit & Teller 1940), emitted by transitions between the 2s and 1s states of the hydrogen atom, is in the case of nebular emission powered by a central ionising source one among other mechanism contributing to the continuum. In the case of cooling radiation, where the hydrogen atoms are collisionally excited, it would be the sole continuum contribution, though (Dijkstra 2008). Our data point provided by the spectrophotometry for the rest-frame wavelength range from 1238 to 1285Å could compared to F 125W and F 160W resemble the typical 2γ dip close to Ly α. Using the frequency-depended emissivities as stated in Table 1 of Spitzer & Greenstein (1951), we calculate the expected JF 125W − HF 160W colour for the two photon continuum at z = 6.590. We find a value of JF 125W − HF 160W = −0.08. Therefore, it cannot be solely responsible for the found very blue JF 125W −HF 160W . In addition, we estimate for the peak flux density expected from cooling radiation based on the predictions of Dijkstra (2008) and the measured Ly α flux for Himiko a value of 28.3 mag. This is more than three magnitudes fainter than the measured flux. Therefore, 2γ emission from cooling radiation is unlikely to significantly contribute to the continuum.
Finally, we note that Ono et al. (2010) have favoured in their SED fitting for the composite of 91 N B921 selected LAEs at z = 6.6 a very young (∼ 1 Myr) and very low mass (∼ 10 8 M ⊙ ) model, with significant nebular contribution (f ion esc = 0.2). They have only included those objects, which are not detected individually in IRAC 1, 17 therefore excluding objects like Himiko. The IRAC 1 magnitude in their median stack is 26.6 mag. For Himiko, the total magnitude after the slightly uncertain aperture correction is 23.69 ± 0.09. Simplifying assuming that the flux in Himiko is equally dis-tributed between the three sources, each of them would have a magnitude of 24.9. This is only a factor five higher than the median stack. Therefore, the individual components are not as extremely different from the typical z = 6.6 Ly α emitters as the joint photometry suggests.

Lyman alpha profile
Due to resonant scattering the Ly α profile can be modified significantly both in the ISM/CGM and in the intergalactic medium (IGM). A shaping of the profile within the ISM/CGM is likely, with Ly α possibly entering the IGM with a typical double peaked profile observed for lower redshifts LAEs (e.g. Christensen et al. 2012;Krogager et al. 2013) as predicted by theory (e.g. Harrington 1973;Neufeld 1990;Verhamme, Schaerer & Maselli 2006;Laursen, Razoumov & Sommer-Larsen 2009). The blue or red peak is suppressed in the case of outflows or inflows, respectively.
On the other hand, Ly α could leave the CGM also with a nearly Gaussian profile, as seen in several cases for lower redshift LABs (e.g. Matsuda et al. 2006), and explainable theoretically by fluorescent Ly α emission in a fully ionised halo (e.g. Dijkstra, Haiman & Spaans 2006).
At z ≈ 6.5, the IGM is expected to always suppress the blue part of the Ly α line completely, while an extended damping wing might also suppress the red part to some extent, as for example found by Laursen, Sommer-Larsen & Razoumov (2011), in accordance with previous observational (e.g. Songaila 2004) and analytical results (e.g. Dijkstra, Lidz & Wyithe 2007). Therefore, even when leaving the CGM as a Gaussian, the line might be reprocessed to the observed shape through scattering in the IGM. We tested the feasibility of this scenario for Himiko.
Assuming that the red slope of the profile is the nearly unprocessed Gaussian, we fit as a first test a line to this part only, similar to the approach used by Matsuda et al. (2006). We always added to the Gaussian a continuum as measured from the spectrum redwards of Ly α. The wavelength interval used for the fit is shown in Fig. 8 and ranges from 9236.4 to 9250.0Å.
The results from the formal fit are a redshift of 6.589 +0.003 −0.013 , a line FWHM of 768 +317 −91 km s −1 corrected for instrumental resolution, and a line flux of 2.2 +8.1 −0.7 × 10 −16 erg s −1 cm −2 for this full Gaussian. The best fit flux value would correspond to a Ly α luminosity of 1.7 × 10 44 erg s −1 after slit loss correction, and is a factor 3.8 larger than the actually measured one.
The obtained F W HM is not unrealistically high. Matsuda et al. (2006) have found for LABs at z = 3.1 F W HM s of 500 km s −1 . These high values were also confirmed by simulations (Yajima, Li & Zhu 2013). For high redshift radio galaxies (HzRGs), the determined width would be still at the very lower limit (e.g. van Ojik et al. 1997).
Then, we tested whether the IGM absorption could indeed produce the observed Ly α profile of Himiko by fitting to a model where we apply the median IGM absorption curve of Laursen, Sommer-Larsen & Razoumov (2011) 18 to a Gaussian, assuming that the Gaussian's peak corresponds to the systemic redshift and convolving the result with the 18 reionization starting at z = 10 instrumental resolution (Fig. 8). For the fit of this combined model, we used a wavelength range including the full profile, ranging from 9220.3 to 9250.0Å.
Under these assumptions the best fit underlying Gaussian has a redshift of 6.59052 +0.00006 −0.00004 , a line FWHM of 702 +12 −13 km s −1 , and a line flux of 1.76 +0.02 −0.02 × 10 −16 erg s −1 cm −2 , where the uncertainties do not take account for the range of possible IGM absorption curves. The result, which is shown in Fig. 8 both before and after applying the median IGM absorption curve of Laursen, Sommer-Larsen & Razoumov (2011), is surprisingly close to the observed profile save some small discrepancy at the peak. This demonstrates that IGM absorption alone is a valid option for shaping the Ly α profile of Himiko.
Summing it up, it is clear that from the observed Ly α shape alone little can be concluded about the full profile before entering the IGM. Therefore, the fraction of Ly α scattered out of the line of sight by the IGM is highly uncertain and upper limits on the flux ratios between rest-frame far-UV lines and those of Ly α, as required for the discussion in sec. 4.4, need to be treated with caution.
The appropriate loss for Ly α can range from nearly zero, as would be the case for a single red peak produced by scattering of Ly α at an expanding optically thick sphere, to an enormous fraction in the case that we are only seeing a strongly suppressed red peak resulting from an infalling medium. Assuming the Gaussian as used for the model shown in Fig. 8 and comparing the flux to the actually measured one, about 38 per cent would pass the IGM.
As a compromise we state in the following upper limits assuming 50 per cent flux loss in the IGM or alternatively, and more conservatively w.r.t to upper limits, zero flux loss to the IGM. In addition to the IGM absorption, the Ly α emission might also be reduced by dust, even so Himiko has been constrained to be very dust poor (Schaerer et al. 2015).

Implications from upper limits on rest-frame far-UV lines
As already outlined in the introduction, there is a range of possible mechanisms proposed to explain the extended emission around LABs. The most popular are photo-ionisation either by a starburst, a (hidden) AGN, or radiative shocks caused by a burst of Sn IIe following the onset of starformation. Alternatively, the powering mechanism could be gravitational cooling radiation. Possibly, a contribution from several mechanisms is jointly powering Himiko. While our upper limit measurements are not suited to identify minor contributions, we can compare them to the expectations for the different mechanisms assuming these as dominating. For reference, measurements or upper limits on N V, C IV, He II, and C III] from the literature for a selection of composite spectra and interesting individual LABs and LAEs are listed in Table 8 and compared to the upper limits obtained by us for Himiko. The stated values are normalised by the respective Ly α fluxes.
As Ly α and the other lines might have different spatial extent, we calculated the ratio between these lines and Ly α for Himiko in two different ways. In both cases we converted before taking the ratio both Ly α and the upper limits to aperture corrected values, but while assuming in the first case that potential emission is as extended as Ly α, we as- sumed in the second case that the emission from other lines is co-aligned with the continuum. The values for the first case are identical to those one would obtain from taking the ratios with Ly α measured in the same apertures 19 as used for the determination of the line flux upper limits.

Stellar population
The range of SED models which give reasonable good fits allows for SF R100 ranging from about 100 M⊙ yr −1 to extreme 2600 M⊙ yr −1 . The Ly α fluxes and rest frame equivalent widths for these models are between 6 × 10 −17 and 37 × 10 −17 erg s −1 cm −2 and between 72 and 296Å. These values were extracted from the Yggdrasil SEDs by using the same approach as for the observation, meaning a continuum measurement at a rest-frame wavelength of 1262Å. Further, the same extinction law as for the rest of the galaxy was assumed for Ly α. These values mean that Himiko's observed Ly α flux and EW0 can easily be accounted for by the strong Pop II star-formation. Even a relatively strong IGM correction or destruction of Ly α by dust in the ISM would not pose a problem. Still, an interesting question is whether in this heavily star-forming galaxy Pop III star-formation might be ongoing 800 Myr after the Big Bang or whether our upper limit for He II allows to rule it out. The comparison of a measured He II EW 0 to theoretical models can put very strong constraints on the allowed IMF-metallicity-age parameter space. Combining the information about He II with an intrinsic Ly α EW 0 would allow to tighten the constraints 19 Meaning same slit-width and extraction aperture height. even further. However, Ly α's susceptibility to resonance effects weakens its usefulness for this purpose. He II is not affected by this problem. As we do not detect He II, we can only test whether we would expect for Pop III star-formation He II flux above our non detection limit. 20 Using the EW 0 predictions for He II and Ly α calculated by Raiter, Schaerer & Fosbury (2010), and following the approach by Kashikawa et al. (2012) for the Lyman-α emitter SDF-LEW-1, we applied this tool to Himiko. In Fig. 17, the predictions are shown for two different initial mass functions (IMFs) and six different metallicities ranging from zero to solar, and assuming constant star-formation. In addition, both for zero metallicity and Z = 0.004 = 0.2Z⊙ values are included for burst models. Both IMFs, Salpeter and B, are power law IMFs with Salpeter (1955) slope of α = 2.35, however differing in the mass-range with 1-100M ⊙ and 1-500M ⊙ , respectively.
In the figure, both the directly observed Ly α EW0 and the one obtained for the Gaussian fit to the red wing (sec. 4.3) are shown as horizontal lines. The range of allowed EW0's from the SED fitting is indicated as shaded area. For He II, the 3σ upper limit for our fiducial 200 km s −1 box is represented as shaded area.
Assuming that the intrinsic Ly α EW0 is the directly measured one, even a solar metallicity (Z⊙ = 0.02) population with standard Salpeter IMF and continuous starformation would be independent of the age above the observed EW0. The approximate correctness of the red wing fit would allow for significantly stronger constraints. Then, basically only Pop III and very metal poor models up to Z = 10 −5 would be allowed for continuous ages larger than about 10 Myr. Populations with higher metallicity would need to be younger. As indicated by our SED fitting, young ages are certainly possible, and very low metallicities are even in the case of strong IGM absorption not necessarily required.
For a Salpeter IMF with 1-100M ⊙ , only an extremely young (< 1 Myr) metal free population could produce a He II EW 0 above the 3σ detection limit of 9.8Å, independent of IMF, while for the high mass IMF B He II flux would be detectable over a longer period at least for a Pop III. Here, it is important to note that detectable He II would be accompanied by very high Ly α EW0. Summing it up, it is only a very young Pop III with an IMF producing very heavy stars, which can be ruled out based on the He II upper limit.
C III], [O III] λλ1661, 1666, and at least in some cases also C IV emission, created by photo-ionisation in the H II regions surrounding young high-mass stars, is now understood to be relatively common in low mass galaxies with high specific star-formation rates and low metallicities (e.g. Fosbury et al. 2003;Erb et al. 2010;Christensen et al. 2012;Stark et al. 2014). While not as strong as in AGNs, the typically strongest of these lines is C III], which has been found with EW0 up to ∼ 15Å (Stark et al. 2014). 21 A detection of this line seems to be correlated with high Ly α EW, with the correlation probably being a result of both line strengths depending on metallicity (Stark et al. 2014).
Our 3 σ upper limit for C III] EW0 is due to the lines' location between J and H band with EW0 < 23Å for the 200 km s −1 extraction box larger than the observed values. Only if the lines were close to unresolved and the velocity offset with respect to peak Ly α would allow for relatively high transmittance (cf. Fig. 9), a detection would have been feasible. Similar, one or both of the [O III] λλ1661, 1666 lines, which can be almost as strong as C III], would be only for very specific velocity offsets w.r.t to Ly α within the skylines gaps. Therefore, little can be concluded from these nondetections about the presence of a substantial population of young massive stars, which in principle could help to break the degeneracy in the SED fitting.

Illumination by (hidden) AGN
Quasars are well known to be responsible for strong Ly α emission in so called extended emission line regions, EELRs, spreading in some cases over several 100 kpc. When being switched on during the phase of cold gas accretion, Haiman & Rees (2001) predicted this emission to originate from the illumination of primordial gas. However, at least at lower redshifts, the illuminated gas is more likely ejected galaxy material driven out by either SNe or the jets of the AGN itself (e.g. Villar-Martín 2007), resulting in the illumination of metal enriched gas and hence a multitude of emission lines.
With Himiko's three spatially distinct components all located within about 6 kpc transverse distance and no strong evidence for much difference in the line of sight direction, they are likely in the process of merging and hence triggered AGN activity is not unlikely. Further, Ly α emission powered by AGNs is a common and expected phenomenon. When considering AGNs, it can be useful both to subdivide between radio-quiet and radio-loud (e.g. McCarthy 1993), and between obscured ('type II') and non-obscured ('type I') by a dusty torus. Extended emission has been found for all cases (e.g. Matsuoka 2012). In the following we are discussing all four options for Himiko, especially w.r.t to our non-detection limits.
As discussed by Ouchi et al. (2009), neither of the existing X-ray (XMM-Newton), 24µm (Spitzer MIPS), and submm (850µm SCUBA) data is deep enough to rule out a type I quasar, when scaling the Elvis et al. (1994) quasar template to the rest-frame VIS IRAC1 flux. However, there is a strong argument against a type I AGN in the Ly α width, which is even after accounting for IGM absorption too narrow to be originating from a type I AGN (cf. sec. 4.3). Another argument against a quasar, at least for the two eastern components, is the continuum slope. Davis, Woo & Blaes (2007) find for SDSS quasars a mean slope of β = −1.63 22 and very few objects with slopes bluer than β = −2.0 for the wave-length range between 1450Å and 2200Å. Although the redshifts of this sample is limited to 1.67 z 2.09, we do itational lensing between illuminated gas and illuminating stars (e.g. Villar-Martín, Cerviño & González Delgado 2004). 22 Converted from the ν α as stated in Davis, Woo & Blaes (2007) to λ β not expected the quasar spectra to be much bluer for higher redshifts. Therefore, it is unlikely that the two eastern components are dominated by direct continuum emission from the accretion disk. Finally, our 3 σ upper limit on C IV/Ly α of 0.13, using the 600 km s −1 aperture and assuming a crude IGM correction of a factor two is two times below the typical value for type I quasars in the composite spectrum of Vanden Berk et al. (2001). Certainly, this comparison falls somewhat short, as a C IV width of 600 km s −1 is too low for a type I quasar. Much higher widths are not justifiable based on our observed Ly α line width.
By contrast, Himiko's Ly α width is an option for the extended emission line regions around quasars, may it be either a powerful high redshift radio galaxy (HzRG) or a radio quiet type II quasar. While typical Ly α line widths of ∼ 1000 km s −1 seem somewhat high compared to Himiko's measured one, it might be possible to explain such a width under the assumption of strong IGM absorption (cf. sec. 4.3). Further, it is known that HzRGs with very relaxed kinematics exist. MRC 0140-257 has over the complete nebula Ly α F W HM 500 km s −1 , with very narrow emission in the peaks of Ly α surface brightness (F W HM ∼ 250 km s −1 ). Also, while the extended Ly α emission around HzRGs is usually spatially kinematically disturbed by several 100 km s −1 , MRC 0140-257 has comparable low velocity offsets. Himiko would need to be similar to this rather special object, as we see in our stacked spectrum no evidence for strong velocity gradients (cf. both Fig. 6 and 7). We remark that velocity gradients could be smoothed out in our stack, as it is a superposition of observations with different slit angles. However, also in the spectrum of Ouchi et al. (2009), who observed with a fixed slit position, only a small Ly α velocity gradient of 60 km s −1 is found over the east-west direction.
Comparing the rest-frame UV and optical photometry, we find that the magnitudes measured for Himiko are indeed values reasonable for those expected from HzRGs, when taking the strong and extended emission in the EELRs into account. Assuming a flat continuum in fν and HF125W (24.93 mag) as normalisation, we can estimate the flux due to emission lines in IRAC1 and IRAC2 to 14.8 × 10 −17 erg s −1 cm 2 and 4.8 × 10 −17 erg s −1 cm −2 , respectively, corresponding to a ratio of 3.1. Summing up the relevant line fluxes in the HzRG composite spectrum of Humphrey et al. (2008), a typical emission line flux ratio of 3.0 between IRAC1 and IRAC2 would be obtained, 23 consistent with that measured for Himiko. Assuming that about 70 per cent of the line flux in IRAC1 is due to [O III] λ5007, the [O III] λ5007 rest-frame EW0 would be 1400Å. Such a value is not unrealistic for HzRGs (e.g. Iwamuro et al. 2003). Finally, based on the [OIII] flux estimated above, Himiko's Ly α flux would be using the Humphrey et al. (2008) average ratio expected to be a factor 1.7 higher than the actually observed one. This factor is easily within the uncertainty of the IGM absorption correction.
While this similarity in line ratios is interesting, it is not entirely surprising when considering our best fit stel-lar SED model. Humphrey et al. (2008) conclude, that the line emission in these HzRGs is not mainly powered by the interaction of the radio lobes with the gas, but rather by photo-ionisation, and that the illuminated gas has typical metallicities of Z = 0.2Z⊙.
A means to discern between an AGN or a young stellar population as ionising source would be through the strength and ratios of rest-frame far-UV emission lines. The ratios between these lines and Ly α are in general expected to be stronger, when being powered by an AGN.
A complicating factor when interpreting non-detection limits is the extent of the possible line emission. Extended Ly α emission could be a consequence of resonant scattering or of in situ production by ionising radiation escaping the ISM. In the first case other emission lines would be almost co-aligned with the UV continuum, while in the latter case they could be spatially extended as the Ly α emission (e.g. Prescott et al. 2015), resulting for our object in a larger slit loss.
The typical ratio between C IV and Ly α in the Humphrey et al. (2008) composite is with 0.15 similar to the 3σ upper limit for our target, when assuming the 600 km s −1 extraction box and a crude Ly α IGM correction of a factor two, which corresponds to C IV/Ly α < 0.13 and < 0.18 assuming a spatial extent similar to the continuum or the Ly α emission, respectively. Also, the typical He II/Ly α of 0.09 would be detected with at least 3σ even for He II emission as extended as Ly α, assuming again the crude IGM correction and the wider extraction box. The advantage of He II compared to C IV is that it would be present even when assuming primordial composition of the illuminated gas.
Another problem with the interpretation of Himiko as a HzRG is the available 100 µJy 1.4 GHz VLA upper limit (Simpson et al. 2006). Assuming example radio spectral indices of −0.5, −1.0, and −1.5, with ultra steep slopes more typical for high-z galaxies, this corresponds to L1.4GHz of 1.8 × 10 25 , 5.0 × 10 25 , 1.4 × 10 26 W Hz −1 . Converted to 5 GHz the upper limit would be around 1-2×10 25 W Hz −1 and hence at the very lower limit of what would still be considered a radio-loud galaxy according to the classification by Miller, Peacock & Mead (1990). On the other hand, if the main ionising source for the EELRs is photo-ionisation by the quasar and not the radio jets, the non detection in the radio data is not ruling out the possibility of a hidden AGN as powering source. Indeed, EELRs have been found around the less known population of radio-quiet type II quasars (e.g. Gandhi, Fabian & Crawford 2006;Villar-Martín et al. 2010).
For several among the z ≈ 2-3 LABs, a hidden AGN has been identified at least as a partial contributor to the Ly α ionising flux. E.g., the blob of Dey et al. (2005) at z ≈ 2.7 shows clear indication of a dust-enshrouded AGN, producing both He II and C IV emission with relatively narrow width (∼ 365 km s −1 ). Scaling the strength of the lines to our Ly α surface brightness, both lines should be detected.
For the narrow-lined AGNs among those UV selected galaxies by Steidel et al. (2004) at z ∼ 2-3 (Hainline et al. 2011), the ratios between the rest-frame far-UV lines and Ly α would be even higher.
In addition to the constraints presented here, Baek & Ferrara (2013) have argued for a criterion to discriminate between LABs being powered by either star-formation, Compton-thin, or Compton-thick AGNs based on the com-bined information about the observed surface brightness profile and skewness of the Ly α line. They seem to conclude that Himiko is not in the right region of the parameter space for having an AGN as source.

Gravitational cooling radiation
Early semi-analytical models assumed that the gas feeding a halo is heated to the virial temperature. However, cosmological SPH simulations indicate that the majority of the infalling gas might never reach these high temperatures (Fardal et al. 2001). Under the assumption of primordial element composition, meaning basically only H and He, the major cooling channel of the gas is consequently not through Bremsstrahlung, as for gas with temperatures above 10 6 K, but through collisional excitation of hydrogen and, depending on the temperature, also through He II. Collisional excitation cooling peaks at 10 4.3 K and 10 5 K for H I and He II, respectively.
For several observed LABs in the literature, it has been suspected that cooling radiation is the major driver of the extended Ly α emission. For instance, Nilsson et al. (2006) had favoured this scenario for their LAB at z = 3.16, as they could not identify an obvious counterpart, even in deep GOODS HST imaging. Recent reanalysis of this object based on the extended availability of multi-wavelength data allowed to identify one of the objects in the field, which is located at a distance of about 30 kpc from peak Ly α emission, as an obscured AGN (Prescott et al. 2015). Unfortunately, no spectrum deep enough to seriously probe He II and the other rest-frame far-UV lines is available for this object.
A LAB, where He II has been observed, is the one by Scarlata et al. (2009). While an AGN can also be identified photometrically within this object, they favour gravitational cooling radiation being at least partially responsible for He II because of the non-detection of C IV.
The He II luminosity of 8 × 10 41 erg s −1 within their extraction mask would correspond at z = 6.595 to a flux of 1.6 × 10 −18 erg s −1 cm −2 , a flux which would not be safely detectable by us. On the other hand, scaling He II to the higher Ly α flux in Himiko, we would be able to significantly detect it (cf. Table 8).
The question is whether the corresponding high He II luminosity would be at all feasible for cooling radiation and whether a substantial amount of the Ly α emission could originate from cooling radiation. Insight can be gained by comparison to published results from simulations.
As a relatively narrow line is predicted for He II in the case of cooling radiation (Yang et al. 2006), we compare to the 3σ upper limit of our standard 200 km s −1 aperture. This limit corresponds to a luminosity of LHe II = 1.5 × 10 42 erg s −1 , a value high compared to the luminosities of even the heaviest halos in the simulations of Yang et al. (2006). Here, it needs to be noted that the results of Yang et al. (2006) are for z = 2.3. Yajima et al. (2015) have run a hydrodynamical simulation combined with Ly α radiative transfer, including all radiative cooling and heating, star-formation in a multi-phase ISM, stellar feedback, and UV background. Their simulation traces a halo developing a Milky Way size galaxy over cosmic times. We have extracted from their presented evolution plots several quantities at redshift z = 6.6. A fraction of 65 per cent of Ly α is predicted by them to be due to cooling radiation, mainly created within the cold accretion streams, where the total Ly α luminosity is LLyα = 4.5 × 10 42 erg s −1 . Therefore, even before correcting for IGM absorption and before applying a surface brightness threshold, the Ly α luminosity is more than a factor ten lower than Himiko's Ly α luminosity.
Taken this and the fact that the SED fitting suggest vigorous star-formation, it is unlikely that a major fraction of Himiko's Ly α luminosity is produced by cooling radiation. Therefore, as each particle in the simulations of Yang et al. (2006) has f He IIλ1640 0.1 × fLyα, the He II flux is not very likely to be detectable and no strong conclusions can be drawn from the non-detection.

Shock ionisation
Another possible explanation for strong and extended Ly α emission are large scale shock-super-bubbles (Taniguchi & Shioya 2000), which might be the consequence of galactic winds powered by a large number of core-collapse supernovae within the first few 100 Myr after the onset of the initial starburst (Mori, Umemura & Ferrara 2004;Mori & Umemura 2007). Assuming that the kinetic energy of the supernova ejecta is converted into radiation by fast-shocks and combining information on C IV/He II and N V/He II ratios with a measured line width, one could get insight into the feasibility of this mechanism as contributing power source for the extended Ly α emission through comparison to shock model grids (e.g. Allen et al. 2008). In absence of a significant detection of at least one of the lines, such an analysis is not feasible.

CONCLUSION
With our analysis we add further hints to the puzzle of what is powering the remarkable Ly α emitter Himiko.
First, we have detected a continuum in the spectrum showing a clear break at the wavelength of Ly α, ruling out any remaining chance of it being a lower redshift interloper. The fact that the continuum appears like a single step function indicates that there are no large velocity differences between the three distinct UV bright components.
From SED fitting, including CANDELS JF 125W and HF 160W data, we argue for a young and heavily star-forming stellar population, with a total stellar mass of the order 10 9 M⊙ and a metallicity of Z = 0.2Z⊙, with the bright IRAC1 magnitude explained by very strong [O III] emission. While we find that similar broadband magnitudes would also be produced by lines in the extended emission line regions around high redshift radio galaxies, this scenario is for several reasons more unlikely. Among them is the most important result of this work. Our upper limits on important restframe far-UV lines are clearly disfavouring an AGN as sole powering source for the extended Ly α emission. However, due to the natural lack of knowledge about the appropriate slit loss, line-width, and systemic redshift and the spread in line strength for AGNs, it is not entirely impossible that an AGN has escaped our detection. Table 8. Line ratios of He II, N V, C III], and C IV to Ly α as found in samples and individual objects by various studies. Upper limits are 3 σ. References: [0] : Line ratios as measured in this study for Himiko. The ratios are calculated based on slit loss corrected fluxes. While we have for Ly α assumed a spatial extent given by the narrowband NB921 image, the other lines were corrected under the assumption that they are distributed as the continuum. If they were as extended as the resonant Ly α line, the upper limits would be a factor 1.5 and 1.4 less stringent for N V and the three lines in the NIR arm, respectively. The non-detection limits are stated for our two fiducial boxes. Values in brackets are assuming a crude IGM correction of a factor two. [1] : SDSS quasar composite spectrum (Vanden Berk et al. 2001) [2] : Composite spectra for AGNs among rest-frame UV selected galaxies (Hainline et al. 2011). The sample is split into those with high and low Ly α EW 0 . To convert the values and the spread in the equivalent widths stated by Hainline et al. (2011) to values and spreads for the fluxes, we have used the continuum slopes of β = −0.1 ± 0.4 and β = −0.5 ± 0.3 stated by them for the composite spectra of the high and low EW 0 sample, respectively. [3] : Sample of nine high-redshift radio galaxies (Humphrey et al. 2008) (Cai et al. 2011) [8] : He II non-detection for extreme Ly α EW 0 ≈ 900Å object  [9] : He II non-detection for SDF J132440.6+273607 (Nagao et al. 2005) [10] : 10 9 M ⊙ LAE at z=2.3, showing strong He II and C III] emission likely powered by star-formation