The puzzle of the formation of T8 dwarf Ross 458c

At the lowest masses, the distinction between brown dwarfs and giant exoplanets is often blurred and literature classifications rarely reflect the deuterium burning boundary. Atmospheric characterisation may reveal the extent to which planetary formation pathways contribute to the population of very-low mass brown dwarfs, by revealing if their abundance distributions differ from those of the local field population or, in the case of companions, their primary stars. The T8 dwarf Ross 458c is a possible planetary mass companion to a pair of M dwarfs, and previous work suggests that it is cloudy. We here present the results of the retrieval analysis of Ross 458c, using archival spectroscopic data in the 1.0 to 2.4 micron range. We test a cloud free model as well as a variety of cloudy models and find that the atmosphere of Ross 458c is best described by a cloudy model (strongly preferred). The CH4/H2O is higher than expected at 1.97 +0.13 -0.14. This value is challenging to understand in terms of equilibrium chemistry and plausible C/O ratios. Comparisons to thermochemical grid models suggest a C/O of ~ 1.35, if CH4 and H2O are quenched at 2000 K, requiring vigorous mixing. We find a [C/H] ratio of +0.18, which matches the metallicity of the primary system, suggesting that oxygen is missing from the atmosphere. Even with extreme mixing, the implied C/O is well beyond the typical stellar regime, suggesting a either non-stellar formation pathway, or the sequestration of substantial quantities of oxygen via hitherto unmodeled chemistry or condensation processes.


INTRODUCTION
Brown dwarfs span the mass range between stars and giant exoplanets and are found as isolated objects, binary systems, and as companions to stars. Most discussions of the formation of brown dwarfs consider them as the substellar extension of the star formation process (e.g. Whitworth et al. 2007;Stamatellos & Whitworth 2009;Whitworth 2018, and references therein). However, the growing sample of isolated planetary mass brown dwarfs (e.g. Faherty et al. 2013;Liu et al. 2013;Schneider et al. 2016;Theissen et al. 2018;Bouy et al. 2022), wide-orbit (resolved) planetary mass companions (e.g. Faherty et al. 2021;Miles-Páez et al. 2017;Burgasser et al. E-mail: j.jensen@herts.ac.uk † 51 Pegasi b Fellow 2010), and giant exoplanets with masses above the deuterium burning limit (e.g. Bakos et al. 2009;Rosenthal et al. 2021) raises the question as to the contribution of planetary formation pathways to the low-mass end of the substellar population.
The compositions of brown dwarfs may provide clues to their formation mechanism. For objects that are companions to stars, we might expect a shared carbon-to-oxygen (C/O) ratio if both formed at the same time and from the same material. Whereas a C/O ratio that differs might suggest an alternative formation path to that of the star. For example, very low-mass brown dwarfs may have formed like gas giant planets in disks, through either core accretion or gravitational instabilities (Bate et al. 2002). Different oxygen-bearing species condense at different locations, removing oxygen from the vapor and yielding a higher C/O ratio in the gas phase. As a result, at the water ice line the C/O ratio of grains would decrease, and then again increase at the CO ice line (Öberg et al. 2011). An object formed in such a differentiated protoplanetary disk may thus inherit a C/O ratio that is quite unlike its parent star. For the brown dwarf population that represents the substellar extension of the star formation process, we can expect the distribution of C/O ratios to follow that of the stellar field population. This distribution is generally tight and close to the solar value of 0.54 (Asplund et al. 2005), with a C/O > 0.8 very unusual (e.g . Fortney 2012;Nissen 2013;Nakajima & Sorahana 2016). If a significant number of the the lowest-mass brown dwarf companions, on the other hand, formed via a planet-like formation scenario, we might expect their C/O distribution to be altered by comparison.
In this paper we present the analysis of the first target of a wider study of the local brown dwarf and substellar companion populations. Ross 458c was discovered by Goldman et al. (2010) and followed up by Burgasser et al. (2010) and Burningham et al. (2011), and is a young T8 dwarf. Comparisons to evolutionary models suggest that it may be a planetary mass object, while its wide orbit of 1200 AU around an M dwarf binary argues against formation via core-accretion or disk fragmentation (if the object formed in situ).
Another unusual feature of Ross 458c is its apparent cloudiness (Morley et al. 2012;Burgasser et al. 2010;Burningham et al. 2011). Clouds are a possible opacity source in brown dwarfs, as inferred from data, and theoretically expected from condensation of species possible at pressures and temperatures found in substellar objects (see e.g. Kirkpatrick 2005;Kirkpatrick et al. 2021;. Although T dwarfs have been generally thought to be cloud-free (e.g. Kirkpatrick 2005), it has been suggested that sulphide and alkali-salt clouds may start to appear in late-T dwarfs (Morley et al. 2012).
In this paper we use retrieval analysis to investigate the composition and cloudiness of this interesting target. The retrieval technique follows on from and complements analysis using one dimensional, self-consistent, radiative-convective equilibrium "grid models". Grid models attempt to simulate the substellar atmosphere self-consistently using a handful of bulk parameters that typically include T eff , log g, metallicity, cloud parameters, and (sometimes) C/O (e.g. Saumon & Marley 2008;Marley & Robinson 2015;Marley et al. 2021). Also, vertical mixing can drive certain molecular abundances out of chemical equilibrium (e.g. Noll et al. 1997;Saumon et al. 2006;Marley & Robinson 2015). The extent to which this occurs depends on the mixing timescale, and this is often included via some additional parameter (e.g. eddy diffusion coefficient Kzz, Griffith & Yelle 1999;Saumon et al. 2006).
A variety of techniques are used to fit these models to data to draw conclusions, from simple χ 2 minimisation to more sophisticated Bayesian methods (e.g. Zhang et al. 2020). This can result in poor fits to the data, with the grid models inadequate to describe the data ). Due to their complexity, it is difficult to identify the poor assumption or missing physics that is driving the bad fit.
Retrievals make fewer assumptions about the state of the atmosphere, and instead parameterise it over many more dimensions with little if any requirement for self-consistency. Whilst this can result in unphysical outcomes, which should be considered when interpreting the result, it also allows data-driven insights that can highlight, or work around, shortcomings in self-consistent approaches (e.g. Burningham et al. 2021). Another benefit of retrievals is that one gets a measure of the C/O ratio from the observations (here using CH4 and H2O) as a retrieved parameter, as opposed to a prescribed ratio that is hardwired into the model. In this work we will use the Brewster retrieval code that has been demonstrated effectively in cloudy atmospheres in the substellar regime (Burningham et al. 2017;Gonzales et al. 2020;Burningham et al. 2021).
The paper is organised as follows. In Section 2 we provide an overview of current literature on Ross 458c. Section 3 describes Brewster, the retrieval framework. In Section 4 we present the results of the retrievals. Section 5 is a discussion and analysis of the results, and Section 6 is the conclusion.

ROSS 458 SYSTEM
2.1 Ross 458AB The primary system consists of a M0.5 and M7, separated by a distance of 0.5" (about 5 AU) (Beuzit et al. 2004). The metallicity of the primary system is supersolar at [Fe/H] = +0.2 ± 0.05 (Burgasser et al. 2010). Burgasser et al. (2010) places constraints of the age of the Ross 458 system of 150 to 800 Myr, the upper limit based on the magnetic activity and Hα emission of Ross 458A and the lower limit on the absence of low surface gravity indicators (such as the 7000Å KI doublet), which would be expected in young M dwarfs. Eggen (1960) selected the system as a possible member of the Hyades open cluster, with an age of around 625 Myr (Lebreton et al. 1997) whilst Nakajima et al. (2010) selected it as a possible member of the IC 2391 Moving Group, with an age of around 50 Myr. Any membership association would place further constraint of the age of the system. Table 1 summarizes previously published parameters for Ross 458A and Ross 458B.

Ross 458c
Ross 458c is a T dwarf of spectral type T8 (Burgasser et al. 2003) or T8.5p (Burningham et al. 2011) and is a wide orbit companion to the binary Ross 458AB (Goldman et al. 2010). It is separated from its primaries by 102" (1200 AU). The system is at a distance of 11.5 ± 0.2 parsec (Gaia Collaboration et al. 2018). Burgasser et al. (2010) found T eff = 650 ± 25 K from fitting self-consistent grid models. This is in good agreement with the values found by Burningham et al. (2011) (using bolometric luminosity) of 695 ± 60 K. Filippazzo et al. (2015) found a similar value of 721 ± 94 K using their bolometric luminosity method.
One of the key results in those previous examinations of Ross 458c was the somewhat surprising feature of clouds in a late-T dwarf. Burgasser et al. (2010) found that models which included cloud opacity were a better fit to the near-infrared spectrum of Ross 458c. They argued that it may indicate a Ross 458c 3 resurgence of iron and silicate clouds that are thought to have disappeared at the low temperatures of the late-T dwarfs. Morley et al. (2012) investigated possible causes of the apparent cloud opacity. Their best fit model for Ross 458c was a 700 K, log g 4.0 model which incorporated sulphide (Na2S, MnS and ZnS) clouds. Manjavacas et al. (2019) find spectrophotometric variability in Ross 458c with an amplitude at the ≈ 2% level, indicating a partial cloud cover rotating in and out of view.
In more recent work, Zhang et al. (2020) attempted to estimate the parameters using the Sonora Bobcat grid models (Marley et al. 2021), using the prism spectra of Ross 458c. However they struggled to get a good fit. The model spectrum was brighter in the Y and J band, and had fainter flux in the blue wing of the H and K band. They argue that this is likely due to clouds, reduction in vertical temperature gradient, or CO/CH4 or N2/NH3 disequilibrium chemistry. As a result, the radius and inferred mass are small. The model atmospheres used in their study were cloud-free, and with a C/O fixed at solar. These limitations might result in inaccurate estimated parameters. This work will improve upon these shortcomings by having a larger set of less restrictive parameters.
For the retrieval of Ross 458c, we make use of archival data, for full details of the data see Burgasser et al. (2010). We have flux-calibrated the data using the same UKIDSS J band photometry used in Burningham et al. (2011) to provide consistency between the two studies and remove a source of possible discrepancies between our results. Table 2 provides an overview of the literature values of Ross 458c.

RETRIEVAL FRAMEWORK
We use the Brewster retrieval framework, for which a more complete description can be found in (Burningham et al. 2017;Gonzales et al. 2020;Burningham et al. 2021). The retrieval framework consists of the forward model, which produces the spectrum based on a set of parameters, and the retrieval model, which uses Bayesian inference to explore the posterior, estimate parameters and perform model selection.
The following section contains a brief recap of key features, and updates for this work.

Forward model overview
Brewster divides the atmosphere into 64 pressure layers with properties of temperature, gas opacity and cloud opacity. The atmosphere is also parameterised by gravity which sets the scale height. The output of the forward model is the flux at the top of the atmosphere, which must be scaled by the ratio of the squared radius, R, and distance to the target, D (R 2 /D 2 ). As will be discussed in Section 3.2, we use two different algorithms for exploring the posterior: EMCEE (Foreman-Mackey et al. 2013), and nested sampling using PyMultiNest (Buchner et al. 2014). These two methods use slightly different parameterisations due to the differences in how the priors are treated in each algorithm. Specifically mass and radius are retrieved directly in the nested sampling version, whereas they are derived from the (R 2 /D 2 ) scaling parameter and log g in the EMCEE version. The thermal profiles also differ as will be discussion Section 3.1.1.
The forward model parameters are discussed below, and are listed in Table 3 along with their priors.

Thermal structure
As is typical in atmospheric retrievals, we do not directly retrieve the temperature for all 64 layers in our model atmosphere, but instead use a smaller number of parameters to set the thermal profile. In previous works (e.g. Burningham et al. 2017Burningham et al. , 2021Gonzales et al. 2020), we have used the Madhusudhan & Seager (2009) parameterisation for the thermal profile.This scheme treats the atmosphere as three zones: P0 < P < P1 : P = P0e α 1 (T −T 0 ) where P0, T0 are the pressure and temperature at the top of the atmosphere, which becomes isothermal with tempera- Parameter Prior gas fraction (Xgas) log-uniform, log Xgas −12.0, gas Xgas 1.0 thermal profile 1 : α 1 , α2, P 1, P 3, T 3 uniform on resulting T , 0.0 K < T i < 5000.0 K thermal profile 2 : uniform, −10 α 10 cloud decay scale, (∆ log 10 P ) decay uniform, 0 < (∆ log 10 P ) decay < log 10 Ptop + 4 cloud thickness (∆ log 10 P ) thick uniform, constrained by log 10 Ptop log 10 Ptop + (∆ log 10 P ) thick 2.3 cloud total optical depth (extinction) at 1 µm 3 uniform, 0.0 τ 0 100.0 Hansen distribution effective radius, a log-uniform, −3.0 < log 10 a(µm) < 3.0 Table 3. Priors for retrieval model. Notes: 1) EMCEE only; 2) nested sampling only. The tolerance factor, o is to allow to unaccounted sources of uncertainty. The wavelength shift is the shift in wavelength between the data and the model. See Burningham et al. (2017) for a complete description. The cloud decay scale, (∆ log 10 P ) decay , is constrained within the height of the atmosphere by log 10 Ptop + 4. The cloud thickness, (∆ log 10 P ) thick , is constrained by 2.3, the bottom of the atmosphere.
ture T3 at pressure P3. In its most general form, a thermal inversion occurs when P2 > P1. Since P0 is fixed by our atmospheric model, and continuity at the zonal boundaries allows us to fix two parameters, we consider six free parameters α1, α2, P1, P2, P3, and T3. If we rule out a thermal inversion by setting P2 = P1 (see Figure 1, Madhusudhan & Seager 2009), we can further simplify this to five parameters α1, α2, P1, P3, T3. For our new nested sampling version of Brewster we have employed a simple 5 point T-P profile, which sets temperatures at the base (T bot ), top (Ttop), mid-point (T mid and quarter pressure depths (TtopQ, T botQ ) in the atmosphere in log 10 P . These points are then joined by spline-interpolation to define the 64 layer temperatures. The only restrictions on the values these 5 points may take is that they lie in the range 0 < T < 4000 K, and decrease towards shallower pressures.

Gas Opacity
The gas opacity is set by the gas fractions in each layer. In this work we directly retrieve these as vertically constant mixing ratios. Previous retrievals of late-T dwarfs using the assumption of chemical equilibrium to allow for vertically varying abundances found such models poorly ranked (see e.g. Gonzales et al. 2020). Moreover, grid model fits in Burningham et al. (2011) suggested that non-equilibrium chemistry is important for this Ross 458c. As such, we have not performed any retrievals under the assumption of chemical equilibrium in this work.

Opacity data
In this work we consider the absorbing gases H2O, CH4, CO, CO2, NH3, H2S, Na, K. Their opacities are from the compendium (Freedman et al. 2008) and (Freedman et al. 2014). For the Na and K D1 and D2 lines, broadened by collisions with H2 and He, the Lorentzian line profile becomes inadequate, and line wing profiles are taken from (Allard et al. 2007a(Allard et al. ,b, 2016. We also consider the alkali opacities from Burrows & Volobuyev (2003) as an alternative given the uncertainty surrounding the best treatment for this issue, as there is no agreement in literature as to which is the preferred choice. Gonzales et al. (2020) tested both Burrows and Allard alkalis on their L7+T7.5p SDSS J1416+1348AB binary target. The 1416A component (L7) was best fit using the Allard treatment, whilst the 1416B component (T7.5p) was best fit using the Burrows treatment. The Allard alkalis provide consistent abundances between the co-evolving pair, and is hence the best fit for the pair as a whole. We follow Gonzales et al. (2020) in testing both the Burrows and the Allard treatment for alkalis.
We include continuum opacities for H2-H2 and H2-He collisionally induced absorption, using cross sections from Richard et al. (2012); Saumon et al. (2012). We also include Rayleigh scattering due to H2, He and CH4, and continuum opacities due to bound-free and free-free absorption by H − (John 1988;Bell & Berrington 1987) and free-free absorption by H − 2 (Bell 1980).

Cloud opacity
The cloud opacity is parameterised in several ways, which have between 3 and 5 parameters. These parameters fall broadly into two categories: those that describe how the cloud opacity is distributed in the atmosphere and those that describe the cloud's optical properties.
Cloud structures: The clouds' distribution in the atmosphere is parameterised as either a "deck" or "slab" cloud. In the deck cloud, it is not possible to see the bottom of the cloud. The deck cloud has the following parameters: 1) Pressure at which the optical depth passes 1, looking down 2) The decay height, the pressure over which the optical depth drops by a factor of 2 In contrast to the deck cloud, in the slab cloud, it is possible to see the bottom of the cloud. The slab cloud has the following parameters: 1) The total optical depth of the cloud at 1 µm (reached at the base of the cloud) 2) Pressure at the top of the cloud (where τ cloud = 0) 3) The thickness of the cloud slab (in d log P ) Cloud optical properties: The cloud opacity can be parameterised in one of three ways. The optical depth is either grey (wavelength independent), a power law (τ = τ0 × λ α ) where τ0 is the optical depth at 1 µm, or Mie scattering. If the cloud opacity is power law dependent as opposed to grey, an extra parameter is added, the power index α, defining the wavelength dependent cloud opacity as τ ∝ λ α . In both the grey and the power-law case, scattering is neglected, and the cloud is assumed to be absorbing. When using Mie scattering, we have two additional parameters, the Hansen a and b parameters (Hansen 1971). The Hansen a parameter is the particle size distribution effective radius in log(r/µm). Hansen b parameter is the particle size distribution spread. Therefore, for a deck cloud, we then have 4 parameters, and for the slab 5. A variety of condensates are potentially able to form in cool T dwarf atmospheres and contribute to the cloud opacity. Burgasser et al. (2010) hypothesised a reemergence of iron and silicate clouds for late-T dwarfs, which would have broken up at the L-T transition. Of the iron and silicate clouds, it is the silicate clouds that will be found at the lowest temperatures and shallowest pressures. These silicate clouds are expected to be composed of enstatite (MgSiO3), forsterite (Mg2SiO4) or SiO2 (quartz, e.g. Visscher et al. 2010;Burningham et al. 2021). Since we are unlikely to be able to distinguish between these species using only nearinfrared data we have only tested one: MgSiO3. Morley et al. (2012) argues that sulphide clouds may be present in late-T dwarfs and tests Na2S, MnS and ZnS, as they are expected to condense at lower temperatures. The abundance of ZnS is low and they find that it does not create an observable change in the spectrum; its optical depth is negligible, so ZnS is not included in the models we test here. KCl is also expected to condense at cooler temperatures (Morley et al. 2012), and is one of the models tested here. Graphite clouds can appear in Carbon-rich atmospheres with super-solar C/O ratio (Moses et al. 2013a,b). As a proxy for graphite, we used optical data for soot, due to its high carbon content.  Only one cloud type is tested per model run. We don't expect spectral features arising from Mie scattering of particular condensate compositions from any of the expected cloud species to be observable in the 1.0 − 2.5 µm region. So, we judged that the additional computational load and complexity due to adding extra model combinations in the form of multiple cloud models, would not be justified in terms of improvements to the fit.
We calculate scattering and absorption Mie coefficients from real and imaginary refractive indices for our condensate species obtained from the sources of optical data shown in Table 4. For grey and power-law opacity, no optical data are used.

Posterior sampling
In previous works, Brewster used the EMCEE sampler (Foreman-Mackey et al. 2013). However, this method of sampling has some drawbacks such as: uncertainty surrounding convergence; the challenge of exploring the entire probability volume to avoid sensitivity to initial values due to local maxima; and difficulty surrounding the calculation of the Bayesian Evidence for model selection. For this work, Brewster has been updated to also include nested sampling by incorporating the PyMultiNest sampler (Buchner et al. 2014), which avoids these issues. We test both the EMCEE and PyMultiNest posterior sampling techniques to check for consistency between the two methods.

Model selection
The EMCEE sampler does not allow for the evaluation of the Bayesian evidence, but provides a maximum likelihood, which is used to calculate the Bayesian Information Criterion (BIC). The model with the lowest BIC is the preferred model. The strength of the preference of one model over another is defined by Kass & Raftery (1995) as such shown in Table 5.
The PyMultiNest sampler returns the nested importance sampling global log-evidence (Feroz et al. 2019). For two com-∆logEv Strength 0 to 0.5 no preference worth mentioning 0.5 to 1 substantial 1 to 2 strong > 2 decisive peting models, model selection between models 1 and 0 of the preferred model is achieved as follows: Z is the Bayesian evidence. Pr(H1)/Pr(H0) is the prior probability ratio for the two models, which can in most cases, including this, be set to unity. The preferred model is the model with the larger value for Z, with the strength of the preference of one model over another being the ratio of their evidences (Feroz et al. 2019). The MultiNest algorithm returns the log-evidence (logEv), and hence the strength of the preference is defined as the difference between the models' logEv (∆logEv). Narrative statements corresponding to boundaries in the values for ∆logEv are defined by Kass & Raftery (1995) as shown in Table 6.

Summary
We tested both the EMCEE and the PyMultiNest options for sampling the posterior, to confirm consistency between results. The results quoted here are from PyMultiNest, as it is preferred due to sampling the entire parameter space, avoiding being stuck in local minima and maxima. The best fit model for the nested sampling is a cloudy model, with a power-law slab cloud. For nested sampling we use the Bayes factor to compare the different models. The evidence difference to the winning model is listed in Table 7. The evidence difference to the second ranked model is substantial.
For the EMCEE sampling, the top ranked model is the power-law slab cloud, with the alternative treatment of the alkali line broadening. The second ranked model is the powerlaw slab cloud. Both PyMultiNest and EMCEE have the same two models ranked in top, the order is just switched. For EMCEE we use the ∆BIC for model selection. The ∆BIC is shown in Table 8. There is a strong preference for the top ranked model as opposed to the second ranked, and a very strong preference against the lower ranked models.
The top-ranked models are cloudy models with the cloud opacity as a power-law. "Real" clouds with Mie-scattering are rejected, as is the cloud-free model.
We consider two different alkali profiles. For the PyMulti-Nest sampler, the Allard alkalis are ranked second, whereas for the EMCEE sampler, the Allard alkali profiles are ranked first, with the Burrows profile second. We cannot assert which is the right profile to use, as it is dependent on methodology.  Figure 1 shows the spectrum with cloud-free Sonora models over-plotted, of T eff = 800 K and 700 K, and log g = 4.5 and 5.0, and the main absorption features. The models are all "normalised" to the J band. All of the plotted grid models underestimate the flux in the Y , H, and K bands compared to the data for Ross 458c. This may indicate that one of the key issues for the grid models is that their predicted flux in the J band is too bright relative to the rest of the near-infrared spectrum.

Retrieved spectrum
Our retrieval model provides a much better fit to the data across the entire near-infrared wavelength range covered. This likely reflects the greater flexibility available to a retrieval model, and its ability to allow for non-solar abundance ratios. A key feature in our ability to fit the J band peak is the inclusion of clouds (as will be discuss in Section 4.4). The only region that the retrieval model struggles to fit well is the Y band peak, and this likely reflects chal-lenges associated with the pressure-broadened wings of the alkali D-lines, which is shared by the grid models. Figure 2 shows the thermal profile for the winning model along with comparison to self-consistent grid models and cloud condensation curves. The placement of the cloud in pressure space is plotted on the side. The thermal profiles for comparison are the Sonora Bobcat models (Marley et al. 2021) with T eff = 700 K and 800 K and log g = 4.5 and 5.0. The comparison models were selected based on the retrieved T eff and log g.

Thermal profile
Generally, the models match the retrieved profile best at deeper pressures, where they follow the retrieved gradient well. The model that matches the retrieved profile best at deeper pressures is the T eff = 800 K, log g = 5.0. The models do not match closely at shallower pressures, where the differences are significant, as our retrieved profile is warmer (see Figure 2). At the shallowest pressures in our model (log(P/bar) ∼ −3 − −4) there is little contribution to the flux (see Figure 3), and a correspondingly large scatter in the retrieved temperature of atmosphere. However, at slightly deeper pressures (log(P/bar) ∼ −1 − −2) there is significant contribution of flux, and the retrieved thermal profile is quite tightly constrained and diverging from the grid models with a more isothermal gradient.
This difference between the retrieved profile and other grid models at shallow pressures is also seen in other works, such as the warmer L dwarfs in Burningham et al. (2021)

Cloud properties
In agreement with previous works (Burgasser et al. 2010;Burningham et al. 2011;Morley et al. 2012), we find that the atmosphere of Ross 458c is best described with a cloudy model. The winning model is a cloudy one, structured as a slab cloud. Burgasser et al. (2010), using grid models, found that enstatite cloud models provided a good fit to the data, whereas Morley et al. (2012), though not testing enstatite clouds found that sulphide clouds fit the data best. Model selection rejects "real" clouds with Mie scattering, so we are unable to distinguish the cloud species based on optical properties. Figure 2 shows the condensation curves for the tested cloud species alongside our retrieved thermal profile. The curves do not mean that the particular cloud will condense, but rather they indicate where it can condense. So, a given cloud species can condense to the left (cooler) of its condensation curve on Figure 2. Two clouds that could condense at our retrieved cloud location are the MnS and MgSiO3 clouds. KCl and Na2S condense at cooler temperatures. So, our thermal profile and the placement of the cloud deep in the atmosphere is suggestive of enstatite or other silicate clouds due to the retrieved Figure 2. Pressure-temperature profile of winning model (powerlab slab cloud). The dashed lines are condensation curves for the different clouds tested. The curves do not mean that the particular cloud will condense, it just means that it can condense at that pressure and temperature. The right part of the figure is the placement of the cloud in the atmosphere, the grey shaded parts are the 1σ errors. The solid coloured lines are the self-consistent Sonora Bobcat grid models.
cloud location with respect to the condensation curves. We note that the other silicates that were not tested (see Section 3.1.4) have very similar condensation temperatures, so this cloud may actually be a mix of these very similar silicates. This will be difficult to test since these clouds appear to be so deep in the atmosphere. JWST data will likely incorporate the spectral features around 8 − 10 µm arising from Mie scattering of these silicates. However, at those wavelengths clouds are likely to be well below the photosphere. Figure 3 shows the contribution function. This shows where in the atmosphere the flux is coming from. The shading of the area indicates the relative contribution of flux from that layer. The darker shaded the area, the more flux from that layer. The blue line is the gas opacity where τgas = 1.0. The cloud is the purple line (at τ cloud = 1.0), and is placed in the bottom of the atmosphere, below the photosphere, except for an overlap at the J band, at around 1.3 µm.
This suggests that the cloud has the biggest impact in the J band peak, where it reduces the flux from the deep atmosphere that escapes between the deep water and methane absorption bands on either side. This is what drives the much improved fit for our retrieval model compared to the cloud free grid models shown in Figure 1. Figure 5 shows the retrieved bulk properties of Ross 458c, including the retrieved values for the gas abundances, along with log g. Some gases, such as CO, CO2, and H2S, are not well constrained, as seen on the histogram distribution. This has been the case for all near-infrared late-T dwarf retrievals using low resolution near-infrared spectra Gonzales et al. 2020). To date, only Tannock et al. (2022) have detected H2S in a near-infrared T dwarf spectrum, and it required spectroscopy with R ∼ 45000, 100x higher than used here. Even at this high-resolution, CO is still invisible due to masking by stronger CH4 absorption.

Gas abundances
We do detect and constrain the abundances of H2O, CH4, NH3, and Na+K. This is the same set of molecules constrained by other works in this temperature regime Zalesky et al. 2022).
The corner plot also shows the C/O (here using the CH4/H2O ratio), of 1.97, the calculation of which will be discussed in the Section 5.
The abundance plot shown in Figure 4 shows the retrieved gas mixing ratios along with equilibrium predictions from our thermochemical grid models (Visscher et al. 2006(Visscher et al. , 2010Visscher 2012;Marley et al. 2021).
Our retrieved abundance for NH3 is consistent with chemical equilibrium predictions, whilst CO and H2O are not. The abundance profiles for H2O and CH4 are close to vertical, somewhat justifying our vertically constant mixing ratio assumption in the retrieval model. H2O and CH4 match their abundances at 100 bars (2000 K), and this may be suggestive of their chemistry being quenched at this level due to rapid vertical mixing. This is discussed further in Section 5.2.3.

ANALYSIS AND DISCUSSION
In line with earlier studies of Ross 458c, we find that the atmosphere is best described by a cloudy model. The best fit cloud model is a power law slab cloud, this is consistent for both the EMCEE and the nested sampling.

T eff , luminosity, radius and mass
The bulk properties of mass, radius, T eff and luminosity are a mixture of retrieved values, and values extrapolated from the forward model in post processing. The mass and radius are retrieved directly in the PyMultiNest version of Brewster, having been previously inferred from log g and distance scaling in the EMCEE version. The luminosity is found by extrapolating the forward model for the retrieved parameters across a wide wavelength range from 0.5 -20 µm, encapsulating essentially all the flux from the target. The T eff is then found using the extrapolated L bol , and the retrieved radius.
Our extrapolated luminosity for Ross 458c is −5.27 ± 0.03, which is much higher than the value of log(L bol /L ) = −5.61 ± 0.03 that was found in Burningham et al. (2011). This is may be due to missing absorption from CO and/or CO2, which occur in the 4 -5 µm region, but which are unconstrained in our model, since it is based on the near-infrared only. Figure 6 shows that our extrapolated spectrum is a good match to the Spitzer Channels 1 and 2 photometry which are driven by well-characterised CH4 absorption. The WISE W1 band photometry also matches our model flux reasonably well. However, our extrapolated model is much brighter than the WISE W2 photometry. This filter coincides more strongly with the 4.6 µm CO absorption, and this support our assertion that this is the origin of the mismatch between our extrapolated luminosity and that measured previously by Burningham et al. (2011). We also note that our extrapolated spectrum shows a strong CO2 absorption starting at 4.3 µm, which is extrapolated from our near-infrared spectrum where its abundance is unconstrained. This is likely spurious and does not warrant further interpretation.
It also plausible that our poorly constrained temperatures at shallower-pressures are also impacting the flux at longer wavelengths. Our inferred T eff which is based on this suspect luminosity should be similarly treated with caution.
We do not think photometric or astrometric uncertainties have a significant effect on our extrapolated luminosity. The astrometric uncertainties are marginalised over in the re- Figure 6. Extrapolated retrieval spectrum of Ross 458c, with the photometry points from WISE1 and 2, and Spitzer 1 and 2 added, as well as transmission regions of the filters, and the 700 K, log g = 4.5 Sonora model. trieval model, and so are already incorporated into the radius uncertainty. We have not marginalised over the photometric uncertainty contribution to the flux calibration. However, the photometric uncertainty is at the 1 % level and thus smaller than our 3 % uncertainty in luminosity. Figure 5, also shows the values for mass and radius, of 27 ± 4 MJup and 1.45 ± 0.06 RJup, respectively. The mass and radius are retrieved directly. However, the value for the retrieved radius is driven by the retrieved model scaling factor (R 2 /D 2 ) and the Gaia parallax. The retrieved mass value is driven by the influence it has on log g and the retrieved radius.
Burgasser et al. (2010) estimate a lower age limit for Ross 458 system of 150 Myr based on the absence of spectroscopic features suggestive of low surface gravity that are seen in young M dwarfs up to the age of the Pleiades. However, this system has a metallicity of [M/H] = +0.2 (and possibly higher), which is higher than the Pleiades (Soderblom et al. 2009) , and thus may not follow the same trend in spectroscopic features. Burgasser et al. (2010) finds the absence of Li i absorption at 6708Å suggests a more robust lower age limit of 30 -50 Myr, which is consistent with the age of the IC 2391 Moving Group, as found by Nakajima et al. (2010). Figure 7 shows the radius as a function of age, with two lower limits for the age of 50 Myr and 150 Myr highlighted in Burgasser et al. (2010). Our retrieved radius of R = 1.45 RJup coincides with the evolutionary model predictions for [M/H] = +0.5 and an age near lower of the two limits. The retrieved mass and gravity are similarly consistent with this lower age limit. However, if the older age limit of 150 Myr is applied, then both the mass and radius are larger than expected from these models. None-the-less, they result in a value for log g which is consistent with previous estimates, for which values are seen in Table 2. We note also, that the EMCEE based retrievals (see Appendix A) find a value for log g = 5.13 +0.20 −0.37 , which is somewhat higher. Although its large error, particularly towards lower values mean that it also consistent with previous estimates.

C/O ratio
The C/O ratio can be calculated using the CH4 to H2O ratio, as CO and CO2 are undetected in the NIR. This method for calculating the C/O ratio from methane and water is also used in Line et al. (2015Line et al. ( , 2017 and Gonzales et al. (2020) (the latter with the addition of CO for the L dwarf). If we naively apply this here, we arrive at an improbably high value of 1.97 +0.13 −0.14 . This is likely to be outside the scope of any stellar C/O ratio, for which C/O > 1 is rarely seen. Generally the distribution of stellar C/O is tight, with a peak near the solar value (Nissen 2013;Nakajima & Sorahana 2016;Brewer & Fischer 2017).
This section investigates several possibilities to assess their impact on the C/O ratio. We first consider observational and analytical sources of bias, before considering atmospheric origins.

Observational and model biases
Telluric water bands are common in the infrared and originate from water absorption in the Earth's atmosphere, which may bias our C/O ratio. We performed a test to check if removing telluric water bands alters our estimate for the H2O abundance and hence the C/O ratio. This was done by removing all data points in the 1.35 to 1.42 and the 1.80 to 1.95 µm range, where telluric water features are present, and running the nested sampling winning model (power law slab cloud) with the bands removed. Removing the telluric absorption bands did not change the retrieved water abundance significantly, and the CH4/H2O remained high at 1.88.
As listed in Tables 8 and 7, the C/O was consistently high across all cloud models tested and sampling methods tested.  A certain amount of oxygen can be tied up via silicate condensation and would be "missing" from the atmosphere. However, under typical assumptions for T dwarf atmospheres this would not account for all of the missing oxygen. By allowing 25% of oxygen to be tied up in silicate clouds (Burrows & Sharp 1999), the C/O ratio would lower to 1.47, which is still extremely high.

Oxygen depletion by condensation
Water clouds in the atmosphere of Ross 458c would account for the missing water and lower the C/O ratio substantially, but the temperature is too high for water clouds. Water clouds are expected to become a significant opacity source in brown dwarfs of temperatures less than 400 K (Morley et al. 2014).

Non-equilibrium chemistry
The value of the CH4/H2O ratio is likely impacted by nonequilibrium chemistry which will lead to more oxygen being tied up in CO and CO2 than would otherwise be expected at the low-temperatures of late-T dwarfs. Previous work has highlighted the importance of non-equilibrium chemistry, leading to higher CO and CO2 abundances.
In the cooler T dwarfs, convection drives the disequilibrium chemistry. Miles et al. (2020) find that disequilibrium chemistry plays an important role in directly imaged exoplanets and brown dwarfs, leading to important CO absorption in the spectra. Noll et al. (1997) found that for Gliese 229B, CO was also present at larger abundances than expected, such as in the 4.7 micron band, as a disequilibrium species high up in the atmosphere, suggesting transport induced quenching.
CO is favoured as the dominant C-bearing gas at high temperatures (deep in the atmosphere), whereas CH4 is favoured as the dominant C-bearing gas at low temperatures (higher altitudes, Lodders & Fegley 2002;Visscher 2012). Through atmospheric mixing, the chemical composition can be driven out of equilibrium. Rapid vertical mixing can transport a gas to higher altitude (and lower temperatures) before the chemical constituents have had time to reach chemical equilibrium. If the mixing time is faster than the chemical reaction time, disequilibrium can occur.
At the quench point, (where the chemical timescale = mixing timescale), the abundance of the species is "quenched" at a fixed value as it is mixed to higher altitudes. CO + CO2 are subject to transport induced quenching, which may lead to a quenched abundance in the upper atmosphere far in excess of that predicted by equilibrium, if undergoing rapid vertical mixing (Visscher et al. 2010). Figure 4 shows a good match with thermochemical predictions at high pressure and temperature (T ≈ 2000K). This suggests that the observed CH4/H2O ratio can be consistent with C/O = 1.35 (the maximum modelled value in our thermochemical grid), if carbon-oxygen chemistry is quenched at the 100 bar, 2000 K level. However, at 2000 K the chemical reactions converting CO/CH4 will be fast, suggesting that it is unlikely that the observed abundances can be attributed to quench chemistry alone (Visscher & Moses 2011).
We have simulated how the estimated C/O ratio can be affected by incorporating CO and CO2 that might have been missed by our near-infrared spectroscopy. In Figure 8 we show how the C/O ratio decreases as more CO and CO2 are incorporated, as indicated via the associated increase in the [C/H] metallicity. The plot is based on our condensation-corrected C/O= 1.47, based on CH4/H2O alone, which corresponds to [C/H] = +0.18 (see Figure 5). As expected, adding CO drags the C/O ratio closer to 1. However, a substantial (0.5) fraction of CO2 is required to bring the C/O ratio to within the typical stellar range (i.e. ∼ 0.8, see Section 5.2.4).
As discussed in Burningham et al. (2011), there are disagreements as to the full network of CO-CO2-CH4 reactions, and where their equilibria lie. But, even the most pro-CO2 cases in the BT Settl model grids discussed in that paper have CO outnumbering CO2 by a factor of 20, which is the most CO2 poor case illustrated in Figure 8.
In addition to the high abundance of CO2 required to bring the implied C/O ratio towards the expected stellar range, it is also clear that the implied [C/H] must rise significantly also. In all cases, allowing for enough CO and CO2 to bring C/O below 1.0 implies [C/H] > +0.5, which is significantly higher than suggested by the metallicty of the primary stars in the system, and beyond the range of carbon abundances or metallicities seen in the solar neighbourhood (e.g. Nissen 2013; Hinkel et al. 2014).
In the absence of spectroscopy covering the missing CO and CO2 absorption, we argue that the estimate of the C/O ratio made via comparison of our retrieved abundances with our thermochemical model grids (Figure 4) as the most reasonable, i.e. C/O≈ 1.35. We note that this value implicitly includes the impact of rain-out of oxygen in condensates according to phase-equilibrium chemistry at solar abundance ratios.

C/O of primary system
The C/O ratio of the primary system, Ross 458AB, is unknown, and it may not share the solar value of 0.54 (Asplund et al. 2005). However, the range of stellar C/O ratios for solar type stars is relatively narrow, with C/O > 0.8 very rare. As the primaries are not carbon stars, a C/O > 0.8 would be highly unlikely. The peak of the distribution is around 0.47, with a tail towards lower and higher C/O ratios (Nissen 2013;Brewer & Fischer 2017). This makes the high implied value for Ross 458c particularly interesting.

Interpretation of C/O ratio
Taking the above considerations into account, the fact that the C/O ratio appears to be so different from its primaries, may suggest a planetary formation route for Ross 458c. This formation would have to be such that it is not enriched with oxygen. Ross 458c has not necessarily formed in situ but could have migrated to its current position. The protoplanetary disk would have different compositions at different radii, at the various ice lines. In order for the atmosphere to be oxygen poor, the formation must have taken place outside of the H2O snowline, with a process that inhibits accretion of oxygen bearing silicates and allows for more carbon-rich gas to accrete. The suggestion of a planetary formation route for Ross 458c seems unlikely due to its mass ratio with the primaries and our own large retrieved mass (driven up by the large retrieved radius).
This oxygen depletion that we are seeing, is also noted in other retrievals, such as Calamari et al. (2022), who find a supersolar C/O ratio for Gliese 229B. They compare their work to other works done on T-dwarfs, who likewise see the trend of supersolar C/O Zalesky et al. 2019Zalesky et al. , 2022. They speculate that unanticipated chemistry may provide additional oxygen sinks, such as unmodeled magnesium silicates and/or iron-bearing condensates, that could drive an apparently super-solar C/O.

CONCLUSION
In this work we have presented the first retrieval of Ross 458c. Ross 458c is best parameterised by a non-grey (power-law) cloud, structured as a slab. Both the radius and the mass are overestimated, based on evolutionary models. The bolometric luminosity is higher than found in Burningham et al. (2011).
The CH4/H2O ratio is much higher than expected, at 1.97. This is due to missing oxygen from a low water abundance. Comparisons to thermochemical grid models suggest a still high C/O ratio of 1.35, if CH4 and H2O is quenched at 2000 K due to vigorous mixing.
Even with oxygen sequestered into clouds and accounting for transport induced quenching, the C/O ratio remains high. This points to either a planetary origin of Ross 458c, or the presence on an unidentified sink for the gas-phase oxygen.
These retrieval results are the first in an analysis of a larger sample of late-T dwarfs that seek to answer the following questions: Do companion T dwarfs have different composition than their host star? Is the C/O ratio of free-floating T dwarfs different to that of the stellar population and to that of companion T dwarfs? What trends in C/O ratio is present across the T dwarf range?

Future work
In order to compare the C/O ratios of companions to their primary star(s), we need the abundances of the primary. This could be done following the method of Tsuji & Nakajima (2014, which uses high resolution spectra of H2O and CO in the near-infrared bands.
Additionally, by extending the spectrum of Ross 458c to 5 µm, we would include CO and CO2 absorption which might be a significant contributor to carbon and oxygen abundances, altering the C/O ratio. No spectroscopic data at those wavelengths exists yet. The data for Ross 458c which will be obtained through JWST at near-infrared and mid-infrared wavelengths and at greater sensitivity will aid our understanding of absorbers in the atmosphere, such as CO and CO2, and the thermal profile over wider range of pressures.

DATA AVAILABILITY
All data underlying this article are publicly available from the relevant observatory archive, or upon reasonable request to the author.

ACKNOWLEDGEMENTS
J.G. acknowledges support from UK Research and Innovation-Science and Technology Facilities Council (UKRI-STFC) studentships. This research was made possible thanks to the Royal Society International Exchange grant No. IES/R3/170266. E.G. acknowledges support from the Heising-Simons Foundation for this research. This work benefited from the 2022 Exoplanet Summer Program in the Other Worlds Laboratory (OWL) at the University of California, Santa Cruz, a program funded by the Heising-Simons Foundation. This work has made use of the University of Hertfordshire's high-performance computing facility. J.F acknowledges the NSF award 1909776 and NASA XRP Award 80NSSC22K0142. R.L acknowledges the following grants: JWST-AR-01977.007-A, JWST-AR-02232.008A, 80NSSC22K0953 (NASA ROSES XRP). Table A1 shows the comparison of retrieved parameters between the EMCEE and the PyMultiNest sampler. The values are consistent with each other to 1 σ for the most part, with a couple of exceptions that differ by under 2 σ. Figure A1 shows the comparison of the thermal profiles of the EMCEE and PyMultiNest sampler for the power-law slab cloud. They follow similar profiles, with the EMCEE profile being slightly warmer at deeper pressures.

APPENDIX A: COMPARISON OF EMCEE AND PYMULTINEST RESULTS
The location of the EMCEE cloud is unconstrained within the atmosphere at the top of the cloud. Table A2 shows the comparison of CH4/H2O for the different models tested, for both the PyMultinest and the EM-CEE sampler. The CH4/H2O stays consistently high across all models, for both sampling methods.
The two highest ranking models are the same for both samplers, just with the order switched. See Table 7 Table A1. Comparison of parameters between EMCEE and PyMultiNest for the power law slab cloud (the winning model from Py-MultiNest, and the second ranked model from EMCEE, in order to allow for a more direct comparison. Figure A1. Comparison of the thermal profiles with their errors, for the power-law slab cloud of the EMCEE (in blue) and PyMultiNest (in black) samplers with the cloud condensation curves overplotted. Also plotted are the cloud locations in the bars on the rights (cloud N being PyMultiNest and cloud E being the EMCEE sampler), with their associated errors in grey.