Searching for NLTE effects in the high-resolution transmission spectrum of WASP-121 b with Cloudy for Exoplanets

Ultra-hot Jupiters (UHJs) undergo intense irradiation by their host stars and are expected to experience non-local thermodynamic equilibrium (NLTE) effects in their atmospheres. Such effects are computationally intensive to model but, at the low pressures probed by high-resolution cross-correlation spectroscopy (HRCCS), can significantly impact the formation of spectral lines. The UHJ WASP-121 b exhibits a highly inflated atmosphere, making it ideal for investigating the impact of NLTE effects on its transmission spectrum. Here, we formally introduce Cloudy for Exoplanets, a Cloudy-based modelling code, and use it to generate 1-D LTE and NLTE atmospheric models and spectra to analyse archival HARPS WASP-121 b transmission spectra. We assessed the models using two HRCCS methods: i) Pearson cross-correlation, and ii) a method that aims to match the average observed line depth for given atmospheric species. All models result in strong detections of Fe I ($7.5<S/N<10.5$). However, the highest S/N model (LTE) does not agree with the best-matching model of the average line depth (NLTE). We also find degeneracy, such that increasing the isothermal temperature and metallicity of the LTE models can produce average line depths similar to cooler, less metal rich NLTE models. Thus, we are unable to conclusively remark on the presence of NLTE effects in the atmosphere of WASP-121 b. We instead highlight the need for standardised metrics in HRCCS that enable robust statistical assessment of complex physical models, e.g. NLTE or 3-D effects, that are currently too computationally intensive to include in HRCCS atmospheric retrievals.


INTRODUCTION
WASP-121 b is one of the best studied examples of an ultra-hot Jupiter (UHJ), and the first exoplanet to have a stratosphere (thermal inversion) detected in its atmosphere (Evans et al. 2017).Recent JWST phase curve observations have revealed a day-side hot spot eastward of the substellar point, and are consistent with a cloudy night-side (Mikal-Evans et al. 2023).A number of works have also been published inventorying the chemical species in its atmosphere, using high-resolution transmission spectroscopy.Collectively, they have identified neutral atomic species such as H, Li, Na, Mg, K, Ca, Sc, V, Cr, Mn, Fe, Co, Ni, Cu, and Ba, as well as Ca ii and Fe ii (Gibson et al. 2020;Cabot et al. 2020;Hoeĳmakers et al. 2020;Ben-Yami et al. 2020;Merritt et al. 2020Merritt et al. , 2021;;Borsa et al. 2021b;de Regt et al. 2022;Azevedo Silva et al. 2022;Gibson et al. 2022), although Fe ii and neutral Mg are debated in the literature with non detections in high-resolution (Hoeĳmakers et al. 2020;Merritt et al. ★ E-mail: mitchell.young@physics.ox.ac.uk 2021).It is worth noting, however, that both Fe ii and Mg ii have been detected at UV wavelengths with low-resolution observations (Sing et al. 2019).
The advantage of high-resolution spectroscopy is that it is sensitive to the number of spectral lines and line depth ratios, information that might be missed in low-resolution, while also probing higher altitudes than low-resolution can reach.Unfortunately, unlike stellar high-resolution spectroscopy where individual spectral lines stand out well above the level of the noise, exoplanetary spectral features are generally of comparable signal to observational noise or are buried within it.Therefore, rather than study individual spectral lines, it is common to employ high-resolution cross-correlation spectroscopy (HRCCS), which involves cross-correlating a model template with a time series of observations to extract the planetary signal from within the noise.
It is common to make a number of assumptions when generating templates for HRCCS, and when modelling exoplanetary atmospheres in general.These can include: i) modelling the atmospheres in 1-D, ii) assuming they are isothermal, or iii) calculating the thermodynamic properties in local thermodynamic equilibrium (LTE), among others.Indeed, many of the studies performing atmospheric inventories of WASP-121 b make one or more of these assumptions.The motivation for doing so is obvious; by simplifying the calculations involved, the time and resources needed to compute the models is reduced, an important consideration for log-likelihood retrieval frameworks that depend upon producing large numbers of models to find a best fit to the data.But models making use of these simplifications are not always able to match observations.For example, Hoeĳmakers et al. (2020, hereafter Ho20) found that their isothermal LTE models were able to make strong detections at optical wavelengths of a number of neutral Mg, Ca, V, Cr, Fe, and Ni, but upon investigating further, they found that a given species' template under-predicted observed average line depths, by factors of 1.5 to 8 depending on the species.Notably, their models under-predicted the strength of the Fe lines, one of the species that shows the largest divergence from LTE models when relaxing that assumption, by a factor of 4.7, as we show in Section 5.1.Hoeĳmakers et al. (2022) found that a warmer isothermal model provided a better fit to the data, but reduced the depth of the observed signal significantly.The authors note in Ho20 that the under-prediction of line depths may be explained by their models assuming chemical and hydrostatic equilibria, whereas the exoplanet itself is under no such constraint.Indeed, Huang et al. (2023) find an escaping atmosphere adequately explains similar observational discrepancies at UV wavelengths, but their hydrodynamic model allows for a more complex temperature structure and some divergence from LTE.Further, the only optical wavelength features to extend beyond the Roche Lobe in their model are H, H, and the Ca ii H and K lines.While hydrodynamics almost certainly still contributes to the strong optical absorption observed in Ho20, atmospheric escape may only be part of the solution.
Recently, Fossati et al. (2021) and Borsa et al. (2021a) demonstrated the effects of non-local thermodynamic equilibrium (NLTE) in the atmosphere of KELT-9b, with NLTE models fitting observations of H, H, and the infrared O i triplet lines more accurately than their LTE counterparts.Fossati et al. (2021) demonstrated that NLTE causes a dramatic heating of the upper atmosphere (< 10 −4 bar) compared to LTE, with temperature increases of around 2500 K, which led to a more extended atmosphere via the pressure scale height, and thus, deeper transit absorption features.Furthermore, computing the radiative transfer for spectral synthesis in NLTE was shown to increase the absorption depth by an additional ∼ 10%.While such effects in the atmosphere of WASP-121 b, if present, are likely to be more moderate than KELT9-b as the stellar irradiation is less intense, the increased depths of absorption features in NLTE provide another possible explanation of the discrepancy seen in Ho20, particularly for Fe i.
In this paper, we explore the effects of NLTE radiative transfer and variable temperature profiles on the depths of absorption features in the transmission spectrum of WASP-121 b to assess the need to account for these when modelling high resolution transmission spectroscopy.In Section 2, we describe in detail how we prepare our atmospheric models and synthetic transmission spectra for HRCCS analysis.In Section 3, we compare and contrast the differences between our various models, identifying and highlighting NLTE effects.In Section 4, we present our methodology, adapted from that used in Ho20, for the HRCCS analysis of the observational data.In Section 5, we present the results of our HRCCS analysis and discuss their implications.Finally, in Section 6, we present the conclusions drawn from the analysis.

Cloudy
Most publically available exoplanetary atmospheric modelling and radiative transfer codes (e.g.petitRADTRANS (Mollière et al. 2019), TauREx (Al-Refaie et al. 2021), HELIOS-K (Grimm et al. 2021), etc.) are not built to perform calculations in NLTE.Consequently, we used the general astrophysical simulation code Cloudy (v 17.03, Ferland et al. 2017) to model the exoplanetary atmospheric structures and associated transmission spectra in this work.Cloudy is an open-source radiative transfer and microphysics code with a wide range of applications, from planetary nebulae to supernovae winds to the interstellar medium.The primary benefit of Cloudy from an exoplanet modelling point of view is that it is capable of carrying out self-consistent atmospheric structural modelling and spectral synthesis under the conditions of NLTE.While Cloudy was not designed with exoplanets in mind, there is a growing body of work adapting it to this purpose (e.g., Salz et al. 2015Salz et al. , 2018Salz et al. , 2019;;Fossati et al. 2020Fossati et al. , 2021Fossati et al. , 2023;;Turner et al. 2020;Young et al. 2020a,b;Borsa et al. 2021a;Kubyshkina et al. 2023;Linssen et al. 2023).
Cloudy's NLTE calculations are unreliable for densities ≳ 10 15 cm −3 , such as are found in exoplanetary lower atmospheres and interiors, typically between ∼ 10 −3 − 10 −4 bar for hot Jupiters (HJs) and UHJs.This density limit is driven by the approximations involved in the treatment of three-body recombination-collisional ionisation for elements heavier than H and He (Ferland et al. 2017).However, tests indicate that this most strongly affects calculations in which Cloudy is allowed to compute the thermal structure, while those with a fixed temperature profile appear to be significantly less affected (Fossati et al. 2020;Young et al. 2020a).This is because the approximations mostly impact the heating and cooling functions that are only involved in calculating the temperature structure, and not radiative transfer calculations.For densities greater than ∼ 10 19 cm −3 (between ∼ 0.1 − 10 bar for HJs and UHJs), Cloudy's NLTE calculations converge to LTE solutions.For an in depth discussion of the application of Cloudy to exoplanetary atmospheres and NLTE conditions therein, we refer to Section 2.2.1 of Fossati et al. (2021).

Cloudy for Exoplanets
Following on from the work of Fossati et al. (2021), here we formally introduce Cloudy for Exoplanets (CfE), a Python interface for Cloudy that models 1-D hydrostatic NLTE exoplanetary atmospheres and high-resolution transmission spectra by preparing, running, and processing Cloudy simulations.The interface directs Cloudy to compute models and spectra in a two-pass procedure (first solving for the atmospheric structure, then synthesizing the spectrum), similar to other comprehensive atmospheric modelling codes (e. g.PHOENIX, Hauschildt et al. 1997;Barman 2007).

CfE Atmospheric Structure
Cloudy for Exoplanets first solves for the atmospheric structure via an iterative procedure: calculating or fixing atmospheric parameters that are passed as initial conditions to individual Cloudy simulations, then processing the simulation output to generate the atmospheric parameters for subsequent iterations.Unless directed otherwise, CfE initially assumes the atmosphere is isothermal at the equilibrium temperature ( eq ) over a user defined pressure range, and calculates the altitude scale of the atmosphere according to the pressure scale height, given a reference radius ( 0 ) and pressure ( 0 ), as where   is the radius of the th atmospheric layer,   is the Boltzmann constant, is the average temperature between radii   and  +1 ,  + 1 2 is the average mean molecular weight of the atmosphere between radii   and  +1 ,   is the gravitational acceleration at radius   , and   is the pressure at radius   .The radius scale is generated with a variable scale height, reevaluating the mean molecular weight and gravitational acceleration at each pressure sampling point.The gravitational acceleration accounts for the Roche potential, for which the generalised 3D formulation is where   , ,   , , and  , are the , , and  components of the gravitational acceleration at the th atmospheric layer, with where G is the gravitational constant,  is the orbital angular frequency,  =  ★  ★ +  is the ratio of the stellar mass to the total mass of the system, and   = cos() sin(),   = cos() cos(),   = sin() sin(),   = cos() cos(),   = cos(),   = − sin(), for co-latitude 0 ≤  ≤  and longitude 0 ≤  ≤ 2.This simplifies to for  = /2 and  = 0 at the substellar point.
The mean molecular weight of the th layer of atmosphere is calculated as where   is the atomic weight of species , and   is the number density of species .The density of the atmosphere is calculated according to the ideal gas law, where   is the number density of the th layer of atmospheric gas,   is the pressure of the th layer,  is a volume element for which the density is calculated (taken here to be a cubic cm), and   is the atmospheric temperature of the th layer.Given these initial conditions, CfE prepares a Cloudy simulation, passing in the substellar radius scale with the associated H density profile,   .Cloudy then irradiates the atmosphere with a user provided stellar spectrum and, given the orbital separation, , stellar luminosity,  ★ , and heat redistribution parameter,  (Burrows 2014; Koll 2022), solves for the excitation, ionisation, density, pressure, and thermal profiles of the atmosphere.Atmospheric heat redistribution is mimicked by scaling the input stellar spectrum by a factor of  , as in Fossati et al. (2021), as Cloudy does not natively allow for calculation of it.
CfE then prepares a second simulation, using the output from the first in place of the initial conditions, revises the radius scale and density profile, and repeats the above calculations.This process is iterated upon until the - and  differ between iterations by less than 1% for each atmospheric layer, at which point the atmospheric structure is considered converged.In practice, this takes between five to ten iterations for most models.To ensure complete selfconsistency within a given model, a final Cloudy simulation is prepared where the - profile is fixed to the profile of the converged iteration and not allowed to vary, solving only for the excitation and ionisation of the atmospheric constituents.The entire process has a computation time of between several hours to several tens of hours on a single processor, depending on the specifics of the model and system architecture.
As discussed in Section 2.1, the region of Cloudy NLTE validity for HJs and UHJs typically occurs at pressures between 10 −3 − 10 −4 bar, with the LTE convergence density typically occurring between 0.1 − 10 bar.Because the thermal solution in this density range (10 15 <  < 10 19 cm −3 ) is uncertain, CfE replaces this portion of the computed - profile in each Cloudy simulation with an analytic profile using a modified version of the Three-channel Eddington Approximation model of Guillot (2010), formulated as in Fossati et al. (2020).This approximation parameterises the - profile as  () = 0.75 2 3 where  is the atmospheric pressure,  int is the planetary internal temperature,  irr is the temperature of the stellar irradiation at the top of the atmosphere, and  is the thermal optical depth, with where  is the Planck thermal infrared opacity,  is the visibleto-thermal stream Planck mean opacity ratio,  is a catch all term for the albedo, emissivity, and day-night redistribution of thermal energy, and  is a model parameter used to control the slope of the profile's temperature variation.In Equation 10,   is the planetary surface gravity, evaluated at  0 for a spherical planet of mass   .In Equation 11,  2 () is the second order exponential integral function of .In Equation 12,   is the stellar radius,  is the orbital separation, and  eff is the stellar effective temperature.Here, CfE chooses , , , and  such that there is a smooth and continuous transition from the Cloudy modelled profile to the approximated profile across both density limits.It does so via an iterative process wherein all four parameters are varied simultaneously over set ranges, a best matching profile is selected, the ranges for the parameters are refined, and the process is repeated until the temperatures and slopes of the approximated profile match the modelled profile at the density limits within 0.1 K and 0.01%, respectively.

CfE Spectral Synthesis
Next, to produce the transmission spectrum associated with the atmospheric structure, CfE maps the 1-D temperature, density, and composition profiles onto concentric isobaric surfaces around the planet, then has Cloudy compute line-of-sight absorption and scattering of the stellar spectrum through the limb of the atmosphere at varying altitudes.This implicitly assumes a radially symmetric atmosphere for transmission along the line of sight and requires only a single Cloudy simulation to be computed per altitude which is then integrated around the line of sight axis and the disc of the planet.Physical distance through successive layers of atmosphere along line-of-sight at the limb is calculated as where   is the length through the th layer,   is half the chord length along line-of-sight through the th layer,   is the radius of the th layer, and ℎ  is the height down from the th layer of the atmosphere to the chord altitude at the terminator, presented schematically in the upper panel of Figure 1.The radius scale used here is the polar radius scale, calculated according to Equation 1 with   evaluated by setting  = 0 and  = 0 in Equations 3, 4, and 5.This formulation of the distance through the atmospheric layers assumes spherically symmetric atmospheric shells and does not account for deformation by the Roche potential effects.While this is inconsistent with treatment of gravity in the structural modelling phase, absorption and scattering for a given altitude at the limb primarily occur in the deepest layers of the atmosphere that the stellar radiation passes through.Hence, at the terminator, assuming spherical atmospheric layers is a good approximation and has minimal impact on the resultant transmitted spectrum.
The output from the individual Cloudy simulations for the various altitudes are compiled into the transmission spectrum by summing the contributions of the individual spectra, weighted by the relative area of the stellar disk they cover and the respective limb darkening.The limb-darkening law used in CfE is a non-linear relation (Claret 2000), of the form where  () is the line-of-sight intensity at position  = cos  ( is the angle between the line-of-sight and the emergent stellar intensity), and  0 ,  1 ,  2 , and  3 are limb-darkening coefficients (Salz et al. 2019).The stellar disc is divided into rings, similar to the planetary atmosphere, every 0. 01  ★ out to a maximum of 0. 99  ★ , and the limb darkening is evaluated independently for each ring.To generate the summation weights for building the transmission spectrum, we calculate the overlapping areas of the planetary and stellar rings using the formula for intersecting circles, where (, ) is the area of the intersection,  and  are the radii of the two circles, respectively, and  is the distance between the centers of the circles.To convert these circular intersections to ring intersections, we subtract off the circular intersections of the next smaller ring radii of each the planetary and stellar rings, where    is the overlapping area of planetary ring  and stellar ring ,   is the radius of the th planetary ring, and   is the radius of the th stellar ring.An illustrative schematic of a planet in-transit in this framework is presented in the lower panel of Figure 1.
The summation weights are then taken to be where    is the weight of the overlapping th and th rings,    / ★ is the overlap area of the th and th rings relative to the area of the stellar disc, and  (  , )/ ( = 1, ) is the wavelength dependent limb darkening of the th ring.For building the relative flux transit spectrum out of the individual ring spectra, we assume that the distance between the centers of the stellar and planetary discs is the impact parameter,  (i.e. , the planet is at mid transit), and the summation is performed according to where   is the in transit flux spectrum,  ,★ is the out of transit stellar flux spectrum at disc centre, and  , is the flux spectrum of the th planetary ring.

Cloudy Spectral Offset
We note that there is a unique behaviour to Cloudy's spectral output that is resolution dependent1 .Rather than place spectral lines exactly at the wavelengths listed in the line list database, Cloudy will instead place them at the boundary of the wavelength bin in which they fall.This results in a slight offset of all line positions that varies between lines, and is exacerbated at lower spectral resolution (where the wavelength bins are wider) but mitigated at higher spectral resolution (where the bins are narrower).While this behaviour is not ideal for high spectral resolution characterisation or precision radial velocity analysis, performing the radiative transfer calculations at sufficiently high spectral resolution such that the average position offset is less than the velocity resolution of both the observational data and the chosen analysis method should reduce any impact this may have on experimental results.We discuss the impact on the current work in Section 5.2.1.

WASP-121 b
In this work, we used CfE to generate 12 models approximating the atmospheric structure and transmission spectrum of the exoplanet WASP-121 b, with various - profiles including self-consistently (SC) modelled profiles computed with Cloudy, an analytic (A) profile generated according to Equation 9 ( = 3.0,  = 0.95,  = 1.17,  = 1.3), and an isothermal (I) profile at 2000 K, two atmospheric metallicities (1× solar, 20× solar), and two thermodynamic treatments (LTE, NLTE).The A and I profiles are held constant regardless of metallicity or thermodynamic treatment.A full list of model combinations and identifiers can be found in Table 1.The atmospheric models extend from 10 2 bar at the lower boundary, up to 10 −12 bar at the upper boundary.The upper boundary is well into the region of NLTE validity, and the lower boundary is below the point where the calculations converge to the LTE solutions.We include opacities from atoms and ions of elements up to and including Zn, but we choose to only include molecular opacities for H-based molecules (H 2 , H + 3 , etc).This is motivated by previous non-detections of common atmospheric molecules in transmission for WASP-121 b, such as H 2 O, TiO and VO (e.g.Ho20).We use the opacities and line lists natively included in Cloudy (Ferland et al. 2017, and references therein), rather than incorporate additional opacities from outside sources.Spectral synthesis was performed at a spectral resolution of  ∼ 300000, for 3000 Å ≤  ≤ 8000 Å.
For the incident stellar spectrum and calculating the stellar limb darkening, a high spectral resolution PHOENIX model from the Göttigen Spectral Library (Husser et al. 2013), with  eff = 6500 K and log = 4.0, was used.We note that this PHOENIX model does not include computation of the stellar chromosphere, and is therefore incorrect in the estimation of centre-to-limb variations for emission lines forming in the stellar chromosphere or corona.It also does not include significant X-ray or extreme-UV (XUV, collectively) flux, which can lead to underestimating hydrogen ionisation and atmospheric heating in the planetary upper atmosphere.Indeed, Huang et al. (2023) found that ground state hydrogen ionisation is the dominant source of thermospheric heating for WASP-121 b, although photoionisation of excited hydrogen does not significantly impact the thermal structure of the atmosphere.Their best fitting model indicates that the thermosphere lies entirely beyond the Roche lobe at  = 2 × 10 −9 bar, and is thus beyond our hydrostaic cutoff at  = 10 −8 bar, discussed further below.We therefore consider the lack of XUV flux in our input stellar spectrum to not have a significant impact on our - profiles in the hydrostatic region.
To determine where to set the transit radius, we generated a number of models varying the reference pressure over the range 10 −3 <  0 < 10 1 bar.For each of these models, optical continuum absorption in the associated transmission spectrum transitions from optically thin to optically thick over the range 10 −2 <  0 < 10 0 bar, with only minor variations dependent upon choice of  0 .We therefore chose to set  0 = 0.1 bar, consistent with other ultra-hot Jupiter modelling studies (e.g.Fossati et al. 2020).Further planetary, stellar, and system parameters used in the modelling are listed in Table 2.
Because Cloudy is a hydrostatic code, we choose to artificially truncate the upper boundary of our atmospheric models at 10 −8 bar, below the hydrodynamic region, before computing the transmission spectra.While this truncation removes some layers that are at least partially opaque, leading to an under-estimation of line depths for some strong absorption features (discussed further in Section 5.1), Cloudy would produce incorrect absorption depths for the hydrodynamic region regardless, therefore we choose to only include the region where the model is valid.Following the reasoning of Borsa et al. (2021b), we calculate the region where the atmosphere begins to deviate from hydrostatic equilibrium via the Jeans escape parameter where  is the mass of an escaping particle, Δ  is the gravitational potential difference between the th pressure level and the Roche lobe.Jeans escape parameters as functions of atmospheric pressure for the various models are presented in Figure 2.For the A and I models, there is no noticeable difference between LTE and NLTE for either 1× or 20×.We find for the SC and A models that the deviation from hydrostatic equilibrium (as   → 0) occurs between 10 −8 − 10 −10 bar, above our truncation point, while for the I models the deviation occurs beyond the top of the model at 10 −12 bar.We choose to still truncate the I models at 10 −8 bar to remain consistent with the other models.
For the SC - profiles, CfE allows Cloudy to self-consistently compute heating and cooling rates throughout the atmosphere, accounting for contributions from each atmospheric chemical species as well as assorted physical processes.No constraints are placed on the resultant - profiles, and in some cases, multiple thermal in- versions occur in a single profile.While the various self-consistent profiles diverge from one another in the NLTE valid region, at higher pressures they are expected to converge to an equilibrium temperature of ∼ 2350 K, becoming isothermal between ∼ 10 −1 − 10 0 bar and remaining so to the bottom of the atmosphere.The A profile is generated by setting the values of , ,  and  in Equations 10, 11 and 12.While three of these parameters are related to physical properties of the atmosphere (,  and ), we set their values independent of any physical motivation such that instead the resultant - profile approximates a SC LTE profile ( = 3,  = 0.95,  = 1.17,  = 1.3).These - profiles and the associated radius profiles for the models are presented in Figure 3.The atmospheric metallicity was set by adjusting the abundances relative to H for all elements heavier than He.The default Cloudy solar abundance distribution is used, a combination of Allende Prieto et al. (2001Prieto et al. ( , 2002) ) Figure 3. Top -Atmospheric - profiles for the SC NLTE models (blue) and SC LTE models (green), as well as the A profile (red) designed to approximate the SC 1× LTE model, and the I profile (black).NLTE models display increased temperatures at pressures lower than ∼ 10 −2 bar, with differences between ∼1000-2000 K. Bottom -Atmospheric substellar radii for the various models in this work.For the A and I models, the NLTE and LTE radii are identical on the scale of this figure.The horizontal grey line at 0.1 bar in this and the following panels indicates the pressure level at which we set the reference radius equal to the observed planetary radius.The L1 point is located at 1.822  p , just outside the region displayed in the figure.

Hydrodynamic Comparison
To assess the limitations of our assumption of a hydrostatic atmosphere, we compare our atmospheric structures and transmission spectra with the best fitting model of Huang et al. (2023) (Case D in that work).As the authors are primarily interested in investigating atmospheric escape, and consequently investigate different physical processes than those focused on in this work, we draw attention to some noticeable similarities and differences between the two modelling approaches.Initially, Huang et al. ( 2023) start with the same planetary and system parameters as in this work (Delrez et al. 2016), but find adjusting the planetary mass ( P = 1.157 ± 0.07  J ) and transit radius ( P = 1.773 +0.041 −0.033  J ) provides a better fit to the data.While they also account for the Roche potential in a similar fashion as this work, these altered parameters, along with an adjustment to the surface gravity, lead to the L1 point being located slightly closer to the planet than we find, at  L1 = 1.71  P (∼ 2 × 10 −9 bar for Case D) rather than  L1 = 1.822  P .
The most significant difference between the Huang et al. ( 2023) model and those in this work, is their inclusion of hydrodynamic effects in the upper atmosphere ( < 10 −6 bar).At higher pressures, they instead use a hydrostatic model, and do not self-consistently model the - profile, but rather adopt the profile of Mikal-Evans et al. ( 2019) (increasing it by 200 K at 10 bar to avoid Fe condensation), reaching a maximum temperature of ∼ 3300 K near the top of the hydrostatic model region.After the transition to the hydrodynamic model region, the atmosphere initially cools off with increasing altitude before a strong inversion in the thermosphere, beyond 10 −9 bar, that reaches temperatures greater than 12000 K before cooling again.The end result is a - profile that is cooler than any of the ones in this work (with the exception of the I profile) up to our hydrostatic cutoff.
The cooler - profile, along with Huang et al. (2023) setting the transit radius at  = 4 × 10 −3 bar (rather than the 10 −1 bar used here), results in transmission features that are shallower than the SC and A model spectra through most of the optical, even with the contribution from the hydrodynamic layers.Notably, H is deeper in the hydrodyanmic model, likely caused by the transport of excited hydrogen to higher altitudes via hydrodynamic effects.Figure 4 compares the Huang et al. (2023) Case D transmission spectrum with our transmission spectra with the deepest features (SC NLTE 20×) and shallowest features (I LTE 1×).We note that although we do not model lines at wavelengths shorter than 3000 Å, the near-UV spectrum of the hydrodynamic model displays drastically deeper lines than the optical, with transit depths in excess of 30%.However, the optical transmission features are weaker than those of the SC and A models in this work, even though the Huang et al. (2023) model extends to lower pressures.

MODEL COMPARISON
To interpret the results of the HRCCS analysis and draw conclusions regarding the conditions in the atmosphere of WASP-121 b, an understanding is first needed of what differences there are between the various model transmission spectra and templates, and what differences in the atmospheric structures may lead to them.To facilitate   Huang et al. (2023).Spectra from this work have been rotationally broadened as discussed in Section 4.2 and convolved to a spectral resolution of  = 30000 to match the Huang et al. (2023) spectrum.Bottom -- profiles corresponding to the above spectra.We note that the Huang et al. (2023) profile experiences a sharp inversion above the bounds of this panel, increasing to ∼ 12500 K between 10 −9 to 10 −10 bar, before cooling to ∼ 1500 K around 2 × 10 −12 bar.this, we define Δ NLTE , Δ MH , and Δ ISO to be the differences in model quantity  relative to LTE, 20× metallicity, and isothermal models, respectively.The terms are formulated as and where  is an atmospheric property to be compared (e.g.,   , etc), and - in Δ ISO can be either SC or A. These terms serve to demonstrate how the various model parameters affect quantities relative to the parameters chosen in Ho20 as the best fitting model.

Temperature Profiles
The three types of - profile used in this work (SC, A, and I) exhibit large differences in temperature throughout the atmosphere, as shown in Figure 5.The A and SC profiles are approximately isothermal below ∼ 2 bar, within ±25 K of the planetary equilibrium temperature,  eq = 2358 K.The SC LTE profiles are well approximated by the A profile, with the SC 1× LTE profile being on average 0.88% warmer and the SC 20× LTE profile being on average 2.06% warmer.The exact deviations from the SC LTE profiles are shown as functions of atmospheric pressure in Figure 6.
Models with SC profiles display the same trend of NLTE upper atmospheric heating that has previously been demonstrated for KELT9-b (Fossati et al. 2021).For pressures lower than ∼ 10 −2 bar, Δ NLTE  (middle panel of Figure 5) reaches a maximum of 44.5%, or ∼ 1885 K, for 1× models and 33%, or ∼ 1235 K for 20× models, both at 10 −8 bar.The Cloudy density limit was exceeded at pressures higher than ∼ 10 −2 bar, and the SC - profiles were replaced with the analytic form between 10 −2 − 5 bar, at which point the SC NLTE profiles converged with the LTE profiles.The SC 1× models show the profiles have already diverged by ∼ 8% at 10 −2 bar, with Δ     increasing at a roughly constant rate of ∼ 2.5% per  until ∼ 10 −7 bar, where the rate increases to ∼ 20% per .The 20× models are approximately equal in temperature at 10 −2 bar, then diverge by Δ     ∼ 25% between ∼ 10 −3 − 10 −4 bar, after which the difference remains approximately constant within ±4% until the top of the atmosphere.These temperature increases in NLTE will increase the atmospheric scale heights of the models relative to LTE, and therefore the absorption depths as well.This may present differently across the two metallicities for different species, depending on where in the atmosphere the species spectral features form.

Fe Mixing Ratios
The largest factor contributing to the NLTE temperature increase is the population of Fe ii, the dominant source of heating in the upper atmosphere, as discussed in Fossati et al. (2021).In NLTE models, Fe is more heavily ionized than in LTE (see Figure 7), providing additional heating.Note that for the SC 1× models, there is already a large increase in the quantity of Fe ii in NLTE at 10 −2 bar, corresponding to the SC 1× NLTE temperature profile already being hotter than the SC 1× LTE profile at this point, while the 20× models don't show an increase in Fe ii at pressures lower than 10 −2 bar until ∼ 10 −3 , which again corresponds to the point where the SC 20× NLTE - profile begins to diverge from the SC 20× LTE one.While WASP-121 b shows a lesser absolute NLTE temperature increase than the KELT9-b models of Fossati et al. (2021), Δ NLTE  is approximately the same, ∼ 40%.Likewise, the ionisation of Fe shows relative NLTE increases similar to those seen for KELT9-b as well.

Atmospheric Radius
A direct consequence of the differing model temperatures is that the physical extent of the atmosphere will also differ, inflating for hotter temperatures and contracting for cooler, according to the pressure scale height (Eqn.1).The various atmospheric radii are displayed in Figure 8.There was no difference between the NLTE and LTE radii for either the A or I models, and we have chosen to only show one example of each for each metallicity.
Choice of - profile is the primary contributing factor to the differences.Radii are increased relative to the corresponding I models by between 21 − 43.5%, or 0.24 − 0.51   , at the top of the atmosphere, with the SC 1× NLTE model showing the greatest difference and the A 20× models showing the least.The two SC LTE models have slightly larger radii than the corresponding A models, corresponding to the SC LTE - profiles being slightly warmer on average than the A profile.Both the choice of thermodynamic treatment and metallicity have noticeably less impact on the radius.The radii of the SC NLTE models increase relative to the SC LTE models by between 6.9 − 11.3%, or 0.10 − 0.17   , with the 1× model exhibiting the larger increase of the two, and the 1× models increase relative to the 20× models by between 3.1 − 11.8%, or 0.04 − 0.18   , with the SC NLTE models exhibiting the largest increase and the I models exhibiting the least.
Differences in radii across metallicities for the A and I models are attributed to the decreased metallicity of the 1× models decreasing   , which in turn increases the pressure scale height.This also plays a role in the SC models, but acts in addition to the changes brought on by the temperature differences.The source of the differences across thermodynamic treatment for the SC models is primarily the difference in the temperature profiles, although we note that in theory the   should also vary, brought on by changes in ionisation rates between NLTE and LTE models.In practice, these differences in   have negligible impact on the radii, as is evidenced by there being no noticeable difference between the radii of the NLTE and LTE A and I models.

Transmission Spectra
The variations in the model atmospheric structures discussed in Section 3.1 impact the associated transmission spectra in a number of ways: (i) A larger atmospheric radius means that a larger portion of the stellar disc is occluded during transit, increasing the depth of transmission features.
(ii) Differences in excitation and ionization means more or less of an absorbing species is available for specific transitions, changing how many photons are absorbed at the corresponding wavelength(s), and consequently increasing or decreasing the absorption depths of lines and the level of the continuum.
(iii) Temperature plays a role in both previous items, but also the velocity distribution of particles, changing the rate of collisions which in turn affects collisional broadening of absorption features.
In practice, these effects lead to a number of general trends in the transmission spectra, as can be seen in Figure 9.The most obvious trend is absorption line depth increases with increasing - complexity, from I to A to SC, a consequence of (i).Next, for a given type of - profile, NLTE spectra have both deeper and broader absorption features than their LTE counterparts.The difference in feature depth is a combination of (i) and (ii), while the additional broadening, most noticeable in the wings of strong features such as the Fe i line at ∼ 4384.8Å, is credited to (iii).Notably here, the I NLTE spectra exhibit a significantly larger number of absorption features than the I LTE models.At an isothermal temperature of  = 2000 K, the models are too cool to significantly thermally ionise any of the constituent atoms, and so ionised species only appear in the NLTE models.Finally, again for a given type of - profile, metallicity makes little difference in general to the depth of absorption features, but does increase continuum absorption by a small amount.
In more detail, we present Δ ISO , Δ NLTE , and Δ MH of the transmission spectra, the % change in the depths of absorption lines between models, in Figure 10.All of the SC and A spectra universally exhibit greater absorption at all wavelengths than the I spectra, primarily as a result of the increased temperatures increasing the atmospheric scale height.For each SC and A spectrum, the vast majority of lines are only moderately deeper than in the corresponding I spectrum, and by an approximately equal amount across all wavelengths, with the exception of a handful of lines which show a much greater increase in depth.These include H (6562.8Å, more prominent in the 1× spectra), Na i D (5889.9Å and 5895.9Å), Ca ii H & K (3933.7 Å and 3968.5 Å, more prominent in the NLTE spectra), several strong Fe ii lines near 5000 Å (again more prominent in the NLTE spectra), and a small forest of lines between 3000 − 3500 Å, mostly Fe ii again.These can all be attributed to the increase atmospheric radius that comes with the increase in model temperature, while the Fe ii lines can additionally be attributed to the lack of Fe ionisation in the I models, particularly the I LTE models which do not provide a mechanism to noticeably ionise Fe.
The Δ NLTE for the SC spectra appear qualitatively similar to Δ ISO , although generally there is not as much of an increase in absorption in any feature, a few of the prominent lines that were present in Δ ISO are now missing (notably the Na i D lines), and a number of lines between 4000 − 6000 Å are actually weaker in the SC NLTE spectra by ∼ 0.1%.The A and I Δ NLTE spectra are qualitatively similar, showing both increases and decreases in line absorption, with a slight tendency of decreased absorption across the wavelength range, and no prominent features present.These decreases in absorption are caused by the increased ionisation of neutral species in the NLTE models, leaving fewer absorbers present.In the SC spectra, this is offset by the increased NLTE temperatures deepening the absorption features.
For Δ MH , there is little variation in any of the transmission spectra, with only the Balmer series lines appearing consistently for all choices of - profile, and Ca ii H & K for the SC and A spectra.Otherwise, increasing metallicity generally reduces absorption depths by increasing   , leading to smaller scale heights, but the effect is small.
It is also informative to investigate in which region of the atmosphere the bulk of the lines form.To this end, we calculate the transmission contribution function as in Mollière et al. (2019) for each of the 12 Fe i templates.Contribution functions for the SC 1× templates are presented in Figure 11.For the NLTE template, the majority of line formation occurs between 10 −1 to 10 −4 bar for wavelengths bluer than ∼ 5000 Å, and between 1 to 10 −3 bar for redder wavelengths.The continuum largely forms at pressures higher than 1 bar.In LTE, the situation is similar in terms of the regions where the lines form, but the template continuum regions  Figure 7. Top -Volumne mixing ratios of Fe i (blue and green) & ii (cyan and light green) for all SC models.The dominant ionisation state of Fe switches from neutral to singly ionised about 1 dex higher in the atmosphere for the 20× models than it does for the 1× models.The kink around 0.1 bar in the Fe i mixing ratios is the region where H 2 dissociation occurs in the models.Bottom -Relative difference between NLTE and LTE Fe ii mixing ratios for SC models.The 1× model is equally or more heavily ionised in NLTE than LTE throughout the atmosphere, while the 20× model shifts back and forth between more and less ionised in NLTE, with a net effect of being more ionised over the whole atmosphere.can be seen with greater ease, largely due to the greater line wing absorption in the NLTE template.
The other models' contribution functions appear qualitatively similar to their respective SC 1× thermodynamic treatment counterparts and exhibit noticeable trends with metallicity and choice of - profile.For the 20× models, line formation occurs higher in the atmosphere relative to the 1× models by approximately one order of magnitude in pressure.While this would suggest that the 20× models should show stronger absorption than the 1× models, this is counteracted by the positive values of Δ    for pressures lower than 0.1 bar.The A and, to a larger extent, I models show contributions that are less concentrated in the upper atmosphere than their SC counterparts, effectively spreading the absorption out over a larger region deeper into the atmosphere.

DATA ANALYSIS
To assess whether the under-prediction of Fe i absorption seen by Ho20 can be explained by either increasing the complexity of the - profile, adjusting the atmospheric metallicity, or accounting for the effects of NLTE (or some combination therein), we recreate here their HRCCS analysis.To do so, archival observations of WASP-121 b during transit are cross-correlated with the Fe i model templates produced in this work.These are then compared with cross-correlations of the data injected with our model spectra.We outline the procedure in detail below and make note in places where we specifically diverge from Ho20 methodology.

Observations
We obtained archival observations of three transits of the UHJ WASP-121 b, via the ESO Science Archive Facility.The data were collected with the HARPS instrument mounted on the ESO 3.6 m telescope at La Silla Observatory, Chile, across three half-nights on 31st Dec 2017 (hereafter 'Night 1'), 9th Jan 2018 (hereafter 'Night 2') and 14th Jan 2018 (hereafter 'Night 3'), as part of the HEARTS survey (ESO programme: 100.C-0750; PI: Ehrenreich).We excluded a single exposure from the second night of observations from our analysis, due to low signal to noise.

Post-processing
Following Ho20, we used data processed by the HARPS Data Reduction Pipeline (DRS, Lovis & Pepe 2007) in the e2ds format for our analysis.In the e2ds format each exposure is extracted as a 72 x 4096 two-dimensional spectra matrix where each row corresponds to an individual echelle order.We reordered the extracted spectra into 72 two-dimensional orders, where each order covered a unique wavelength range.Each row of these new orders corresponded to a unique exposure and each column (or wavelength channel) corresponded to a unique wavelength value, an example of which can be seen in row 1 of Figure 12.
Telluric removal: We removed the telluric contamination from our data at this stage, when it was in the Earth's reference frame.We used a different approach to Ho20, who applied molecfit to the one-dimensional s1d data products from the HARPS DRS to generate a telluric model for the entire wavelength range covered by HARPS.This was subsequently interpolated onto the wavelength grid of each order and divided out.
In contrast, we create a master telluric spectrum for each spectral order by first removing the continuum from the individual frames for a given night, and then averaging these continuumremoved spectra.We note that the continuua were removed the for the purposes of generating a master telluric spectrum only, and the remaining post-processing continues with the spectra still containing their continua.The result of combining all the frames is a very high signal-to-noise master telluric spectrum.The telluric lines were then removed from each order by dividing out the master telluric spectrum.
This approach works on the premise that the planet spectrum Doppler shifts significantly across the full in-and out-of-transit observing sequence such that it is not removed by the master telluric.While the planet spectrum shifts by 6-7.5 km s −1 between subsequent frames in these data (depending on individual exposure times), each pixel in the observed spectrum corresponds to only 1.7 km s −1 (when accounting for the reduction in effective resolution from planetary rotation2 ).The spectrum of the planet therefore shifts ∼ 4 pixels per frame, such than any in-transit spectra are offset and do not constructively sum into the master telluric.We note that this approach works well here due to the rapid orbital velocity of the planet and the time sampling of the observing sequence, but caution that any data set would need its own assessment before using this method of telluric removal.
Alignment: Following Ho20, we next aligned the data to the rest frame of WASP-121 b's parent star via a single interpolation, combining two velocity components: i) the barycentric velocity values for each exposure provided by the DRS and ii) the stellar reflex velocity ( ★ ) of each exposure.We did not use the  drift values provided by the HARPS DRS to align the data to the stellar rest frame, because the DRS did not provide a sufficiently stable value, oscillating randomly around 19.0 km s −1 for each exposure, indicating that it had not found an accurate  drift solution.This is unsurprising as WASP-121 is a rapidly rotating star ( sin  = 13.5 ± 0.7 km s −1 Delrez et al. 2016), with broadened lines that impact the ability of the DRS to accurately calculate the  drift .Instead, we calculated  ★ =  ★ sin 2 for each spectrum, where  ★ is the star's  semi-amplitude,  ★ = 181 m s −1 (Cabot et al. 2020) 3 , and  is the planet phase at the time the exposure was taken.Following Ho20, this meant that the data was now consistently olution element of / = 300 000/120 000 = 2.5 km s −1 .Each resolution element has a full width at half maximum FWHM = 4.1 pixels, resulting in a velocity resolution per pixel of 2.5/4.1 = 0.6 km s −1 .However, the impact of  proj,p = 7.0 km s −1 can be approximated as an effective reduction in resolving power of / proj,p ≈ 43 000, which results in an effective velocity resolution per pixel of 1.7 km s −1 . 3We note a unit error in Cabot et al. (2020), where the  ★ is stated to be 181 km s −1 as opposed to 181 m s −1 .offset from the stellar rest frame by the systemic velocity ( sys = 38.36km s −1 , Merritt et al. 2020).
Masking: Next, we applied a mask to contaminated and outlying regions of each order.We followed a different procedure to Ho20, who used a two-step masking process of applying a 5- clip to each pixel, followed by a visual inspection.They interpolate outlier individual pixels, whilst contaminated wavelength channels or wavelength channels with over 20% flagged pixels were masked.In contrast, we applied a 2- clip to each wavelength channel, such that if its standard deviation was over two times greater than the median of the whole order, the wavelength channel was masked to zero.This approach led to 5.8%, 6.4% and 6.4% of the data from Nights 1, 2, and 3 respectively being masked, in contrast to the 1.1%, 2.7% and 1.6% affected by masking in Ho20.Whilst our approach led to more data being masked, we found that using a weaker 3- clip led to a marginally weaker Fe i recovery, suggesting that the masked regions were sufficiently noisy or contaminated that they obscured the signal recovery.Our approach has the advantage of providing more objectivity than possible with a visual inspection.
Model injection: If a model spectrum was injected, it was injected at this stage.We followed the injection routine laid out in Ho20.The injected models were first broadened to by convolution with a rotation kernel to account for WASP-121 b's rotation,  proj,p = 7.0 km s −1 .As in Ho20, we used the rotation broadening approach described in Brogi et al. (2016).This accounts for the varying impact of the rotational broadening as the planet transverses the face of the star, and results in the spectral lines having a distinct double-peaked shape (see Figure 13).The model spectrum was next shifted from vacuum to air wavelengths.We then simulated the shape of the WASP-121 b transit4 .To inject the model into the in-transit spectra, we shifted the model spectrum to the expected radial velocity of the planet during that exposure, then multiplied it by the model transit light curve at that orbital phase, and finally multiplied it into the observed data.All injections are made with the all-species model for whichever - profile, metallicity, or thermodynamic treatment being investigated.During the cross-correlation, we use templates that match but instead of all-species the template is for a single species only.For example, for the case where we cross-correlated with the SC 20× NLTE Fe i model, we had injected the SC 20× NLTE allspecies model spectrum (see Table 1 for model identifiers).For our reproduction of the Ho20 results, we injected the all-species 2000 K model spectrum without TiO opacity.From this stage forward in the data analysis, we treated the model-injected data and the data with no model injected (hereafter 'observed data') exactly the same.
Colour-correction: Next, we performed a colour-correction to impose consistent mean flux levels across the spectra in each order, which facilitated the identification and removal of the Doppler shadow after the cross-correlation (see Section 4.3).Following Hoeĳmakers et al. (2020) to colour-correct, we calculated the mean wavelength channel, which gave us the mean flux level of each spectrum.We then took the mean of the mean wavelength chan-nel, which gave the mean flux of the entire order.We divided the mean wavelength channel by the mean flux of the order, which gave the deviation of each spectrum's mean flux level from the order's mean flux.Finally, we divided the order by these flux deviation values.To compensate for this colour-correction, the cross-correlation functions (CCFs) were weighted at the end of the cross-correlation method.

Cross-correlation
At this stage, our data were ready for cross-correlation.We prepared our cross-correlation templates in the same way as Ho20 (see Figure 15).For each cross-correlated template, we first shifted the model spectrum from vacuum to air wavelengths and truncated it such that it only spanned the wavelength coverage of HARPS, while leaving some padding on either end.We next fitted the continuum by sampling the maximum value of the model spectrum at regular intervals and interpolating these points together.This was a different approach to Ho20, who fitted a low-order polynomial to find the continuum.We justify this change in the next paragraph.We subtracted the continuum such that the template flux was centered around zero, and all absorption lines took on negative values.We then clipped to zero any lines shallower than 1 × 10 −4 times the amplitude of the deepest spectral line in the cross-correlation template.Next, we convolved the template with a Gaussian kernel such that the template matched the resolving power  = 120,000 of HARPS.Finally, we normalised the cross-correlation template such that it summed to one, which served to flip the absorption lines.
The step of normalising the cross-correlation template to sum to one preserves the relative line depths, but naturally changes the absolute line depths and average line depth.We found that, because of this, the extent to which the continuum was fitted had a substantial impact on the average line depths of the normalised templates.If the continuum was over-fitted (so that the bases of the deepest lines were misidentified as continuum and removed) then the depth of the deepest lines was reduced and thus the average line depth of the normalised template was increased.If the continuum was underfitted (such that it was not fully removed after subtraction) then the lines remained artificially deep, leading to the normalised template having a too small average line depth.As an example of the scale of this issue, the subtraction of a highly over-fitted continuum led to an nearly 100% increase in the line amplitude of the observed cross-correlation function (CCF) for Mg i, whilst the subtraction of an under-fitted continuum led to a nearly 30% reduction in CCF line amplitude.We found we could not adequately fit the continuum with the use of a low-order polynomial, hence our decision to use an alternative method, combined with manually checking that the continuum was not over-or under-fitted.
We used the Ho20 cross-correlation operation (see Equation 23) to perform the cross-correlation.The normalisation of the cross-correlation template described above means that the crosscorrelation operator is a weighted average of the photon counts registered by the detector.The cross-correlation operation between the full wavelength range of a single exposure (  ()) and the normalised template (  ()) is: where (, ) is the cross-correlation coefficient at planet radial The largest NLTE differences are seen for the SC spectra, again driven by the large differences in temperature, whereas the differences for the A and I spectra are primarily caused by the differences in level populations.
velocity  obtained at time .We searched the region of velocity space Δ ± 300 km s −1 at 1.0 km s −1 intervals (following Ho20).
We Doppler shifted the cross-correlation template according to every Δ value within our velocity grid and performed the crosscorrelation between the shifted template and the model-injected or observed spectrum.The resulting array of correlation coefficient values formed a single CCF.We repeated this process for every spectrum in every order, resulting in 72 CCF matrices.As Equation 23 covers every wavelength channel across all orders, it implicitly includes the summation of the orders, and thus we stacked and summed the CCF matrices to obtain the final CCFs for all the exposures per observing night.At this stage, we combined all three nights of data by rearranging all the summed CCFs according to phase.This gave us a phase-ordered all-nights CCF matrix.
The next stpdf in the analysis involved operations performed directly on the CCFs.The CCFs at this stage showed considerable stellar contamination (see panel 5, Figure 12).Following Ho20, in order to remove this contamination we use the out-transit CCFs.We first divided all the observed CCFs by the column-wise mean of the out-transit CCFs.Next, we found the (row-wise) median value of each CCF, excluding the velocity region that spanned any stellar and planet signal.We then subtracted this median value from each CCF, forcing the CCFs to vary around zero (Hoeĳmakers priv. comm.).
The removal of the stellar contamination revealed the Doppler shadow (Collier Cameron et al. 2010, see panel 6, Figure 12).During transit, an exoplanet moves across the face of its parents star.Assuming that the planet orbits in the same direction as the star rotates on its axis, for the first half of the transit the planet will be moving in front of the region of the star that is emitting blue-shifted light (with respect to the line-of sight as seem from Earth), meaning that overall the light received from the star will appear to be slightly red-shifted.For the second half of the planet's transit, the opposite is true: the planet eclipses a small area of the red-shifted region, causing the overall light received to appear blue shifted.This effect is known as the Rossiter-McLaughlin effect (RM, Rossiter 1924;McLaughlin 1924;Ohta et al. 2005).In addition, the centreto-limb variations (CLV, whereby the perceived brightness of a star varies with the limb angle) cause variations in the stellar line depth across the face of the star, which in turn can introduce variations to the depth of the stellar lines as a planet transits across the face of the star (e.g.Czesla et al. 2015).It is typical in HRCCS analyses to behave as though the stellar lines are uniform and do not vary with time.RM and CLV effects introduce changes in the stellar line depths and wavelength that vary with time whilst the planet is transiting.This means that, when we divide all the CCFs by the out-transit average (see above), the stellar CCF is fully removed, but is over-corrected in some regions and under-corrected in others.This over and under-correction effect in the region of the CCF matrix where the planet is transiting in front of its parent star creates the Doppler shadow.
The Doppler shadow can be removed after the cross-correlation has been performed by modelling the shape of the Doppler shadow directly and then subtracting it from each CCF (e.g.Hoeĳmakers et al. 2020).Alternatively, it can be removed prior to the crosscorrelation, by modelling the impact of the RM and CLV effects on the spectra and removing them directly (e.g.Bello-Arufe et al. 2022).We followed the method described in Ho20, where a Gaussian with two negative side lobes was fitted to the Doppler shadow, scaled appropriately per CCF, and subtracted.We fitted the Gaussian by taking the column-wise summation of the in-transit CCFs, and fitting to the summed Doppler shadow.
We then applied a high-pass filter to the CCFs.We referred to the tayph5 documentation to establish that the high-pass filter took the form of a broad Gaussian kernel.We followed Ho20 and set the FWHM of the kernel to be 70 km s −1 .We convolved each CCF with the Gaussian kernel, and then subtracted the result from the unconvolved CCF.The impact was to remove any residual shape of the continuum from the CCFs.We note that in Hoeĳmakers et al. 2018 andHoeĳmakers et al. 2019 a high-pass filter was used, but was applied to the data prior to cross-correlation.In Ho20, the high-pass filter was applied to the CCFs instead.
We weighted to the CCFs to compensate for the effects of the colour-correction (see Section 4.2).Ho20 follow this step, however, there is room for interpretation as to how this is achieved.The reason for this is that the colour correction is performed on individual orders, whereas the weighting step is performed after the orders have been combined during the cross-correlation.However, whilst the orders are colour corrected separately, the relative correction applied to each exposure is fairly consistent across the orders.Thus, we took the mean of the flux deviation values (see Section 4.2) that each exposure was divided by across all the orders.We then multiplied the corresponding CCF by the mean flux deviation.
Lastly, we shifted each CCF according to the radial velocity of WASP-121 b at the time its exposure was taken ( =   sin 2), and then took the mean of all the shifted CCFs to give a single, combined CCFs for all the orders, exposures and nights.We set the error on the combined CCFs by finding the 1 deviation of the CCF values, excluding the velocity region containing the planet signal (if present).To build the   − Δ maps, we shifted the CCFs according the a range of   values (see Figures 16 and 17).Note that in the Figures displaying the results of the cross-correlation (Figures 12,16 and 17), the stellar contamination and planet trail are consistently offset from the stellar and planet rest frames respectively by ≈ 38 km s −1 .This is because, following Ho20, the data-set is not shifted to account for the systemic velocity of the WASP-121 system.
Finally, we clarify that the model-injected ΔCCF results presented and discussed in Section 5 are the model-injected CCFs

Pipeline Validation and Comparison with Ho20
To validate our pipeline, we reproduce the results in Ho20 that used both the observed data and an injection of a 2000 K model spectrum without TiO opacity, cross-correlating with various neutral species templates.The results are presented in Figure 16.We compare the peak values of the observed ( Obs ) and injected CCFs ( Inj ) in Table 3, and note that for the six species detected, we produce shallower observed line depths than Ho20.However, given the difference between our pipelines, and the associated uncertainties from our work and Ho20, we consider we have recovered the Ho20 detections to reasonable accuracy.
Although the data reduction pipeline and Equation 23 are formulated to produce CCFs that are a measure of the average depth of absorption lines in the planetary spectrum, the calculation is dependent on the particulars of the chosen template.Two Fe i templates with a different number of lines or different line ratios will necessarily produce different CCFs.It is more accurate in this case not to think of the CCFs of the observations as absolute average line depths, but rather weighted average line depths where the weighting function is the cross-correlation template itself.To this end, it is of little value comparing the amplitudes of the planetary CCFs in the  16.The line amplitude (A) is equivalent to the fractional area of the star that is obscured.We have left out Ti i as it was not detected via cross-correlation.We include the results of Ho20 in the rightmost column for comparison.
Template A Obs (10 −4 ) A Inj (10 −4 ) A Obs /A Inj A Obs /A Inj (Ho20) lower panels of Figure 16, and instead their ratio,  Obs / Inj , should be considered and our pipeline produces  Obs / Inj in agreement with Ho20 for all detected species, as seen in the two rightmost columns of Table 3. Ho20 found that the absorption strengths of all atomic species they investigated, including Fe i, were under-predicted by their model and hypothesised that hydrodynamic effects may cause the atmosphere to extend beyond what is expected assuming hydrostatic equilibrium, thus increasing the transmission depths.While this is certainly possible for some of the species investigated, all of the models in this work, which were truncated at 10 −8 bar (below the hydrodynamic region), are largely transparent in the outermost layer over most of the HARPS passband.At this pressure, the models only show absorption from the Balmer series, Na i D, Ca ii H and K, the Mg i triplet around 5170 Å and a number of Fe ii lines.More Fe ii lines appear in the 20× spectra than the 1× spectra, predominantly at wavelengths shorter than ∼ 4500 Å, except for the I LTE spectra where the Fe ii lines lines don't appear at all.Of these species, only Na i and Mg i were among those detected in Ho20.We propose several alternatives that can modify the transit depths of the remaining species and possibly explain the weak injection signals: (i) An isothermal atmosphere inaccurately describes the temperature structure of WASP121-b, as it has already been demonstrated that the atmosphere exhibits a temperature inversion (Evans et al. 2017).Increasing the temperature of the atmosphere, even if it remains isothermal, increases the absorption strength of transmission features by increasing the scale height.
(ii) If the abundance ratios of the atmospheric constituent elements vary with altitude (e.g. via atmospheric transport, mass segregation, etc), absorbing species may be transported higher or lower in the atmosphere, increasing or decreasing their absorption strength.This could particularly explain why different species in Ho20 exhibit a large range in how divergent the injections are from the detections, rather than a single universal value.
(iii) The effects of NLTE can change the excitation and ionisation population ratios in the upper atmosphere, effectively changing line depths and line ratios.This can additionally alter a self-consistently calculated temperature structure by modulating the heating and cooling rates via dependencies on these populations.
It is likely that the source of the under-predicted line strengths in Ho20 is some combination of these, including their proposition of hydrodynamic effects.

Cross-Correlation of Fe i
Figure 17 summarises the CCFs of the 12 Fe i templates in this work, each showing weighted average line profile of varying depth.We compare the peak values of the observed and injected CCFs in Table 4.The detected planetary signal is shallower than the injected signal for all SC and A models, with the SC LTE models exhibiting the greatest overestimation of transit depth.Conversely, the I models all recover shallower injected signals than the observed depths.The I 20× NLTE model provides the closest match between the observed and injected line depths ( Obs / Inj = 1.10),only differing from the observed signal by a factor of ∼ 10%, while the I 20× LTE model, the model closest in design to the Ho20 best model, produces CCFs nearly identical to the Fe i panels in Figure 16.
Although the  20× NLTE model provides the closest match  to the observed signal following the method of Ho20, the amplitude of the observed signal is noticeably shallower than the  20× LTE model finds, and indeed all of the NLTE models produce observed signals that are shallower than their LTE counterparts.While these amplitudes should be taken as weighted average line depths, as mentioned above, we investigate how the strength (i.e.S/N) of the detections compare with one another.To this end, we follow many HRCCS works (see e.g.Birkby 2018, and references therein) and perform Pearson cross-correlations, from which we calculate the S/N of the CCF detection strength by comparing the CCF peak to the standard deviation of the CCF values outside the peak.We did this for all 12 Fe i templates in this work.The resultant S/N maps are shown in Figure 18, with the peak S/N values (  = 217 km s −1 ,   = 38 km s −1 ) for each presented in Table 5.The minimum value for all models in the S/N map remains close to S/N∼-4 as statistically expected (see e.g.Spring et al. 2022).The detections are all substantially above S/N= 4 in the positive direction, indicating that all models deliver a strong detection.The S/N difference between the LTE and NLTE models, for either  or  profiles, is small, ΔS/N< 0.5, indicating that by this metric the data is not able constrain the differences between these models.For the isothermal models however, the LTE models are statistically preferable over the NLTE equivalents by 1.5 ≲ ΔS/N≲ 2.
Visually the NLTE and LTE spectra for the isothermal profile are more distinct than their counterparts for the  and  profiles, as shown in Figure 15, with greater differences in the number of lines, particularly at longer wavelengths, so it is perhaps not surprising that the S/N is able to distinguish them better.This however remains in conflict with the method from Ho20, which indicates the best match occurs with the NLTE models.

Impact of the Cloudy Spectral Offset
To assess the impact of Cloudy's spectral offset (see Section 2.2.2.1) on these results, we cross-correlated the Fe i templates with a binary mask produced from the Fe i line list in Cloudy's database, using the standard Pearson cross-correlation.Cross-correlation was performed for wavelength ranges corresponding to each of the HARPS   5. spectral orders, recording the velocity of the CCF peaks as the individual order offsets.The average and standard deviation of the offsets were calculated, weighted by the number of lines in the binary mask present in each order.
We find an average offset for the CfE templates of 0.90 ± 0.18 km s −1 across all orders.In contrast, the average offset for the Ho20 template is 0.001 ± 0.008 km s −1 , consistent with no offset from the line positions in the Cloudy line list.While the CfE offset is one and a half times the velocity resolution per pixel of HARPS (0.6 km s −1 , see Footnote 2), it is approximately half of the effective velocity resolution per pixel in this case (1.7 km s −1 ), given WASP-121 b's rotation, and approximately equal to the velocity interval of the  P −  s maps (1.0 km s −1 ).Given that the peaks of the  P −  s distributions (  = 217 km s −1 ,   = 38 km s −1 ) are already in good agreement with that of Ho20 (  = 221.1 km s −1 ,   = 38.043km s −1 ), and that individual orders in the CfE templates universally exhibits a red-shifted offset, the spectral offset is expected to have minimal impact on the final results.

3000 K Isothermal Models
To investigate the possibility that NLTE effects are compensating for a 2000 K atmosphere being cooler than WASP-121 b, we follow the procedure in Section 2 to produce two additional models (with corresponding transmission spectra and Fe i templates).These new models, labeled with 3, are identical to the  20× LTE and NLTE models in all aspects with the exception that the isothermal temperature has been raised to 3000 K.They were cross-correlated with the data in the same fashions as above, following the methods of Section 4, with both the Ho20 cross-correlation and the S/N presented in Figure 19.The peak S/N values are also presented in Table 5.
Following the Ho20 method,  Obs for the 3 20× NLTE template is the same as  20× NLTE template within error, but  Obs for the 3 20× LTE template is half of that for the  20× LTE template.The  Inj of the 3 20× models are approximately equal, as they were for the  20× models, but are now approximately double the amplitude of the  20× model injected signals.The 3 20× LTE model does a reasonable job of matching the observed signal, with  Obs / Inj equal to that of the  1× NLTE model (the second best match of the original 12 models), but at more than twice the absolute amplitude.This hotter isothermal profile more closely agrees with the day-side temperature measurement of WASP-121 b from JWST phase curves (Mikal-Evans et al. 2023) which is adopted in later work by Hoeĳmakers et al. (2022).However, the  Obs / Inj metric is unable to distinguish it from the  1× NLTE model in the HARPS high resolution spectra, leaving a degeneracy.
This highlights the need for standardised metrics of compar- ison when assessing the impact of increasing model complexity in HRCCS.Cloudy for Exoplanets is a powerful modelling tool for exploring NLTE effects in exoplanet atmospheres, but its computational time is currently prohibitive to using it in Bayesian frameworks (e.g.Brogi & Line 2019;Gibson et al. 2020) developed for performing retrievals with HRCCS.A similar problem exists for 3D modelling of exoplanet atmospheres, but in the era of highquality, data-rich studies, HRCCS works must find a way to fully explore the information they contain.

CONCLUSION
This paper presents an analysis of an archival high-resolution optical transmission spectrum of WASP-121 b observed with the HARPS spectrograph.Previous work by Ho20 found that while a hydrostatic, isothermal atmospheric model computed under the assumption of LTE was able to detect the presence of Fe i via HRCCS, their model under-predicted the average line strength of Fe i at optical wavelengths by a factor of ∼ 4.7.They propose that hydrodynamic effects in the upper reaches of the atmosphere may serve to extend the atmosphere beyond the range of their model, but leave such an investigation for future work.
In this work, we investigate alternative explanations for the under-predicted line strengths reported in Ho20, including the complexity of the model temperature profile, the atmospheric metallicity, and NLTE effects.We generate 12 models and associated transmission spectra using the astrophysical simulation code Cloudy (Ferland et al. 2017), varying the temperature profile between and isothermal temperature of 2000 K, an analytic function (Guillot 2010), and Cloudy's self consistently computed temperatures, repeated at atmospheric metallicites of 1× and 20× solar metallicity and in both LTE and NLTE.The Fe i transmission spectra were cross-correlated with the observational data, and the resultant CCFs were compared with injections of the full, all species models.We choose to focus exclusively on Fe i as this is one of the species most highly affected by NLTE calculations.The findings of this work are summarised as follows: • NLTE calculations provide additional heating in the upper atmosphere by ionising neutral Fe more heavily than in LTE.This NLTE heating had previously been seen in the atmosphere of KELT9-b (Fossati et al. 2021), but was not confirmed for any less extreme exoplanets.While the absolute increase in heating over LTE is not as extreme, the relative increase in heating at the top of WASP-121 b's atmosphere agrees well with what was found for as for KELT9-b.
• Of the three parameters investigated, temperature profile complexity has the largest impact on the resultant transmission spectrum, primarily through changing the atmospheric scale height.NLTE radiative transfer effects are secondary to this, while atmospheric metallicity has only minor impact on the transmission spectrum in comparison to the others.
• We were able to successfully recreate the analysis methodology of Ho20, and reproduce their detections of Mg i, Ca i, V i, Cr i, Fe i, and Ni i, with the ratios of the observed and injected line amplitudes agreeing with the findings of Ho20 within error.
• All of our  and  models over-predict the line strength of Fe i by factors of ∼1.4 to 4.8, with the  20× NLTE model providing the largest over-prediction.Both of the  LTE models (the models most similar to those of Ho20) under-predict the observed line strengths, by factors of 5.5 and 5.4 for the 1× and 20× models, respectively.The  NLTE models provide reasonably good predictions of line strengths, with the  20× NLTE model providing a match between the observed and injected signals within error, albeit with the weakest overall observed amplitude.
• Contrary to the results of the Ho20 methodology, more traditional HRCCS S/N calculations show that while all of 12 models provide strong detections of Fe i, the  20× NLTE provides the weakest detection at 8.0 while the  20× LTE model provides the strongest detection at 10.4.Generally, all of the NLTE models provide either equivalent or weaker S/N detections than their LTE counterparts, similar to the observed signal amplitudes of the Ho20 methodology, but otherwise do not show any correlation between S/N and amplitude.
• Two additional models were produced, identical to the  20× models with the exception that the isothermal temperature was increased to 3000 K.The 3 20× NLTE model produces a similar observed signal amplitude to the  20× NLTE, but the injected amplitude is more than twice as deep, over-predicting the line depths by a factor of ∼2.2.The 3 20× LTE model produces an observed signal amplitude approximately half as deep as that of the  20× LTE model, but is a reasonable predictor of line depth, on par with the  1× NLTE model (the second best matching of the original 12 models), although at more than twice the observed amplitude.Again contrary to these results, the S/N detections of the two 3 models are equivalent to or weaker than their cooler counterparts (9.1 to 10.4 for LTE and 7.5 to 8.0 for NLTE).
• The two HRCCS analysis methods provide conflicting results in terms of which model best represents the data, and we are unable to determine if there are NLTE effects present in the atmosphere of WASP-121 b.The best matching model in the Ho20 methodology ( 20× NLTE) produces the lowest S/N, while the model that produces the highest S/N ( 20× LTE) is the worst matching in the Ho20 methodology.Additionally, increasing the isothermal temperature of these models results in the 3 20× LTE providing a reasonable match in the Ho20 methodology, at more than twice the amplitude of the  20× NLTE model, but again produces a lower S/N than the  20× LTE model.The results are therefore inconclusive on the matter of NLTE effects in the atmosphere of WASP-121 b, but highlight the need for standardized metrics of comparison for increasingly complex models in HRCCS, particularly as data quality continues to increase.The challenge is how to include computationally-intensive, but powerful modelling codes such as Cloudy for Exoplanets, or 3D magneto-hydrodynamic models, into HRCCS retrievals.

Figure 1 .
Figure 1.Top -Schematic diagram of transmission chords along the line-ofsight for transmission spectrum calculation.The shaded blue rings denote different layers of the planetary atmosphere and the solid blue circle is the bulk disc of the planet.Bottom -Schematic diagram of the line-of-sight view of a transiting planet in our framework.The shaded red circles are the stellar rings of discrete limb darkening.Ring and layer sizes and intensities are not shown to scale.Figure adapted from Young et al. (2020a).

Figure 4 .
Figure 4. Top -Comparison of the transmission spectra from this work with the deepest (SC NLTE 20×) and shallowest (I LTE 1×) transmission features to that ofHuang et al. (2023).Spectra from this work have been rotationally broadened as discussed in Section 4.2 and convolved to a spectral resolution of  = 30000 to match theHuang et al. (2023) spectrum.Bottom -- profiles corresponding to the above spectra.We note that theHuang et al. (2023) profile experiences a sharp inversion above the bounds of this panel, increasing to ∼ 12500 K between 10 −9 to 10 −10 bar, before cooling to ∼ 1500 K around 2 × 10 −12 bar.

Figure 5 .Figure 6 .
Figure 5. Left -Difference in temperature relative to the I - profile.Colours are the same as the top panel of Figure 3. Middle -Relative temperature differences across thermodynamic treatments between the SC 1× (solid) models, and the SC 20× (dot-dash) models.Right -Relative temperature differences across metallicities between SC NLTE (blue) and SC LTE (green) models.

Figure 10 .
Figure10.Δ ISO (top), Δ NLTE (middle), and Δ MH (bottom) for the SC (blue), A (green), and I (red) model transmission spectra in this work.All SC and A spectra show increased absorption relative to the corresponding I spectra, driven by the warmer temperatures increasing the scale height.The largest NLTE differences are seen for the SC spectra, again driven by the large differences in temperature, whereas the differences for the A and I spectra are primarily caused by the differences in level populations.

Figure 11 .
Figure 11.Contribution functions for the SC 1× NLTE (top) and LTE (bottom) Fe i template transmission spectra.Darker regions indicate greater contribution to the template at those wavelengths.

Figure 12 .
Figure12.Data at different stages of post-processing (top four panels, see Section 4.2) and after cross-correlation (bottom four panels, see Section 4.3).Left column: Night 2, order 30 (top four panels) and CCF matrix truncated to show a velocity range of 380 km s −1 (bottom four panels).Right column: Night 2, exposure 18's spectrum (top four panels) and CCF (bottom four panels).Row 1: raw extracted e2ds data processed by the HARPS DRS and subsequently sorted into separate orders.Row 2: order after telluric removal.Row 3: order after alignment to the stellar rest frame and masking.Row 4: order after colour correction.Row 5: in-transit CCF matrix after cross-correlating the data with the Ho20 Fe i cross-correlation template, averaging the CCF matrices for each order, reordering the CCFs from all three observing nights and truncating the out-of-transit CCFs.Row 6: in-transit CCF matrix after column-wise division by the out-transit CCF to remove the stellar signal, and the row-wise subtraction from each CCF of its (out-planet) median value.The path of the planet trail is marked by a thin grey line.Row 7: in-transit CCF matrix after Doppler shadow removal, application of the high-pass filter and weighting to account for the colour-correction.Row 8: as for Row 7, but data aligned to the planet rest frame.The planet trail in aligned vertically.The stellar contamination and aligned planet trail are offset from the stellar and planet rest frames respectively by ≈ 38 km s −1 because, following Ho20, the data-set is not shifted to account for the velocity of the WASP-121 system.The CCF matrices in rows 6, 7 and 8 are on the same colour scale to best visualise the effects of the Doppler shadow removal and application of the high-pass filter.

Figure 13 .
Figure 13.Ho20 all-species 2000 K model spectrum without TiO opacity in vacuum wavelengths, prior to (grey line) and after (green line) with the Brogi et al. (2016) rotational broadening kernel.Note the double-peaked shape in the broadened lines, resulting in the loss of the line cores.

Figure 14 .
Figure 14.Stages of conversion of the Ho20 Fe i model spectrum to the cross-correlation template format.A: The spectrum in air wavelengths.B: The fitted continuum.C: After continuum subtraction and the smallest values clipped to zero.D: After broadening with a Gaussian kernel.E: After forcing the cross-correlation template to sum to one.

Figure 15 .
Figure15.Fe i spectral templates before being processed for cross-correlation as described in Section 4.3.Colours correspond to the type of - profile used to produce the template, either SC (blue), A (green), or I (red).

Figure 16 .
Figure 16.Upper Row -Co-added cross-correlation functions (CCFs) in  P −  sys for the Ho20 various atomic templates.Dashed lines indicate the expected velocities of the planetary signal.Lower Row -Co-added CCFs in the planet rest-frame.Solid grey and red curves are obtained by cross-correlating with the observational data before and after injection of the Ho20 2000 K model spectrum without TiO opacity, respectively.Note that the red curves show the model-injected CCF minus the observational CCF, and thus show the transit depth of the model only.Also note that the scaling of the Mg i transit depth is multiplied by a factor of three, because the Mg i absorption features are significantly deeper than the other atoms.

Figure 17 .
Figure 17.Similar to Figure 16, but for the SC (top rows), A (middle rows), and I (bottom rows) spectra and Fe i templates produced in this work.

Figure 18 .
Figure 18.S/N maps of the 12 Fe i templates used in this work.All maps are plotted on the same colour scale, minimum and maximum S/N values for each are reported in Table5.

Figure 19 .
Figure 19.Top two rows -Similar to Figure 16, but for the 3 Fe i templates.Bottom row -Similar to Figure 18, but for the 3 Fe i templates.

Table 2 .
(Delrez et al. 2016) and System Parameters used in this work.(Delrezet al. 2016) Left -Differences in radii relative to the I model of the corresponding metallicity.Middle -Differences in radii between corresponding LTE and NLTE models.Right -Differences in radii across choice of metallicity.Choice of - profile plays the largest role in changing the atmospheric radius; the smallest Δ ISO difference is nearly double the largest of either Δ NLTE or Δ MH .

Table 3 .
Peak amplitudes of the observed and injected average line depth CCFs in Figure

Table 4 .
Peak amplitudes of the observed and injected average line depths in Figure17.The line amplitude (A) is equivalent to the fractional area of the star that is obscured.The I3 labels refer to 3000 K isothermal models outlined in Section 5.3.

Table 5 .
Minimum and maximum S/N values for the Ho20 Fe i template and the Fe i templates in this work.The I3 labels refer to 3000 K isothermal models outlined in Section 5.3.