Assessing telluric correction methods for Na detections with high-resolution exoplanet transmission spectroscopy

Using high-resolution ground-based transmission spectroscopy to probe exoplanetary atmospheres is difficult due to the inherent telluric contamination from absorption in Earth's atmosphere. A variety of methods have previously been used to remove telluric features in the optical regime and calculate the planetary transmission spectrum. In this paper we present and compare two such methods, specifically focusing on Na detections using high-resolution optical transmission spectra: (a) calculating the telluric absorption empirically based on the airmass, and (b) using a model of the Earth's transmission spectrum. We test these methods on the transmission spectrum of the hot Jupiter HD 189733 b using archival data obtained with the HARPS spectrograph during three transits. Using models for Centre-to-Limb Variation and the Rossiter-McLaughlin effect, spurious signals which are imprinted within the transmission spectrum are reduced. We find that correcting tellurics with an atmospheric model of the Earth is more robust and produces consistent results when applied to data from different nights with changing atmospheric conditions. We confirm the detection of sodium in the atmosphere of HD 189733 b, with doublet line contrasts of -0.64 $\pm$ 0.07 % (D2) and -0.53 $\pm$ 0.07 % (D1). The average line contrast corresponds to an effective photosphere in the Na line located around 1.13 $R_p$. We also confirm an overall blueshift of the line centroids corresponding to net atmospheric eastward winds with a speed of 1.8 $\pm$ 1.2 km/s. Our study highlights the importance of accurate telluric removal for consistent and reliable characterisation of exoplanetary atmospheres using high-resolution transmission spectroscopy.


INTRODUCTION
The era of characterising exoplanet atmospheres has accelerated greatly over the last two decades. With technological advancements in instrumentation, we are now able to monitor exoplanetary systems and characterise in detail both their bulk properties as well as their atmospheres. Numerous planets have been observed with both radial velocity and transit methods, allowing for extensive analyses and characterisation (e.g. Fischer et al. 2014;Sing et al. 2016;Madhusudhan 2019;Pinhas et al. 2019;Welbanks et al. 2019). Observations of transiting systems can give an insight into the chemical properties of the planetary atmosphere. During one orbit there are two windows which can be used for such studies: the primary transit, when the planet passes in front of its host star as viewed from Earth, and the secondary eclipse when the planet is occulted by the star. In both instances, the planetary atmospheric spectrum can be ★ E-mail: adam.langeveld@ast.cam.ac.uk † E-mail: nmadhu@ast.cam.ac.uk deduced by measuring the change in observed flux outside of and during transit (Seager & Sasselov 2000), and absorption or emission due to chemical species may be seen. Such inferences can also be made if phase-resolved spectra are obtained at other parts of the orbit (Stevenson et al. 2014).
The first observation of an exoplanet atmosphere was made by Charbonneau et al. (2002). Using the STIS spectrograph on the Hubble Space Telescope (HST) with a resolution of = 5540, absorption at 589 nm due to atmospheric sodium (Na ) was detected in the transmission spectrum of HD 209458 b. The first groundbased detection was made using the High Resolution Spectrograph (HRS) on the 9.2 m Hobby-Eberly telescope (Redfield et al. 2008). With a resolution of ∼ 60, 000, the sodium doublet was fully resolved and absorption in the atmosphere of HD 189733 b was detected. Shortly after, Snellen et al. (2008) used the High Dispersion Spectrograph (HDS) ( ∼ 45, 000) on the 8 m Subaru telescope to confirm the presence of sodium in HD 209458 b. With a combination of low to medium-resolution space-based data and high-resolution ground-based data, there have been discoveries of almost 20 different chemical species in exoplanet atmospheres to date (Madhusudhan 2019). Among the most common are detections of the alkali species Na and K in hot Jupiters (e.g. Charbonneau et al. 2002;Redfield et al. 2008;Sing et al. 2016;Sedaghati et al. 2016;Nikolov et al. 2016;Wyttenbach et al. 2017;Chen et al. 2018;Casasayas-Barris et al. 2018;Jensen et al. 2018;Deibert et al. 2019;Seidel et al. 2019;Hoeĳmakers et al. 2019).
Many challenges are presented when using ground-based facilities to acquire high-resolution spectra of transiting exoplanets. For methods which derive the planetary transmission spectrum by comparing the in-transit and out-of-transit stellar spectra, observing times must be chosen to ensure that an adequate out-of-transit baseline can be obtained. An insufficient out-of-transit baseline can adversely affect the quality of the transmission spectrum . A significant problem arises from the contamination of spectra due to absorption in the Earth's atmosphere. These telluric lines can merge with stellar lines, or appear at a similar or higher strength as features in the planetary transmission spectrum (Redfield et al. 2008;Snellen et al. 2008). In the optical domain, the predominant sources of absorption are telluric water and oxygen. Therefore, data must be obtained on nights with low water vapour content and when the airmass is low to minimise the impact of the contamination.
Alongside the telluric removal process, it is essential to make further corrections to account for stellar reflex motion, systemic velocity, and the planetary radial velocity. Additionally, a planet which passes in front of a rotating star blocks different amounts of blueshifted and redshifted light. This causes the shape of lines in the integrated stellar spectrum to change as the transit progresses, and is known as the Rossiter-McLaughlin (RM) effect (Rossiter 1924;McLaughlin 1924;Queloz et al. 2000;Triaud 2018). The spectral line shape is also affected by Centre-to-Limb Variation (CLV) which describes the brightness of the stellar disk as a function of limb angle (Czesla et al. 2015;Yan et al. 2017). Together with telluric contamination, these effects may imprint spurious signals within the planetary transmission spectrum and must be corrected for to prevent false identifications of chemical species. A recent study has shown that absorption features in the transmission spectrum of HD 209458 b can be explained purely by the induced CLV and RM signals (Casasayas-Barris et al. 2020), highlighting the importance of correcting for these effects. In the near future, newly developed high-resolution spectrographs will be used to target rocky planets and super-Earths. It is therefore crucial to be able to accurately tackle the many problems associated with ground-based observations in order to distinguish spectral features of these planets. Further characterisation will be possible when these results are used in combination with upcoming James Webb Space Telescope (JWST) data.
Several methods have been used in previous work to remove telluric features from high-resolution spectra. For the first detection of the atmosphere of HD 189733 b from a ground-based facility (Redfield et al. 2008), the spectrum of a telluric standard star was recorded immediately after the observations. A spectrum of absorption in the Earth's atmosphere was recovered by using a stellar template to remove the weak stellar lines of the rapidly rotating hot B-star. This was also employed in subsequent studies (Jensen et al. 2011(Jensen et al. , 2012. Other datasets have been corrected by assuming a linear relationship between airmass and telluric line strength, and deriving a telluric spectrum to remove variation throughout the night (Vidal-Madjar et al. 2010;Astudillo-Defru & Rojo 2013;Wyttenbach et al. 2015). More recently, studies have used models of Earth's atmosphere to correct for the telluric spectrum, e.g. TER-RASPEC (Lockwood et al. 2014), molecfit (Smette et al. 2015;Kausch et al. 2015;Allart et al. 2017), and other custom-built codes (e.g. Yan et al. 2015;Casasayas-Barris et al. 2017). Contamination can also be corrected by removing the time-dependent variation of flux at each wavelength element using linear regression (e.g. Snellen et al. 2010;Brogi et al. 2012), or by removing systematic trends with singular value decompositions (SVDs) or principal component analysis (PCA) (e.g. de Kok et al. 2013;Birkby et al. 2013;Piskorz et al. 2016). Comparison has also shown that some methods are better at correcting H 2 O than O 2 (Ulmer-Moll et al. 2019).
In this study we investigate how different telluric correction methods may affect the measurements of chemical species in exoplanet atmospheres at optical wavelengths. We see inconsistencies in measured results depending on the method employed and the nightly weather variation, so it is important to quantify which techniques are best for analysing high resolution exoplanet spectra. We choose to compare two popular methods: one which derives the telluric transmission solely from the data, and one which uses a model of molecular absorption in Earth's atmosphere. Both of these methods account for the variation of telluric line strength over the multiplehour observing window. This gives an advantage over modelling a telluric standard star, where the resultant spectrum is insensitive to atmospheric variation during the transit and requires extra time to observe. We focus specifically on the quality of telluric corrections in the region around the sodium D-lines, however the analysis is applicable to the full range of the spectrograph. Other methods discussed above may be more suited for Doppler-resolved molecular detections (particularly in the near-infrared) but were beyond the scope of this work.
In section 2 we give an overview of the high-resolution HARPS spectra used for our analysis. We discuss the data reduction and two methods for removing telluric contamination in section 3, and the calculation of the transmission spectrum which is common for both methods in section 4. In section 5 we evaluate the differences in the telluric reduction processes by measuring the absorption parameters of the Na doublet using Gaussian fits and a binned passband analysis. We also measure the net atmospheric wind speed from the blueshift of the line profiles and discuss further atmospheric properties. The results are briefly summarised in section 6 together with a conclusion about the most advantageous method.

OBSERVATIONS
In this paper we use observations of high-resolution transmission spectra of the hot Jupiter HD 189733 b as a case study. Data were obtained from programs 072.C-0488(E), 079.C-0828(A) (PI: Mayor) and 079.C-0127(A) (PI: Lecavelier des Etangs) using the  Wyttenbach et al. (2015). The number of in and out-of-transit exposures are calculated using the mid-exposure times and the ephemeris in Table A1. High Accuracy Radial velocity Planet Searcher (HARPS) echelle spectrograph on the ESO 3.6 m telescope in La Silla, Chile, and made available through the ESO archive. With a resolving power of = 115, 000, HARPS records 72 orders of the echelle spectrum over a range of 380-690 nm. The detector contains a mosaic of two 4k x 4k pixel CCDs; one spectral order from 530-533 nm is lost due to a gap between the CCDs. Two fibres are used when observingfibre A for the target and fibre B for recording simultaneous reference spectra for calibrations. Each fibre has a diameter of 70 m which gives a 1 arcsec aperture on the sky (Mayor et al. 2003). All observations are reduced by the HARPS Data Reduction Software (DRS) v3.5. The pipeline performs calibrations using the dark, bias and flat-field frames taken at the beginning of the night, corrects for the blaze, and uses Thorium-Argon (Th-Ar) reference exposures for wavelength calibration. Finally, each spectrum is remapped onto a uniform grid at a resolution of 0.01 Å in the solar system barycentric rest-frame. High-resolution spectra of HD 189733 were obtained with HARPS during four primary transits and are available through the ESO archive. The observing log in Table 1 shows the dates and program IDs of the observations. The number of in-transit spectra were calculated using the mid-exposure time and ephemeris of Agol et al. (2010). A combination of these datasets was first used to measure the Rossiter-McLaughlin effect (Triaud et al. 2009). They have since been analysed by several other authors to study other atmospheric properties, leading to important results such as: a detection of sodium in the planetary atmosphere , spatially resolved eastward winds (Louden & Wheatley 2015), a temperature gradient of ∼ 0.4 K km −1 (Heng et al. 2015), a measurement of the Rayleigh scattering slope (Di Gloria et al. 2015), and an attempt to constrain the presence of atmospheric water vapour (Allart et al. 2017). As discussed by many of these authors, on 29 July 2006 there are only 12 observations in total, and there are none during the second half of the transit because of bad weather conditions. We were unable to recover a clear planetary transmission spectrum due to the lack of data before and during transit, and therefore chose to ignore this night.
Parameters of the HD 189733 system which were used throughout this study are shown in Table A1. The known stellar radial velocity semi-amplitude was used to model the stellar and planetary radial velocity curves. Spectra with mid-exposure times between the first and fourth contacts of the eclipse are defined as in-transit. As seen in Figure 1, the observations on nights 1 and 3 fully cover the period shortly before, during, and after the transit. The orange and blue lines indicate out of and in-transit observations respectively. The complete transit duration is shown by the shaded region.

DATA REDUCTION
Preliminary corrections need to be made to the spectra before telluric contamination can be considered. We extract the data from the one-dimensional HARPS (s1d) files. The full spectral wavelength range is 3781-6912 Å, however we limit this to 4000-6800 Å to reduce systematics at the edges of the full spectrum where strong telluric contamination or low throughput is evident. A small number of pixels in each spectrum contain substantially higher flux values than the median, which are unlikely to be from the observed astrophysical source. To correct for these, a mask is created for each spectrum by applying a median filter with a width of 9 pixels and subtracting this from the original spectrum. Any pixels in the mask which are greater than 10 standard deviations from the median are flagged, and the corresponding fluxes in the original spectra are corrected to the median of the 10 surrounding pixels. Each spectrum is then normalised by dividing by its median. The errors in the flux values are assumed to be dominated by photon noise and we propagate the Poisson uncertainties throughout the analysis.
All ground-based observations in this wavelength range are polluted with telluric lines predominantly due to H 2 O and O 2 . These lines vary in strength with observing conditions such as airmass and precipitable water vapour. Incorrect removal of tellurics will create false signals in the resultant planetary transmission spectrum. We now consider the removal of telluric contamination using two methods.

Correcting tellurics with airmass
We first employed a method similar to that described by Vidal wavelength step in the spectrum, a linear fit between the natural logarithm of the measured fluxes and their corresponding airmass was made, taking the form where ( ) is the flux at wavelength , is the zenithal optical depth, and is the airmass. The telluric lines vary with airmass but the exoplanet absorption lines do not. Therefore a telluric reference spectrum, ( ), can be built using the value of the gradient in equation 1 ) ( 2) It is important to only use out-of-transit spectra to build the telluric reference spectrum; using in-transit spectra would also correct for absorption due to the planet's atmosphere. Therefore, it is ideal to obtain a number of spectra shortly before and shortly after the transit to ensure a large enough out-of-transit baseline. However, observations on night 2 began when the planet was already in transit and there are not enough out-of-transit observations to produce a good quality telluric reference spectrum. We were unable to recover a clear transmission spectrum using the limited out-of-transit data, so chose to include the in-transit sample as well. We note that some of the planetary signals could be over-corrected and muted as a result.
Using equation 2, the telluric contamination in the observed spectra can be adjusted so that they appear to have been observed at a constant reference airmass, ref . The adjusted spectra, ref , were calculated for all observations using where obs ( ) is the observed spectrum and is the airmass at the time of the observation . The reference airmass was taken to be the mean airmass of the in-transit sample only. An example of the adjustment to telluric lines is shown in Figure 2. The top panel shows the calculated telluric reference spectrum T( ), and the bottom panel shows the result of the correction with the original data in grey and the corrected data in blue. It is noted that the telluric lines are only adjusted to a level of constant airmass but are not removed completely. Following this correction, the transmission spectrum was computed as described in section 4. Residuals of telluric lines may occasionally be seen in the transmission spectrum despite performing the correction derived from airmass. These lines may be confused for planetary atmospheric absorption, but are actually caused by changes in Earth's atmospheric water content in the air above the telescope throughout the night. Similarly to Wyttenbach et al. (2015), we make a second correction to account for this. For the data on night 3, a linear fit between the transmission spectrum and the T( ) was made. The transmission spectrum was divided by this fit which effectively reduces parts of the spectrum where telluric residuals are correlated with T( ). We split up the full spectrum into smaller wavelength ranges and performed the correction on each section to prevent variations in unrelated parts of the spectrum from affecting each other. Only one iteration of the correction was needed to reduce the residuals to the continuum level. This is further discussed in section 5.1.

Correcting tellurics with molecfit
An ideal method would reduce telluric contamination down to the noise level without changing any of the existing data outside of these regions. The empirical method discussed in the previous section struggles to do this, and gets worse with low signal-to-noise.
Here we consider telluric removal by modelling out Earth's atmospheric contribution. We use molecfit v1.2.0 -an ESO tool specifically designed to correct for telluric contamination in groundbased spectra (Smette et al. 2015;Kausch et al. 2015). Molecfit fits a line-by-line radiative transfer model (LBLRTM) of telluric absorption to the observed spectra, producing a unique model of atmospheric absorption for each observation. This was first performed on HARPS spectra by Allart et al. (2017), and has since been used in other HARPS studies (Cauley et al. 2017;Seidel et al. 2019;Cabot et al. 2020), as well as with other high-resolution spectrographs (Cauley et al. 2017(Cauley et al. , 2019Allart et al. 2018Allart et al. , 2019Nortmann et al. 2018;Salz et al. 2018;Casasayas-Barris et al. 2019, 2020Hoeĳmakers et al. 2019;Alonso-Floriano et al. 2019;Kirk et al. 2020;Chen et al. 2020).
HARPS spectra are given in the Solar System barycentric restframe. We use the Barycentric Earth Radial Velocity (BERV) values to shift each spectrum into the rest-frame of the telescope to ensure correct telluric modelling with molecfit. We then inspect areas with heavy H 2 O and O 2 absorption around 5950, 6300 and 6475 Å and select 15 small regions (no larger than 2 Å) which contain only telluric lines and flat continuum (Figure 3). It is important that no stellar lines are included as they will affect the fit to the atmospheric model. The chosen regions remain constant for the duration of one night of observing, however they must be re-selected for different nights since the location of telluric lines changes with respect to the stellar lines.
We provide molecfit with parameters similar to those discussed by Allart et al. (2017), together with the date, location, and atmospheric conditions (e.g. humidity, pressure, temperature) stored within the HARPS output. The atmospheric model is fitted to the selected regions. With reference to the output parameters, we then run the calctrans tool which fits an atmospheric model to the entire spectrum, resulting in a unique telluric profile for each spectrum with the same resolution. Finally, the spectra are divided by their corresponding model to reduce all telluric lines down to the continuum noise level, as shown in Figure 4.

TRANSMISSION SPECTRA
Here we discuss our calculation of the transmission spectra from observations. This work addresses several common effects that can imprint false signatures onto the planetary transmission spectrum at high-resolution. These effects include telluric contamination, velocity contributions from stellar, systemic and planetary sources, as well as spurious signals due to the Rossiter-McLaughlin (RM) effect and Centre-to-Limb Variation (CLV).

Velocity corrections
Following removal of telluric contamination (sections 3.1 and 3.2), we extract the transmission spectrum while making corrections for the stellar reflex motion, systemic velocity, and planetary radial velocity using a similar technique to Wyttenbach et al. (2015). Individual spectra are identified as ( , in ) for in-transit exposures and ( , out ) for out-of-transit exposures by modelling the orbit using the parameters listed in Table A1. As discussed in section 2, it is important that the observations cover the period shortly before, during, and shortly after the transit to create a good out-of-transit baseline.
The HARPS-measured stellar radial velocities at the time of each exposure are extracted from the Cross Correlation Function (CCF) files. An example of these measurements for observations on night 3 is shown in Figure 5. The dashed line is a fit for both the where * is the stellar radial velocity semi-amplitude, is the phase and sys is the systemic velocity. For the in-transit spectra, there is a deviation from this fit due to the Rossiter-McLaughlin effect (Triaud et al. 2009;Di Gloria et al. 2015). Each spectrum is Doppler shifted to correct for its corresponding measured stellar radial velocity and linearly interpolated back to the uniform 0.01 Å grid. This corrects for the apparent radial velocity change due to the RM effect, however the overall shape of the transmission line profiles will still be affected and could lead to spurious signals -additional corrections are discussed in section 4.2. We simultaneously calculate the planetary radial velocity for all observed phases using where = − * ( * / ). The maximum change in measured stellar radial velocity throughout one night of observing is ∼ 90 ms −1 , corresponding to a ∼ 0.0018 Å wavelength shift. The planetary radial velocity is much larger in comparison, varying between ±16 kms −1 during the transit. As a result, signals in the planetary transmission will be shifted up to ∼ 0.32 Å (32 resolution elements) from the rest wavelength. It is therefore important to correct for this in order to recover the full transmission spectrum.
We propagate the Poisson uncertainties of the raw data and use the inverse variance as weights when combining the spectra. This reduces the contribution of spectra with low signal-to-noise which may have been affected by poor weather conditions or other timedependent systematics. We co-add all out-of-transit exposures to create a master out-of-transit spectrum, denoted out ( ). In-transit spectra contain signals from both the star and planet, whereas the out-of-transit spectra contain just the stellar signal. Classically, the transmission spectrum can be calculated by dividing a master intransit spectrum, in ( ), by out ( ), however this does not account for the planetary radial velocity shift. We therefore compute individual transmission spectra as Each spectrum is then continuum normalised using a third-order polynomial to remove effects from instrumental systematics or weather variations. A Doppler shift using is performed to move each spectrum into the planet's rest-frame, and the final combined transmission spectrum is thus computed: where , is the Doppler corrected and time averaged in-transit spectrum. We remove any remaining broadband continuum variations with a median filter of width 1501 pixels (15.01 Å).

Correcting CLV and RM effects
CLV and RM effects can change the shape of lines in the stellar spectrum during the transit. This induces spurious signals into the transmission spectrum which must be corrected for (Queloz et al. 2000;Triaud 2018). We model the extent of these effects in the resultant transmission spectra in a similar manner as Casasayas-Barris et al. (2019). We generate synthetic spectra of HD 189733 using the software Spectroscopy Made Easy (SME) (Valenti & Piskunov 1996). The synthesis routine takes as input: 1) physical stellar parameters, for which we used those listed in Table A1 (we do not provide a stellar rotation at this point); 2) a list of spectral line positions and strengths, for which we use the VALD3 database (Ryabchikova et al. 2015); and 3) a list of -angles (Mandel & Agol 2002); we choose 21 angles spanning from the limb to the centre of the stellar disk. For each night of observation, we simulate the transit of the planet on an 80 × 80 pixel grid centred on the host star. Each pixel overlapping the stellar disk is assigned a synthetic spectrum, interpolated at the appropriate -angle from the set of calculated SME spectra. The spectrum is also Doppler shifted according to the local radial velocity of the star, per parameters in Table A1. The integrated flux over all pixels provides our simulated out-of-transit spectrum. At each phase of the transit, corresponding to exposure midtimes, we geometrically determine the projected position of the planet on the stellar disk (Cegla et al. 2016), and integrate the stellar flux over all pixels not occulted by the planet. The result is our simulated in-transit spectra. The jointly-modelled CLV and RM effect on the observed transmission spectra is obtained by dividing each simulated in-transit spectrum by the simulated out-of-transit spectrum. We do not account for potential non-LTE effects in the cores of the Na doublet, nor variable planetary radii accounting for absorption by the atmosphere. As shown in the middle panel of Figure 6, the modelled transmission spectrum contains features which arise strictly from CLV and RM effects. However, these features have amplitudes 0.001 relative flux. This is small in comparison to the planetary absorption signal (discussed in the following section), and approximately at the noise-level of the data. Indeed, following the Na doublet detection by Wyttenbach et al. (2015), further studies which account for the RM correction (Louden & Wheatley 2015;Casasayas-Barris et al. 2017) detect the lines at similar strengths. The consistency may be explained by the fact that HD 189733 is a slow rotator, and the occulted, local radial velocities differ significantly from the planet's Doppler motion traced by the atmospheric absorption. Still, these studies find a several km s −1 smaller net blueshift in the Na doublet which underscores the potential impact of CLV and RM residuals, as well as the RM velocimetric effect. We correct for small changes in the shape of the line profiles by dividing by the model, and produce a final, observed transmission spectrum by co-adding the overall nightly CLV and RM corrected spectra.

RESULTS AND DISCUSSION
In this section we show a comparison of the calculated HD 189733 b transmission spectra using the two different telluric corrections. We confirm and discuss the detection of Na in the planetary atmosphere, assess the shape of spectral features and the amount of absorption, and give context for further inferred atmospheric properties. Since we are focusing on the Na doublet, we restrict the wavelength range of the transmission spectra to 5870-5916 Å.

Telluric removal methods
For a clearer visualisation of the spectral features, the transmission spectra for all nights were binned to a 0.2 Å resolution (20 points per bin). We define absorption features as parts of the spectrum which are negative in value. Figure 7 shows the transmission spectrum for night 3 using the airmass correction method. On this night, a secondary correction was performed to reduce the effects of water column variation. The middle panel shows the transmission spectrum without the extra correction -by comparing this with the empirical telluric reference spectrum (top panel), it is clear that there are still residual telluric lines present around 5886.0, 5887.5, 5891.5, 5900.0 and 5901.5 Å. Since these are of comparable strengths to the signals we expect in the planetary transmission spectrum, the validity of any detections is significantly reduced. The bottom panel shows the result after applying the second correction, and the residual tellurics have been reduced without affecting other parts of the spectrum. However, the sodium lines could be slightly over-corrected due to increased noise in the telluric reference spectrum in these regions. This correction was not necessary for nights 1 and 2.
To assess the quality of the telluric reduction methods, we fit Gaussian profiles to the lines in the sodium doublet which provides a good approximation of the line cores (Gandhi & Madhusudhan 2017). We used the LevMarLSQFitter module from astropy,  which performs a Least-Squares fit using a Lavenberg-Marquardt algorithm and accounts for errors in the flux which have been propagated from the photon noise on the raw spectra. We define three properties of the Gaussian fit: the amplitude or line contrast, D; centroid 0 ; and the full width at half maximum, FWHM. Figure 8 shows the night 3 transmission spectrum around the Na doublet for the airmass (left) and molecfit (right) telluric removal methods. By comparison, we see that corrections using airmass gives a transmission spectrum with more variation, particularly in regions where residual telluric lines were removed (see Figure 7). The Gaussian fits to the full resolution data around the Na lines have reduced chi-square values of 2 = 0.54 for the airmass correction method and 2 = 0.50 for the molecfit method.
For each method, we co-added the nightly spectra to ensure that the signal-to-noise ratio is large enough for a well characterised atmospheric sodium detection. Weaker signals would also be more prominent. The final CLV and RM corrected transmission spectra are shown in Figure 9. The Gaussian fits have 2 = 0.58 (airmass) and 2 = 0.57 (molecfit). Given the comparable fits with both approaches, no conclusion can be drawn about which method is better based on this statistic alone. The measured parameters of the Gaussian fits to the the sodium D2 and D1 lines are shown in Table 2. Following reduction using the airmass telluric correction method, we measured line contrasts of −0.55 ± 0.07 % for the D2 line and −0.45 ± 0.07 % for the D1 line. This results in an average depth of −0.50 ± 0.05 %. For the molecfit method, the line contrasts were −0.64 ± 0.07 % (D2) and −0.53 ± 0.07 % (D1), averaging to −0.59 ± 0.05 %. Although the average line contrasts for both methods are not within 1 of each other, the error ranges overlap. From this we note that telluric corrections using the molecfit method resulted in a transmission spectrum with deeper Na features than those for the airmass method. We find a similar difference when measuring binned absorption depths, and further context is provided in section 5.

Binned sodium absorption
HD 189733 b has been studied extensively in the past at different resolutions (Redfield et al. 2008;Jensen et al. 2011;Huitson et al. 2012;Wyttenbach et al. 2015;Khalafinejad et al. 2017). In order to form a comparison to these (and studies of other planets), it is beneficial to calculate the strength of the absorption over varying bin widths centred at the sodium lines (Snellen et al. 2008;Wyttenbach et al. 2015). This is particularly useful if the line itself is too narrow to be resolved, or if there is too much noise to produce an accurate fit.
We define absorption depth (AD) as the difference between the mean flux of the continuum regions and the mean flux within a passband centred on the spectral feature. Red and blue continuum passbands are chosen in regions around the sodium doublet: 5874.89-5886.89 Å for the blue band, and 5898.89-5910.89 Å for the red band. The mean flux in central passbands of widths 0.375, 0.75, 1.5, 3, 6 and 12 Å are measured. Since the sodium feature contains two lines, the total widths of these passbands are divided into two, with one half centred around each line. However, the 12 Å passband is large enough to encapsulate both features, so this is not split and is centred in the middle of the two lines instead. The absorption depth is thus calculated by where is the mean flux of the transmission spectrum in the blue (B), red (R) and central (C) passbands. This quantity was evaluated over all passbands for each nightly and combined transmission spectrum. All averages were weighted using the inverse of the squared uncertainties (variance), which were propagated from the photon noise of the raw data. The results are shown in Table 3 (airmass) and Table 4 (molecfit). The measured absorption depths of the combined data are shallower for the airmass method than the molecfit method. This matches the difference seen in the line contrasts in section 5.1.
Narrower passbands more accurately probe the sodium line cores, so these measurements will most closely match the Gaussian line contrasts. For the molecfit reduction, the absorption depths were −0.471 ± 0.066 % and −0.501 ± 0.046 % for the 0.375 and 0.75 Å passbands respectively. These values are consistent to within 1 of each other, and the error for the 0.75 Å passband overlaps with the error in the average Gaussian line contrast. The same comparison is true for the airmass method.
We compare our results to those of Wyttenbach et al. (2015), who analyse the same HARPS data and compare the effects of planetary radial velocity corrections. In the 1.5 Å passband, they measure an absorption depth of −0.320 ± 0.031 %, which agrees to within 1 with our results of −0.288±0.031 % for the airmass method and −0.350 ± 0.031 % for the molecfit method. We therefore confirm the 10 detection of sodium in the atmosphere of HD 189733 b. Table 3 shows slight variation to the results discussed by Wyttenbach et al. (2015), despite similarities in the telluric removal methods. This is likely due to differences in the reduction pipeline and handling of the data, as well as corrections for the CLV and RM effects which change the shape and depth of the absorption lines. A consistent trend is seen in both studies: the measured depths for night 2 using the airmass reduction are significantly lower than the other two nights. As discussed previously, the observations on this night began when the planet was already in-transit, so a long out-of-transit baseline was not obtained. A clear telluric reference spectrum was only able to be derived when the in-transit spectra were included. Whilst this is better for telluric correction, it could also lead to partial removal of signals in the transmission spectrum. We were unable to detect any significant absorption in the D1 line for this night, which gives rise to the lower overall measurements. The molecfit reduction performs better in this case. As seen in Table 4, the measured depths are much more consistent across all nights for each passband, and there is a significant improvement in the results for night 2. Therefore, molecfit is much better at removing telluric contamination in spectra, regardless of the number of in or out-of-transit observations. It is also noted that the errors across single passbands are comparable for both methods, therefore the signal-to-noise ratio is improved when using molecfit.
As the passbands increase in size, the measured depths for both telluric correction methods decrease by the same proportion. We measure an absorption depth of −0.049 ± 0.007 % in the largest (12 Å) passband using molecfit. This agrees with the space-based detection of −0.051 ± 0.006 % (Huitson et al. 2012), and two other ground-based detections of −0.067 ± 0.020 % (Redfield et al. 2008) and −0.053 ± 0.017 % (Jensen et al. 2011) within the same 12 Å passband. The corresponding measurement for the airmass method was −0.040 ± 0.008 %, which does not agree as significantly. This suggests that molecfit is better at correcting tellurics in the immediate regions around the sodium doublet.

Wind speeds
Sodium absorption features in the lab rest-frame should be located at wavelengths of 5889.951 Å for the D2 line and 5895.924 Å for the D1 line. Despite efforts made to correct for spectral deviations due to systemic and planetary radial velocities, the measured centroids of the Gaussian fits to the full resolution data (Table 2) have a net blueshift from the rest-frame Na doublet positions (although still encapsulated within error). We interpret this as being due to winds within the atmosphere of HD 189733 b, in agreement with other studies Louden & Wheatley 2015;Casasayas-Barris et al. 2017). The average blueshift was −0.025 ± 0.029 Å for the airmass method and −0.035 ± 0.025 Å for the molecfit method. This corresponds to net wind velocities of −1.3 ± 1.5 km s −1 and −1.8 ± 1.2 km s −1 respectively, indicating an eastward atmospheric movement from day-side to night-side.   Table 3 but measuring from the transmission spectrum after removing telluric contamination with molecfit ( ).
Since the telluric corrections are the only parts that vary between our methods, we deduce that the difference between the velocities may be due to inaccuracies in telluric removal -particularly in nights 2 and 3 for the airmass reduction. Nevertheless, both results are lower than the −8 ± 2 km s −1 measurement discussed by Wyttenbach et al. (2015). The discrepancy can be attributed to nuances in the reduction pipeline and radial velocity corrections, and application of models to correct for CLV and RM effects. If the spectra are observed uniformly over the full transit and stacked in the stellar rest frame, the RM effect can be averaged out. As shown in Figure 6, this is not true when the transmission spectra are stacked in the planetary rest-frame -spurious signals are induced (albeit small for the slowly rotating host star) and were removed by modelling these effects.
Our measured wind velocities are consistent with the results of Louden & Wheatley (2015) (−1.9 +0.7 −0.6 km s −1 ) and Casasayas-Barris et al. (2017) (−2 km s −1 ) who analysed the same data. Additionally, Louden & Wheatley (2015) spatially resolved the atmospheric winds and measured a redshift on the leading limb of the planet (2.3 +1.3 −1.5 km s −1 ) and a blueshift on the trailing limb (−5.3 +1.0 −1.4 km s −1 ). This implies a net eastward motion due to a combination of tidally-locked planetary rotation and strong eastward winds. High-resolution observations in the infrared regime have also reported similar day-side to night-side wind speeds: −1.7 +1.1 −1.2 km s −1 (Brogi et al. 2016) and −1.6 +3.2 −2.7 km s −1 (Brogi et al. 2018). Additionally, our measured wind velocities are consistent with predictions from three-dimensional General Circulation Models (GCMs), where the atmospheric flow is dominated by a superrotating eastward equatorial jet and the hottest region of the atmosphere is advected downwind from the substellar point (Showman et al. 2009;Miller-Ricci Kempton & Rauscher 2012;Rauscher & Kempton 2014). For HD 189733 b, models predict that the equatorial jet extends to latitudes of ∼ 60 • . Atmospheric winds flow eastward within this region -from night-to-day at the western terminator (leading limb) and from day-to-night at the eastern terminator (trailing limb). At latitudes above ∼ 60 • , the winds flow preferentially from day-to-night at both the leading and trailing limbs of the terminator. Therefore, models predict a net blueshift due to the winds (Showman et al. 2013).

Extent of sodium in the atmosphere
An exoplanet's transit depth is wavelength-dependent; if there is molecular or atomic absorption, the atmosphere will appear opaque when viewed at wavelengths within this region. The planet will have a larger apparent radius at these wavelengths and block more stellar flux, compared to a transparent atmosphere with no absorption. Therefore, we can use absorption measurements to probe the atmospheric height at which the molecules or atoms are present. We define atmospheric scale height sc as the increase in altitude over which the pressure reduces by a factor of , given by sc = eq . (9) Using a mean molecular weight of = 2.22 u (for Jupiter) and deriving the planet's surface gravity of = 22.7 m s −2 from the parameters in Table A1, the scale height for HD 189733 b is 200.7 km. We refer to the white-light transit depth as the decrease in stellar flux during transit, equal to the ratio of the sky-projected areas of the planet and star: (10) For HD 189733 b, Δ 0 = 0.0228. Atmospheric absorption increases the apparent planetary radius by a height , so a wavelengthdependent transit depth is similarly defined as Since the atmosphere typically extends around 5-10 scale heights (Madhusudhan 2019), << * , so For a particular wavelength, the difference between the white-light and wavelength-dependent transit depths is a measure of the absorption, Using the line contrasts and passband absorption depths as a measurement of , we can therefore calculate the height of an atmospheric detection by rearranging equation 13 for . For the molecfit reduction, the average Na doublet line contrast of −0.59±0.05 % corresponds to an apparent planetary radius of 1. 13 and an atmospheric height of Na = 10200 ± 900 km. This is ∼ 51 times greater than the scale height and suggests that the detection of Na probes the upper atmosphere of HD 189733 b.

SUMMARY AND CONCLUSION
HD 189733 b is one of the most extensively studied exoplanets to date. A number of chemical species have been detected through observations from space and ground-based instruments, as well as measurements of wind speeds, atmospheric circulation, and evidence of Rayleigh scattering and high-altitude hazes. The nature of telluric contamination makes ground-based observations of exoplanet transmission spectra difficult. Several methods have been used in the past to remove excess absorption from Earth's atmosphere, including observing a telluric standard star (Redfield et al. 2008;Jensen et al. 2011Jensen et al. , 2012, computing an empirical telluric spectrum , and using an Earth atmospheric model (Yan et al. 2015;Casasayas-Barris et al. 2017;Allart et al. 2017).
In this study we approached the challenge of removing telluric lines in high-resolution optical spectra by comparing two popular methods using archival HARPS observations of HD 189733 b. Firstly, we derived a telluric reference spectrum from the data by assuming a linear relationship between airmass and telluric line strength. The reference spectrum was used to scale the strength of telluric lines to a constant level as if they had been observed at the same airmass, and are thus removed during the subsequent transmission spectrum calculation. Our second method involved using molecfit to generate a model of molecular H 2 O and O 2 absorption in Earth's atmosphere. It is then straightforward to remove the contamination through division of unique telluric contributions for each observation. The methods can be applied to all regions of the HARPS spectra, however we focused specifically on the quality of the telluric correction around the sodium D-lines and the impact on the measured line properties. Other empirical methods such as SYSREM or principal component analysis (Tamuz et al. 2005;Mazeh et al. 2007;Birkby et al. 2013), which are normally reserved for severe telluric contamination in the near-infrared, are beyond the scope of this study.
Alongside correcting for stellar, systemic and planetary radial velocity effects, we modelled the Centre-to-Limb Variation and Rossiter-McLaughlin effect to correct for spurious signals which are imprinted onto the transmission spectrum. The relative strengths of these signals were small and at the noise level of the spectrum, however it is an important factor to consider if the star is rotating more rapidly. Both methods were performed on the data from each night separately, then combined for an overall transmission spectrum.
Absorption due to atmospheric sodium (Na ) was identified from the doublet lines at ∼ 5890 Å (D2) and ∼ 5896 Å (D1). For the airmass reduction, we measured Gaussian line contrasts of −0.55 ± 0.07 % (D2) and −0.45 ± 0.07 % (D1), and absorption in the 0.75 Å passband of −0.412 ± 0.046 %. For the molecfit reduction, the corresponding values were −0.64 ± 0.07 % (D2 line contrast), −0.53 ± 0.07 % (D1 line contrast), and −0.501 ± 0.046 % (0.75 Å passband absorption). In cases where weather conditions are variable over one night, the model produced with molecfit was more robust at removing telluric contamination, and absorption depth measurements were more consistent across all nights. Additionally, it performs better when the observing window does not allow for a long out-of-transit baseline to be obtained (when building an empirical telluric reference spectrum is difficult). The average line contrast of the Na doublet for the molecfit method was −0.59 ± 0.05 %, corresponding to an atmospheric height of 10200 ± 900 km. Finally, we measured a net atmospheric blueshift of −1.8 ± 1.2 km s −1 arising from a combination of tidally-locked planetary rotation and day-side to night-side winds.
From the assessments of the results, we conclude that correcting telluric contamination with a model of molecular (H 2 O and O 2 ) absorption in Earth's atmosphere using molecfit produces a better quality transmission spectrum than if the tellurics were removed empirically, e.g. using airmass. This is evident by: (a) a detection of atmospheric sodium with greater significance, (b) more consistent measurements of absorption depths in bandpasses across different nights, and (c) the ability to extract a signal when the weather conditions are not ideal. Molecfit allows for targeted removal of specific telluric lines without significantly impacting data outside of these regions. This is a key advantage over data-driven methods such as the airmass correction used within this work, since it is likely that the empirically derived model will more significantly increase the continuum noise. Additionally, corrections can be performed on a single spectrum without relying on a large sample of observations, which is particularly advantageous if only a partial transit is obtained or if variable weather conditions affect a large number of spectra.
Detections at different spectral resolutions and across a number of wavelength regimes are important for thorough atmospheric characterisation. With an ever-growing number of identifications of chemical species within exoplanet atmospheres, we are able to further understand and constrain dynamical and chemical process and make inferences into how hot Jupiters form and evolve. We have already seen how ground-based instruments such as HARPS can be a powerful tool for such studies in the optical regime, and this is set to increase with newly developed high-resolution spectrographs such as ESPRESSO (Pepe et al. 2013), HARPS-3 (Young et al. 2018) and EXPRES (Jurgenson et al. 2016). ESPRESSO covers a wavelength range of 380-780 nm which probes deeper into the red wavelengths, giving opportunity to make detections of the water band at ∼ 7400 Å or the potassium doublet at ∼ 7665-7699 Å. It is therefore critical to be able to tackle the problem of telluric contamination in a robust way across a broad wavelength range to ensure accurate recovery of the planetary transmission spectrum.