Data structures for photoabsorption within the ExoMol project

The ExoMol database currently provides comprehensive line lists for modelling the spectroscopic properties of molecules in hot atmospheres. Extending the spectral range of the data provided to ultraviolet (UV) wavelengths brings into play three processes not currently accounted for in the ExoMol data structure, namely photodissociation, which is an important chemical process in its own right,the opacity contribution due to continuum absorption and predissociation which can lead to significant and observable line broadening effects. Data structures are proposed which will allow these processes to be correctly captured and the (strong) temperature-dependent effects predicted for UV molecular photoabsorption in general and photodissociation in particular to be represented.

. Summary of the current ExoMol data structure. files total number of possible files; mol Number of molecules in the database; tot is the sum of iso for the mol molecules in the database; iso Number of isotopologues considered for the given molecule.
There are tot sets of .trans files but for molecules with large numbers of transitions the .trans files are subdivided into wavenumber regions. There are cross sets of .cross files for isotopologue. There are kcoef sets of .kcoef files for each isotopologue.
There are sets of -dependent super-lines. There are sets of VUV cross sections.
Set of opacity files in the format native to specific radiative-transfer programs.
cm −1 , due to predissociation are not saturated in the stellar spectra meaning that it was only by analysing these broadened lines that Pavlenko et al. were able to retrieve abundances of AlH. It would therefore clearly be advantageous to include consideration of predissociation effects in the ExoMol database.
In this research note we propose a generalisation of the current ExoMol format to allow for the various processes discussed above. At the same time we draw a clear distinction between the photoabsorption data, needed for spectral and opacity models, and for the data needed for modelling the chemical consequences of photodissociation. Figure 1 gives a simplified ExoMol data structure; a complete specification of the file types is given in Table 1. The master file exomol.all (https://www.exomol.com/exomol.all) gives an overview of the entire database and points towards the .def files which characterise the recommended line lists for each isotopologue for which data is available. The .def file contains specification of the dataset in terms of what is available, for example uncertainties or lifetimes, quantum numbers used in the states file and file sizes. It also gives a version number in yyyymmdd date format. The core of the database are the .states and .trans files which provide a compact form of the line list data. Table 2 gives the specifications for the mandatory part of the .states file. These specifications include the optional components: uncertainties in the term values, the state lifetime and Landé -factor. The specification of term value uncertainties was introduced as part of the last data release (Tennyson et al. 2020) and the aim is to make their inclusion compulsory once the available datasets have all had uncertainties added. The lifetimes column has thus far contained radiative lifetimes computed using the Einstein A coefficients available in the transitions file (Tennyson et al. 2016a); so far lifetime effects due to other processes such as predissociation have not been considered. As discussed below we propose changing this.

THE PRESENT EXOMOL DATA MODEL
After the mandatory fields, the states files contains data on the quantum numbers and other meta-data used to specify each state. The number   and format of these quantum numbers is specified in by the .def file associated with that dataset. Other metadata associated with level can also be included in this section. Table 3 shows the start of the states file for the recently update AlHambra line list for 27 Al 16 O (Bowesman et al. 2021). Note that we have taken the opportunity to update our quantum number specifications to conform with the PyValem python package for parsing, validating, manipulating quantum states labels of atoms, ions and small molecules (Hill & Hanicinec 2022;Hill et al. 2023b). In general, this change only affects electronic state designations for which X2SIGMA+, A2PI and soforth are updated to X(2SIGMA+), A(2PI), etc.. This update adds two characters to the electronic state field but otherwise should be transparent to users; however, it means that all state labels can now be parsed using PyValem which is important for some uses of the database . Table 4 gives the specification of the simpler but generally much larger .trans file. ; transitions to the repulsive electronic state produce continuum spectra (in green). The line arrows denote sharp, line transitions, the dot interrupted arrow goes to states above the dissociation limit ( e ) which can exhibit predissociative effects. The golden dotted arrow shows photoabsorption which we represent using bound-continuum cross sections. Photoabsorption above e can contribute to photodissociation.   Yurchenko et al. (2018b) showing the predissociation region. The = 0 vibrational state of the A 1 Π electronic state lies below the AlH dissociation limit: transitions to this state are sharp as they do not show effects due to predissociation. Conversely, states in the = 1 can predissociate by tunneling through the barrier to dissociation and show pronounced effects due to lifetime broadening. Right: The potential energy curves (bound and dissociative) and predissociated states of SH.

PROPOSED UPDATED EXOMOL DATA MODEL
There are three new aspects that need to be included in an updated ExoMol data structure: predissociation, the continuum contribution to the opacity and photodissociation. Figure 2 illustrates the various photoabsorption processes for the case of a diatomic molecule. Figure 3 illustrates the main mechanism leading to predissociation; for AlH it is caused by tunneling through a small barrier to a dissociation, while for SH it is caused by couplings to dissociative states crossing the state to which the transition goes to. The effect of predissociation can be included by a minor adjustment to the current ExoMol data structure. Predissociation manifests as a shortened lifetime which leads to enhanced natural line-broadening of any transition to (or from) the state concerned. We therefore propose to generalise the definition of the lifetime, , given in the .states file. For most line lists, where predissociation is not important, is defined as the natural lifetime due to radiative decay. In cases where predissociation is considered, will represent the natural life due to both radiative decay and predissociation. For example, the radiative lifetime of = 1 A 2 Σ + state of SH was calculated by Gorman et al. (2019) as 5.13 ns, while the predissociative lifetimes was measured as 5.45(24) ps ( = 0) (Wheeler et al. 1997). A similar example of the predissociation effects in the spectra of AlH is the predissociative lifetime of the = 23, = 1, A 1 Π state of 27 AlH, which was measured by Baltayan & Nedelec (1979) as 4.5 ns, while the radiation lifetime is calculated to be 101.7 ns (Yurchenko et al. 2018b). The reduced lifetime affects the line broadening of the corresponding transitions and therefore is an important factor in retrievals of AlH abundance from stellar spectra, as was recently demonstrated by Pavlenko et al. (2022).

Predissociation
Cases where predissociation effects are included in this lifetime will be marked by a new flag, prediss, included in the .def file; prediss will default to 0 (none) and be set to 1 when the effects are present.
In principle, the natural lifetime provides a contribution to the line profile which sits alongside the temperature-dependent effects of Doppler broadening and pressure-dependent collisional broadening. In practice, the natural lifetime usually makes a minimal contribution to the line profile and is thus neglected. For predissociating states this is no longer the case. However, including the effect of lifetime broadening within a standard Voigt profile is straightforward. Lifetime broadening leads to a Lorentzian line shape like pressure broadening where , the half-width in cm −1 due lifetime broadening, is given by (1) This half-width can simply be added to the pressure-broadening half-width, , to give the total Lorentzian component of the line profile. Given that a Voigt profile is already being used, this has little computational impact on a calculation which suggests that routine use of + for half-width would avoid the need to worry about whether predissociation needs to be considered or not. To this end, for states with non-negligible predissociative lifetimes, the radiative values of rad in the ExoMol States files will be replaced by prediss . In the example of the = 23, = 1, A 1 Π state of 27 AlH, the ExoMol value rad = 1.0169 × 10 −7 s can be replaced by the experimental value prediss = 4.5 × 10 −9 s (Baltayan & Nedelec 1979), otherwise calculated values will be used.

Continuum absorption
Both continuum absorption and photodissociation need to be represented as cross sections rather than lines as they are continuum processes. However, in our proposed data model continuum absorption will be represented as a function of wavenumber (cm −1 ) to retain consistency with the line spectra, while photodissication cross sections will be stored as function of wavelength (nm).
When considering photoabsorption by molecules to states lying above the dissociation limit, the spectrum can be thought of broadly dividing into two classes: line spectra comprising what looks like sharp bound-bound transitions, and absorption directly into the continuum. Predissociation spectra form an intermediate between these two cases and, as discussed above, will be treated as line spectra. The line spectra can be represented using the form of the standard line lists (line positions, Einstein A coefficients, lower/upper state energies and other state descriptions), as captured by the .states and .trans files. However, the bound-continuum photoabsorption is best represented as temperaturedependent photo-absorption cross-sections. These continuum photoabsorption cross sections, whose data specification is given below, will be stored as part of the standard line list data base as separate files for each isotopologue. The temperature-dependent photoabsorption spectrum is then obtained by adding the appropriate line and continuum cross-sections using software such as ExoCross (Yurchenko et al. 2018a) or PyExoCross (Zhang et al. 2023).
Continuum molecular absorption due to collision induced absorption (CIA) (Richard et al. 2012;Karman et al. 2019) is already routinely considered as part of astronomical models; it has also been recently recommended that absorption due to the so-called 'water continuum' be considered in model atmospheres for water-rich exoplanets (Anisman et al. 2022).
The data model we propose for including continuum absorption due to simply photoabsorption into continuum states is given in Table 5. Again these cross-sections will be temperature dependent but, unlike CIA and the water continuum, it is probably safe to assume that this continuum is not strongly pressure dependent. The presence of a continuum absorption contribution to the photoabsorption will be indicated by a new flag, continuum, included in the .def file; continuum will default to 0 (none) and be set to 1 when the effects are present.
We note that our proposal involves providing photodissociation data on a wavelength grid (in nm) while continuum absorption cross sections will be provided on a wavenumber grid (in cm −1 ). This latter choice ties in closely with line lists which already prove transitions wavenumber (in cm −1 ). The data structure of continuum absorption cross sections is presented in Table 5. The file names have the following structure: '<ISO-SLUG>__<DSNAME>__<RANGE>__T<TEMP>K__P<PRESSURE>bar__<STEP>.cont', where ISO-SLUG is the iso-slug molecule name (Tennyson et al. 2020), DSNAME is the name of the line list, RANGE is the wavenumber range in cm −1 , TEMP is the temperature in K, PRESSURE is the pressure in bar, STEP is the wavenumber step in cm −1 .

Photodissociation
Photodissociation cross sections are separated from the line list and form a distinct section in the ExoMol database. At present this section contains calculated cross sections for HCl and HF (Pezzella et al. 2022) and measured cross sections for CO, H 2 O, CO 2 , SO 2 , NH 3 , H 2 CO and C 2 H 4 due to Fateev et al. (2023) and CO 2 due to Venot et al. (2018). The immediate plan is also to add temperature-dependent cross sections due to Qin and co-workers who have performed photodissociation calculations on MgO , AlH , AlCl , AlF (Qin et al. 2022a) and O 2 (Hu et al. 2023), as well as HF and HCl (Qin et al. 2022b). In due course a structure of photodissociation .pdef files will be implemented to aid the navigation of this section of the database.
As the photodissociation cross sections form a distinct part of the ExoMol database, we have added a new photodissociation definition file (.pdef) file to the data structure; the proposed format of this file is given in Table 6. This gives the necessary metadata to access and interpret the recommended photodissociation cross sections. The information section mirror those given in the .def for the same system. For completeness we have added two more flags to the master file, line and photo, which define whether the line spectra and photodissociation cross sections are present (=1) or not (=0). The default values are line=1 and photo=0 which aligns with structure of previous master files which assumed all data was in the form of a line list.
A file structure for photodissociation was already proposed by Pezzella et al. (2022); however, this is updated and extended here to align with the one proposed for VUV spectra in Tennyson et al. (2020); Table 7 gives the formal specification of the file structure. As a file naming convention we adopt the following: '<ISO-SLUG>__<DSNAME>__<RANGE>__T<TEMP>K__P<PRESSURE>bar__<STEP>.photo', where ISO-SLUG is the iso-slug molecule name, DSNAME is the name of the line list, RANGE is the wavelength range in nm, TEMP is the temperature in K, PRESSURE is the pressure in bar, STEP is the wavelength step in nm. For example, the states file of the photodissociation cross sections for HF the filename: 1H-19F__PTY__90.0-400.1__T200K__P0bar__0.1.nm, see Table 8. Pezzella et al. (2022) found that their cross sections depended strongly on the temperature of the molecule and proposed presenting these data for 34 temperatures between = 0 and = 10000 K. This data model implicitly assumes that the molecule is in local thermodynamic equilibrium (LTE). We discuss issues with treating non-LTE effects and other issues with this data model in the next section. These data are all implicitly at zero pressure as pressure broadening effects are neglected. Data from other sources will have different temperature and pressure grids.
Experimental cross sections of molecules covering the VUV region has been curated by the MPI-Mainz UV/VIS Spectral Atlas (Keller-Rudek et al. 2013) using a similar, wavelength (in nm) format. In Fig. 4, we illustrate photodissocitasion cross sections of HCl from ExoMol (Pezzella et al. 2022) and by Nee et al. (1986) at room temperature as provided by the MPI-Mainz UV/VIS Atlas.

OMISSIONS FROM THE UPDATED DATA MODEL
The assumption of LTE for a molecule undergoing photodissociation may not be valid in all cases. Our method of calculating these cross sections does indeed involve computing the initial/final-state dependent data which would be required for a non-LTE representation of photodissociation. However, given that even for diatomic molecules a large number of initial states have to be considered, even considering the vibronic states only, the volume of these data is large. As yet no-one has asked us for non-LTE photodissociation cross sections so at present we do not propose including them in the standard data distribution; if they are required they can be obtained from the present authors. Examples of the state-dependent non-LTE cross sections include the continuous opacities of CH and OH provided by Kurucz et al. (1987) as well as the CH data produced by Popa et al. (2022) and used in their non-LTE analysis of CH in metal-poor stellar atmospheres. Another issue with our data model for photodissociation is that at present it provides no information of dissociation products. In comparison with the Leiden database (Heays et al. 2017), which provides low-temperature photodissociation cross sections for molecules of astrophysical interest, gives the dissociation products which are given at the threshold to photodissociation but does not provide information on other possible photodissociation products as they may arise at shorter wavelengths. Although our methodology is capable of providing the partial cross sections (or branching ratios) associated with dissociation to different products, so far our models have not been constructed to produce these data. While the initial step in photodissociation generally involves dipole allowed transition to an electronically excited state, the subsequent dissociation step may involve crossings to states which cannot be reached by dipole transitions from the ground state such as ones with different spin multiplicity. Allowing for these extra states represents a significant complication in the calculation and again, so far, no one has asked for these data. However, should these partial cross sections be needed it would be relatively simple to extend our proposed data model to accommodate them; we note that the International Atomic Energy Agency's CollisionDB database (Hill 2023;Hill et al. 2023a) already used PyValem to address this issue in collision cross sections.

CONCLUSION
The present research note sets out how we propose to extend the current ExoMol data model to accommodate photoabsorption processes which occur at shorter wavelengths where the possibility of either direct or indirect absorption into the continuum can occur. Broadly these processes are classified as predissociation, continuum absorption and photodissociation contribution to the opacity. While both predissociation and continuum absorption can be accommodated by generalising our current line; a new branch starting form a photodissociation definition file has been added for the photodissociation cross sections.
Given that these various processes considered are not mutually exclusive as, for example, photodissociation also provides a contribution to the opacity, some care is required in defining data structures to facilitate the use of our extended data. We believe our proposals satisfy this requirement but that further expansion will be required to allow for pressure-dependent continuum absorption, photodissociation of molecules not in LTE, or to account for the possibility that photodissociation might result in creation of a variety of different photodissociation products.