cecilia: A Machine Learning-Based Pipeline for Measuring Metal Abundances of Helium-rich Polluted White Dwarfs

Over the past several decades, conventional spectral analysis techniques of polluted white dwarfs have become powerful tools to learn about the geology and chemistry of extrasolar bodies. Despite their proven capabilities and extensive legacy of scientific discoveries, these techniques are however still limited by their manual, time-intensive, and iterative nature. As a result, they are susceptible to human errors and are difficult to scale up to population-wide studies of metal pollution. This paper seeks to address this problem by presenting cecilia, the first Machine Learning (ML)-powered spectral modeling code designed to measure the metal abundances of intermediate-temperature (10,000$\leq T_{\rm eff} \leq$20,000 K), Helium-rich polluted white dwarfs. Trained with more than 22,000 randomly drawn atmosphere models and stellar parameters, our pipeline aims to overcome the limitations of classical methods by replacing the generation of synthetic spectra from computationally expensive codes and uniformly spaced model grids, with a fast, automated, and efficient neural-network-based interpolator. More specifically, cecilia combines state-of-the-art atmosphere models, powerful artificial intelligence tools, and robust statistical techniques to rapidly generate synthetic spectra of polluted white dwarfs in high-dimensional space, and enable accurate ($\lesssim$0.1 dex) and simultaneous measurements of 14 stellar parameters -- including 11 elemental abundances -- from real spectroscopic observations. As massively multiplexed astronomical surveys begin scientific operations, cecilia's performance has the potential to unlock large-scale studies of extrasolar geochemistry and propel the field of white dwarf science into the era of Big Data. In doing so, we aspire to uncover new statistical insights that were previously impractical with traditional white dwarf characterisation techniques.


INTRODUCTION
In recent years, the field of exoplanets has undergone remarkable growth, revolutionising our understanding of the Universe with the discovery of more than 5500 distant worlds. 1 Despite this notable progress, the detection of a potentially habitable Earth-sized planet with an Earth-like interior remains elusive.
In our quest to find exoplanets similar to our own, knowledge of a planet's bulk composition will be crucial to understand its formation and evolution history, geology, climate, and habitability conditions.Traditionally, transit photometry (Charbonneau et al. 2000) and Doppler spectroscopy (Campbell et al. 1988) have been used to measure the radii and masses of exoplanets, respectively.These two parameters, when combined, provide the planetary bulk density, which can then be compared to interior models to infer the planet's most likely composition (Zeng et al. 2016).Nevertheless, this approach is incomplete because it does not account for the specific elemental abundances of the planet.In addition, for planets smaller than   ∼4 R ⊕ , large uncertainties in radius and mass measurements (> 10%) pose challenges in distinguishing between different interior compositions (Zeng & Seager 2008).These problems are further complicated by degeneracies in theoretical mass-radius diagrams, which can yield various combinations of metal, silicate, water/ice, and gas for the same density measurement (Dorn et al. 2015;Rogers & Seager 2010;Seager et al. 2007).
To date, the most direct -and only-way of overcoming these ambiguities and probing the composition of small, terrestrial exoplanets is through spectroscopy of externally polluted White Dwarfs (WDs).These objects are the degenerate stellar cores of low-and intermediate-mass (≤ 10  ⊙ ) main-sequence stars.With a typical radius similar to the size of the Earth ( WD ≈ 1 M ⊕ ), and a mass half of that of the Sun ( WD ≈ 0.6  ⊙ ), they are extremely dense and compact, which causes their atmospheres to have a stratified chemical structure.In other words, light elements (i.e.H and He) are expected to float in the thin upper layers of their atmospheres, while elements heavier than helium (or "metals," e.g.Ca, Mg, Si, Fe) should sink rapidly towards the core with diffusion timescales much shorter than the cooling age of the white dwarf (Paquette et al. 1986;Koester 2009).In reality, however, observations suggest that between 25% and 50% of single white dwarfs are contaminated with heavy elements in their atmospheres (Koester et al. 2014;Zuckerman et al. 2010Zuckerman et al. , 2003)), likely from the recent or ongoing accretion of disrupted exoplanetary debris from their surviving outer planetary systems (Jura 2003;Zuckerman et al. 2003Zuckerman et al. , 2010;;Koester et al. 2014;Jura & Young 2014).
During the past two decades, high-resolution spectroscopy of these "polluted" white dwarfs has offered an excellent opportunity to determine the bulk composition of their accreted material.This technique has enabled the detection of more than 20 heavy elements (e.g.see references in Table 1 of Klein et al. 2021), revealing that most exoplanetary bodies are dry and rocky (Veras 2021), with some waterrich exceptions (e.g.Farihi et al. 2011Farihi et al. , 2013;;Raddi et al. 2015;Xu et al. 2017;Hoskin et al. 2020).Despite the significance of these discoveries, conventional analysis techniques of polluted white dwarfs entail time-consuming, manual, and iterative work, which makes them prone to human errors.As a result, these techniques may lead to wrong or sub-optimal results, potentially biasing our understanding of the geology and chemistry of white dwarf pollutants.With these challenges in mind, it has been difficult to transition from individual to population-wide studies of metal pollution.As we move into an era of massive multi-object spectroscopic surveys, when tens of thousands of white dwarfs will be observed and many of them are expected to show multiple heavy elements, conventional analysis techniques will thus struggle to keep up.
In this paper, we address the limitations of "classical" white dwarf characterisation methods by leveraging the power of Machine Learning (ML).Recently, multiple ML-based tools have been developed to interpolate and efficiently model spectra with the goal of estimating stellar abundances.Notable examples include The Cannon (aimed at deriving stellar properties based on a large training set of reference stars with well-known abundances; Ness et al. 2015), The Payne (designed to estimate 25 elemental abundances of red giants from low-resolution APOGEE spectra; Ting et al. 2019), wdtools (dedicated to the analysis of H-rich white dwarf spectra with no metal pollution; Chandra et al. 2020) and TheHotPayne, which is an adaptation of The Payne to OBA stars (Xiang et al. 2022).These ML pipelines have undoubtedly shown promising results, but they have not been designed to fit the complex and diverse spectra of polluted white dwarfs.This paper seeks to bridge this gap by presenting the first ML-based code capable of achieving this goal.
Inspired by the groundbreaking contributions of astrophysicist Cecilia Payne and building upon the ML framework of the The Payne, we have chosen to name our code "cecilia" -or Clues of Exoplanet Compositions Inferred from poLluted WhItE dwArfs.With cecilia, we combine state-of-the-art atmosphere models with innovative ML methods to improve the performance of conventional white dwarf analysis techniques and ultimately enable large-scale studies of metal pollution.From a design standpoint, cecilia's ML architecture contributes to this goal by adhering to three fundamental principles: automation, efficiency, and flexibility.As an automated pipeline, cecilia solves the dependence of current techniques on human-based, time-intensive, and iterative procedures.In terms of efficiency, our code harnesses the speed of ML tools to rapidly estimate elemental abundances of polluted white dwarfs with a retrieval accuracy similar to that of traditional methods.Lastly, cecilia has a flexible ML architecture that can be re-trained at any time with new, better, and/or larger models.This also makes our pipeline a particularly useful feature extraction tool beyond the fields of white dwarf and exoplanetary science.
In this work, we describe the building blocks of cecilia, its current performance and limitations, and potential opportunities for improvement.The organisation of the paper is as follows.In Section 2, we provide a brief introduction to the the concept of deep neural networks.In Section 3, we present the stellar parameters and model atmospheres used to train, validate, and test our code.Section 4 explains cecilia's ML architecture, our training procedure, and our framework for estimating the elemental abundances of polluted white dwarfs from their spectra.In Section 5, we test and explore cecilia's functionality with synthetic and real spectroscopic observations.Section 6 discusses the strengths and weaknesses of cecilia, possible future work to enhance its retrieval accuracy, and its relevance in the context of massive datasets from wide-field astronomical surveys.Finally, we offer our conclusions in Section 7.

DEEP NEURAL NETWORKS
Since the 1950s (Turing 1950;McCarthy et al. 2006), the field of modern Artificial Intelligence (AI) has sought to create intelligent systems capable of emulating human skills, both in terms of human reasoning and behavior.In recent years, the rapid development of technology has transformed this theoretical aspiration into a tangible reality, causing an important shift from model-to data-driven scientific analysis.The creation of faster and more powerful computers, coupled with the availability of extremely large and complex datasets (or "Big Data"), has only fueled the growth of AI, making neural networks increasingly relevant across a variety of disciplines, including astrophysics (Smith & Geach 2023).
To date, the task of developing computer programs that can learn, extract, and process meaningful insights from observations is typically referred to as Machine Learning (Samuel 1959).At the core of many ML algorithms are multilayered, fully connected, feed-forward neural networks (NN).These networks are learning systems where information flows forward through several layers of artificial nodes called "neurons."A basic NN consists of an input layer, one or more hidden layers, and an output layer.The higher the number of hidden layers (or "depth"), the more complex the ML architecture becomes (LeCun et al. 2015).This is the case of cecilia, which consists of three neural networks with 5-6 layers each.In general, deeper architectures are more capable of recognising more intricate relationships in the data (Goodfellow et al. 2016).
In addition to the choice of ML architecture, the learning capabilities of a NN are also shaped by the way its neurons are interconnected with one another.This connection is regulated by three parameters known as "weights," "biases," and "activation functions."The weights and biases determine the importance of the neuronal connections, while the activation functions employ non-linear operations to transform the input data of a set of neurons into useful output param- eters.In particular, for a generic neuron , the activation function   performs the transformation where  denotes the total number of inputs fed into the neuron,  is a generic input index,  and  are the input and the output data, and  , and   represent the optimised weight and bias of the neuron, respectively (see Figure 1).Typically, the activation function   is chosen by the user experimentally (e.g.see Section 4.1).
In the field of ML, the training of a NN is an iterative procedure designed to fine-tune the weights and biases of the neurons with the goal of minimising the error (also known as "loss" or "cost" function) between the network's final predictions and an assumed ground truth.During this process, the NN is exposed to labeled observations in order to identify hidden patterns in the data and learn to generate predictions that closely resemble the expected outcome for a given set of input parameters.As the training proceeds, the predictive accuracy of the NN is gradually improved with a technique known as "backpropagation."This technique calculates the loss function at each iteration to adjust the model parameters of the network.A standard formulation of the loss function is the quadratic expression where  is the index of a generic instance,2  is the total number of output neurons,  is the index of a generic output neuron, ŷ,  is the prediction of the -th output neuron,  ,  is the known true output, and  denotes the residual error difference between ŷ and .
For a given training dataset, the aggregated loss function of all the instances  can then be expressed as During training, the aggregated loss function is computed at every step in order to optimise the weights and biases of the network.These parameters are updated iteratively through the gradient of the loss function, ∇J ( , ).More specifically, the incremental change of a generic weight in the network, or Δ , , is proportional to the derivative of the total loss function with respect to the current value of the weight, In this expression, the partial derivative J / , dictates how much the network parameters should be changed in order to reach a minimum of the loss function, with the negative sign indicating that the direction of the "descent" should be against the gradient and towards the minimum.The term  is the "learning rate" of the network, i.e. a user-defined tuning parameter which provides stability to the learning process and controls the step size of the gradient descent.
Broadly speaking, the partial derivative J / , in Eq. 4 is proportional to the inputs of the neurons, the derivatives of their activation functions with respect to their weights, and the errors of the network.For the output layer, these factors are decoupled, which makes the computation of the derivative relatively straightforward.However, for the earlier (and deeper) layers, the term J / , depends on the derivative of the loss function for the outer layers, which in turn is proportional to the product of the derivatives of all the activation functions with respect to their weights.This product operation is implemented with the chain rule from the output to the initial layer, i.e. backwards, hence giving rise to the concept of backpropagation.We note that in many ML applications, including cecilia, the exact value of the gradient at each training iteration is not computed with the full training sample, but approximated with subsets (or "batches") of the latter (e.g.Shallue & Vanderburg 2018).

DATA
In this section, we motivate the use of a neural-network spectral interpolator, describe the atmosphere models and synthetic stellar properties used to train, validate, and test cecilia, and discuss the normalisation techniques used to improve the learning and predictive abilites of our code.

Generation of a Training Set: Motivation
Current white dwarf spectral modelling techniques rely on grids of stellar properties (or "labels") with uniform step sizes (e.g.Coutu et al. 2019).Although this approach has worked well over the years, it can be computationally expensive for a large number of free model parameters.For example, using 13 stellar properties -as in this work (see Section 3.1.1below)-and a relatively coarse grid of 10 points per label, conventional interpolators would require at least 10 13 models to explore the parameter space of their study.Assuming, on average, that each model takes about 1 hour to generate on a single CPU core, it would take on on the order of 10 9 CPU-years to compute the whole grid.Therefore, classical methods are prohibitively expensive.
To address this challenge, we haved developed cecilia, a ML generative spectral interpolator capable of sampling from a highdimensional label space based on randomly drawn training parameters, rather than a uniformly spaced grid of model points.With this ML approach, our pipeline only needs about 20,000 synthetic models to predict white dwarf spectra, i.e. about 2 × 10 −9 % of the models that classical methods would typically require for the same task.

Synthetic Stellar Labels
In this work, we have chosen to characterise He-rich polluted white dwarfs with 13 stellar properties, namely: their effective temperature ( eff ) and surface gravity (log g), their hydrogen abundance, and 10 metal abundances for Ca, Mg, Fe, O, Si, Ti, Be, Cr, Mn, Ni (see Table 1). 3In our models, we have also considered 14 additional heavy elements (C, N, Li, Na, Al, P, S, Cl, Ar, K, Sc, V, Co, and Cu).However, none of these metals are fitted during cecilia's optimisation procedure.
To prepare cecilia's training set of synthetic spectra, we generated 25,000 unique combinations of the 13 aforementioned stellar labels (i.e. eff , log g, and 11 elemental abundances).In particular, for  eff , we generated our labels from a random uniform distribution spanning the range [10,000, 20,000] K.At higher effective temperatures, the presence of photospheric metal pollution may not be due to the accretion of exoplanetary material, but to the effects of radiative levitation pressure (Chayer et al. 1995).For cooler white dwarfs, there is less thermal energy to produce atomic transitions in the optical; as a result, there is a limited number of polluted systems exhibiting multiple heavy elements at once.With respect to log g, we followed Chandra et al. (2020) and sampled our values from a random uniform distribution between [7,9] cgs (with  in cgs units of cm•s −2 ).For log 10 (H/He), we also drew our labels from a random uniform distribution between [-7,-3] dex, as in Coutu et al. (2019).
To generate our labels for the other elements, we used the following procedure.First, we drew our calcium abundances from a random uniform distribution between [-12,-7] dex: (5) For the remaining 10 heavy elements (i.e.Mg, Fe, O, Si, Ti, Be, Cr, Mn, and Ni), we generated a random Gaussian distribution centered around   with a standard deviation   , where   = 2 dex and  Z = log 10 (Z/He) ⊙ − log 10 (Ca/He) ⊙ = log 10 (Z/Ca) ⊙ .
In the above expressions,  is the metal under consideration, log 10 (Z/He) ⊙ is the abundance of metal  in the solar photosphere (Asplund et al. 2009), and log 10 (Ca/He) ⊙ is the solar calcium abundance relative to helium (Asplund et al. 2009).We note that our choice of  Z = 2 dex was motivated by two factors: on the one hand,  had to be large enough to allow cecilia to measure the true elemental abundances of a white dwarf, even if the latter deviated from chondritic values; and on the other hand, we also wanted  to be small enough to down-weight unlikely regions of the parameter space in our training set, hence limiting the hypervolume of cecilia's label parameter space.As a final step, we generated the distribution D of abundance ratios log(Z/He) for our training set by adding In Table 1, we list the values of  Z and  Z for Mg, Fe, O, Si, Ti, Be, Cr, Mn, and Ni, rather than the properties of the (non-Gaussian) distributions resulting from evaluating Eq. 8.
Finally, for the remaining 14 heavy elements (C, N, Li, Na, Al, P, S, Cl, Ar, K, Sc, V, Co, and Cu), we chose to scale their abundances such that their abundance ratios relative to calcium corresponded Sampled from a Random Gaussian Distribution (Eq.7) to the values of CI chondrites reported in in Lodders (2003).This type of bodies are primitive carbonaceous meteorites with a pristine composition unaltered by melting and differentiation processes.With the exception of their volatile components, including their lithium and noble gas abundances, their chemical makeup -mostly dominated by O, Mg, Si, and Fe-closely resembles that of the solar photosphere (Palme et al. 2014) as well as that of bulk Earth (e.g.Zuckerman et al. 2007;Xu et al. 2019;Doyle et al. 2023).Therefore, CI chondrites are often used as a proxy for the average bulk composition of non-volatile objects in the Solar System (Lodders 2003;Jura & Young 2014).In the field of polluted white dwarfs, the assumption of chondritic ratios often constitutes a reasonable first-order approximation when there is no a priori knowledge of the stellar photospheric composition (Dufour et al. 2007;Coutu et al. 2019).In the context of this work, the use of chondritic abundances for the 14 "fixed" metals also allowed us to dramatically decrease the volume of the parameter space that cecilia had to learn during training.

Synthetic Atmosphere Models
Next, we used the 25,000 synthetic labels to generate their corresponding atmosphere models.To achieve this, we employed the code described in Dufour et al. (2007); Blouin et al. (2018a,b) and references therein.This code includes all the necessary physical principles to predict the emerging spectra of metal-polluted white dwarfs and it has been used successfully to reproduce ultraviolet (UV) and optical observations across a wide range of effective temperatures and metal abundances (Dufour et al. 2012;Giammichele et al. 2012;Vanderburg et al. 2015;Melis & Dufour 2017;Xu et al. 2017;Blouin et al. 2019a,b;Coutu et al. 2019;Blouin 2020;Kaiser et al. 2021;Klein et al. 2021;Caron et al. 2023;Doyle et al. 2023).More specifically, for each model, the code calculates a 1D thermodynamic structure first, and this stratification is used to generate a synthetic spectrum with no instrumental broadening and sampled with a pixel spacing equal to  ≈ 50, 000 (i.e.55,000 wavelength points between 3,000 Å and 9,000 Å). Figure 2 shows a selection of synthetic spectra for Herich polluted white dwarfs spanning different effective temperatures, surface gravities, and photospheric compositions.
For cecilia, we independently varied 13 parameters for each model (i.e. eff , log g, and 11 elemental abundances; see Table 1), which we randomly selected for each calculation as described in Section 3.1.1.In total, we calculated 25,000 atmosphere models, but 2,176 of them (8.7%) had to be discarded due to computational issues, leaving 22,824 useful synthetic spectra for training, testing, and validation. 4We note that exclusion of these models did not have any significantly impact cecilia's predictive ability; this is due to our code's ML-based spectral interpolation approach, which facilitates the computation of spectral models in regions of the parameter space where there are no observations.We also recall that while the abundances of 10 metals (Ca, Mg, Fe, O, Si, Ti, Be, Cr, Mn, Ni) were varied independently between different synthetic spectra, the remaining 14 heavy elements (C, N, Li, Na, Al, P, S, Cl, Ar, K, Sc, V, Co, and Cu) were also included in our models with their abundance ratios Z/Ca fixed to their nominal CI chondritic abundance from Lodders (2003).

Data Normalisation
As is standard procedure in ML, we then normalised the synthetic spectra and their corresponding stellar labels between 0 and 1 in order to improve cecilia's learning abilities during training.For our synthetic spectra, we experimented with different scaling techniques and ultimately adopted a "pixel-wise" (PW) normalisation approach (also commonly known as min-max scaling in ML).For a given wavelength index   , this approach performs the following linear transformation: first, it determines the minimum and maximum values (or "pixels") of the synthetic flux across all the available spectra; and then, it uses these flux bounds to normalise the pixel at wavelength   to a value between 0 and 1.Therefore, the normalisation is conducted with respect to the flux axis, rather than along the wavelength dimension.
As illustrated in Figure 3, the PW approach may sometimes cause the normalised spectra to appear distorted to the human eye.Nevertheless, it is a powerful scaling technique because it treats each pixel as an independent spectral "feature," thus allowing cecilia to learn both the small-and large-scale variations in the synthetic flux.In other words, pixels associated to large variations become less prominent, while those linked to smaller changes become magnified.In addition to this advantage, the PW normalisation differs from other ML scaling techniques (e.g.log-scaling) because it treats each pixel in the same way, therefore ensuring that cecilia does not have any preference for a specific absorption line.

Design of cecilia's ML Architecture
The central problem in this paper lies in accurately determining the main properties -or stellar labels-of polluted white dwarfs from their spectra.To achieve this goal, we have developed cecilia, an innovative ML pipeline capable of performing two tasks: (i) the generation of polluted white dwarf spectra from a set of stellar labels, i.e. S =  1 (L), where L is the vector space of the white dwarf labels (with dimension ), and  1 is a generic non-linear function; and (ii) the prediction of polluted white dwarf labels from their corresponding spectra, or L =  2 (S), where S is the vector space of  the white dwarf spectrum (with dimension ), and  2 is a function different to  1 .
Developed with the open-source Python tensorflow package (Abadi et al. 2015), cecilia is characterised by three deep neural networks: an Autoencoder, a Fully Connected Neural Network (FCNN1), and a replica of the FCNN1 for fine-tuning purposes (FT FCNN2).The Autoencoder is a ML tool consisting of two symmetrical networks (Figure 4): an Encoder and a Decoder.The Encoder identifies the main attributes of the input data and compresses the latter into a lower-dimensional representation.This operation takes place in the so-called "hidden" or "latent" space (denoted by H ), where H has a dimensionality equal to the number  of hidden/latent features used to encode the data (i.e.H ∈ R  ).The second part of the Autoencoder is the Decoder, which is a network designed to reconstruct the original input data based on the Encoder's latent model while trying to minimising the loss of information caused by the dimensionality reduction.We emphasise that the hidden features of the Autoencoder are fundamental properties of the ML model and are not intrinsically correlated to any stellar parameters.As a result, they have no astrophysical interpretation.In general, the choice of  constitutes a trade-off between the interpretability and the complexity of the ML model, characterised respectively by a low and a high  (Liang et al. 2023).
Following the Autoencoder, cecilia incorporates two deep neural networks: the FCNN1 and the FT FCNN2.As shown in Figures 5 and 6, these networks consist of an input layer, six hidden layers, and an output layer.Unlike the Autoencoder, which is used to perform compression and dimensionality reduction, the FCNN1 and FT FCNN2 architectures are designed to teach cecilia how to interpolate white dwarf spectra based on a set of 13 stellar labels.We provide a more detailed discussion about this subject in Section 4.2.
At the level of individual neurons, the choice of weights, biases, and activation functions can also shape the performance of a NN, as explained in Section 2. In this work, we followed common ML practice and set the weights of cecilia to random values before optimising them during training.More specifically, we used the tensorflow functions he_uniform and glorot_uniform to initialise the weights of all the initial layers and the final output layer, respectively.These two functions draw values from random uniform distributions in the range between [-limit, +limit].If fan in and fan out are, respectively, the number of input and output connections of a neuron, the parameter "limit" in tensorflow is defined as limit = √︁ 6/fan in (he_uniform) √︁ 6/(fan in + fan out ) (glorot_uniform) With respect to cecilia's activation functions, we opted for using a Rectified Linear Unit (ReLU; Jarrett et al. 2009) and a sigmoid function (Goodfellow et al. 2016), as specified in Table 2.These two activation functions regulate the flow of information by introducing non-linearities across the networks, normalising the neurons' output parameters, and enabling the identification of hidden and complex patterns in the training sample.For an input value , the ReLU function returns 0 if  is negative, or  if  is positive, Therefore, its derivative is given by In contrast, the sigmoid activation function () transforms the input value  into a value between 0 and 1 with As a consequence, its derivative approaches zero when  → ±∞.In neural networks with multiple hidden layers, the potential saturation of the sigmoid function towards zero values can lead to the so-called "vanishing gradient problem" (Hanin 2018).This issue arises during backpropagation when the gradient of the loss function -which encapsulates the product of the derivatives of the activation functions with respect to their weights (see Section 2)-becomes extremely small.Due to this "vanishing" effect, the weights and biases of the initial layers may only be marginally updated, which can cause poor and slow model convergence.As evidenced by Eq. 11, ReLU functions are more robust to this problem because they have a more stable derivative (either 0 or 1).As a result, they often yield a better performance during training (Nair & Hinton 2010).In this work, we decided to employ ReLU activation functions for the initial layers of cecilia's networks in order to avoid the vanishing gradient of the loss function.For the final layer of each network, we chose to use a sigmoid function with the goal of generating predicted values between 0 and 1.
In Table 2, we summarise the fundamental properties of cecilia's Autoencoder, FCNN1, and FT FCNN2.These properties were de-termined with the Python tensorboard library, which is a builtin optimisation module of tensorflow designed to execute and compare several configurations of the same ML architecture.In particular, tensorboard performs a short training of each configuration to identify the one with the most optimal performance.During cecilia's development phase, we used tensorboard to optimise the number of layers, the neurons per layer, the batch size (i.e. the number of training instances processed together during a given iteration), and the learning rate of the Autoencoder and the two FCNNs.

Network Training
To train cecilia, we used a partition of MIT's Satori Supercomputer with one NVidia V100 GPU.5 First, we randomly split our collection of synthetic labels and spectra into three datasets for training (70%), testing (20%), and validation (10%) purposes.Next, we used the backpropagation technique to optimise the hyperparameters of cecilia's neural networks.To this end, we implemented the Adaptive Moment Estimation (Adam) optimisation algorithm (Kingma & Ba 2017), which is an extension of the commonly used Stochastic Gradient Descent routine (Kiefer & Wolfowitz 1952).Finally, given that the synthetic spectra described in Section 3.1.2represented a substantial amount of data (a total of about 10 9 flux pixels for 22,842 spectra), we chose to train our pipeline by dividing the synthetic spectra in 29 windows of 200 Å, rather than by considering the full wavelength range at once. 6With this approach, the end-to-end training of cecilia took approximately three weeks.
For each spectral window of 200 Å, cecilia's training consisted of three sequential learning phases: an initial one for the Autoencoder, another one for the FCNN1, and a final one for the FT FCNN2.In each phase, the networks employed the same datasets for training, validation, and testing purposes, with each dataset consisting of unique combinations of the 13 randomly generated labels listed in Table 1 (i.e. eff , log g, and 11 elemental abundances from H to Ni).It is important to clarify that the Autoencoder and the FCNN1 were only used to improve the learning of the fine-tuned neural network.After concluding cecilia's training, our code only uses the FT FCNN2 to generate spectral models of polluted white dwarfs.Below, we summarise the three training phases in more detail.
• Phase 1 (Figure 4): First, we trained the Autoencoder in order to compress and reconstruct the normalised synthetic spectra based on only 100 latent features.At the end of this process, we used the trained Encoder to transform our full database of 22,824 synthetic spectra into their latent space representations.If S and H represent, respectively, the vector spaces of the synthetic spectra and the hidden space (see Section 4.1), we can denote the operations of the trained Autoencoder as • Phase 2 (Figure 5): Next, we trained the first Fully-Connected Neural Network (FCNN1) to compress the normalised stellar labels into their latent space representations.We then appended the trained Decoder to the output layer of the FCNN1 to convert the latent features of the normalised labels into predictions of normalised synthetic spectra.If L is the vector space of the The first component of cecilia's ML architecture is an Autoencoder, which consists of two networks: an Encoder (in grey) to compress input data into a small number of latent, hidden features (in orange); and a Decoder (in green) to reconstruct the encoded data from the latent space.In our work, the input and output data of the Autoencoder are spectral windows of 200 Å from our atmosphere models (in purple and pink, respectively).4 (in green) to predict the spectrum of a polluted white dwarf (in pink) from its corresponding stellar labels (in purple).
T eff log(g) log 10 (H/He) log 10 (Ca/He) log 10 (Be/He) log 10 (Mg/He) log 10 (Fe/He) log 10 (O/He) log 10 (Si/He) log 10 (Ti/He) log 10 (Cr/He) log 10 (Mn/He) log 10 (Ni/He) . The third component of cecilia's ML architecture is a replica of the FCNN1 aimed at fine-tuning the final predictions of our pipeline (FT FCNN2; in blue).Similarly to the FCNN1 Figure 5, this network is connected to the trained Decoder (in green) to predict a metal-polluted spectrum (in pink) from its stellar labels (in purple).stellar labels, the FCNN1 and trained Decoder are thus in charge of perfoming the operation: At this stage, we had two options for training the FCNN1 while keeping the trained Decoder fixed.On the one hand, we could directly determine the errors of the hidden layer with respect to the encoded spectra H .This scenario required prior compression of the spectra with the Encoder and involved backpropagating the error through the FCNN1 to correct the weights and biases of the network.On the other hand, we could calculate the errors of the FCNN1's output layer with respect to the true synthetic spectra S.This alternative option required backpropagating the errors through the fixed Decoder until reaching the hidden space, and then updating the weights and biases of the FCNN1 accordingly.
In other words, the main difference between the two approaches was the location of the network used to calculate the loss function, with the first and second scenarios employing, respectively, the encoded synthetic spectra H and the normalised synthetic spectra S. In this work, we decided to follow the first approach in order to (i) avoid unnecessary backpropagation steps while traversing the Decoder, and (ii) prevent gradient vanishing effects of the loss function, which could result in a slower and poorer performance.
• Phase 3 (Figure 6): Lastly, we sought to improve and fine-tune cecilia's final predictions by training a second Fully-Connected Neural Network (FT FCNN2) connected to the trained Decoder.This architecture replicates the structure of the FCNN1, but it differs from the latter because we allowed the parameters of the Decoder to be trainable.Similarly to Eq. 14, the joint FT FCNN2 and Trainable Decoder architecture performs the operation where the latent space of this third network, represented by the term H * in Eq. 15, is now different than those of the Autoencoder and the FCNN1 (H ).
Upon training cecilia's Autoencoder and FCNNs, we calculated the error between the predictions of each network and their respective true synthetic models.To this end, we adopted the Mean Absolute Error (MAE) metric as our loss function -a popular choice in ML regression problems.In this work, we calculate the MAE of a spectral window as where  and  represent, respectively, the index and the total number of instances -where we recall that an instance is a synthetic spectrum with its corresponding stellar labels-,  and  denote the index and total number of output pixels (i.e. the number of flux points in a given spectral window),  and ŷ correspond to the true and predicted synthetic flux.Figure 7 shows our loss function for each network, with the MAE averaged for the 29 windows in the wavelength range between 3,000 Å and 9,000 Å.From this figure, we find that the training error rates of cecilia's networks decrease with each epoch, as should be the case during training.This behavior demonstrates the ability of cecilia to recognise abstract features in the training data and generalise its acquired knowledge to previously unseen observations.At the same time, Figure 7 shows that none of cecilia's networks are suffering from the dangers of underfitting or overfitting:7 on the one hand, the absence of underfitting can be demonstrated with the convergence of the validation error, which becomes asymptotically horizontal, thus leaving no substantial margin for further generalisation; on the other hand, the problem of overfitting can be discarded because the validation error does not increase over time.In Figure 8, we also present an example of cecilia's predictions for several 200 Å-windows of a synthetic spectrum.
A general note about our training process is that Phases 2 and 3 recycle the trained Decoder to improve the performance of the FCNN1 and the FT FCNN2.After experimenting with less sophisticated ML architectures, we found this approach resulted in more accurate predictions than a simpler NN designed to predict stellar labels directly  from their spectra without any intermediary steps (i.e. =  ()).
The use of a pre-trained model to improve the training of another network is known as "Transfer Learning" and has been very successful in many scientific fields outside computational science (e.g.Tan et al. 2018).

Parameter Estimation
After training, cecilia can use its FT FCNN2 architecture to rapidly (<1 second) generate a full high-resolution synthetic spectrum from 13 stellar labels ( eff , log g, log 10 (H/He), and 10 metal abundances relative to helium).Exploiting the fast and automated ML interpolation capabilities of our pipeline, we have developed a new fitting framework to simultaneously and accurately measure the main physical and chemical properties of He-rich polluted white dwarfs from their spectra.This section explains our methodology for achieving this goal using a step-by-step approach.In our discussion, we will employ the subscripts "synth" and "obs" to represent cecilia's synthetic predictions and a real white dwarf spectrum, respectively.With this choice of notation, we will refer to their corresponding wavelength, flux, and flux error arrays as  X ,  X , and  X,err , where  will indicate the nature of the data (i.e.synthetic "synth," or observed "obs").Finally, we will denote their resolving power as  X , where  X ≡  X /Δ X (unitless).Broadly speaking, our fitting framework can be divided into three main phases (see Figure 9).First, cecilia's trained FT FCNN2 architecture generates a preliminary atmosphere model based on the assumed astrophysical properties of the white dwarf.Second, our pipeline optimises this initial prediction with the non-linear least-squares Levenberg-Marquardt algorithm implemented by the fast Python mpfit 8 library (Moré 1978;Markwardt 2009).To conclude, we perform a global Bayesian fit with the differential evolution Markov Chain Monte Carlo (MCMC) sampler of the Python edmcmc package (Vanderburg 2021;Ter Braak 2006). 9More specifically, our optimisation procedure consists of the following phases: (i) Step 1: Input Spectrum and Stellar Parameters: First, the user uploads the spectrum of their polluted white dwarf ( obs ,  obs ,  obs,err ) into cecilia, ensuring that the effective temperature and surface gravity of the star satisfy the ranges 10,000≤ eff ≤20,000 K and 7≤log g≤9 cgs imposed in Section 3.1.1(see Table 1).In parallel, the user provides a list of 14 white dwarf properties to our pipeline, either complete with initial guesses, or lacking the latter if unknown.These properties consist of the 13 stellar labels used during cecilia's training, 10 as well as an additional Radial Velocity shift (or RV thereafter) to account for the barycentric motion and gravitational redshift of the white dwarf.If necessary, the RV term can also be used to handle potential wavelength calibration issues in the spectrum. (ii) Step 2: Initial Guesses for Stellar Parameters: Next, cecilia 8 https://github.com/segasai/astrolibpy/blob/master/mpfit/mpfit.py 9Differential evolution is a genetic algorithm designed to find an optimal distance and direction for a jumping distribution from one MCMC chain to another. 10A future improvement to cecilia will be to include a trained neural network to estimate the effective temperature and surface gravity of the polluted white dwarf from existing photometric observations (see Section 6).In the meantime, we assume that the user has good external constraints on  eff and log g.
reviews the user's white dwarf properties and performs the following operations based on the availability (or absence) of initial guesses for the 13 stellar labels and the RV term: • Scenario 1: The user has prior knowledge on all stellar parameters: In this optimal situation, cecilia examines the user's initial guesses to confirm that they are within the ML bounds listed in  1.If any of these parameters are unknown, cecilia uses the mean of their ML bounds as initial guesses for its optimisation routine (i.e.Teff =15,000 K, log(g)=8 cgs, log(H/He)=-5, and log(Ca/He)=-9.5).For the remaining white dwarf parameters, cecilia adopts a zero RV shift and assumes chondritic abundances for Mg, Fe, O, Si, Ti, Be, Cr, Mn, Ni based on the estimated log 10 (Ca/He) of the white dwarf and the chondritic ratios of Asplund et al. (2009).
• Scenario 3: The user has no knowledge of the stellar parameters: In this worst-case scenario, cecilia generates preliminary estimates of  eff , log g, log 10 (H/He) and log 10 (Ca/He) by taking the mean of their corresponding ML bounds, as in Scenario 2 above.For the remaining white dwarf properties, it also follows the same procedure described in Scenario 2, that is: it assumes a RV shift of 0 km/s and chondritic abundance ratios for those elements lacking a user-defined initial guess. (iii) Step 3: Definition of cecilia's Spectral Model: After loading the observed spectrum into cecilia together with (adjusted) preliminary estimates of the 14 white dwarf parameters, the user is asked to choose the free and fixed parameters of their cecilia spectral model.If certain parameters are fixed, our code freezes their initial value to the user's guess or, if not provided, to cecilia's estimated value. (iv) Step 4: Adjusting the Format of the Input Data: The last step before initiating our fitting procedure is to adjust the white dwarf spectrum and its stellar labels to a format that cecilia can understand.To this end, our code automatically implements the following data processing techniques: • Normalisation of Initial Guess Labels: First, it normalises the guess stellar labels using the minimum and maximum values of the synthetic labels considered during training (see Section 3.2).Given that cecilia was trained in normalised space, this step is crucial to ensure that our code can effectively process and interpret the user's input data.
• Check of Spectral Units: Next, cecilia verifies that the observed stellar flux is expressed in terms of the calculated flux at the surface of the white dwarf, i.e.   , where [  ]=erg/(s•Hz•cm 2 •sr).Although this check is not of paramount importance due to the slope and offfset correction discussed in step (vi) below,   is the unit of choice of cecilia's atmosphere models (e.g.see Figure 8) and is therefore the preferred notation of our pipeline.In reality, however, real spectra of white dwarfs are typically not expressed in units of   , but in   , where [   ]=erg/(s•cm 2 •Å).As a consequence, our code integrates the possibility of transforming the observed stellar flux from   to   with where  is the speed of light (in [Å/s]),  is the radius of the white dwarf,  is its distance from Earth, and the second term represents the inverse solid angle of the white dwarf (in [1/sr]).In Section 5, we use the Gaia DR3 database (Gaia Collaboration et al. 2016Collaboration et al. , 2023) ) to determine the values of  and .
• Spectrum Cropping: Lastly, our code examines the observed wavelength array to make sure that it falls within cecilia's spectral coverage between 3,000 Å and 9,000 Å.Any regions outside these boundaries are automatically excluded from our fitting analysis (e.g.see Figure 12).

(v)
Step 5: cecilia's High-Resolution Spectral Prediction: Next, our code feeds the normalised stellar labels into cecilia's trained FT FCNN2 architecture to quickly generate a normalised synthetic spectrum for the white dwarf ( synth ,  synth ).At this stage, cecilia's synthetic spectrum shares the resolving power of the atmosphere models described in Section 3.1.2,i.e.  synth ≈50,000.Therefore, assuming that the resolving power of the observed white dwarf spectrum is lower than that of our atmosphere models ( obs ≪ synth ), cecilia's prediction cannot be optimised yet. (vi) Step 6: Processing of cecilia's High-Resolution Spectral Prediction: To enable a meaningful comparison between the output of cecilia's FT FCNN2 architecture and the observed input spectrum, our code performs the following automated and sequential operations: • Denormalisation of cecilia's Synthetic Prediction: First, it denormalises cecilia's prediction with the minimum and maximum flux values of the synthetic models used during training.This transformation brings cecilia's synthetic spectrum to the non-normalised space of the observed white dwarf spectrum.
• Smearing and Resampling: Second, our code smears and reasmples cecilia's denormalised prediction to the resolving power ( obs ) and wavelength grid ( obs ) of the input spectrum, respectively.To achieve this, it downgrades cecilia's synthetic spectrum from  synth to  obs by convolution with a Gaussian smoothing function.Then, it resamples cecilia's prediction by smearing the latter onto the wavelength grid of the observed spectrum using Python's one-dimensional linear interpolator numpy.interp.
• Slope and Offset Correction: Third, we correct for any spectral "jumps" in cecilia's prediction caused by the training of our ML pipeline in independent windows of 200 Å (e.g.see Figure 8).For this task, we start by defining an observed normalisation function N obs from the ratio between cecilia's prediction and the observed flux, where   synth is cecilia's synthetic flux at wavelength , and   obs is the observed stellar flux.Then, we use a weighted linear least-squares algorithm to fit N obs as a function of the observed wavelength array,  obs .From this analysis, we obtain a model normalisation function N model (e.g.see purple line in the second panel of Figure 8), which we then multiply to cecilia's prediction to minimise the jumps between consecutive windows.We note that cecilia can do a slope and offset correction with an n-th degree polynomial as well.
• Radial Velocity Shift: Finally, our code applies a radial velocity shift to cecilia's denormalised, smeared, resampled, and slope-corrected prediction.In particular, if  is the speed of light, and  obs is the assumed RV shift of the white dwarf, the total wavelength shift of the spectral lines can be calculated with From the above, we determine the shifted wavelength array of cecilia's synthetic prediction with Using this new wavelength grid, we then perform a cubic spline interpolation to obtain the shifted flux and flux error arrays (Eq.21).In the absence of flux points within a certain wavelength range, our function simply extrapolates the data.Therefore, for a shifted wavelength array  ′ synth , the final products of our interpolation are: In the remaining discussion, we will denote cecilia's smeared, resampled, slope-corrected, and RV-shifted synthetic wavelength and flux as  synth,corr and  synth,corr , respectively. (vii) Step 7: Initial mpfit  2 Optimisation: After concluding steps (i)-(vi), our code initiates the first phase of our optimisation procedure, namely: a preliminary non-linear least-squares fit implemented with the mpfit package.The goal of mpfit is to quickly (< 2 minutes) identify good values for the model parameters which can then be used as initial guesses for our MCMC.To achieve this, mpfit minimises the chi-squared statistic ( 2 ) of cecilia's corrected prediction, where  points is the total number of points in the observed input spectrum,   synth,corr is cecilia's corrected synthetic flux at wavelength ,   obs is the observed stellar flux, and   obs,err is the uncertainty of the latter.
As illustrated in Figure 9, mpfit computes the  2 metric in Eq. 22 as follows: first, it provides the normalised stellar labels to cecilia, which then invokes its trained FT FCNN2 architecture to rapidly generate a preliminary normalised, high-resolution ( synth ), synthetic spectrum ( synth ,  synth ).Next, cecilia's prediction is adjusted with the techniques described in step (vi), which allows mpfit to compare the observed input spectrum to    Step 2 Step 3 Step 4 Step 4 Step 5 Step 6 Step 7 Step 8 Best-Fit MCMC Model, including WD Labels and Uncertainties Figure 9.A schematic view of our fitting methodology using cecilia and a combination of frequentist and Bayesian statistical techniques.First, the user uploads the spectrum of their polluted white dwarf onto cecilia, together with a list of 14 stellar properties ( eff , log g, 11 elemental abundances, and a RV shift; see step (i) in Section 4.3).If these properties are unknown, cecilia estimates initial guesses as explained in step (ii); afterwards, it verifies that the labels are within cecilia's ML bounds.Next, the user defines the free and fixed parameters of their spectral model (step (iii)).Then, our code adapts the input spectrum and stellar properties to cecilia's format (step (iv)); this includes normalising the guess labels and excluding any regions of the observed spectrum outside of cecilia's coverage between 3,000 Å to 9,000 Å.After this data processing step, our code feeds the normalised labels to the trained FT FCNN2 network, which then takes <1 second to generate a normalised, high-resolution synthetic spectrum (step (v)).This prediction is subsequently denormalised and adjusted to the properties of the input spectrum (step (vi)) before being optimised with two techniques: a non-linear least-squares  2 minimisation algorithm (step (vii)), and a Bayesian MCMC (step (viii)).Using 1 GPU from the MIT Satori Supercomputer, our fitting procedure produces a full spectroscopic solution in less than about 10 hours, including the stellar labels of the polluted white dwarf with their corresponding uncertainties.Source of icons: Flaticon.cecilia's corrected prediction.From this comparison, mpfit calculates the  2 metric with Eq. 22 and repeats the entire optimisation procedure until identifying a set of optimal model parameters.For these best-fit labels, mpfit also provides their covariance matrix, which can then be used to derive initial estimates of their uncertainties.

(viii)
Step 8: Final MCMC Exploration The second phase of our fitting routine employs the best-fit results of mpfit to initialise an MCMC sampler and obtain more robust parameter uncertainties.In contrast to mpfit, which uses frequentist (or "classical") statistics to minimise the  2 value of cecilia's predictions and determine best-fit values with fixed error bars, an MCMC relies on Bayesian inference to maximise the likelihood ℒ of the model parameters and evaluate their full posterior probability distributions.These "posteriors" are proportional to the information contained in the observations as well as to any pre-conceived beliefs or assumptions about the model parameters (or "priors").
Mathematically, the posterior probability () of observing the model parameters () given an observed spectrum (  obs ) is encoded in Bayes' Theorem where ℒ is the likelihood function, and  0 () is the prior probability distribution given an array of model parameters .Assuming that the likelihood of cecilia's model parameters can be modelled with a Gaussian distribution, our MCMC is designed to explore and map the multidimensional likelihood function, given by ln(ℒ) = − 1 2 Above, we adopt the logarithmic scale for numerical and computational efficiency.In particular, the use of the log-likelihood is driven by three main factors: first, it simplifies and speeds up the calculation of derivatives; second, it penalises unlikely fits by assigning them a very high  2 value and thus a very small loglikelihood; and third, it does not entail any loss of information during the fit because both ℒ and ln(ℒ) have a maximum in the same location.
In Eq. 24, the last two terms represent the prior distributions in our cecilia model.The first term ( =1  0,phot j ) assumes that the user has reasonable estimates for the effective temperature and surface gravity of the white dwarf from an external photometric fit.The second term ( =1  0,chondr k ) is optional and can be used to impose chondritic abundance priors on any metal of interest based on the assumed calcium abundance of the star and the chondritic values of Asplund et al. (2009).These physically motivated priors are useful to constrain the abundances of those heavy elements that are barely visible or undetectable in the observed white dwarf spectrum.
After defining the prior distributions of our likelihood function, we execute our MCMC following the same structure presented in step (vii); this time, however, we do not minimise the  2 statistic of cecilia's predictions, but maximise the loglikelihood function ln(L).To run our MCMC, we use an ensemble of  walkers walkers and  draws draws, discarding the first 20% of the draws in a "burn-in" phase to exclude non-stationary values.11With 1 Satori GPU and suitable combinations of these three MCMC hyperparameters, we find that cecilia can produce an optimal spectroscopic model in less than 10 hours, with the time complexity of this process depending primarily on the the computational time associated to the predictions of the FT FCNN2, rather than the quality and/or resolution of the observed input spectrum.In addition to providing a best-fit spectral solution, our MCMC also determines the conditional probabilities of the model parameters, also known as "posterior probability distributions."From the these posteriors, the MCMC computes the median values of each label, together with their 16th, 50th, and 84th percentiles.In this work, we report our MCMC results using the median statistic and its 1 confidence interval.
At the end of the MCMC, we also examine the quality of cecilia's fit with a variety of diagnostic figures and performance metrics.Among them, we generate scatterplot matrices (or "corner plots") to visualise the two-dimensional "marginal" posterior of a given pair of model parameters alongside their one-dimensional marginal histogram projection.For a parameter   , the marginal posterior  (  |  obs ) can be computed with where  iterates over all the parameters.Finally, we test the convergence of the chains using the Gelman-Rubin (GR) potential scale reduction factor R for each model parameter (Gelman & Rubin 1992).This metric compares the within-chain-variance with the scatter of the points between the chains.In this work, we assume convergence of a model parameter if its GR value is lower than 1.003 (Vehtari et al. 2021;Vats & Knudson 2018).

RESULTS
In this section, we investigate the performance of cecilia using two types of observations: noiseless synthetic models, and a real spectrum of a well-characterised polluted white dwarf.By testing cecilia under ideal and realistic noisy conditions, we aim to understand its retrieval accuracy, its usage limits, and its suitability for the study of different white dwarfs and astronomical datasets.

Sensitivity Analysis with Synthetic Observations
One of the main disadvantages of ML techniques is the lack of their explainability, i.e. the difficulty of interpreting their predictions through a human lens.Despite notable progress in recent years to make ML models more transparent and easier to understand (Saranya & Subhashini 2023;Vilone & Longo 2020;Gunning et al. 2019), neural networks are still perceived as "black boxes," which can sometimes hinder our ability to trust their predictions and their decisionmaking process.In the context of cecilia, we have sought to address this problem by evaluating the baseline performance of its trained architecture under well-defined and optimal conditions.To conduct our study, we selected 1000 random synthetic spectra from our test set; in other words, we only considered atmosphere models that cecilia had never seen during training.We then used cecilia's FT FCNN2 architecture and the mpfit library to model Table 3. Abundance accuracy of cecilia as determined from an analysis of 1,000 noiseless synthetic spectra from the test set (see Section 5.1).We also show cecilia's allowed ML bounds as well as the abundance ranges detected in He-rich polluted white dwarfs with effective temperatures between 10,000≤ eff ≤20,000 K (source: default column in the Montreal White Dwarf Database, MWDD; Dufour et al. 2016).The dagger symbol ( †) indicates that cecilia's performance is stable over the entire ML range of the parameter, while the star symbol (★) is used to highlight that cecilia cannot constrain Be to within 0.2 dex for synthetic data, hence suggesting a potentially worse performance for real, noisy observations.References:  (H/He) to log 10 (Ni/He)).To better understand the retrieval accuracy of cecilia under perfect conditions, we also decided to adopt the true stellar labels of each synthetic spectrum as our input guesses for our optimisation routine.We then executed mpfit with 30 iterations, obtaining a best-fit model in about 1 minute per spectrum.For computational efficency purposes, we did not complement our  2 minimisation analysis with a full MCMC, and instead treated the mpfit's results as our final solutions.After performing our fits, we determined the residuals ì  between their true labels and cecilia's predictions.Our results are presented in Figure 10 as a function of photospheric composition, with the colorbar indicating the true effective temperature of the synthetic spectra.Recognising that the visibility of a metal is strongly dependent on  eff , the general trend observed in Figure 10 is that cecilia performs very well for synthetic white dwarfs with high levels of metal pollution.As metal abundances decrease, the accuracy of our code is also lower because spectral signatures become less strong and harder for cecilia to mimic (typically less than roughly 1% with respect to the flux continuum, except for a few outliers).In practice, this problem is mitigated by the fact that subtle, shallow lines are challenging to detect observationally as well.
In Figure 10, we show all the independently varied elements during cecilia's training except for Beryllium, which our code could not reliably constrain.This light metal has a relatively low abundance in the present-day solar photosphere and in meteoritic CI chondrites (Asplund et al. 2009;Lodders 2003), which could also be the case for most polluted white dwarfs.With only two prominent resonance lines of Be in the near ultraviolet at 3,130.42 Å and 3,131.07Å (Kramida et al. 2022), its abundance is challenging to determine unless there is a substantial amount of this element in the photosphere of the white dwarf (e.g.Klein et al. 2021).Therefore, given the difficulty of cecilia to model Be, we recommend fixing its abundance to a chondritic ratio before using cecilia to fit the spectrum of a polluted white dwarf.In the remaining sections, we follow this approach to generate our results.
Using the residuals from Figure 10, we then calculated the systematic errors associated to cecilia's mpfit predictions.To this end, we estimated the retrieval accuracy of our pipeline from the ratio between the Median Absolute Deviation (MAD) of the residuals and the median of a standard Gaussian distribution, where the MAD provides a robust statistical metric against outliers.The green and orange vertical lines in Figure 10 show the lowest possible abundance values that cecilia can retrieve with an average accuracy lower than 0.1 dex and 0.2 dex, respectively.These bounds are summarised in Table 3 and can be used to assess the usefulness and robustness of cecilia for the study of a real polluted white dwarf.
In Figure 11, we also illustrate cecilia's estimated abundances as a function of synthetic effective temperature, with the colorbar representing the residuals of its predictions.From this figure, we find that cecilia's performance is overall quite stable over the range between 10,000 K and 20,000 K.However, its residuals are slightly higher for high-temperature white dwarfs, which is consistent with the fact that hotter sources are more opaque, thus making it more difficult to detect weak signs of metal pollution (e.g.Zuckerman et al. 2010).It is also interesting to highlight that cecilia's predictive power is particularly good for log 10 (H/He) and log 10 (Ca/He), as demonstrated by the low error in their predicted abundances.The level of accuracy for these two stellar labels is possibly due to the depth of their absorption lines compared to those of other metals (more than 1% deep relative to the flux continuum, even for the lowest possible abundances of H and Ca at the maximum and minimum allowed values of  eff and log g).
Figure 10.Residuals of cecilia as a function of true elemental abundance for 1,000 noiseless atmosphere models from our test set (see Section 5.1).To conduct our analysis, we fixed  eff and log g to their true values, and only fitted the 11 abundances considered in this work (i.e. from log 10 (H/He) to log 10 (Ni/He) in Table 1).The vertical lines in green and orange denote, respectively, the boundaries where cecilia achieves a predictive accuracy lower than 0.1 dex and 0.2 dex as calculated by Eq. 26 (see Table 3).The black lines represent the minimum and maximum ML bounds of cecilia (see Table 1), with the grey hatched regions symbolising abundances outside of cecilia's allowed parameter space.This figure does not show any results for  eff and log g, which were fixed parameters during our fits.It also does not include our residuals for Be due to the poor performance of our code with this metal.
Figure 11.cecilia's estimated elemental abundances as a function of synthetic effective temperature.The colorbar shows the residuals of its predictions, which are slightly higher for hotter white dwarfs.Similarly to Figure 10, the black lines denote the minimum and maximum allowed ML bounds of Cecilia (see Table 1), while the grey hatched areas represent photospheric compositions outside of cecilia's range.From this figure, we find that cecilia's predictions are particularly good for log 10 (H/He) and log 10 (Ca/He).

Performance with Real Observations
In this section, we test and validate cecilia against real spectroscopic observations of the He-rich, heavily polluted white dwarf WD 1232+563.This star was discovered by Eisenstein et al. (2006), who compiled a catalog of 19,712 spectroscopically-confirmed white dwarfs from the Sloan Digital Sky Survey (SDSS, York et al. 2000).More recently, it was studied by Coutu et al. (2019) and Xu et al. (2019) at low-and high-resolution: the former conducted a spectroscopic analysis of 1,023 He-atmosphere polluted systems observed by SDSS, while the latter used data from the Keck HIRESb and ESI spectrographs to perform a detailed characterisation of WD 1232+563.From these literature studies, the physical and chemical properties of WD 1232+563 appear to fall well within the allowed bounds of cecilia (see Table 4).Therefore, this white dwarf constitutes an excellent candidate to understand the behavior of our pipeline with real astrophysical observations.To perform our sensitivity analysis, we choose to use cecilia to fit the low-resolution, flux-calibrated spectrum of WD 1232+563 from the SDSS DR18 database (plate = 8232, fiber = 84; see Figure 12). 13iven that SDSS observations have a variable resolving power of =1,500 at 3,800 Å and =2,500 Å at 9,000 Å, we adopt a mean value of  obs ≈2,000 in our fitting procedure.Moreover, to evaluate cecilia's behavior with unprocessed real data, we do not remove any problematic data points from the spectrum, such as bad flux points or instrumental artefacts.
As explained in Section 4.3, the first step in our fitting routine is to load the SDSS spectrum of WD 1232+563 into cecilia, together with reasonable estimates for the white dwarf's effective temperature, surface gravity, elemental abundances (H, Ca, Mg, Fe, O, Si, Ti, Be, Cr, Mn, and Ni), and RV shift.To generate our list of stellar parameters, we adopt the  eff and log g values obtained by Coutu et al. (2019) from existing photometric observations; the calcium photospheric abundance derived by the same authors from the Ca II H&K absorption lines at air wavelengths 3,934 Å and 3,369 Å (Kramida et al. 2022); and the elemental abundances of Xu et al. (2019) for the remaining heavy elements.We also assume a zero RV shift as our initial guess for the RV term in our spectral model.In our analysis, we choose to treat all 14 stellar properties as free model parameters.
To begin with, we execute our mpfit least-squares  2 minimisation routine, restricting the label parameter space to cecilia's ML bounds, requiring a maximum of 50 iterations, and using gradient step sizes14 of 0.01 K for temperature, 0.01 cgs for surface gravity, 0.01 dex for all the elemental abundances, and 0.0001 km/s for the RV shift.With these hyperparameters, mpfit generates an optimal spectroscopic solution with its corresponding 14 stellar labels in less than 1.80 minutes.Next, we run a final MCMC with  walkers =50 walkers,  draws =5,000 draws, and  burn = 0.2 •  draws =1,000 draws.We initialize the walkers in Gaussian balls with standard deviations of 0.01 K for  eff , 0.01 cgs for log g, 0.01 dex for the 11 elemental abundances, and 0.0001 km/s for the RV shift.We then define our likelihood function with Eq. 24, incorporating two types of prior distributions: first, we adopt weak photometric priors of 500 K and 0.1 cgs on  eff and log g based on the uncertainties of these parameters in Xu et al. (2019); and second, we assume chondritic abundance priors for Be, Cr, Mn, and Ni, which are elements that we cannot visually and confidently detect in the SDSS spectrum. 15For these chondritic labels, we impose a prior width of 0.5 for Be, Cr, and Ni, with a stricter width of 0.25 for Mn to ensure chain convergence.
With the exception of Be, Cr, Mn, and Ni, for which we use chondritic abundance ratios as initial guesses, we take the results of mpfit to initialise our MCMC sampler.This approach, combined with the edmcmc configuration mentioned above, generates a final spectral model in about 10 hours with a converged Gelman-Rubin statistic lower than 1.008 for all our model parameters.In Table 5, we also summarise the MCMC best-fit labels of WD 1232+563, excluding the four undetected heavy elements with chondritic priors (i.e.Be, Cr, Mn, and Ni).We also report the systematic ( sys,ML ) and statistical errors ( stat,MCMC ) of each model parameter, obtained respectively from our sensitivity analysis of synthetic spectra (see Section 5.1) and our MCMC fit.The former measure the errors of cecilia's predictions arising from the ML-based spectral interpolation process, while the latter represent the errors of the SDSS data and are hence observational in nature.Assuming Gaussian and uncorrelated errors, we can approximate the total uncertainty of a model parameter ( tot ) by adding these two sources of errors in quadrature (together with any other systematic errors, or  other , coming from the atmosphere models, which we do not consider in this work), Using the above expression, we calculate the total uncertainty of each stellar label and provide it in Table 5.As a conservative measure, we assume that the MCMC has underestimated the statistical error of those abundance parameters with  tot <0.15 dex, where this threshold represents an approximate SDSS noise floor. 16For these abundances (flagged with a star symbol in Table 5), we replace the total uncertainty calculated in this work by  tot =0.15 dex.
In Figure 13, we also show our mpfit and MCMC solutions as well as their corresponding "Observed-Calculated" (O-C) flux residuals.To complement this figure, Figure 14 presents the RVshifted MCMC model for different regions of the SDSS spectrum with clear metallic absorption lines, together with two additional models obtained with the same  eff and log g, but with the abundances of the detected elements changed by 3 stat,MCMC .Finally, Figure 15 presents a corner plot of the posterior probability distributions of the model parameters as a measure of MCMC chain convergence.
Based on Table 5, and taking the properties of WD 1232+563 from Xu et al. (2019) as our assumed ground truth, our results demonstrate that cecilia can accurately retrieve the majority of parameters in our model.This includes  eff and log g -for which we use weak photometric priors-as well as the elemental abundances of 6 heavy elements with visible absorption lines in the SDSS spectrum (Ca, Mg, Fe, O, Si, and Ti; see Figure 14).Among these metals, a few of them (e.g.Si, Ti, O) are weak detections, as evidenced by their small observational signatures and their high statistical uncertainties relative to those of other detected elements.For these elements, we might not have been able to claim a confident detection without knowledge of the higher-resolution and signal-to-noise spectra from Xu et al. (2019).However, given that our current knowledge of this star confirms their existence, it is promising to see that our pipeline can estimate abundance values consistent with those in the literature.In the future, cecilia's detection of weak lines in low-resolution data can be a sign that a white dwarf deserves spectroscopic follow-up at high resolution and high signal-to-noise.
With respect to those elements with no clear spectral signatures (Be, Cr, Mn, and Ni), cecilia can marginalise over the uncertainty in the true values of their abundances thanks to the use of chondritic priors in the likelihood function.These chondritic priors limit the values of the parameters explored in our fit, which in turn facilitates MCMC convergence.We note that the assumption of chondritic priors in our MCMC is effectively equivalent to the classical convention of including undetectable metals in atmosphere models with their abundances fixed to a certain value (e.g.chondritic levels).
Another interesting result from Table 5 is cecilia's retrieved calcium abundance (log 10 (Ca/He)≈-7.44), which is more consistent with that of Coutu et al. (2019) (log 10 (Ca/He) Coutu ≈-7.41) than with the value reported by Xu et al. (2019) (log 10 (Ca/He) Xu ≈-7.69).This discrepancy is understandable given the different optimisation strategies used in each paper.More specifically, Coutu et al. (2019) fitted the same SDSS spectrum of WD 1232+563 considered in this work, focusing exclusively on the Ca II H&K doublet and using a maximum likelihood approach similar to that of our code.In contrast, Xu et al. (2019) implemented a line-by-line  2 minimisation technique of the Ca II H&K doublet and the near-infrared (IR) Ca II triplet in their high-resolution Keck spectrum.For this task, they individually fitted all the Ca lines and weighted them equally to obtain a mean calcium abundance, hence giving more importance to the IR triplet than cecilia does.Indeed, our code fits all the Ca lines simultaneously -in combination with the full spectrum-and subsequently weighs them based on their observed strengths, uncertainties, and signal-to-noise.As Figure 14 shows, the Ca II H&K lines in the SDSS data are significantly stronger than the near-IR Ca Best-fit edmcmc MCMC models using an RV shift of 0 km/s and 94.74 km/s (in green and orange, respectively).Similarly to Panel (a), the inset plot shows the spectral window between 3,800 Å and 4,000 Å, while the bottom panel shows the normalised O-C residuals for each spectroscopic solution.We note that the MCMC value of the RV shift is different from that reported in Table 5 because it also accounts for the barycentric velocity and gravitational redshift correction.In each panel, we also show a grey shaded area bounded by an orange and red dashed line, which correspond, respectively, to a ±3 stat,MCMC variation on the best-fit model (using the same  eff and log g, and only changing the abundances of the detected elements by 3 stat,MCMC ).In black letters, we indicate some of the strongest spectral signatures of hydrogen, helium, and multiple heavy elements.Finally, we represent the start/end of a 200 Å spectral window with a dashed vertical line and a black inverted triangle symbol.[a]: All abundances are given as logarithmic number abundance ratios relative to Helium, i.e. log 10 (n(Z)/n(He)).
[b]: We note that there is at least another source of systematic errors coming from the atmosphere models themselves; these errors should be of the order of 0.10 dex and are not considered in this work.
[c]: We assume a SDSS systematic RV error of 10 km/s based on the estimates provided by e.g.Abazajian et al. (2009).
II triplet at around 8,498 Å, 8,542 Åand 8,662 Å.Therefore, similarly to the study of Coutu et al. (2019), they dominate cecilia's fitting procedure, leading to a best-fit log 10 (Ca/He) abundance somewhat higher than the result of Xu et al. (2019).In addition to the different fitting methodologies employed in each paper, it is also possible that the uncertainties on the SDSS flux, which are higher in the red region of the spectrum (likely due to telluric and/or instrumental effects), may have brought our Ca abundance closer to that of (Coutu et al. 2019) than to the result of (Xu et al. 2019).
Besides providing the main physico-chemical properties of WD 1232+563, cecilia generates the posterior probability distributions of its model parameters and shows their interdependence in the form of a corner plot (see Figure 15).With this Bayesian approach, cecilia differs from conventional white dwarf analysis techniques, which can only use a limited number of points in their interpolation grid to explore the relationship between two model parameters (e.g.Klein et al. 2011;Xu et al. 2019).In general terms, corner plots reveal possible correlations between two labels if their posterior distributions have a slanted and oval-like contour shape.For WD 1232+563, this seems to be the case for multiple stellar properties, such as  eff and log g,  eff and log 10 (Ca/He), log g and log 10 (Mg/He), or log 10 (Ca/He) and log 10 (Mg/He).To qualitatively investigate the presence of these correlations, we computed the Pearson Coefficient (PC) for all our model parameters, where   and   are the posterior distributions of labels  and , and    and    are the arithmetic means of their posteriors.A PC greater or lower than 0 denotes, respectively, a positive or negative correlation, while a PC=±1 indicates that the data can be perfectly modelled by a line.
In Figure 16, we present the Pearson coefficients for cecilia's best-fit MCMC model parameters.Assuming a strong positive correlation if PC≥0.5, our results suggest that certain stellar labels are linearly dependent on one another (e.g. eff and log g, or log 10 (Ca/He) and log 10 (Mg/He)), with an increase in one label resulting in an increase of its corresponding label pair.Based on the corner plot and the Pearson coefficients alone, it is hard to determine whether the observed correlations can be generalised to the entire population of polluted white dwarfs.Although such an investigation is outside the scope of this paper, it is possible that certain stellar properties are linearly dependent on one another due to observational or physical reasons.Observationally, some of the metal correlations may be caused by the presence of blended absorption lines.From a physical perspective, changes in  eff and log g (a proxy for pressure) modify the temperature-pressure conditions at the stellar photosphere, which in turn control the strength and the width of the absorption lines.In addition, different values of  eff and pressure result in different excitation, ionisation, or dissociation states of the atoms in the stellar photosphere (e.g.Saha equation, Boltzmann equation), which could also potentially affect the metal abundances of the white dwarf.

Implications for White Dwarf Science
As demonstrated in Section 5, cecilia is the first pipeline capable of simultaneously estimating 10 elemental abundances with a retrieval accuracy of ≲0.1 dex.This performance compares favorably to that of conventional techniques, which in turn depend on a variety of factors, such as the main properties of the star (e.g. eff , log g, level of metal contamination), the quality of the spectroscopic observations (e.g.resolution, signal-to-noise), or the constitutive physics of their model atmosphere codes.In particular, conventional methods can reach typical uncertainties of about 0.10-0.20 dex for He-rich polluted white dwarfs, both across the UV and the optical range, and irrespective of their underlying atmosphere models (e.g.Doyle et al. 2023;Klein et al. 2021;Izquierdo et al. 2020;Raddi et al. 2015;Wilson et al. 2015;Jura et al. 2012;Zuckerman et al. 2007).Therefore, cecilia's performance is close to -if not comparable to-that of classical approaches, only introducing small uncertainties due to the accumulation of errors in its ML predictions.With a promising performance and an automated architecture that eliminates the need for manual, time-consuming, and iterative work, cecilia stands as a promising tool for the study of polluted white dwafs and the bulk composition of their accreted material.
With the advent of wide-field, multi-object astronomical surveys like SDSS-V (Kollmeier et al. 2017), the Dark Energy Spectroscopic Instrument (DESI; Cooper et al. 2023;DESI Collaboration et al. 2016), or WEAVE (Dalton et al. 2014), the volume of spectroscopic observations is poised to grow exponentially (Smith & Geach 2023).These surveys are expected to acquire a large number of white dwarf spectra, so they represent a unique opportunity to improve our fundamental knowledge of metal pollution.Unfortunately, as of today, the exploitation of these datasets remains a challenge for conventional techniques due to their time-consuming and human-supervised nature.With cecilia, we have developed a practical solution to characterise the spectra of polluted white dwarfs in a fast, automated, and accurate way.For instance, using 1 GPU of the MIT Satori supercomputer and limiting our optimisation procedure to mpfit, we estimate that cecilia would take about 14 and 17 days to process the 40,000 and 50,000 white dwarf spectra in the cumulative DESI and WEAVE databases, respectively.In reality, cecilia's performance should be much faster than two weeks, as only a small fraction of these observations will exhibit signatures of metal pollution.To quickly identify these polluted spectra, cecilia could benefit from other efficient ML-based classificators, such as the deep neural networks of Vincent et al. (2023) or the Random Forest approach of Garcia-Zamora et al. (2023).
Another valuable point of comparison is the SDSS-V survey, which has monitored more 6,000 unique Gaia white dwarfs as of 2021, and will acquire a total of 100,000 spectra of these stars in the coming years (Kollmeier et al. 2017;Chandra et al. 2021).Assuming, unrealistically, that all these spectra will be metal-polluted, they would represent an intractable amount of data for traditional white dwarf characterisation methods.Nonetheless, with mpfit only, it would only take about a month to process all these observations.Therefore, in light of these capabilities, our ML-based pipeline is a powerful tool to harness the information content of massive spectroscopic databases and scale up individual studies of polluted white dwarfs, thus opening the door to a statistically-informed understanding of ancient extrasolar geochemistry.Ultimately, we aspire to use cecilia to bring the field of white dwarf science into the era of big data, allowing scientists to overcome the scalability problem associated to "human-in-the-loop" conventional analysis techniques.
In addition to cecilia's promising retrieval accuracy and speed, our pipeline offers a innovative Bayesisan framework to estimate the elemental abundances of polluted white dwarfs.More specifically, our inference approach differs from that of classical methods because (i) it integrates prior knowledge about the spectra and the main properties of the white dwarf, (ii) it provides robust parameter uncertainties based on their full posterior probability distributions, and (iii) it illustrates possible degeneracies between different model parameters thanks to the use of corner plots.Through these added capabilities, cecilia aims to provide a more detailed and nuanced perspective on the properties of white dwarf pollutants and their parent bodies.

Limitations and Future Work
The sensitivity analyses presented in Section 5 show that cecilia can effectively determine the main properties of He-rich polluted white dwarfs, achieving an accuracy comparable to that of conventional techniques without the need for human supervision.This performance has the potential to unlock population-wide studies of metal pollution, but it has several limitations that deserve further investigation.In this Section, we highlight the main caveats of our pipeline, discuss possible mitigation strategies, and propose new opportunities for future work.

General ML Design Choices
First, cecilia's ML architecture can be improved along three important dimensions: versatility, speed, and accuracy.In terms of versatility, our work has revolved around He-dominated polluted white dwarfs, which -for a given abundance level-exhibit stronger metallic lines than H-rich stars due to their lower photospheric opacity (Dufour et al. 2012;Klein et al. 2021;Saumon et al. 2022).Nevertheless, cecilia's potential extends beyond He-dominated systems, as its ML architecture can always be retrained with the model atmospheres and astrophysical properties of other stellar objects.
With respect to speed, our full optimisation procedure takes between 4 and 10 hours to provide a best-fit spectroscopic solution, depending on the user's choice of MCMC hyperparameters (see Section 4.3).This behavior makes cecilia an efficient pipeline capable of modelling several stars at the same time, while running uninterruptedly on a high-performance supercomputer cluster.The scalability of cecilia represents an improvement over the performance of classical methods, but it can still be optimised further in preparation for large-scale studies of polluted white dwarfs.A possible solution to making cecilia more efficient would be to explore whether simpler and/or faster ML architectures can achieve similar performance.
Lastly, cecilia's retrieval accuracy is better than 0.1 dex for 10 heavy elements, which places our pipeline at the level of many conventional techniques.Nevertheless, the errors accumulated in cecilia's networks introduce a small amount of noise into its final synthetic prediction, which in turn challenges the modeling of very shallow lines and introduces limits below which cecilia performs poorly.As model atmospheres improve with our understanding of white dwarf physics, larger training sets and/or better ML architectures might help improve cecilia's predictive abilities, obtaining a better match between its best-fit solutions and the observed metal abundances of a polluted white dwarf.Regardless of whether we improve cecilia's ML predictions, the errors are small enough that most absorption lines detected in the data would be well modelled by our code.

Training Improvements
As explained in Section 3, the size and quality of a training dataset can highly affect the performance of a neural network.In this work, we have have trained cecilia with more than 22,000 synthetic labels and atmosphere models, assuming that white dwarfs can be characterised with a set of 13 randomly varied independent parameters ( eff , log g, log 10 (H/He), and 10 metal abundances).As we design future iterations of cecilia, a larger and more comprehensive training dataset might help our pipeline to achieve a better level of precision.Another important update of cecilia would involve changing our training strategy to prevent distortions of the lines near the edges of the 200 Å spectral windows.A possible solution to this problem would be to re-train cecilia using a system of overlapping windows.
In relation to the stellar labels used in this work, we have identified three improvement opportunities that can be easily implemented in the future.To begin with, we can extend the abundance ranges listed in Table 1 -upwards and downwards-so that cecilia can gain sensitivity to both stronger and weaker levels of metal pollution.This is particularly relevant for Ca and Be.For instance, the inclusion of higher Ca abundances in our training set would allow us to validate our code against several heavily polluted white dwarfs whose astrophysical properties are currently outside of cecilia's allowed ML bounds (e.g.GD 40, Ton 345, GD 362, although the latter would also require extending cecilia to lower temperatures, as discussed below).Similarly, expanding the abundance ranges of Be to higher values would make our pipeline more sensitive to stronger Be lines.This would facilitate the analysis of several Be-polluted systems such as those described in Klein et al. (2021), which have a log(Be/Ca) ratio (≈-2.50 dex) considerably higher than that of the Sun (≈-4.96dex, Asplund et al. 2009).Second, we can make our pipeline more useful by retraining it with more heavy elements.With this idea in mind, we aim to include at least three new elements: carbon (C), sodium (Na), and aluminium (Al).These species have been detected in the atmospheres of multiple white dwarfs and can provide valuable insights into the volatile (C) and rocky (Na, Al) compositions of white dwarf pollutants (Greenstein 1976;Holberg et al. 1998; see review by Klein et al. 2021).Finally, we can try to improve cecilia's performance by exploring different ways to generate our stellar labels.This is especially important for the 14 metals with fixed chondritic abundances (C, N, Li, Na, Al, P, S, Cl, Ar, K, Sc, V, Co, and Cu), which might have introduced biases in our pipeline and caused cecilia's spectral predictions to favor chondritic values.Although the results presented in Section 5 do not show any evidence of this behavior, it will be important to investigate this potential problem in future versions of cecilia.
Regarding our atmosphere models, there are two possible changes that might make cecilia more useful and versatile.On the one hand, we can retrain our pipeline using a broader wavelength coverage.At present, cecilia can analyze spectroscopic observations in the optical range between 3,000 Å and 9,000 Å, but it is not prepared to model data in the UV (∼900≤  ≤3,000 Å).The UV contains important spectral signatures of volatile species -e.g.oxygen, carbon, nitrogen, sulfur, phosphorous-and is thus a crucial region for understanding the volatile budget of a white dwarf pollutant.From a practical point of view, the integration of the UV into our pipeline will also make cecilia particularly relevant for the study of UV observations from space-based facilities such as the Hubble Space Telescope.
Moreover, we can also retrain cecilia with synthetic spectra covering a broader range of effective temperatures.For example, it could be useful to include models of slightly cooler ( eff <10,000 K) and hotter white dwarfs (20,000≤ eff ≤25,000 K).At temperatures higher than  eff >25,000 K, the origin of metal pollution is unclear due to the outward effects of radiative levitation pressure, which can bring metals to the photosphere from the deep stellar interior (Chayer et al. 1995).For cooler white dwarfs, it is still possible to observe traces of metal pollution in their spectra (e.g.Elms et al. 2022;Blouin 2020;Hollands et al. 2017).While most of these cool systems only show a handful of metallic absorption lines (because of the insufficient thermal energy to populate atomic states responsible for many transitions in the optical), they also allow the detection of elements that cannot be easily observed in warmer objects where they are ionized (e.g.Na, K, Li).

Fitting Analysis
In addition to exploring possible improvements to cecilia's ML architecture and training datasets, we have also identified four opportunities to refine our fitting procedure.First, we can reconfigure our code to predict upper abundance limits when the absorption features of a metal are ambiguous or unobservable.This option is currently unavailable because our pipeline was trained in logarithmic space, which implies that cecilia can never predict a zero elemental abundance (i.e. the complete lack of an element).Introducing the possibility of quantifying upper limits would involve a reformulation of our code, but would constitute a valuable addition to cecilia.
A second enhancement to our pipeline would be to allow for a simultaneous spectroscopic and photometric fit of a polluted white dwarf spectrum.Given that our code does not currently use photometric observations in its optimisation routine, cecilia requires the user to have accurate estimates of the effective temperature and surface gravity of the star.These two parameters are also dependent on the amount of heavy elements in the stellar photosphere, so their prediction becomes an iterative problem.17Moreover, even if these two properties are well-defined, they might have been estimated with a set of atmosphere models different to those employed in this work, which could potentially introduce minor errors in cecilia's predictions.Therefore, to ensure that our parameter estimation routine is entirely self-consistent, it would be beneficial to incorporate a simple neural network capable of estimating values for  eff and log g based on the metal abundances predicted by cecilia at each iteration.
Third, we hope to improve cecilia's MCMC fitting procedure by introducing a jitter term in our log-likelihood function.This parameter would allow us to fit unknown sources of noise (e.g.instrumental noise, stellar activity), and could thus be particularly important to model weak absorption lines in real spectroscopic observations.Finally, given that MCMC pipelines can sometimes be quite sensitive to the initialisation of the model parameters, we hope to make cecilia capable of calculating the RV shift of the white dwarf when no estimates are provided by the user.

CONCLUSIONS
In this work, we have presented cecilia, the first ML pipeline designed to measure the main physical and chemical properties of intermediate-temperature, He-rich polluted white dwarfs from their spectra.In particular, our code exploits the power of neural-networkbased-interpolation to (i) rapidly generate synthetic spectral models of polluted white dwarfs in high-dimensional space, (ii) map the latter to a set of 13 underlying stellar properties ( eff , log g, log 10 (H/He), and 10 metal abundances); and (iii) generate a final spectral prediction through frequentist and Bayesian fitting techniques (see Sections 3 and 4).
With the development of cecilia, we have sought to tackle the dependence of conventional spectral analysis techniques on manual and iterative work, which in turn make them time-intensive, prone to human errors, and impossible to scale up to large samples of polluted white dwarfs.As explained in Section 5, cecilia's architecture speeds up and automates these conventional methods by providing preliminary metal abundances in less than 2 minutes and a final spectroscopic solution in less than 10 hours.After testing the performance of our pipeline with noiseless synthetic data and with real spectroscopic observations of WD 1232+563, we have shown that cecilia can retrieve metal abundances with uncertainties better than 0.1 dex for up to 10 elements, with only beryllium proving hard to constrain.This level of accuracy is similar to that of state-of-the-art techniques, and it can only be improved in the future as we incorporate larger/better training datasets and integrate user feedback.
Acknowledging the main limitations of our pipeline (see Section 6), cecilia has the potential to enable population-wide studies of metal pollution without entailing any manual work.These studies will significantly expand the number of polluted white dwarfs with well-characterised abundances, thus playing an important role in improving our knowledge of the geology and chemistry of extrasolar material.Unveiling the diversity of extrasolar compositions is especially relevant in the context of upcoming wide-field astronomical surveys like DESI, WEAVE, or SDSS-V, which will acquire a large volume of spectroscopic observations in the coming years.While conventional white dwarf characterisation techniques are too slow to process these massive datasets, our ML pipeline offers a fast, automated, and reliable solution to measuring trace elements in the atmospheres of polluted white dwarfs.
Finally, as a feature extraction tool with a versatile ML architecture, cecilia can also be replicated and transferred to other disciplines beyond the realm of white dwarf and of planetary science.As we step into the world of Big Data, where the ability to uncover meaningful insights from observations will increasingly become more important, cecilia exemplifies the immense potential of combining powerful ML tools with state-cutting-edge scientific knowledge and analysis techniques.

DATA AVAILABILITY
The SDSS spectrum of WD 1232+563 can be downloaded from the SDSS DR18 online database.

Figure 1 .
Figure 1.Parameters and operations associated to a single "neuron," which represents the smallest unit of a neural network.

Figure 2 .
Figure2.A selection of 6 atmosphere models for synthetic He-rich polluted white dwarfs with effective temperatures between 10,000 K and 20,000 K, and surface gravities between 7 cgs and 9 cgs.Cool and hot white dwarfs are shown, respectively, in red and blue.

Figure 3 .
Figure 3.Effect of applying the Pixel-Wise (PW) normalisation technique on a synthetic spectrum in the spectral window between 3,800 Å to 4,000 Å.This technique is designed to improve cecilia's learning capabilities by amplifying any small variations in the stellar flux and attenuating the largest fluctuations.Panels (a) and (b) show, respectively, the raw and PW-normalised synthetic flux of the spectrum.The inset plots show the region between 3,330 Å and 3,350 Å.

Figure 7 .
Figure 7. Mean Absolute Error (MAE) statistic (or loss function) for the Autoencoder, FCNN1, and FT FCNN2, averaged for the 29 windows of 200 Å used to train cecilia.The training and validation sets are shown in blue and red.The dotted and dashed lines denote an average MAE of 0.01 and 0.001, respectively.

Figure 8 .
Figure 8. Top Panel: Synthetic spectrum corresponding to a white dwarf with  eff =18,301 K and log g=8.5 cgs.To allow for a closer examination, we only show the wavelength range between 3,800 Å and 5,800 Å, which features 10 windows of 200 Å.Middle Top Panel: Denormalised cecilia FT FCNN2 prediction.Bottom Top Panel: Slope correction with a linear least-squares fit.In each panel, we use dashed vertical lines to indicate the start/end of a 200 Å spectral window.
t e x i t s h a 1 _ b a s e 6 4 = " s M E 5 G t r d j 1 c f e a u A k F V 5 O y p 1 a 7 4 c a F I m 5 9 G H e + j Z m 2 g o o e C P n 5 / n P I y R / F j G r j e R / O 0 v L K 6 t p 6 b i O / u b W 9 s 1 v Y 2 2 9 o m S h M 6 l g y q B u b U 9 6 G 8 P V T + L 9 o l F z / 1 A 1 u y s X K 5 f 0 8 j h w 4 B E f g B P j g D F T A N a i C O s D g D j y A J / D s j J 1 H 5 8 V 5 n b c u O Y s I D 8 C P c t 4 + A a u N k n Y = < / l a t e x i t > R = R obs < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 h 4 s F X O i I r H 7 i 8 / 3 3 + W s 2 U z K W e w = " > A A A B 9 H i c d Z D L S g M x G I U z 9 V b r r e r S T b A A m a o A 0 w S M E D e A L P x p 3 x a L w Y r 4 v R g r F E e A B + y H j 7 B D h B k + U = < / l a t e x i t > [a]: Genest-Beaulieu & Bergeron (2019), [b]: Coutu et al. (2019), [c]: Wolff et al. (2002), [d]: Xu et al. (2013), [e]: Koester et al. (2005), [f ]: Xu et al. (2016), [g]: Fortin-Archambault et al. (2020), [h]: Desharnais et al. (2008

Figure 12 .
Figure12.Low-resolution SDSS spectrum of the He-rich, heavily polluted white dwarf WD 1232+563.The normalised   flux and its corresponding observational error are presented in grey and blue, respectively.The vertical lines denote the spectral windows of 200 Å used to train cecilia (see Section 4.2).The red shaded area covers the wavelength region excluded in our optimisation routine (see Section 5.2).

Figure 13 .
Figure 13.Normalised best-fit models for the SDSS spectrum of WD 1232+563 (in light blue).Panel (a): Best-fit mpfit models using an RV shift of 0 km/s and 94.42 km/s (in red and blue, respectively).The inset plot shows the spectral window between 3,800 Å and 4,000 Å, where there are two important Ca II lines at about 3,934 Å and 3,969 Å.The bottom plot shows the normalised flux residuals (Observed-Calculated; O-C) associated to each prediction.Panel (b):Best-fit edmcmc MCMC models using an RV shift of 0 km/s and 94.74 km/s (in green and orange, respectively).Similarly to Panel (a), the inset plot shows the spectral window between 3,800 Å and 4,000 Å, while the bottom panel shows the normalised O-C residuals for each spectroscopic solution.We note that the MCMC value of the RV shift is different from that reported in Table5because it also accounts for the barycentric velocity and gravitational redshift correction.

Figure 14 .
Figure14.A selection of spectral windows illustrating our best-fit, RV-shifted MCMC model (in green) for the raw SDSS spectrum of WD 1232+563 (in light blue with grey error bars).In each panel, we also show a grey shaded area bounded by an orange and red dashed line, which correspond, respectively, to a ±3 stat,MCMC variation on the best-fit model (using the same  eff and log g, and only changing the abundances of the detected elements by 3 stat,MCMC ).In black letters, we indicate some of the strongest spectral signatures of hydrogen, helium, and multiple heavy elements.Finally, we represent the start/end of a 200 Å spectral window with a dashed vertical line and a black inverted triangle symbol.

Figure 15 .
Figure 15.MCMC corner plot for the SDSS spectrum of WD 1232+563.The off-diagonal plots show the two-dimensional histograms of the posteriors of the model parameters, marginalised over the rest of parameter values.In contrast, the histograms along the diagonal illustrate their one-dimensional marginalised distributions, together with their corresponding 1 confidence intervals.The grey and red lines show the best-fit values of Xu et al. (2019) and Coutu et al.(2019), respectively, as defined in Table5.In this figure, we exclude the 4 heavy elements with chondritic priors (i.e.Be, Cr, Mn, and Ni) and show the RV shift without subtracting the barycentric motion and gravitational redshift of the white dwarf.

Figure 16 .
Figure 16.Pearson Correlation Coefficients (PC) for the stellar properties of WD 1232+563.Some label pairs (e.g. eff and log g,  eff and Ca, H and Mg) exhibit a strong positive correlation (PC≥0.5),as suggested by their slanted, oval-like corner plots in Figure 15.In general, however, most labels have a weak or moderate correlation (PC<0.5).

Table 2 .
Main properties of cecilia's ML architecture.Its three neural networks are trained sequentially, but only the final one (i.e. the FT FCNN2) is used to model the spectra of He-rich polluted white dwarfs.

Table 1 .
Asplund et al. (2009)t within the range given by -500≤RV≤500 km/s, our code automatically sets this parameter to 0 km/s.Alternatively, if any of the abundances are outside of their corresponding ML ranges, cecilia attempts to correct this problem by generating a normal distribution centered on the user's estimate with a scale of 0.1 dex, and subsequently adopting a random value from this distribution as an initial guess.If this modification is still insufficient, cecilia's Gaussian estimate is then substituted by a chondritic abundance ratio based on the assumed log 10 (Ca/He) of the white dwarf and the chondritic values ofAsplund et al. (2009).
• Scenario 2: The user has partial knowledge of some stellar parameters: In this intermediate scenario, our code starts by verifying that the user's initial guesses for  eff , log g, log 10 (H/He) and log 10 (Ca/He) are within cecilia's allowed bounds in Table ).Last MWDD Access: November 2023.
the spectra.For this task, we fixed  eff and log g to their true values,12and only fitted the 11 elemental abundances listed in Table1(i.e. from log 10

Table 5 .
MCMC best-fit stellar parameters for the SDSS spectrum of WD 1232+563, together with their uncertainties and assumed ground truths.The systematic ( sys,ML ) and statistical uncertainties ( stat,MCMC ) measure the errors associated to cecilia's spectral interpolation (see Section 5.1) and to the MCMC fit, respectively.For abundance studies of exoplanetary material, we can approximate the total uncertainty of a model parameter ( tot ) by computing their quadrature sum (see Eq. 27).In this work, if the total error of an abundance is lower than an SDSS noise floor of 0.15 dex, we assume that the MCMC statistical errors are underestimated and impose  tot ≈0.15 dex (see star ★ symbol).The dagger symbol ( †) flags the model parameters dominated by photometric priors ( eff , log g).In this table, we do not present the best-fit abundances of the four heavy elements with chondritic priors (Be, Cr, Mn, and Ni); these metals were included in cecilia's MCMC, but are not detected in the SDSS spectrum.Finally, we do not report the systematic errors for  eff and log g because these two parameters were fixed during our sensitivity study of synthetic spectra (see Section 5.1).References (Ref.): [1] Coutu et al. (2019), [2] Xu et al. (2019).