Estimating the timing of stillbirths in countries worldwide using a Bayesian hierarchical penalized splines regression model

Reducing the global burden of stillbirths is important to improving child and maternal health. Of interest is understanding patterns in the timing of stillbirths -- that is, whether they occur in the intra- or antepartum period -- because stillbirths that occur intrapartum are largely preventable. However, data availability on the timing of stillbirths is highly variable across the world, with low- and middle-income countries generally having few reliable observations. In this paper we develop a Bayesian penalized splines regression framework to estimate the proportion of stillbirths that are intrapartum for all countries worldwide. The model accounts for known relationships with neonatal mortality, pools information across geographic regions, incorporates different errors based on data attributes, and allows for data-driven temporal trends. A weighting procedure is proposed to account for unrepresentative subnational data. Results suggest that the intrapartum proportion is generally decreasing over time, but progress is slower in some regions, particularly Sub-Saharan Africa.


Introduction
Stillbirths represent a large share of the global burden on child and maternal health.Globally, in 2019 there were an estimated 2 million stillbirths (babies born with no sign of life at 28 weeks of pregnancy or later) (Hug et al. 2021).This represents a rate of around 14 stillbirths per 1000 total births, which is of comparable magnitude to the global neonatal mortality rate of around 17 deaths per 1000 live births (Hug et al. 2019).As such, there have been increased calls to monitor stillbirth outcomes in addition to infant and maternal deaths to fully understand the extent of the risks faced by different populations during pregnancy and childbirth (Lawn et al. 2016).The Every Newborn Action Plan (ENAP), which was endorsed by 194 WHO Member States, calls for each country to achieve a rate of 12 stillbirths or fewer per 1000 total births by 2030 (World Health Organization 2014).While progress has been made in reducing stillbirths, substantial inequalities across countries and regions still persist (Hug et al. 2020).
An important part of working towards the goal of ending preventable stillbirths is having reliable estimates of when stillbirths occur, that is, whether a stillbirth occurs before or after the onset of labor (ante-versus intrapartum).Stillbirths that occur intrapartum are likely to be preventable in most situations given adequate access to healthcare (De Bernis et al. 2016), and thus are likely to be responsive to increased resources and health policy changes.The goal of this project is thus to estimate the proportion of stillbirths that are intrapartum for all countries and regions worldwide, over the period 2000-2021.Challenges in estimation exist, particularly in low-and middle-income countries, due to data availability issues.Even when data do exist, the quality of reporting and type of population captured varies substantially by data source type and diagnosis method.An intrapartum stillbirth is defined as a fetal death occurring after the onset of labor and prior to delivery.The gold-standard classification of whether a fetus is alive after the onset of labor is the presence of a fetal heartbeat.However, in many contexts, the presence of a fetal heartbeat may not be documented and so diagnosis of the presence of life occurs through other methods postpartum; for example, fetuses who died antepartum can have skin changes consistent with maceration, tissue injury, meconium staining, and edema (Da Silva et al. 2016).However, these types of diagnosis methods are much more prone to misclassification compared to heartbeat diagnoses.In addition, different countries use different definitions of what constitutes a stillbirth, based on gestational age or birthweight thresholds (or a mixture of both).For international comparability, the WHO recommends using the cut-off of 1000g or more at birth (if available), or after 28 completed weeks of gestation (Hug et al. 2020).But different countries report stillbirths at different definitions, and definitions may vary subnationally by jurisdiction; for example in the United States, there are eight different definitions by combinations of gestational age and weight (Da Silva et al. 2016).In particular, many countries report stillbirths at 22 weeks gestation.An additional issue stems from differences in data collection systems.While most high-income countries collect information on stillbirths through civil registration and vital statistics (CRVS) systems, many other countries rely on other collection systems that are less reliable and cover only a small portion of the population, and some countries have no information available.
As such, a method to estimate the proportion of stillbirths that are intrapartum needs to account for a wide range of data reporting, coverage, and classification issues.However, previous regional estimates of intrapartum stillbirths rely on median proportion-based approaches, making no data adjustments and not accounting for uncertainty in data, estimates, or projections (Hug et al. 2020).
To overcome these issues, we formulate a Bayesian hierarchical penalized splines model that takes into consideration different types of data on stillbirth timing from a wide range of reporting and diagnosis systems.The model captures the underlying relationship between changes in the proportion of stillbirths that are intrapartum and the neonatal mortality rate over time, which allows for estimates and projections of the intrapartum proportion to be made even in contexts where available data are limited.Additionally, the model allows for data-driven trends through the use of a penalized splines regression framework.The model is also informed by high-quality auxiliary data to adjust for different gestational age definitions, and allows for varying amounts of uncertainty around data from different sources.We also propose a post-estimation weighting scheme to account for varying levels of coverage in the observed data.
The remainder of the paper is structured as follows.Firstly, we describe data sources and availability, before introducing the modeling framework and presenting some results and validation.We conclude with a discussion of possible extensions to the model.

Data
Data on the number of stillbirths by timing (intrapartum versus antepartum) can come from a number of different sources.Data from civil registration and vital statistics (CRVS) systems generally has national coverage and are assumed to be relatively high quality compared to other sources.Data on stillbirth timing also come from health management information systems (HMIS) (such as DHIS2), which are commonly used in low-and middle-income countries to record information from a number of health facilities (Moxon et al. 2015).Subnational data are sourced from either health facility data or population-based studies, most notably through the Global Network Study, which collects data on stillbirths and neonatal deaths in multiple communities across several different countries (Frøen et al. 2016).Data were extracted and collated through a number of channels, including web-based searches of national statistics' offices, UNICEF country consultations, and literature searches.
Overall, a least one observation of the number of stillbirths that occurred intrapartum and antepartum was available for 92 countries across the period 2000-2020.Table 1 shows the breakdown of data availability by Sustainable Development Goal (SDG) region, suggesting that the most observations are available in the high-income country group.

Additional data sources
In addition to data on the number of intrapartum and antepartum stillbirths, we use several other data sources and estimates to help inform the model and calculate the eventual proportion of stillbirths that are intrapartum (IPSB) at various regional and global aggregations.
Firstly, we use estimates of the neonatal mortality rate (NMR), which is the number of deaths in the neonatal period (first 28 days of life) per 1000 live births, for every country and year in the estimation period of interest.NMR estimates are produced by the UN Interagency Group on Mortality Estimation (UN IGME) as part of annual SDG reporting efforts.Details of the estimates and estimation can be found in UN IGME (2021).In particular, we use the full posterior samples of NMR for each country-year in order to reflect and propagate the uncertainty in the estimation process.
We also draw upon estimates of the total stillbirth rate (SBR), which is defined as the number of babies born with no sign of life at 28 weeks or more of gestation, per 1,000 total births.These estimates are used to quantify the coverage of data sources within a country, and to weight countrylevel estimates of IPSB to get regional estimates of IPSB.Similarly to the NMR, we use UN IGME estimates (Hug et al. 2020) and use the full posterior samples to better propagate the uncertainty in the SBR estimation process.
As shown in Table 3, data on stillbirths by timing is available at varying gestational age definitions.
In order to inform gestational age adjustments in the estimation process, we use data from Euro-Peristat (Gissler et al. 2010).The data provided contain intrapartum and antepartum stillbirth counts in 2015 for 17 high-income European countries corresponding to both types of gestational age definitions.For confidentiality reasons, country names in these data are not given.Additionally, to help inform the gestational age adjustment for lower income countries, we draw on data from the Global Health Network (Frøen et al. 2016), which gives information on stillbirths by timing for eight low-and middle-income countries1 at both early and late gestational age definitions.
3 Model framework

Overview
The varying type, quality, and coverage of data on stillbirths by timing that are available across different countries suggests the need for an estimation process that accounts and adjusts for various data characteristics.Firstly, there is a large degree of missingness in many countries across the time period of interest , with many countries only having one observation over the whole period, and more than half of the countries having no observations at all.As such, a hierarchical model, which allows for information exchange across countries within regions may be appropriate for this context.The lack of available data also suggests that one or more covariates may be useful to obtain reasonable trends over time.Secondly, the data that do exist come from a wide range of data collection systems, which are likely to have different levels of coverage of the overall population of interest, and also measurement error in the classification of intrapartum versus antepartum timing.
This suggests a need for allowing for different magnitudes of error around observations in a data (or measurement error) model, and also potentially allow for observations to be 'down-weighted' if they come from a data system that only covers a small portion of total stillbirths.Finally, data are reported using different definitions (early or late stillbirths).As we expect early stillbirths to have a higher proportion of fetal deaths occurring in the intrapartum period (Getahun et al. 2007), the estimation process should account for these definitional issues.
To address these issues, we propose a Bayesian hierarchical penalized splines regression model to estimate the IPSB levels in each country and trends over the period 2000-2021.In this section we describe the model for place-specific IPSB, the weighting process to obtain country-level estimates, and how regional estimates are derived.

Model for place-level IPSB
We consider data on intrapartum and antepartum stillbirths as available for a specific 'place.'In many cases, a 'place' refers to data for a whole country (e.g.data from a nationally-representative CRVS system), but in other contexts, a 'place' may refer to a subnational region (such as a state or province) or a single health facility.In our approach, we explicitly model the data at the 'place' level, and then reweight the estimates in a second step to obtain country-level estimates.
For data points i = 1, . . ., N , let y i and z i denote the the number of observed intrapartum and antepartum stillbirths respectively.Then where y i + z i represents the total number of classified stillbirths2 and the parameter φ i represents the proportion of intrapartum stillbirths.The proportion φ i is modeled where µ i describes the "true" inverse-logit transformed proportion for the population, and ε i is an error term.The variance σ 2 ε,s [i] of the error term depends on the type of data system of the observation which are expected to vary in reporting quality.Observations from CRVS systems are expected to be the most reliable, and are assigned σ ε,s[i] = 0. Observations from other types of systems, such as those from subnational health facilities, DHIS/HMIS, or population studies are expected to be less reliable and for these methods the variance terms are estimated from data with half-Normal priors: The parameter µ i represents the mean for the place-level observation i, and is estimated as where β 0 is a global intercept, β r[i] is a region effect for region r, grouped according to the UN Sustainable Development Goals (SDG) regional groupings, and β c[i] is a country effect for country c.
These are modeled as where σ βr and σ βc are variance terms to be estimated.

The parameters β p[i]
represent place-level effects which are specific to a population or subnational region p as specified by descriptions in the data.Sub-populations may vary in size depending on data sources within a country.For instance, one sub-population may be only the births at a specific health facility, while another may be all births at government facilities.We assume where σ βp is a variance term to be estimated.For countries where only one sub-population appears in the data, we set β p = 0 for identifiability with the country effect β c .
We use estimates of the (log) NMR as a covariate in the model.This is motivated by the strong empirical relationship observed between the logit IPSB and log NMR (as shown in Figure 1), and also by the fact that we expect the proportion of stillbirths that are intrapartum to be positively correlated with NMR, as both outcomes are likely to decline in response to improved medical conditions during birth (Joyce et al. 2004).The term β NMR represents the slope of the effect and NMR c[i],t [i] refers to the UN IGME point estimate of the national NMR of country c and time t of the observation.
To allow for data-driven trends in addition to trends based on changes in country-level NMR, we include a place-time specific component η p,t , which is modelled using a first-order penalized splines set up.For each place p, η p,t is defined where k h (t) denotes the h th spline function evaluated at time t and α h,p denotes its coefficient to be estimated.We use cubic B-splines with knots placed at integer year values.First-order differences in the coefficients, denoted ∆ h,p , are penalized to ensure a level of smoothness in the resulting fit: where σ ∆ is an estimated variance term, and the coefficients are constrained to sum to zero are not anonymized, counts that correspond to the 28 week definition are already used to inform the country intrapartum stillbirth proportions.We therefore model these observations to inform only γ as follows.Let ẏc,g and żc,g denote respectively intrapartum and antepartum stillbirth counts for some country c using gestational age definition g.The counts are then modeled ẏc,g |ρ c,g ∼ Binomial( ẏc,g + żc,g , ρ c,g ) ( 14) where ν c,g is given a vague prior and represents the mean under the late gestational age definition.
The difference between the proportions in the early and late definitions is therefore captured by the adjustment factor γ early,m[c] , where m[c] denotes the income group of country c.

Overview
Since observations may only pertain to a specific context or geography (e.g.certain health facilities or subnational region), we do not assume that patterns in the observations generalize to parts of the full national population.We therefore apply a post-estimation weighting step to construct a national estimate as a weighted average of place-level estimates.
Ideally, data pertaining to some place p would cover the entire country, and capture timing information about all known stillbirths.In this case, the country-level estimate of the IPSB, φc,t , would just equal the place-level estimate φp,t .In the case where the the entire population of stillbirths is instead reported across multiple places, we would like to know the true proportion w p of the country's stillbirths belonging to each place to use as weights.In such a scenario a reasonable estimate of the national rate φ c,t would simply be those weights applied to the place-level rates, φc,t = In practice however, the observed sub-populations are not in general exhaustive (i.e.w p < 1), meaning that there are stillbirths that are not captured in any of the data sources for the country.
To account for this remainder, we add an "unobserved" component so that the final estimate is Here, φobs p,t is the estimate specific to place p, that is, using the estimate of its intercept and time trend.To construct φunobs c,t we assume a generic place in country c with unknown intercept and time trend.This assumes that the unobserved stillbirths are centered at the prediction based on its SDG region, NMR level, and estimated country intercept, with additional uncertainty based on the estimated between-sub-population variation and temporal variation.Details on these components are given in the following sections.

Construction of weights
In practice, the true weights w p are unknown.We instead construct an estimate ŵp as the ratio of the number of observed classified stillbirths in place p to the number of total stillbirths expected nationally, based on UN IGME estimates of overall stillbirths.Under this setup, the estimates for countries at the extremes of data quality are similar to those which would arise from a more conventional hierarchical model where data are modeled at the country level.For instance, if a country reports a single data source with full coverage (e.g.high quality CRVS), then there is only one place p (which accounts for all stillbirths in the country), so w p = 1, and the national estimate φc,t is simply the estimate φp,t informed by that data source.On the other hand, if a country has no data, then the entire estimate consists of φunobs c,t , and is informed entirely by the region and NMR.
To construct the weights for the observed place components, first let s i = y i + z i denote the sum of observed stillbirths classified as intrapartum or antepartum in observation n.
denote the estimate of total stillbirths from the UN IGME total stillbirth rate model in the country c and year t corresponding to the observation.To reflect uncertainty in the number of stillbirths, we directly use posterior samples of Si when computing our own posterior samples.
For a place p, we construct its weight ŵp as ratio of the sum of observed classified stillbirths in that sub-population, i:p[i]=p s i , to the sum of estimated stillbirths nationally in the country-years of those observations i:p[i]=p Si : Terms in the denominator are scaled proportionally for partial year observations.For example, if an data point refers only to one third of the year 2016, then we only add S c[i],t[i]=2016 /3 to the denominator for this observation.
For a country c, we sum the weights of its observed sub-populations to give the weight given to the observed population ŵc = p:c[p]=c ŵp .If this quantity exceeds 1, we downscale the weights to add to 1.The weight assigned to the unobserved portion is the remainder 1 − ŵc .
The final estimate of the intrapartum proportion φc,t for a country is a weighting of observed sub-population components φobs p,t and an unobserved component φunobs

Estimate for observed component
The sub-population weights w p are applied to sub-population estimates, which are calculated from the estimated parameters, We directly use posterior draws of the country-year neonatal mortality estimates from UN IGME, denoted ÑMR c,t , to incorporate appropriate uncertainty about the covariate.

Estimate for unobserved component
The "unobserved" component for a country is centered at the estimate given its region and country intercepts and NMR level: where β0 , βr[c] , and βNMR are the usual parameters estimated from data, whereas βpc and ηc,t are new realizations of the sub-population effect and time trend to reflect appropriate uncertainty about the unobserved population.
We assume that the unobserved component consists of a single sub-population.For the place effect The new realization of the time trend, denoted ηc,t , is generated according to the distribution ∆c

Obtaining regional-level estimates
The intrapartum stillbirth proportion for a region is obtained by multiplying the country-level intrapartum stillbirth proportions by total stillbirth counts, then aggregating at the region level and recalculating the proportion.In particular, if φc,t denotes the estimate for country c at time t, and Sc,t denotes the corresponding total stillbirth count, then φc,t Sc,t is our estimate of the number of intrapartum stillbirths.We obtain the intrapartum stillbirth proportion for a region r by calculating φr,t = c:r[c]=r φc,t Sc,t where the numerator represents the sum of intrapartum stillbirths in countries in region r, and the denominator represents all stillbirths in countries in region r.Here we again directly use posterior samples of the total stillbirth counts Sc,t when computing our own posterior samples.

Computation
We obtain samples of posterior distributions of parameters using Hamiltonian Monte Carlo (HMC), a type of Markov Chain Monte Carlo sampling, implemented in Stan using the cmdstanr interface (Gabry & Češnovar 2022).For each of 4 HMC chains, we perform 1000 warmup iterations before taking the next 1000 post-warmup iterations as our posterior samples.We monitor for convergence using standard diagnostic measures.R-hat ( R) values for all parameters are under 1.02 (Stan Development Team 2019), and visual inspection of traceplots do not indicate sampling pathologies.

Results
In this section we illustrate regional estimates of the proportion of stillbirths that are intrapartum (IPSB) over the period of interest.Additionally, we demonstrate the impact of several key components of the modeling process on country-level estimates of IPSB.
Figure 2 shows regional estimates of IPSB, with the shaded areas representing 90% credible intervals.
There is substantial regional variation in the level and trends of IPSB.Australia and New Zealand, and Europe, have low and declining IPSB of around 10%, while regions such as Northern Africa and Sub-Saharan Africa have IPSBs of 40-50% and show little evidence of decline at the regional level over the time period.Eastern Asia shows evidence of the most rapid decline since 2000, falling from around 40% to 12% over the 20-year period.The largest uncertainty in estimates is in the Oceania region, reflecting the lack of available data in this region.In Figure 4, we illustrate the effect on estimates of allowing non-sampling error to vary by data source type for two countries with different types of data available.In the figure, estimates shown in blue in the right-hand column are based on a model with no non-sampling error estimated, while estimates shown in red in the left-hand side are the final estimates from the full model.For Country 'B,' which has data from a national high-quality vital registration system, there is no effect of non-sampling error on the estimates, as this type of error is assumed to be zero for such data sources.In contrast, Country 'C' has data from several subnational sources, both from HMIS and population-based studies.As these subnational sources have additional error that is estimated to be non-zero, the resulting estimates over time are smoother, and in particular are less influenced by the particularly high observations around 2009.
Finally we illustrate the impact of the post-estimation weighting scheme on country-level estimates for three countries with different data availability situations (Figure 5).In the "No weighting" setup on the right, we use a more conventional hierarchical setup in which the β p terms are still estimated when there are multiple data sources, but are not included when presenting the national estimates.
Instead of place-level time trends, a single national time trend is shared across sub-populations.Country 'D' has data that come from a CRVS system and capture all stillbirths in the country.
As such, the estimated weights for that data series are approximately 1 and the post-estimation weighting scheme has no effect on the final estimates.In contrast Countries 'E' and 'F' have one or more subnational data sources, which capture well below the estimated total number of stillbirths.
The final estimates that include the re-weighting step thus indicate a higher level of uncertainty around the IPSB levels.

Validation
We evaluated the model using two out-of-sample prediction exercises.In the first exercise we fit the model using observations before 2017 as the training set (which represent 78% of all observations), leaving out observations from 2017 and later to use as a test set.In the second exercise we perform a 10-fold cross-validation.Results from the validation exercises are given below.We calculate mean absolute error as the average of the absolute differences between posterior predicted medians and observed IPSB proportions in held-out data points, and prediction interval coverage as the proportion of observed IPSB proportions which fall within their respective posterior prediction interval.Table 4 shows validation metrics from the first prediction exercise, and Table 5 shows validation metrics from 10-fold cross validation.Mean absolute error in both validation exercises is under 5%, and overall prediction interval coverage in both cases is close to, but slightly lower than nominal, which suggests a slight underestimation of the variability in the data.
There are some apparent regional differences in the performance of the model.In the Oceania region, there is only one data point in 2018 which also represents the only available observation in that country, resulting in high error in prediction in this region.In the Northern Africa and Western Asia region, prediction coverage is low, possibly due to relatively high between-and within-country variability.Particularly poor performance here in the first validation exercise is partially due to a series of data points for a country in this region which deviate from the NMR trend beyond what is expected from the estimated spline component.Finally, high error in the point estimates for Central and Southern Asia region can be partially explained by a relatively high proportion of low-quality subnational observations which are difficult to predict.The reasonably high prediction interval coverage, however, suggests that variability and uncertainty are well calibrated for observations in rural area, for example.In future work, we plan to investigate the construction of these weights in more depth.For instance, in some cases we may have detailed enough information about the source of a subpopulation data series to infer the types of women giving birth that were captured in the data, particularly if the geographic location of the observations is known and can be matched with census or survey data.In these cases, a post-stratification based approach may be possible, to more adequately control for the non-representativeness of different subpopulation samples.
Currently, estimates of stillbirth by timing and the total stillbirth rate are performed separately, with the total stillbirth rate being estimated using a Bayesian sparse regression model with temporal smoothing (Wang et al. 2022).As detailed above, we combine our estimates of the intrapartum proportion with posterior samples of total stillbirths to obtain regional estimates and associated uncertainty.Future work could investigate the possibility of estimating both quantities in the same modeling framework.As there is generally more data available for total stillbirth rates, combining the estimation of two approaches may help to inform estimates by timing, and would also help to better incorporate uncertainty from different sources (Schumacher et al. 2022).
Stillbirths have traditionally been overlooked by demographers and public health professionals, with more emphasis being placed on improving infant and maternal survival outcomes.But as substantial health inequalities persist across regions of the world, there have been calls to monitor stillbirths, working towards the goal of ending all preventable stillbirths.From a population dynamics perspective, there has been increased work in understanding fetal outcomes in conjunction with maternal and infant outcomes, in order to allow for a more complete assessment of mortality conditions and improvement (Hathi 2022).The estimation work presented here aimed to improve estimates of stillbirths by timing, but by doing so highlighted the vast array of data availability and quality issues in this area.Future resource efforts should not only focus on improving access to essential healthcare during pregnancy and childbirth, but also in helping national stakeholders to improve data systems to collect reliable information in order to help those most in need.

Figure 1 :
Figure 1: Observed proportion of stillbirths that are intrapartum (IPSB) (logit scale) versus log of the neonatal mortality rate.Colors represent different SDG regions.

Figure 2 :
Figure 2: Regional estimates of the proportion of stillbirths that are intrapartum.

Figure 2
Figure 2 shows the impact of the inclusion of the gestational age adjustment in the model.Country 'A' shown only has data available on the timing of stillbirths based on the early (22 weeks gestation)

Figure 3 :
Figure 3: Effect of gestational age adjustment on final estimates for country 'A'.

Figure 4 :
Figure4: Effect of allowing for non-sampling error to vary by source type on final estimates for country 'B' (which has vital registration data) and country 'C' (which has a mix of subnational data available).Points that report on the same place or sub-population are connected by lines.

Figure 5 :
Figure 5: Effect of weighting scheme on final estimates, for countries with range of different data availability situations.

Table 1 :
Data availability by SDG region

)
Finally, γ g,w is an adjustment for the gestational age definition g used in the observation (g = early or g = late).Separate adjustment factors are estimated for two income groups: m = high income countries (HIC) and m = low-middle income countries (LMIC).Since the final desired estimate of the intrapartum stillbirth proportion is with respect to a 28-week gestational age definition of stillbirth, we set γ g=late,m = 0 for observations which use a 28-week definition, or comparable "late" definition of stillbirth, and estimate adjustment factors for γ g=early,m to accommodate observations that use a 22-week definition or comparable "early" definition.The adjustment factors are additionally informed by supplementary datasets for which timing-specific stillbirth counts are available under different gestational age definitions.In particular, the high income group is informed by anonymized country-level data from Euro-Peristat.Country names in these data are not available and therefore cannot be joined with the NMR covariate to inform the region or country-level estimates directly.For the low-income group, we incorporate data from the Global Network Maternal Newborn Health Registry, for which timing-specific stillbirth counts according to 22-week and 28-week definitions are typically available.While these latter observations

Table 4 :
Model evaluation metrics using 2000-2016 data as a training set and data from 2017 onward as a test set.

Table 5 :
Model evaluation metrics from 10-fold cross validation.