-
PDF
- Split View
-
Views
-
Cite
Cite
Emanuele Aliverti, Stefano Mazzuco, Bruno Scarpa, Dynamic Modelling of Mortality Via Mixtures of Skewed Distribution Functions, Journal of the Royal Statistical Society Series A: Statistics in Society, Volume 185, Issue 3, July 2022, Pages 1030–1048, https://doi.org/10.1111/rssa.12808
- Share Icon Share
Abstract
There has been growing interest on forecasting mortality. In this article, we propose a novel dynamic Bayesian approach for modelling and forecasting the age-at-death distribution, focusing on a three-component mixture of a Dirac mass, a Gaussian distribution and a skew-normal distribution. According to the specified model, the age-at-death distribution is characterized via seven parameters corresponding to the main aspects of infant, adult and old-age mortality. The proposed approach focuses on coherent modelling of multiple countries, and following a Bayesian approach to inference we allow to borrow information across populations and to shrink parameters towards a common mean level, implicitly penalizing diverging scenarios. Dynamic modelling across years is induced through an hierarchical dynamic prior distribution that allows to characterize the temporal evolution of each mortality component and to forecast the age-at-death distribution. Empirical results on multiple countries indicate that the proposed approach outperforms popular methods for forecasting mortality, providing interpretable insights on its evolution.
1 Introduction
In the last decades, changes in life expectancy and population growths have stimulated growing interest in sophisticated models for mortality (e.g. Booth, 2020; Zanotto et al., 2020). These approaches characterize the observed age-specific regularities of the mortality function and their trends over time, providing a parsimonious representation of the components of mortality and allowing to provide forecasts. Estimates of past mortality trends provide insights on the global health of a population, describe relevant aspects of its demographic structure and facilitate comparisons across countries. Similarly, probabilistic forecasts are routinely employed by researchers, practitioners and organizations to aid and support welfare planning strategies such as pension funds, insurances and health systems, among many others (e.g. Lutz & Samir, 2010).
Human mortality has been generally characterized by several quantities, with the hazard function, the survival function and the probability density function being the most popular ones (e.g. Klein & Moeschberger, 2006). Although each representation can be directly obtained from the others, their interpretations are complementary and highlight more clearly different aspects of mortality (e.g. Basellini & Camarda, 2019). In this article we will focus on modelling the distribution of the ages at death, generally referred to as ‘age-at-death distribution’ and corresponding to the density function that characterizes mortality (Keyfitz, 2005). In the literature on modelling and forecasting mortality, this quantity has received significantly less attention than other quantities such as mortality rates (e.g. Lee & Carter, 1992; Li & Lee, 2005). However, as outlined in Basellini and Camarda (2020) and Pascariu et al. (2019), analysis of the age-at-death distribution can be incredibly useful, since it provides a direct evaluation of longevity and lifespan variability that cannot be directly inferred from other quantities (see also Canudas-Romo, 2010; Cheung et al., 2005).
To further see that, Figure 1 illustrates the age-specific distribution of deaths from four illustrative countries, over a time period that ranges from 1970 to 2015. These data highlight crucial aspects of the temporal evolution of mortality, such as the global shift of the curve to older ages—due to the increase in life expectancy—and the drop of perinatal and infant mortality. Figure 1 also suggests that such an evolution has characterized countries differently, resulting in faster changes in some countries (e.g. Italy) and slower in others (e.g. Sweden). Furthermore, the age-specific distribution of deaths illustrates that several mortality trends are converging, resulting in similar patterns in terms of modal age at death, infant mortality and compression of old-age mortality.

Age-at-death distribution across Italy, France, Sweden and Norway from 1970 to 2015
The first approaches for modelling the distribution of deaths date back to the 19-th century (e.g. Gompertz, 1825; Makeham, 1860; Pearson, 1897), and have been generalized in the following century (Heligman & Pollard, 1980; Siler, 1983). The main interest of such methods has been on characterizing mortality patterns across ages, focusing on a given country during a specific year. According to these approaches, the mortality curve can be characterized using different parametric specifications, accounting for distinct aspects of the mortality curve; see also Dellaportas et al. (2001) and Mazzuco et al. (2018). One notable limitation of these approaches is the focus on single-country modelling; when interest is on forecasting mortality for multiple countries, independent country-specific analysis is required. This strategy ignores the fact that in most developed nations similar trends are observed, and this abundance of information can be appropriately included in the model to improve the quality of the estimates. Joint modelling of multiple nations, in fact, is expected to improve the estimation of country-specific parameters and temporal trends, allowing to share information across different countries.
More recently, researchers have focused on explicitly modelling the temporal evolution of the age-at-death distribution, forecasting the parameters using time-series specifications (Basellini & Camarda, 2019; Pascariu et al., 2019). These methods borrow ideas from death-rate forecasting, such as the milestone approach introduced by Lee and Carter (1992) and sequent developments (e.g. Hyndman & Ullah, 2007). Potentially, these models can be both used to describe the evolution of mortality in the past years and to provide forecasts, although the focus of applications is more frequently on forecasting only. However, separate forecasts often lead to unreasonable results, with country-specific or sex-specific predictions that diverge substantially across groups. Such divergent forecasts across sub-populations are implausible, since industrialized countries have been showing global convergence trends in terms of mortality levels, and this tendency is expected to endure (Bergeron-Boucher et al., 2017; Oeppen, 2008; Wilson, 2001).
Following the breakthrough approach of Li and Lee (2005), different authors have focused on ‘coherent’ mortality forecasts, defined as predictions that do not diverge across groups. Such a strategy has constantly received attention in the past years, being successfully implemented in several models for death rates; for example, generalizing compositional-data techniques (Bergeron-Boucher et al., 2017) and functional-data models (Hyndman et al., 2013) for mortality. However, a critical issue with these methods concerns with the choice of a population standard to be used as a reference. This choice is substantially arbitrary, and a common consensus on which sub-population should be used is still lacking, since results are often sensitive to different references (Booth, 2020; Kjærgaard et al., 2016).
Motivated by the above discussion, in this article we propose a novel approach for modelling and forecasting the age-at-death distributions of multiple countries. We will focus on the age distribution of deaths, combining the life table quantities with the total number of deaths. This procedure allows to naturally take into account the population size of each country and its age structure, which are important determinants of the age-at-death distribution. The proposed methods can also be applied, without modifications, directly to the life table quantities, where we can assume that data are provided on artificial populations with fixed population size. However, this would discard an important source of uncertainty, which corresponds to the population size of a specific nation. Indeed, imposing a fixed same sample size weights each nation equally, implying that a relatively small country (e.g. Sweden) would have the same influence as a much bigger country (e.g. Germany or the United Kingdom). Therefore, we prefer to use the actual sample size of each county to have a proper quantification of uncertainty and genuinely account for different population structures.
We propose a method based on a mixture distribution function of a Dirac mass, a Gaussian distribution and a skew-normal distribution, in order to characterize infant, adult and old-age mortality with an interpretable set of parameters. Following a Bayesian approach to inference, we specify hierarchical priors for the dynamic parameters and the country-specific coefficients, relying on informative common prior specifications to improve the borrowing of information across nations. Compared to the available techniques, our method allows to model directly the temporal evolution and the country-specific variations across multiple sub-populations, avoiding sequential procedures and providing estimates shrunk towards common levels. Under the proposed methods, coherent predictions are naturally obtained without selecting a reference population or an external standard, but instead estimating these quantities from the available data. We will refer to the proposed dynamic model based on a skewed distribution function as dysm in the sequel.
2 Methods
2.1 Model specification
Our interest is on providing a flexible representation of the age-at-death distribution in multiple countries, dynamically across different years or time periods. We will characterize this quantity through a continuous probability density function (pdf) which parametrizes the probability of dying at age x—based on the individuals current living at age x—in country j during year t via a set of country- and time-specific parameters , with x = 0, …, 110, j = 1, …, p countries and t = 1, …, T years. Although f(x;·) is often specified as a continuous function, in practice we focus on characterizing the age-at-death distribution on a discrete grid, corresponding to the probability of dying at a specific age x = 0, …, 110. Such an aim can be addressed discretizing over a set of disjoint intervals (x − 1/2, x + 1/2] as
where F denotes the cumulative distribution function (cdf) associated with f. This characterization of probabilities via a discretization of an underlying pdf automatically guarantees that the vector with elements as in Equation 1 is a proper probability vector with and , therefore avoiding further restrictions (e.g. Oeppen, 2008); see the left panel of Figure 2 for a graphical representation of the discretization process.

Proposed mixture density function. Panel (a) provides a sketch of the discretization of the pdf into a set of probabilities , with j = 57, …, 60. Panel (b) highlights the three components of the mixture density function
We denote as the number of deaths and as the death probabilities, where elements and denote, respectively, the number of deaths and the probability of dying at age x—based on the current individuals living at age x—in country j during year t. Given the total number of deaths and the probabilities outlined in Equation 1, we model the vector as a multinomial distribution with
A crucial aspect of this approach involves the specification of f(x;·), which induces a model on the vector of probabilities . In practice, different parametric models for f(x;·) are available in the literature, with the Gompertz (1825) and Siler (1983) models being popular options. The age-at-death distribution f(x;·) could be also modelled semi-parametrically, using for example splines or other basis functions (Basellini & Camarda, 2020; Hyndman et al., 2013). However, when interest is explicitly on characterizing relevant aspects of mortality—such as the impact of infant mortality or the shape of old-age mortality—parametric models provide a much clearer interpretation and deeper insights on the structure of the populations under investigation. In addition, mortality can be naturally divided into different components, accounting for deaths at different phases of life (e.g. Lexis, 1879; Pearson, 1897). Therefore, it is desirable to rely on a specification which preserves such a decomposition, for example using mixtures of distributions (Carriere, 1992; Mazzuco et al., 2018).
In this work, we draw inspiration from Mazzuco et al. (2018) and Zanotto et al. (2020) and rely on a mixture of three components to characterize mortality; see also Carriere (1992) for related arguments. Specifically, in order to facilitate model's interpretation, we decompose mortality as the contribution of infant mortality, adult mortality and old-age mortality. Infant mortality is accounted through a single Dirac mass at age 0, while adult mortality is modelled via a Gaussian distribution with mean and standard deviation . Old-age mortality, instead, is characterized using a skew-normal distribution (e.g. Azzalini & Capitanio, 2013) with location parameter , scale parameter and shape parameter . The impact of each component to the overall mortality function is accounted assigning mixture weights such that in order to guarantee that the resulting expression remains a proper pdf. This choice leads to the following characterization of
with , denoting the indicator function and φ(·), Φ(·) denoting the pdf and the cdf of a standard Gaussian, respectively; we omit the parameter from since due to the constraint on the mixture weights; see the right panel of Figure 2 for a graphical representation of the proposed model for the age-at-death distribution.
Equation (3) allows to characterize the age-at-death distribution with flexibility, while also retaining interpretability of the mortality structure. The first component of Equation (3) accounts for infant and perinatal mortality, characterizing deaths in the first year of life. As outlined, for example, in Figure 1, perinatal mortality is notably decreased during recent years, mainly due to technological advances (e.g. Wilson, 2001; Zanotto et al., 2020); the mixture weight characterizes the impact of this component in the overall mortality. The use of a Dirac mass is particularly adequate for developed countries, where infant mortality occurs during the first year of life and can be properly modelled with a single mass, avoiding more elaborate specifications and additional parameters. However, dysm can be naturally adapted to countries with high infant-mortality levels, replacing the Dirac mass with an half-normal distribution (Zanotto et al., 2020) and leveraging on the same model architecture; in Section 3.4, we compare such a specification with the proposed model. The Gaussian component, instead, is assigned to a relative weight and characterizes adult mortality via the location parameter and the scale parameters . In addition, these quantities correspond also to the moments of the Gaussian component, and therefore they can be interpreted directly as the mean and standard deviation of the age at adult mortality.
The skew-normal component, lastly, accounts for the asymmetric shape of old-age mortality, which has been frequently observed in developed countries (e.g. Mazzuco et al., 2018; Pascariu et al., 2019). Since most deaths occur at old ages, this component is expected to be associated with the largest mixture weight . Inference on the structure of old-age mortality rely on such mixture weight and on the skew-normal parameters and , providing insights on the evolution of old-age mortality in terms of location, scale and shape of such a component. In practice, it can be convenient to focus on the mean and standard deviation of old-age mortality, which are more interpretable measures of the shift and compression of the age-at-death distribution. According to the proposed parametrization, these quantities can be conveniently expressed as
with (e.g. Azzalini & Capitanio, 2013, section 2.1.4). Alternatively, it is also possible to parameterize directly the skew-normal distribution in terms of its cumulants (e.g. Azzalini & Capitanio, 2013, section 3.1.4). However, for simplicity in implementation and exposition, we prefer to work with the outlined parameters, and eventually post-process the estimates to focus on the functionals of interest.
2.2 Hierarchical dynamic model
The multinomial model outlined in Expression 2, with probabilities given by Equations (1) and (3), provides a flexible specification for the distribution of the number of deaths occurred in a single country j during a specific year t. In order to account for the temporal variation of the age-at-death distribution, we model explicitly the dynamic evolution of the parameters through an hierarchical dynamic model. This strategy will allow us to highlight what aspects of mortality have changed in the past years, characterizing the rates of such changes and facilitating comparison across different countries. In addition, the inclusion of a dynamic component facilitates the developments of mortality forecasts extrapolating parameters in time, obtaining predictions for the future age-at-death distributions.
We specify the hierarchical model in an unconstrained domain, transforming the parameters of Equation (3). This choice facilitates computation and implementations, allowing to rely on Gaussian dynamic models in the transformed space and avoiding restrictions on the support of the parameters. For this purpose, we introduce the reparametrized vector , with elements corresponding to
In Equation (5), the mixture weights are mapped into leveraging a cumulative logit transformation, while and — corresponding to adult and old-age mortality scale parameters—are transformed into and , respectively, via a logarithmic transformations. Note that each for k = 1, …, 7, and that the vector effectively corresponds to a one-to-one reparametrization of .
We model the dynamic evolution of the elements leveraging a random-walk model with drift. This choice leads to the following specification.
In Equation (6), denotes the initial condition of , and it is assigned to a Gaussian distribution with mean and standard deviation . These values correspond to the initial information for the dynamic hierarchical model, and characterize a probabilistic representation of our beliefs before the data are observed (e.g. West & Harrison, 1997). The drift parameter measures the expected difference across consecutive values of and accounts for possible increasing or decreasing linear trends in the evolution of the parameters. The inclusion of this component is particularly relevant in our problem, where we expected to observe clear and important trends in the evolution of mortality, accounted by the parameters charactering the age-at-death distribution. Conditionally on the value at time (t − 1), each is recursively distributed as a Gaussian distribution with mean , and standard deviation . Overall, we can interpret dysm as a Multinomial state-space model, where Equation (6) controls the evolution of the latent states via random walks with Gaussian innovations and, conditional on the states, the observed vectors of deaths are modelled as independent Multinomial variables with probabilities given by Equations (1) and (3); see Figure 3 for a graphical representation of the proposed hierarchical dynamic model.

Graphical illustration of the dynamic component of dysm. Solid circles, white squares and grey squares denote vectors of dynamic parameters, probabilities characterizing the age-at-death distribution and observed number of deaths respectively
Although this dynamic model might seem oversimplistic, we found that such a specification leads to good results in a large variety of applications involving developed countries; see Section 3 for further details. More elaborate extensions are possible, for example including a moving-average component, increasing the lag of the autoregressive terms or joint modelling multiple components of via multivariate random walks. However, a random walk with drift on the latent states often characterizes well many phenomena without over-parameterizing the dynamic component, and is also general enough to be directly used, without further modifications, in a large variety of settings; see Durbin and Koopman (2012) and West and Harrison (1997) for related arguments and Section 3.4 for a comparison with alternative specifications.
Additionally, we highlight that Equation (6) is valid for any equally spaced time grid, and therefore the proposed approach can be also applied to different time scales. In fact, our description focuses on yearly data for simplicity in exposition, but this requirement is not necessary for developing the proposed model in practice. Several important settings involve analysis of mortality across different time scales, such as monthly data or multi-year periods; for example, when interest is on modelling short-term variations due to extraordinary events (e.g. covid-19 pandemic) or large-scale trends, respectively. The general structure of dysm facilitates its use in these different settings, without modifying the model specification.
2.3 Bayesian inference
As discussed in Section 1, a crucial aspect of mortality forecasting involves the penalization of diverging scenarios via coherent modelling. This can be obtained explicitly including a common trend across countries (e.g. Li, 2013; Li & Lee, 2005), or choosing a reference population (e.g. Basellini & Camarda, 2019; Hyndman et al., 2013); see also Booth (2020) for a recent discussion. However, these choices are arbitrary and significantly affect the final results (e.g. Kjærgaard et al., 2016). Therefore, we follow a different path and induce coherent predictions via a Bayesian approach to inference, regularizing estimates towards a common mean trend. More specifically, we introduce a common prior specification for the country-specific dynamic hierarchical model defined in Equation (6), for each coefficient k = 1, …, 7. Including the prior for , we let
where and denote fixed hyper-parameters; see Figure 4 for an illustration of the hierarchical dependence structure induced by the common prior specification.

Graphical illustration of the hierarchical component of dysm, inducing borrowing of information and coherent predictions across countries, with j = 1,…, p. Top level block denotes fixed hyper-parameters, while bottom blocks denote model's parameters
The specification of a common prior distribution for the country-specific parameters allows to share information across different nations, improving estimates for the age-at-death distributions. In addition, the introduction of a common prior shrinks estimates towards a common mean, providing an implicit regularization on the parameters. Such a shrinkage is what makes dysm a ‘coherent’ model, using a data-driven reference mortality level without the need of a user-defined benchmark.
2.4 Prior elicitation and posterior computation
The large abundance of research on the evolution of mortality, and the clear interpretation of the parameters in Equation 3, facilitate informative elicitation for the hyper-parameters of Equation 7. For example, European adult mortality is expected to lie between 30 and 70 years with high probability (e.g. Lewer et al., 2020); therefore, we can assume for the initial condition of , independently for j = 1, …, p. Similarly, informative specifications for the prior distribution on the drift parameter allow to include information on the direction and intensity of the temporal trends. For instance, we could centre the trend around , resulting in an expected increasing trend for the age at adult mortality of 2 years between consecutive time points. In Section 3.4, we compare such informative prior specification with a non-informative elicitation, confirming the importance of the proposed approach to induce shrinkage across countries and obtain coherent predictions.
There is a large literature on simulation methods for non-Gaussian state-space models; see, for example, Durbin and Koopman (2012 chapter 13.3), Prado and West (2010 chapter 6) or Chopin and Papaspiliopoulos (2020) for recent developments. A major complexity in our methods derive from the computation of the age-at-death distribution, which links to via Equation (1). Therefore, we conduct posterior inference relying on the r package nimble (de Valpine et al., 2020), that allows to specify the main structure of dysm, compile it in c++, and implement an adaptive Metropolis-within-Gibbs mcmc algorithm to simulate from the posterior distribution of the model's parameters. Specifically, the parameters of Equation (7) are updated using Gibbs steps, while the remaining are updated with multivariate Metropolis steps, due to the lack of conjugacy induced by Equation (1). Pseudo-code illustrating the main steps to conduct posterior inference is reported in the Supplementary materials. Forecasts for future time points t > T can be simulated recursively from the posterior predictive distribution, continuing the conditional representation of Equation (6) for each draw of the mcmc output.
In order to improve the computational performance, we included further modifications to efficiently compute the cdf of the mixture density function in Expression (3) using c++. This operation is non-trivial since it requires to compute the cdf of the skew-normal component, which relies on the Owen (1956) T-function; see also Azzalini and Capitanio (2013 Appendix B) for the details and Patefield and Tandy (2000) for practical considerations. Overall, posterior computations in a setting with p = 12 countries and T = 20 time points requires roughly 2 min per 1000 iterations on an 8-core intel I7-7700HQ, 2.8 ghz processor running Linux, which is satisfactory for our purposes.
3 European Countries Mortality
3.1 Data description
The focus of this section is to evaluate the ability of dysm to characterize the age-at-death distributions of different European countries. In Section 3.2, we are interested in comparing its performance with some state-of-the-art methods for mortality forecast, focusing on rolling windows scenarios. In Section 3.3 we use the proposed method to estimate mortality and to forecasts for future years, illustrating in details how dysm can be used to draw thoughtful inferential conclusions by making inference on different quantities. Analysis will focus on the time range 1960–2016, and on joint modelling the following p = 12 countries: Austria (aut), Switzerland (che), England & Wales (gbr), Germany (deu), Denmark (dnk), Finland (fin), France (fra), Italy (ita), Netherlands (nld), Norway (nor), Spain (esp) and Sweden (swe), separately for the male and female populations. These countries were selected according to the European countries analysed in other approaches, such as Li and Lee (2005). Mortality data are retrieved from the Human Mortality Database (2020), and the year- and country-specific number of deaths are obtained with the utilities of the r package demograpy (Hyndman et al., 2019).
3.2 Performance evaluation via rolling window
In order to quantitatively evaluate the performance of dysm in estimating and forecasting mortality, we rely on rolling window scenarios. Specifically, analysis focus on sub-periods of 20 years of data to fit the models, and subsequent periods of 10 years of data to provide forecasts. Performance is evaluated both in terms of in-sample accuracy—comparing the model fit with the 20 observed years—and in terms of forecasts ability—comparing predictions with the out-of-sample block for the subsequent 10 years. Multiple scenarios are obtained rolling the window forward of steps of 1 years, thereby dividing the range 1960–2016 in 30 consecutive scenarios.
We compare dysm with different popular approaches for mortality forecasting: the Lee-Carter model (Lee & Carter, 1992), the coherent Li-Lee model (Li & Lee, 2005), the functional mortality model of Hyndman–Ullah (Hyndman & Ullah, 2007), the compositional Oeppen model (Oeppen, 2008) and the mem5 model (Pascariu et al., 2019). All methods are conveniently available in a common interface within the r package MortalityForecast (Pascariu, 2020). Since these methods focus on modelling single population data, we perform the analysis on each country separately. In order to compare the approaches on the same basis, results are evaluated in terms of the quality of the fitted and predicted age-at-death distributions and death rates. Under dysm, such estimates are obtained post-processing the mcmc output, computing the posterior distribution of Equation 1 via Monte Carlo integration. For the remaining approaches, we follow standard practices and transform predictions into death rates and age-at-death distributions (e.g. Pascariu, 2020).
Posterior inference proceeds separately for the female and male populations, relying on informative prior distributions. We elicited the initial conditions for the hierarchical dynamic component considering the expected level of mortality in the temporal window under investigation, centring old-age mortality around 70 years (), adult mortality around 50 years and setting the remaining values of to 0. We also let for all k to induce sufficient variability in such initial prior guesses and cover the temporal window under investigation. Lastly, we specified a shared standard Gaussian for the drift parameters (setting and ) and a weakly informative Inverse-Gamma on the variances of the dynamic innovations , setting for all k. In each scenario, posterior inference for dysm relies on 100000 iterations collected after a burn-in of 5000, thinning one iteration every 5. Convergence was assessed in terms of mixing, autocorrelation of the chains and Geweke diagnostics, which resulted satisfactory in all the windows considered, with an effective sample size larger than 18000 of 20000 stored iterations for all the parameters.
Performance is evaluated in terms of mean squared error (mse) and mean absolute error (mae) between the fitted and predicted age-at-death distributions or death rates and the ground truth. Since the elements of the age-at-death distribution are generally very small numbers, we report results in relative terms, dividing the performance of each approach by the performance obtained by dysm in each scenario and country. Recalling that both mse and mae are measure of discrepancy, results larger than 1 indicate that predictions under dysm achieve an error which is lower than the competitor, and therefore we can conclude that dysm performs better; the opposite holds for values lower than 1.
Results referring to in-sample evaluation are reported in Table 1, computing the median, first and third quartiles of the relative performances across countries and rolling windows scenarios. Current results suggest that dysm outperforms the competitors both for the male and female populations, with a performance that is better than the other approaches in all settings. These results confirm the ability of dysm to characterize with flexibility age-at-death distributions across multiple countries and time periods, providing estimates that are consistent with the observed data and more accurate than state-of-the art methods. When dysm is used for fitting death rates, a relatively worse performance is observed, particularly in terms of mae. This result can be explained considering that old ages (80–100) are assigned more weight in the computation of death rates, compared to other quantities that characterize mortality (e.g. Keyfitz, 2005). Therefore, dysm performs particularly well in characterizing premature and adult mortality, and this leads to large improvements for modelling the age-at-death distribution, in particular for male populations where the share of adult mortality is relatively large.
In-sample relative performance in estimating the age-at-death distribution (top) and death rates (bottom), comparing predictions of dysm against the competitors. Median across rolling windows and countries. First and third quartiles are reported in squared brackets
. | female . | male . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.41 [1.17, 1.64] | 2.73 [1.91, 3.80] | 1.92 [1.39, 2.40] | 5.37 [2.76, 7.79] |
Lee-Carter | 1.40 [1.17, 1.62] | 2.74 [1.90, 3.79] | 1.93 [1.39, 2.36] | 5.34 [2.77, 7.77] |
Li-Lee | 1.23 [0.94, 1.45] | 2.19 [1.22, 2.97] | 1.75 [1.30, 2.28] | 4.71 [2.53, 6.99] |
mem5 | 1.22 [1.07, 1.31] | 2.01 [1.55, 2.55] | 1.86 [1.56, 2.19] | 4.20 [3.01, 6.61] |
Oeppen | 1.13 [0.92, 1.36] | 1.88 [1.31, 2.66] | 1.61 [1.24, 2.14] | 3.79 [2.24, 6.37] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 0.86 [0.76, 0.95] | 1.02 [0.86, 1.11] | 0.84 [0.79, 0.92] | 1.02 [0.85, 1.19] |
Lee-Carter | 0.91 [0.79, 1.06] | 1.08 [0.88, 1.86] | 0.87 [0.80, 1.00] | 1.07 [0.87, 1.58] |
Li-Lee | 1.02 [0.71, 1.26] | 1.81 [0.71, 5.42] | 0.99 [0.77, 1.17] | 1.90 [0.95, 3.89] |
mem5 | 1.06 [0.97, 1.14] | 0.99 [0.90, 1.09] | 1.01 [0.96, 1.05] | 1.02 [0.95, 1.08] |
Oeppen | 1.27 [1.03, 4.71] | 3.87 [1.10, 6.51] | 1.09 [0.89, 2.98] | 1.32 [0.81, 7.01] |
. | female . | male . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.41 [1.17, 1.64] | 2.73 [1.91, 3.80] | 1.92 [1.39, 2.40] | 5.37 [2.76, 7.79] |
Lee-Carter | 1.40 [1.17, 1.62] | 2.74 [1.90, 3.79] | 1.93 [1.39, 2.36] | 5.34 [2.77, 7.77] |
Li-Lee | 1.23 [0.94, 1.45] | 2.19 [1.22, 2.97] | 1.75 [1.30, 2.28] | 4.71 [2.53, 6.99] |
mem5 | 1.22 [1.07, 1.31] | 2.01 [1.55, 2.55] | 1.86 [1.56, 2.19] | 4.20 [3.01, 6.61] |
Oeppen | 1.13 [0.92, 1.36] | 1.88 [1.31, 2.66] | 1.61 [1.24, 2.14] | 3.79 [2.24, 6.37] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 0.86 [0.76, 0.95] | 1.02 [0.86, 1.11] | 0.84 [0.79, 0.92] | 1.02 [0.85, 1.19] |
Lee-Carter | 0.91 [0.79, 1.06] | 1.08 [0.88, 1.86] | 0.87 [0.80, 1.00] | 1.07 [0.87, 1.58] |
Li-Lee | 1.02 [0.71, 1.26] | 1.81 [0.71, 5.42] | 0.99 [0.77, 1.17] | 1.90 [0.95, 3.89] |
mem5 | 1.06 [0.97, 1.14] | 0.99 [0.90, 1.09] | 1.01 [0.96, 1.05] | 1.02 [0.95, 1.08] |
Oeppen | 1.27 [1.03, 4.71] | 3.87 [1.10, 6.51] | 1.09 [0.89, 2.98] | 1.32 [0.81, 7.01] |
In-sample relative performance in estimating the age-at-death distribution (top) and death rates (bottom), comparing predictions of dysm against the competitors. Median across rolling windows and countries. First and third quartiles are reported in squared brackets
. | female . | male . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.41 [1.17, 1.64] | 2.73 [1.91, 3.80] | 1.92 [1.39, 2.40] | 5.37 [2.76, 7.79] |
Lee-Carter | 1.40 [1.17, 1.62] | 2.74 [1.90, 3.79] | 1.93 [1.39, 2.36] | 5.34 [2.77, 7.77] |
Li-Lee | 1.23 [0.94, 1.45] | 2.19 [1.22, 2.97] | 1.75 [1.30, 2.28] | 4.71 [2.53, 6.99] |
mem5 | 1.22 [1.07, 1.31] | 2.01 [1.55, 2.55] | 1.86 [1.56, 2.19] | 4.20 [3.01, 6.61] |
Oeppen | 1.13 [0.92, 1.36] | 1.88 [1.31, 2.66] | 1.61 [1.24, 2.14] | 3.79 [2.24, 6.37] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 0.86 [0.76, 0.95] | 1.02 [0.86, 1.11] | 0.84 [0.79, 0.92] | 1.02 [0.85, 1.19] |
Lee-Carter | 0.91 [0.79, 1.06] | 1.08 [0.88, 1.86] | 0.87 [0.80, 1.00] | 1.07 [0.87, 1.58] |
Li-Lee | 1.02 [0.71, 1.26] | 1.81 [0.71, 5.42] | 0.99 [0.77, 1.17] | 1.90 [0.95, 3.89] |
mem5 | 1.06 [0.97, 1.14] | 0.99 [0.90, 1.09] | 1.01 [0.96, 1.05] | 1.02 [0.95, 1.08] |
Oeppen | 1.27 [1.03, 4.71] | 3.87 [1.10, 6.51] | 1.09 [0.89, 2.98] | 1.32 [0.81, 7.01] |
. | female . | male . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.41 [1.17, 1.64] | 2.73 [1.91, 3.80] | 1.92 [1.39, 2.40] | 5.37 [2.76, 7.79] |
Lee-Carter | 1.40 [1.17, 1.62] | 2.74 [1.90, 3.79] | 1.93 [1.39, 2.36] | 5.34 [2.77, 7.77] |
Li-Lee | 1.23 [0.94, 1.45] | 2.19 [1.22, 2.97] | 1.75 [1.30, 2.28] | 4.71 [2.53, 6.99] |
mem5 | 1.22 [1.07, 1.31] | 2.01 [1.55, 2.55] | 1.86 [1.56, 2.19] | 4.20 [3.01, 6.61] |
Oeppen | 1.13 [0.92, 1.36] | 1.88 [1.31, 2.66] | 1.61 [1.24, 2.14] | 3.79 [2.24, 6.37] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 0.86 [0.76, 0.95] | 1.02 [0.86, 1.11] | 0.84 [0.79, 0.92] | 1.02 [0.85, 1.19] |
Lee-Carter | 0.91 [0.79, 1.06] | 1.08 [0.88, 1.86] | 0.87 [0.80, 1.00] | 1.07 [0.87, 1.58] |
Li-Lee | 1.02 [0.71, 1.26] | 1.81 [0.71, 5.42] | 0.99 [0.77, 1.17] | 1.90 [0.95, 3.89] |
mem5 | 1.06 [0.97, 1.14] | 0.99 [0.90, 1.09] | 1.01 [0.96, 1.05] | 1.02 [0.95, 1.08] |
Oeppen | 1.27 [1.03, 4.71] | 3.87 [1.10, 6.51] | 1.09 [0.89, 2.98] | 1.32 [0.81, 7.01] |
Out-of-sample performances are instead reported in Table 2. Results provide further evidence in favour of the ability of dysm, which obtains out-of-sample forecasts more accurate than the competitors. This important results suggest that, overall, dysm is the most accurate methods for modelling and forecasting the age-at-death distribution in the 12 European countries under investigation. Performances stratified by countries are reported in the Supplementary materials.
Out-of-sample relative performance in estimating the age-at-death distribution (top) and death rates (bottom), comparing predictions of dysm against the competitors. Median across rolling windows and countries. First and third quartiles are reported in squared brackets
. | females . | males . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.38 [1.14, 1.69] | 1.99 [1.39, 2.88] | 1.91 [1.12, 2.48] | 3.64 [1.42, 6.36] |
Lee-Carter | 1.30 [1.03, 1.62] | 1.75 [1.12, 2.65] | 1.73 [1.06, 2.37] | 3.23 [1.25, 5.54] |
Li-Lee | 1.28 [0.91, 1.56] | 1.66 [0.93, 2.54] | 1.73 [1.06, 2.36] | 3.23 [1.25, 5.53] |
mem5 | 2.44 [2.02, 3.03] | 5.67 [3.98, 8.88] | 2.84 [1.70, 3.88] | 7.92 [3.23, 9.96] |
Oeppen | 1.29 [0.94, 1.82] | 2.30 [1.29, 9.34] | 1.66 [1.06, 2.36] | 3.86 [1.45, 8.11] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.02 [0.88, 1.10] | 1.35 [1.18, 1.70] | 0.96 [0.89, 1.05] | 1.24 [1.05, 1.47] |
Lee-Carter | 1.12 [0.95, 2.66] | 1.73 [1.22, 8.96] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
Li-Lee | 1.12 [0.95, 2.66] | 1.73 [1.22, 7.95] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
mem5 | 1.06 [0.89, 1.19] | 1.03 [0.76, 1.48] | 1.32 [1.03, 1.58] | 1.05 [0.83, 1.41] |
Oeppen | 1.67 [1.24, 4.07] | 3.99 [1.53, 9.44] | 1.29 [1.10, 1.69] | 1.97 [1.36, 6.35] |
. | females . | males . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.38 [1.14, 1.69] | 1.99 [1.39, 2.88] | 1.91 [1.12, 2.48] | 3.64 [1.42, 6.36] |
Lee-Carter | 1.30 [1.03, 1.62] | 1.75 [1.12, 2.65] | 1.73 [1.06, 2.37] | 3.23 [1.25, 5.54] |
Li-Lee | 1.28 [0.91, 1.56] | 1.66 [0.93, 2.54] | 1.73 [1.06, 2.36] | 3.23 [1.25, 5.53] |
mem5 | 2.44 [2.02, 3.03] | 5.67 [3.98, 8.88] | 2.84 [1.70, 3.88] | 7.92 [3.23, 9.96] |
Oeppen | 1.29 [0.94, 1.82] | 2.30 [1.29, 9.34] | 1.66 [1.06, 2.36] | 3.86 [1.45, 8.11] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.02 [0.88, 1.10] | 1.35 [1.18, 1.70] | 0.96 [0.89, 1.05] | 1.24 [1.05, 1.47] |
Lee-Carter | 1.12 [0.95, 2.66] | 1.73 [1.22, 8.96] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
Li-Lee | 1.12 [0.95, 2.66] | 1.73 [1.22, 7.95] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
mem5 | 1.06 [0.89, 1.19] | 1.03 [0.76, 1.48] | 1.32 [1.03, 1.58] | 1.05 [0.83, 1.41] |
Oeppen | 1.67 [1.24, 4.07] | 3.99 [1.53, 9.44] | 1.29 [1.10, 1.69] | 1.97 [1.36, 6.35] |
Out-of-sample relative performance in estimating the age-at-death distribution (top) and death rates (bottom), comparing predictions of dysm against the competitors. Median across rolling windows and countries. First and third quartiles are reported in squared brackets
. | females . | males . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.38 [1.14, 1.69] | 1.99 [1.39, 2.88] | 1.91 [1.12, 2.48] | 3.64 [1.42, 6.36] |
Lee-Carter | 1.30 [1.03, 1.62] | 1.75 [1.12, 2.65] | 1.73 [1.06, 2.37] | 3.23 [1.25, 5.54] |
Li-Lee | 1.28 [0.91, 1.56] | 1.66 [0.93, 2.54] | 1.73 [1.06, 2.36] | 3.23 [1.25, 5.53] |
mem5 | 2.44 [2.02, 3.03] | 5.67 [3.98, 8.88] | 2.84 [1.70, 3.88] | 7.92 [3.23, 9.96] |
Oeppen | 1.29 [0.94, 1.82] | 2.30 [1.29, 9.34] | 1.66 [1.06, 2.36] | 3.86 [1.45, 8.11] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.02 [0.88, 1.10] | 1.35 [1.18, 1.70] | 0.96 [0.89, 1.05] | 1.24 [1.05, 1.47] |
Lee-Carter | 1.12 [0.95, 2.66] | 1.73 [1.22, 8.96] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
Li-Lee | 1.12 [0.95, 2.66] | 1.73 [1.22, 7.95] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
mem5 | 1.06 [0.89, 1.19] | 1.03 [0.76, 1.48] | 1.32 [1.03, 1.58] | 1.05 [0.83, 1.41] |
Oeppen | 1.67 [1.24, 4.07] | 3.99 [1.53, 9.44] | 1.29 [1.10, 1.69] | 1.97 [1.36, 6.35] |
. | females . | males . | ||
---|---|---|---|---|
. | mae . | mse . | mae . | mse . |
Age-at-death distr. | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.38 [1.14, 1.69] | 1.99 [1.39, 2.88] | 1.91 [1.12, 2.48] | 3.64 [1.42, 6.36] |
Lee-Carter | 1.30 [1.03, 1.62] | 1.75 [1.12, 2.65] | 1.73 [1.06, 2.37] | 3.23 [1.25, 5.54] |
Li-Lee | 1.28 [0.91, 1.56] | 1.66 [0.93, 2.54] | 1.73 [1.06, 2.36] | 3.23 [1.25, 5.53] |
mem5 | 2.44 [2.02, 3.03] | 5.67 [3.98, 8.88] | 2.84 [1.70, 3.88] | 7.92 [3.23, 9.96] |
Oeppen | 1.29 [0.94, 1.82] | 2.30 [1.29, 9.34] | 1.66 [1.06, 2.36] | 3.86 [1.45, 8.11] |
Death rates | ||||
dysm | 1.00 | 1.00 | 1.00 | 1.00 |
Hyndman–Ullah | 1.02 [0.88, 1.10] | 1.35 [1.18, 1.70] | 0.96 [0.89, 1.05] | 1.24 [1.05, 1.47] |
Lee-Carter | 1.12 [0.95, 2.66] | 1.73 [1.22, 8.96] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
Li-Lee | 1.12 [0.95, 2.66] | 1.73 [1.22, 7.95] | 1.06 [0.94, 2.08] | 1.51 [1.14, 9.24] |
mem5 | 1.06 [0.89, 1.19] | 1.03 [0.76, 1.48] | 1.32 [1.03, 1.58] | 1.05 [0.83, 1.41] |
Oeppen | 1.67 [1.24, 4.07] | 3.99 [1.53, 9.44] | 1.29 [1.10, 1.69] | 1.97 [1.36, 6.35] |
3.3 Results for European mortality
In Section 3.2, we have demonstrated that dysm outperforms the competitors in terms of the quality of the fitted and predicted age-at-death distributions. In this section we argue that the proposed method has concrete practical benefits in terms of mortality modelling also from an interpretative point of view. Indeed, posterior inference on the parameters characterizing dysm and their functionals provides insights on the structure of mortality and its future evolution, highlighting aspects that have larger impact in such changes and explaining differences across countries.
To illustrate this, we focus on the same European countries as in the previous section, modelling the more recent period 1980–2017 and providing forecasts until 2037. Posterior simulation for this application relies on the same settings as in Section 3.2. Results in terms of mixing, autocorrelations, Geweke diagnostic and effective sample size were satisfactory for all the parameters considered, with an effective sample size larger than 19000 of 20000 iterations for all the parameter. We conduct inference on the set of parameters to further facilitate interpretation. Results are reported in Figure 5, illustrating the posterior medians and 90% credible intervals for the estimated parameters for male and female populations separately.

Posterior medians (solid lines) and 90% credible intervals (dotted lines) for the parameters characterizing dysm and functionals of interest. Black vertical dotted lines indicate the last observed period (2017)
Current empirical findings are consistent with known aspects of the temporal evolution of mortality of the past decades. For example, the impact of infant mortality—measured via the parameters —is notably decreased in all countries in the past years, dropping from values around 0.01 to 0.005. Although both male and female populations share such decreasing trends, we also observe that infant mortality is larger for males than females (e.g. Drevenstedt et al., 2008). Another clear aspect involves the evolution of the mean age of old-age mortality, captured via the skew-normal mean , which has constantly increased in all countries. The overall trend is similar between males and females, with woman reporting older ages on average. Similarly, the decreasing trend of is consistent with the empirical evidence on compression of old-age mortality.
Under dysm, estimates and credible intervals for any complex functional of the model's parameters are easily obtained. More importantly, prediction intervals are determined taking into account the uncertainty of the entire inferential process, which is not the case for other coherent models, where all the uncertainty related to choice of reference is generally not accounted for. Some interesting functionals include death rates and life expectancies, which can be obtained from the age-at-death distribution via deterministic transformations; dysm allows to conduct coherent inference also on this quantities, rigorously characterizing uncertainty post-processing the mcmc output. As an illustrative example, Figures 6 and 7 depict the life expectancy and the death probability, respectively, computed at different ages and considering male and female populations in Italy and Sweden. Results are consistent with the conclusions outlines above, confirming regular patters of mortality such as non-overlapping trends for males and females and an overall improvement at all ages.

Life expectancy at age 0 (at birth), 20, 40, 60, 70 and 80. Italy and Sweden, female and male population. Estimates are obtaining post-processing the mcmc for the age-at-death distribution. Solid and dashed lines denote posterior medians and 90% credible intervals, respectively

Logarithm of death probability at age 50, 60, 70, 80 and 90. Italy and Sweden, female and male population. Estimates are obtaining post-processing the mcmc for the age-at-death distribution. Solid and dashed lines denote posterior medians and 90% credible intervals, respectively
3.4 Alternative parametric specifications
In this section we evaluate the robustness of the parametric assumptions underlying dysm, focusing on the application outlined in Section 3.3. We investigate different specifications for the mixture components characterizing dysm and alternative prior specifications. Specifically, we modify dysm by
- •
modelling old-age mortality with a rescaled Beta distribution, supported in the interval [75–110], instead of a skew-normal distribution;
- •
modelling infant mortality with an half-normal distribution, as in Zanotto et al. (2020), instead of a single Dirac mass at age 0;
- •
removing the Gaussian component accounting for adult mortality, setting ;
- •
characterizing the joint evolution of with a multivariate random walk, instead of independent innovations;
- •
characterizing the evolution of via random walks with Student-t innovations with 5 degrees of freedom, instead of Gaussian distributions;
- •
replacing informative priors distributions in Equation 7 with flat uniform priors.
All modifications are implemented and estimated with nimble, relying on the same settings as in Section 3.3. Model evaluation is performed via marginal likelihood—estimated via the harmonic mean estimator (e.g. Raftery et al., 2007)—and comparing the mse and mae of predictionsobtained via posterior medians. The empirical findings are reported in Table 3, and suggest that the current implementation for dysm is the best performing model in terms of marginal likelihood and other measures. For example, modelling old-age mortality with a scaled Beta or characterizing infant mortality with an half-normal distribution do not improve the fit of the current specification for dysm; similar results are obtained with more complex specifications for the dynamic prior distribution or non-informative elicitation, confirming the importance of the proposed specifications. Interestingly, current results indicate that the specification without adult mortality performs better for female than for male mortality, confirming that the inclusion of a separate component devoted to adult mortality is fundamental for modelling male populations, but also provides concrete benefits for females.
Evaluation of model fit: − log p(D) refer to the logarithm of the marginal likelihood, changed of sign; mse and mae refer to mean squared error and mean absloute error of the posterior medians respectively. Smaller values indicate better fit
. | males . | females . | ||||
---|---|---|---|---|---|---|
. | − log p(D) . | mae . | mse . | − log p(D) . | mae . | mise . |
dysm | 55.65 | 0.10 | 10.19 | 46.2 | 0.08 | 10.21 |
Beta (old-age mortality) | 453.55 | 0.13 | 33.83 | 307.53 | 0.92 | 98.72 |
Half-normal (infant mortality) | 176.03 | 0.15 | 9.79 | 164.07 | 0.95 | 98.72 |
Removing adult mortality | 264.20 | 0.88 | 30.79 | 88.23 | 0.74 | 83.74 |
Multivariate random-walk | 301.52 | 8.78 | 116.76 | 205.52 | 4.75 | 135.77 |
Student-t innovations | 254.07 | 0.90 | 10.26 | 178.23 | 0.91 | 96.24 |
dysm, uniform prior | 359.70 | 2.35 | 175.14 | 466.52 | 2.01 | 121.61 |
. | males . | females . | ||||
---|---|---|---|---|---|---|
. | − log p(D) . | mae . | mse . | − log p(D) . | mae . | mise . |
dysm | 55.65 | 0.10 | 10.19 | 46.2 | 0.08 | 10.21 |
Beta (old-age mortality) | 453.55 | 0.13 | 33.83 | 307.53 | 0.92 | 98.72 |
Half-normal (infant mortality) | 176.03 | 0.15 | 9.79 | 164.07 | 0.95 | 98.72 |
Removing adult mortality | 264.20 | 0.88 | 30.79 | 88.23 | 0.74 | 83.74 |
Multivariate random-walk | 301.52 | 8.78 | 116.76 | 205.52 | 4.75 | 135.77 |
Student-t innovations | 254.07 | 0.90 | 10.26 | 178.23 | 0.91 | 96.24 |
dysm, uniform prior | 359.70 | 2.35 | 175.14 | 466.52 | 2.01 | 121.61 |
Evaluation of model fit: − log p(D) refer to the logarithm of the marginal likelihood, changed of sign; mse and mae refer to mean squared error and mean absloute error of the posterior medians respectively. Smaller values indicate better fit
. | males . | females . | ||||
---|---|---|---|---|---|---|
. | − log p(D) . | mae . | mse . | − log p(D) . | mae . | mise . |
dysm | 55.65 | 0.10 | 10.19 | 46.2 | 0.08 | 10.21 |
Beta (old-age mortality) | 453.55 | 0.13 | 33.83 | 307.53 | 0.92 | 98.72 |
Half-normal (infant mortality) | 176.03 | 0.15 | 9.79 | 164.07 | 0.95 | 98.72 |
Removing adult mortality | 264.20 | 0.88 | 30.79 | 88.23 | 0.74 | 83.74 |
Multivariate random-walk | 301.52 | 8.78 | 116.76 | 205.52 | 4.75 | 135.77 |
Student-t innovations | 254.07 | 0.90 | 10.26 | 178.23 | 0.91 | 96.24 |
dysm, uniform prior | 359.70 | 2.35 | 175.14 | 466.52 | 2.01 | 121.61 |
. | males . | females . | ||||
---|---|---|---|---|---|---|
. | − log p(D) . | mae . | mse . | − log p(D) . | mae . | mise . |
dysm | 55.65 | 0.10 | 10.19 | 46.2 | 0.08 | 10.21 |
Beta (old-age mortality) | 453.55 | 0.13 | 33.83 | 307.53 | 0.92 | 98.72 |
Half-normal (infant mortality) | 176.03 | 0.15 | 9.79 | 164.07 | 0.95 | 98.72 |
Removing adult mortality | 264.20 | 0.88 | 30.79 | 88.23 | 0.74 | 83.74 |
Multivariate random-walk | 301.52 | 8.78 | 116.76 | 205.52 | 4.75 | 135.77 |
Student-t innovations | 254.07 | 0.90 | 10.26 | 178.23 | 0.91 | 96.24 |
dysm, uniform prior | 359.70 | 2.35 | 175.14 | 466.52 | 2.01 | 121.61 |
4 Conclusion
In this article we have introduced a dynamic model based on a skewed distribution function (dysm) for modelling and forecasting the age-at-death distribution of multiple countries across time. The proposed method automatically regularize diverging scenarios, and allows to obtain predictions which are coherent across sub-populations. In addition, this strategy leads to a borrowing of information which improved the quality of the fit and of the forecasts, compared with state-of-the-art methods for mortality forecasting. Another important advantage of the proposed method is that it provides coherent forecasts without the need of choosing a reference population, which is instead naturally derived from data. The choice of reference population, in fact, is still an open issue of coherent forecast models and has received notable attention from scholars in recent years (Booth, 2020; Kjærgaard et al., 2016).
We have focused directly on the age distribution for the number of deaths, considering the total number of deaths as a given quantity and relying on a multinomial likelihood. This approach allows to focus directly on the distributions of deaths across different ages, accounting for the heterogeneity in the overall mortality trends. For future developments, it might be useful to include a further dynamic level in the hierarchy and model the process of death counts . Such an aim can be achieved, for example, using a hierarchical Poisson or Negative-Binomial dynamic model.
Data Availability Statement
The data that support the findings of this study are publicly available at the Human Mortality Database, at https://urldefense.com/v3/__https://www.mortality.org/__;!!N11eV2iwtfs!vLl4T8uK9UTbafiRG3tuesmjmbYYzfLL2h8NR80WddPeKqXfDTXjlO9TZQBmHTBQQTnm-6NXgMNE0hdYmw$. Code implementing the method is available at https://urldefense.com/v3/__https://github.com/emanuelealiverti/dysm__;!!N11eV2iwtfs!vLl4T8uK9UTbafiRG3tuesmjmbYYzfLL2h8NR80WddPeKqXfDTXjlO9TZQBmHTBQQTnm-6NXgMMCJEFYng$
Supporting information
Additional supporting information may be found in the on-line version of the article at the publisher’s website.
Acknowledgements
This work was supported by the grant miur–prin 2017 project 20177BR-JXS. The authors thank David Dunson, Sally Paganin and Tommaso Rigon for their comments and the fruitful discussions.
References