Estimating Country-Specific Incidence Rates of Rare Cancers: Comparative Performance Analysis of Modeling Approaches Using European Cancer Registry Data

Abstract Estimating incidence of rare cancers is challenging for exceptionally rare entities and in small populations. In a previous study, investigators in the Information Network on Rare Cancers (RARECARENet) provided Bayesian estimates of expected numbers of rare cancers and 95% credible intervals for 27 European countries, using data collected by population-based cancer registries. In that study, slightly different results were found by implementing a Poisson model in integrated nested Laplace approximation/WinBUGS platforms. In this study, we assessed the performance of a Poisson modeling approach for estimating rare cancer incidence rates, oscillating around an overall European average and using small-count data in different scenarios/computational platforms. First, we compared the performance of frequentist, empirical Bayes, and Bayesian approaches for providing 95% confidence/credible intervals for the expected rates in each country. Second, we carried out an empirical study using 190 rare cancers to assess different lower/upper bounds of a uniform prior distribution for the standard deviation of the random effects. For obtaining a reliable measure of variability for country-specific incidence rates, our results suggest the suitability of using 1 as the lower bound for that prior distribution and selecting the random-effects model through an averaged indicator derived from 2 Bayesian model selection criteria: the deviance information criterion and the Watanabe-Akaike information criterion.


WEB APPENDIX 2 Simulated Distribution of the Number of Cases for the 4 Entities Considered in Study 1
To evaluate the performance of the different approaches, we simulated data to establish a realistic ground truth for disease risk variation, based on the validated and published data used in RARECARENet project. Four cancer entities were selected from the RARECARENet database and each was fitted to a random-effects Bayesian model.
Using the predictive distribution derived from this model, J=1000 simulations of a set of I=27 predicted values (one for each country) were performed, , where j=1,…,J and i=1,…,I. We considered this realistic since the observed cases for the 27 European countries included in the study were previously validated to derive the most recent indicators: the country-specific number of RCs and their corresponding rates. In addition, in the context of rare cancers, an outcome of interest is to estimate the variability of the case counts.
In order to simulate data sets similar to the original data, we used the estimates of the model parameters as the true values, and then we simulated the model described in  Figure 4). j=1,…,1000) for entity 1 (in red: actual observed cases in the corresponding country, that is ) for each of the 27 countries considered. j=1,…,1000) for entity 2 (in red: actual observed cases in the corresponding country, that is ) for each of the 27 countries considered. j=1,…,1000) for entity 3 (in red: actual observed cases in the corresponding country, that is ) for each of the 27 countries considered. j=1,…,1000) for entity 4 (in red: actual observed cases in the corresponding country, that is ) for each of the 27 countries considered.

(I) Exact confidence intervals for
We first proceed with the simplest approach based on the exact Poisson approximation to the confidence limits since this approach performs well. We used epitools R library (see its description in http://www.epitools.net, last accessed November 2019), which allows for computing these confidence intervals, and we referred to them as 'exact Poisson' in the tables.

(II) Byar's approximation to confidence interval for
Another one of the simplest approaches, based on assuming a non-random-effects model from where and are the respective lower and upper bounds for the confidence limits, and 2 denotes the 100(1 − 2 ⁄ ) percentile of the standard normal distribution.. We used epitools R library, which allows for computing these confidence intervals, and we referred to them as 'Byar Poisson' in the tables.

(III) Empirical Bayes approach by assuming Gamma distribution for
Two random-effects models were assessed using the maximum likelihood approach.
First, we followed the EB proposal by Clayton and Kaldor (3) based on the assumption that follows a common gamma distribution with scale parameter and shape parameter parameterized as Then the distribution of conditional on is also Gamma with expectation ( + )/( + ). Simulating from this distribution can yiedld the required quantiles for the . Moreover, the unconditional expectation of is since follows a negative binomial distribution (3). In this line, maximum likelihood estimates for ( , ) can be obtained through a generalized linear model assuming a negative binomial for as described by McCullagh and Nelder (4). This approach is implemented in the R library SpatialEpi (5) and the model is referred to as EB in the tables.

(IV) Generalized linear mixed model
The performance

Measures for the Assessment of the Models/Methods in the Simulation Study
For each simulated data set, the average width of the confidenceprocedures I,II,IIIor credibleprocedure IVintervals was calculated across the I=27 countries, as well as the , and two matrices, that of the prior precision ( ) and that of the posterior covariance matrix of the Gaussian approximation * ( ) (7,8,11). We kept these estimates over the 1000 simulated data sets. However, a lower value in one of these platforms does not necessarily mean that one platform fits better than the other, as the DIC calculations differ

Computational Issues in the Simulation Study
The INLA approach to approximate the posterior distribution of the hyperparameters can be changed to increase accuracy or reduce computation time (8,11,12). In this line, the uniform prior on σ must be a uniform between 0 and ∞ (for computational reasons), so changing lower bounds for Uniform cannot currently be assessed using INLA (8). . We checked the Gelman-Rubin diagnostic estimate of the potential reduction scale (12)(13)(14) to assess convergence of the Markov chains.

Markov Chain Monte Carlo (MCMC)
Since our aim was to estimate predictive accuracy of M1,..,M8, we used the LPPD (13)(14)(15)(16)(17) and from there, calculated the WAIC (15,16) where p is a Poisson distribution, S is the number of simulation draws and is the s-th simulated value from the posterior distribution of . From (E3), one can derive an estimate of the WAIC (17), a fully Bayesian approach for estimating the out-of-sample expectation by using (E3) and adding a correction for the effective number of parameters to adjust for overfitting, the , leading to two WAIC estimates, WAIC1 and WAIC2 (13)(14)(15)(16)(17). This can be calculated two ways, leading to two WAIC estimates (17) as follows: Although Gelman recommended the use of 2 because its series expansion resembles the series expansion of the leave-one-out cross validation (17), we assessed both 1 and 2 . A note on the model selection comparing WAIC with other indicators as well as it implementation in R can be found in Web Appendices 9 and 10, respectively.

A Note on Model Selection Methods Used in this Paper
Model selection requires a compromise between goodness-of-fit and flexibility, quantifying these two criteria through some metric. The four frequently used methods implementing information criteria are: the Akaike information criterion (AIC) (18), the Bayesian information criterion (BIC) (19), the deviance information criterion (DIC) (20), and the widely applicable information criterion (WAIC) (15,16). DIC and WAIC estimates can be obtained from INLA, but WAIC is not available from a WinBUGS output.. Web Appendix 10 shows its R code implementation when using MCMC samples derived from WinBUGS.

Illustration on the Calculation of Models' Ranking
The Web Figure 5 depicts how to rank models for each entity. Assume that we have 190 entities and 7 models. The indicators are stored in a matrix M of 190 rows (entities) and 8(models) × 5(indicators), giving 40 columns. The procedure is as follows: for each cancer entity the model rank was calculated according to each of the indicators (see Step 2), and this generated the matrix of rankings R. Averaging the rankings of indicators for each model in the matrix R, the 'ranking' for each model and cancer site could be determined.

Web Figure 5.
Steps to rank models for each entity.
In this line, for deriving model rankings for each scenario, cancer entities must be selected according to the scenario's criterion, the IRs. Therefore, one must select the corresponding rows in the matrix R. This is a new matrix for the corresponding scenario, say R-SC. Then, the average of each models' ranking in R-SC must be calculated.
Web Figure 6. Differences in DIC and pD between INLA models M8 (Gamma prior on ) and M9 (Uniform prior on σ) across J=1000 simulated data sets: Panels a) and b) "adenocarcinoma with variants of middle ear", Panels c) and d) "adenocarcinoma with variants of trachea." 21 Web Figure 7. Ranking of models accounting for DIC, WAIC1 and WAIC2, stratified by scenarios as well as overall ranking (non-stratified by scenarios). Overall ranking was calculated by averaging DIC and WAIC rankings into one indicator.