Reflection on modern methods: shared-parameter models for longitudinal studies with missing data

Abstract A primary goal of longitudinal studies is to examine trends over time. Reported results from these studies often depend on strong, unverifiable assumptions about the missing data. Whereas the risk of substantial bias from missing data is widely known, analyses exploring missing-data influences are commonly done either ad hoc or not at all. This article outlines one of the three primary recognized approaches for examining missing-data effects that could be more widely used, i.e. the shared-parameter model (SPM), and explains its purpose, use, limitations and extensions. We additionally provide synthetic data and reproducible research code for running SPMs in SAS, Stata and R programming languages to facilitate their use in practice and for teaching purposes in epidemiology, biostatistics, data science and related fields. Our goals are to increase understanding and use of these methods by providing introductions to the concepts and access to helpful tools.


Introduction
Missing outcome data occur frequently in longitudinal studies 1 and even moderate amounts can strongly bias study results. 2 Whereas significant progress has been made in approaches for handling missing data, [3][4][5] regular implementation has been slow to follow 6 and improper handling of missing data remains one of the biggest concerns in peer-reviewed literature. 7 Missing and 'coarsened' outcome data may arise from a variety of mechanisms, some potentially innocuous such as participants losing interest in the study, others potentially serious such as sicker participants dropping out and some particularly convoluted such as participant death. 8,9 Every statistical analysis makes underlying assumptions about how the missing data may influence the estimated results. Conducting examinations on how the missingness might affect study conclusions is vital but often underperformed, often due to unfamiliarity or a lack of readily available statistical code.
We briefly review longitudinal-study missing-data problems and methods to address them; describe a 'shared-parameter model' (SPM) approach; 10,11 discuss its utility, limitations and extensions; and provide reproducible example data and code for SPM analyses in SAS, Stata and R programming languages.

Motivating example: cognitive decline and dementia in the Atherosclerosis Risk in Communities (ARIC) study
We illustrate our approaches using an examination of MRI-defined brain-atrophy associations with 20-year cognitive decline in the ARIC study. Full details on the ARIC study design are published elsewhere. 12 Briefly, our subset of ARIC participants had brain-atrophy measurements, with atrophy defined as present/absent by brain MRI 13 at visit 3 (1993-1996) alongside a concurrent battery of cognitive instruments (N ¼ 1840; ages 50-73 years, 60% female, 50% Caucasian). Up to four follow-up cognitive assessments occurred over the next 20 years. The longitudinal outcome of interest was a composite cognitive outcome ('z-score') derived from the memory, language and executive-function assessments that were performed at each of the visits. 14 In addition, dementia status was ascertained over time using neuropsychological testing, telephone calls with participants or proxies, hospital dementia code surveillance and death-certificate dementia codes. 15 The majority (61%) of the original N ¼ 1840 did not complete all the follow-up exams, more participants with brain atrophy dropped out (67%) vs participants without atrophy (58%) and those with brain atrophy had higher rates of dementia than those without brain atrophy (Table 1, Figure 1 and Supplementary Tables S1 and S2, available as Supplementary data at IJE online). Given these amounts and patterns of missingness, and clear connections between censoring dementia events and missing cognitive-decline outcome data, standard analysis methods may violate the assumptions that underlie their results.

Missing-data categories and related statistical methods
The taxonomy for missing data has been well detailed, 4,16,17 with fairly accepted modelling approaches for each (see Box 1, where missingness assumptions are depicted as occurring over a continuum). Briefly, suppose the full set of data planned to be collected is Y full ¼ (Y obs , Y miss ), where Y obs is actually observed and Y miss is unfortunately missing, with M denoting missingness indicators that separate Y full into its observed and missing components, and the joint distribution of outcomes and missingness being P(Y full , M). Under missing completely at random (MCAR): P(MjY full ) ¼ P(M) and MAR: P(MjY full ) ¼ P(MjY obs ), the missing outcome values (Y miss ) are assumed to be unrelated to M, whereupon missingness would be ignorable. We include the additional description of 'extended-MAR' (XMAR) [18][19][20] and corresponding partly ignorable missingness assumptions, which are weaker than missing at random (MAR), 21 where it becomes reasonable to assume that part of the missingness may be ignored by incorporating additional information, denoted by g(M, Y miss ); under XMAR: PfMjY full g ¼ PfMjY obs , g(M, Y miss )g. The XMAR designation helps to clarify distinctions between standard mixed models, where missingness is ignorable under MAR, the family of 'conventional' SPMs described here, which are Box 1 Missing-data assumptions and statistical methods • MCAR: missing completely at random; completely ignorable missingness; P(MjY full ) ¼ P(M) • ᭺ Assumption: Missingness is completely random and not related to any observed or unobserved data.
• ᭺ Example: Unobserved cognitive-decline outcomes occur due to the study ending.
• MAR: missing at random; conditionally ignorable missingness; P(MjY full ) ¼ P(MjY obs ) • ᭺ Assumption: Missingness only depends on observed data and is unrelated to any unobserved data.
• ᭺ Example: Unobserved cognitive declines among participants dropping out of a study are expected to be similar to the observed declines among participants remaining in the study. • ᭺ Analyses: Mixed models, 23 multiple imputation (MI), 5 weighting (IPW) 6 and EM-based. 4 • ᭺ Notes: Rich sets of longitudinal outcomes and predictors are often used to support MAR assumptions.
• XMAR: extended missing at random; partly ignorable missingness; PfMjY full g ¼ PfMjY obs , g(M, Y miss )g • ᭺ Assumption: Missingness depends on observed data and a function of the missing values, g(M, Ymiss). 21 • ᭺ Example: Unobserved cognitive declines among participants with a censoring diagnosis of dementia are expected to follow a latent trajectory that is associated with their dementia-censoring time. • ᭺ Analyses: SPM ('conventional-SPM' [18][19][20] extended MI. 24 • ᭺ Notes: Overall missingness is postulated into ignorable and non-ignorable structures, allowing more tractable handling of potentially remaining informative effects. Additional external data sources containing information related to the outcome and the reason(s) for the outcomes being missing can be essential.
• ᭺ Example: Participants drop out of a study because they experience steeper cognitive declines than otherwise similar participants who remain in the study, potentially regardless of dementia ascertainment. • ᭺ Analyses: pattern-mixture models (PMMs), 4,16 selection models (SEMs) 4,16 and generalized shared-parameter models (GSPMs). 19 • ᭺ Notes: MNAR situations can induce severe bias; analyses only succeed in eliminating bias under additional unverifiable assumptions, making sensitivity analyses an important strategy. 25 both are easy to implement. In the ARIC example, though, participants have no further cognitive assessments after dementia is diagnosed. The unobserved cognitive data for the dementia participants is likely far lower than the observed cognitive data for the non-dementia participants who remained in the study, even if these participants had otherwise similar observed characteristics (age, sex, baseline cognition, etc.). Basic MCAR or MAR structures are therefore unlikely, whereby standard GEE and mixed-model analyses could yield biased results. We examine an SPM approach in the next section.

Methods
Three main modelling approaches for informatively missing data have arisen: selection (SEM), pattern-mixture (PMM) and shared-parameter (SPM) models. 16,20,27 SEM and PMM approaches are defined by alternative decompositions of the outcome and missingness processes; both have been detailed extensively, are useful for examining MNAR and have been recommended by statisticians often, but neither is widely used. SPMs can be expressed in either SEM or PMM forms 20 but, in an SPM, the outcome and missingness processes are both conditioned on latent variables, after which forms of independence are often assumed. 18,19 SPMs received less early attention due to both higher computational demands and more uncertainty in influences of the latent assumptions. Clarity in SPM assumptions, generalizations that allow further sensitivity analyses and advances in computation have allowed SPMs to become more attractive. We focus here on SPM methods given their rise in use, flexibility, extendibility and progress in computational approaches.

The SPM
SPMs extend and connect two of the most widely used statistical models: mixed models for longitudinal data and event/survival models for event-time data. Mixed models 23 are commonly used to study trends in repeated outcomes measured over time, such as cognitive decline. As discussed above, standard mixed models inherently make a MAR assumption. Mixed models conceptualize a person as having inherent, underlying, 'latent characteristics' that account for the correlation in their repeated measurements. For example, one participant may exhibit 'worse cognition' (e.g. a lower baseline cognitive score and faster cognitive decline) than another participant, even when all measured predictors (age, gender, education, etc.) are the same. These underlying, 'latent cognition' characteristics are commonly included in the mixed model as person-specific baseline cognitive scores and cognitive declines (random intercepts and slopes).
Event/survival models, 28 such as proportional-hazards models (PHMs), are commonly used to study time-to-event outcomes, such as incident dementia or death. Here, interest is often in factors that predict faster rates of the event occurring ('hazards') and event times are only known for those participants who experienced the event. Participants who have not yet experienced an event are 'censored' at their last observed follow-up time.
SPMs 10,11,18-21,29 are a type of statistical 'joint model', 30,31 where information is shared between two or more analysis 'sub-models' as shown in Box 2. An SPM uses a mixed model for longitudinal outcomes (e.g. cognition over time) and an event/survival model for an important censoring event directly related to the outcome (e.g. dementia). Information is shared between these longitudinal and event sub-models through a set of latent characteristics (e.g. 'random effects'). The event sub-model uses parameters called 'loading factors' to connect censoring events to the longitudinal sub-model outcomes. When the loading parameters are estimated to be non-zero, there is evidence that the longitudinal and event subprocesses are connected, and the SPM (XMAR) model may be preferred over simpler GEE (MCAR) or GLMM (MAR) models. Whereas the conventional SPM in Box 2 assumes the outcomes and missingness are independent after conditioning on a single set of random effects (an XMAR assumption), the family of Generalized-SPM (GSPM) allows for further MNAR specifications with the inclusion of additional sets of random effects. [18][19][20][21] In our brain-atrophy and cognitivedecline example, dementia represents a potentially egregious source of influential missingness. If the dementia loading factor q 0 in Box 2 is estimated as negative in the SPM, this would translate into participants with 'poorer underlying cognition' developing dementia faster, leading to additional missing cognitive data (XMAR at a minimum). It is well known that the observed data cannot be used to distinguish MNAR situations from MCAR/MAR/ XMAR (since the distinction depends inherently on the data that are missing). However, in some cases, the observed data help to examine whether the inclusion of additional covariates supports MAR assumptions over MCAR. Analogously, one can use the observed data and conventional metrics such as the Bayesian Information Criteria (BIC) to examine whether the SPM inclusion of the additional censoring event sub-models supports potential XMAR assumptions over the simple mixed-model MAR assumptions. 32 We demonstrate this in our motivating example. Additional technical details on our GEE, mixed and SPM formulations, as well as further explanations of GSPM extensions and partly ignorable structures, are provided in Supplementary Appendix 1, available as Supplementary data at IJE online.

Results
Participants who completed cognitive assessments were more likely to have higher baseline cognitive function, more education and younger age, and were generally in better health (Table 1). Missingness mechanisms are likely at least MAR (not MCAR), since missingness is related to observed covariates. Importantly, brain atrophy was heavily related to completion status; 42% of participants without atrophy completed cognitive assessments vs 33% completing among those with atrophy (Supplementary  Table S1, available as Supplementary data at IJE online). Figure 1 shows cognitive trajectories from the observed data (panel A) and Kaplan-Meier functions for censoring dementia events (panel B), stratified by brain-atrophy status. Are the cognitive declines different? Whereas the plotted lines in panel A may appear fairly parallel, they are based on the observed cognitive outcome data (MAR assumption). Since participants with atrophy had lower baseline cognitive scores, higher dementia hazards and more missing data, the plotted MAR lines in panel A may be misleading if XMAR or MNAR missingness is at play. Table 2 compares estimates of brain-atrophy associations with 20-year cognitive decline across different missingness assumptions. Comparing participants with brain atrophy vs those without, GEE estimates show an unsupported additional cognitive decline of À0.081 standard deviations (SDs) (95% confidence interval: À0.204, 0.042) p ¼ 0.199 (under exchangeable correlation; similar for independent, AR1 and unstructured). These estimates are only valid under the strongest assumption (MCAR) which, per above, is highly improbable. The standard mixedmodel estimates the additional cognitive decline with brain atrophy to be a bit stronger at À0.111 SD (À0.235, 0.012) p ¼ 0.077. The mixed-model estimates are valid under the MAR assumption that, although there may be differences between the informative dementia-censoring events and cognitive trajectories were supported statistically. Loadingfactor estimates showed that both lower baseline cognition (q 0 ) HR ¼ 2.57 (1.95, 3.39) per SD and steeper cognitive declines (q 1 ) HR ¼ 1.08 (1.07, 1.10) per SD were associated with increases in the hazards of dementia-censoring events. Figure 2 shows the observed and predicted values for eight participants with different atrophy, dementia and • Shared latent effects: • ᭺ Random effects: random intercepts, slopes, etc.
• ᭺ Example: random intercept, b 0i $ N(0,s 2 ) • Principal interpretations: • ᭺ b 3 : difference in cognitive decline between those with brain atrophy compared with those without brain atrophy • ᭺ b 0i : subject-specific (latent) random intercept representing a person's underlying difference in baseline cognition from the baseline of otherwise similar people; often, but not always, assumed to be normally/Gaussian-distributed • ᭺ exp(a 1 ): dementia hazard ratio associated with brain atrophy • ᭺ q 0 : loading factor describing association of subject-specific (latent) baseline cognitive scores with dementia hazards • ᭺ s: heterogeneity (standard deviation) of subject-specific (latent) baseline cognitive scores across individuals • Conventional SPM specification: • ᭺ For subjects i ¼ 1..N, measured at time points j ¼ 1..n i , with predictor sets X ij , Z ij , X* ij , Z* ij , random-effects vector b i , parameter vectors b, a, q, and covariance matrix s: where T Ã i is the true time to a censoring event with potential information about Y ij and C i is an independent stochastic censoring mechanism. Letting h 0 (t) denote the baseline hazard function and X Ã ij , Z Ã ij , being predictor sets potentially different from those in Sub-model 1: missingness status. Panels A and B show non-dementia participants 1 (without atrophy) and 2 (with atrophy); both have complete data and similar predicted values comparing mixed-model results to SPM. Non-dementia participants 3 (without atrophy) and 4 (with), both having missing data, show SPM predictions closer to their observed values than the mixed model. Panels C and D show similarly structured participants with dementia, three of whom (participants 6, 7 and 8) all show closer fits to their observed data with the SPM. The fourth (participant 5) with complete data has similar mixed and SPM predictions. These eight participants are indicative of overall results, where correlations between SPM predictions and observed values were slightly higher with SPM (Pearson's ¼ 0.954) vs mixed models (Pearson's ¼ 0.945), with improvements in all eight atrophy by dementia by completeness groups and the largest improvements being in those with incomplete data (Supplementary Figure S1, available as Supplementary data at IJE online). Here, even when considering observed data metrics (AIC/BIC, loading-factor 95% CIs, observed vs predicted correlations), the XMAR assumptions of the SPM are supported over the MAR assumptions of the standard mixed model.

Simulated data and reproducible code to run SPMs
To facilitate the broader implementation of SPMs, simulated data based on the ARIC cognitive-decline example and reproducible programs are provided in the Supplementary Materials, available as Supplementary data at IJE online, and at github link https://github.com/ MichaelGriswold/SPM. The simulation approach and data are found in Supplementary Appendix 2, available as Supplementary data at IJE online. General 'pseudo-code' and tips to help convergence are provided (Supplementary Appendix 3, available as Supplementary data at IJE online), along with specific code for Stata, SAS and R programming languages (Supplementary Appendices 4-6, available as Supplementary data at IJE online) and corresponding results (Supplementary Tables S5-S7, available as Supplementary data at IJE online). Others have used simulations to prove statistical properties of SPMs; our simulated data are meant only to provide a starting place for code implementation.

Discussion
Missing data in longitudinal studies can significantly bias study results. Many techniques exist to explore the degree of this potential bias, but routine implementation has been slow, despite guidelines recommending inclusion. We have described how an underutilized method, i.e. the SPM, can be used to examine potentially important missing-data  effects, using a real-world example to demonstrate the approach. We have additionally provided reproducible data and code to facilitate and broaden SPM use. In accordance with other recommendations, we suggest the sensitivity examinations to be included routinely at least in Supplementary Materials, available as Supplementary data at IJE online. When sensitivity analyses suggest that conclusions may depend on missing-data assumptions, discussion of how primary results may be affected by important missingness elements is prudent.
Another feature of joint models is their flexibility for extensions. Here, we have focused on concepts to outline a simple, conventional SPM specification (see Supplementary Appendix 1, available as Supplementary data at IJE online, for additional technical details). Extensions are vast and include useful aspects such as: adding multiple longitudinal sub-models (e.g. simultaneously modelling declines in language, memory and executive function, or even multiple individual cognitive instruments, similar to factor analysis or structural equation modelling); adding multiple-event sub-models as competing risks (e.g. times to dementia, nursing-home institutionalization, death); sharing time-dependent predictions from the longitudinal sub-models (vs only random effects); including flexible random-effect distributions (such as mixture models), including additional random effects for MNAR examination, and many other augmentations. The broader class of joint models is an active area of statistical research.
Deciding how to deal with deaths deserves special attention in any longitudinal data analysis. 8,9,16,33 Mixed models implicitly borrow information from the observed data and attribute it to the missing data. This property extends the missing-data assumptions from MCAR to MAR but does not distinguish between missing data for those who remain alive vs for those who have died. Others have described mixed models as conveying inferences on an 'immortal cohort'. We used a simple SPM with a single longitudinal outcome (global cognition) and a single censoring event (dementia) in order to describe and illustrate key SPM aspects. When we extended the SPM to incorporate competing risk sub-models for both dementia and death events, associations between brain atrophy and cognitive decline were even stronger and supported by BIC and loading parameter estimates similar to those described above (see Supplementary Appendix 7, available as Supplementary data at IJE online, for description and Figure 2 Predicted vs observed values for (A) participants without dementia using standard linear mixed-model (LMM) estimates, (B) participants without dementia using shared-parameter model (SPM) estimates, (C) participants with dementia using standard LMM estimates and (D) participants with dementia using SPM estimates. Participants 1 and 5 have complete data and brain atrophy; participants 2 and 6 have complete data and no atrophy; participants 3 and 7 have incomplete data with atrophy; and participants 4 and 8 have incomplete data without atrophy. LMM, linear mixed model; SPM, shared-parameter model; MAR, missing at random; XMAR, extended missing at random. results). Additional GSPM formulations with assumptions that admit 'non-future dependence' 19 could be used to explore death effects further. Accounting for death effects is challenging by any method, particularly when informative missingness is involved, and, although this has been an active area of statistical research, a broader platform of 'mortal-inference' methods that assess sensitivity to full MNAR would be valuable.
We follow trends in stating that the conventional SPM can be conceptualized as an 'extended-MAR' (XMAR) assumption where, given the random effects, missingness is assumed to no longer depend on the unobserved longitudinal measurements or future times (events or censoring). 19 Under this partly ignorable assumption, a conventional SPM can constitute a valuable missingness examination as shown in our example. However, we note that the partly ignorable assumption is still an assumption and, although it is more flexible than MAR, even more insidious MNAR situations not addressed by a conventional SPM can exist, where differences between observed and missing observations are uninformed by the observed data. Such situations can only be addressed in an extended sensitivity analysis, where one elicits a scientifically plausible range for differences in observed vs missing values and then examines how findings vary within this plausible range. 19,25,30 Recent work for SPM approaches is based on 'grounding' the sensitivity analyses on the conventional SPM that we have detailed. 26 Such analyses are more complex than those illustrated in the present work and could indicate additional sensitivities not identified here. However, assuming that additional informative missingness effects would act in the same direction as dementia (such as those shown by the additional death model, Supplementary Appendix 7, available as Supplementary data at IJE online), the stated SPM results here would be conservative if additional effects above and beyond those accounted for in the SPM remain.

Conclusion
Results from standard GEE (MCAR) and mixed models (MAR) can be biased if the missing-data assumptions are incorrect. It is difficult or impossible to tell whether an MNAR situation is correct because the data needed to test the assumption are, by definition, missing. However, just as observed data may support an MAR assumption over an MCAR one, an XMAR assumption may extend the utility of other observed data related to the missing outcomes to show utility over MAR. Recent national recommendations regarding missing-data state: 'Examining sensitivity to the assumptions about the missing data mechanism should be a mandatory component of reporting.' 2 Sensitivity analyses should examine scientifically plausible situations with appropriate approaches rather than using single imputation strategies that are known to bias both estimates and standard errors such as best-case, worst-case or last-observation-carried-forward (LOCF). 2,25 Our suggested approach is to (i) identify scientifically plausible informative censoring events (like dementia for cognitive decline), then (ii) incorporate these events in an SPM alongside a longitudinal mixed model, (iii) compare the SPM (XMAR) results to estimates obtained from separate GEE (MCAR) and mixed models (MAR), and finally (iv) consider additional MNAR structures that may still remain using GSPM, SEM or PMM methods. This approach uses multiple models, which are all now becoming more and more computationally tractable, to provide a better understanding of potential mechanisms and influences of missing data. To place an addendum on the famous quote by George Box: 'All models are wrong, but some models are useful'-and many useful models are illuminating.

Supplementary data
Supplementary data are available at IJE online.

Ethics approval
The institutional review board at each centre approved the study and all participants provided written informed consent at each visit.

Data availability
Study protocols and data for the ARIC and ARIC Brain MRI Study may be obtained by approved persons through written agreements with the ARIC Steering Committee and the research sponsor (National Institutes of Health, National Heart, Lung, and Blood Institute) (e-mail: tmos-ley@umc.edu). Reproducible Research example data sets are available at https://github.com/MichaelGriswold/SPM or via e-mail from Dr Griswold (mgriswold@umc.edu).