Joint Modeling of Individual Trajectories, Within-Individual Variability, and a Later Outcome: Systolic Blood Pressure Through Childhood and Left Ventricular Mass in Early Adulthood

Abstract Within-individual variability of repeatedly measured exposures might predict later outcomes (e.g., blood pressure (BP) variability (BPV) is an independent cardiovascular risk factor above and beyond mean BP). Because 2-stage methods, known to introduce bias, are typically used to investigate such associations, we introduce a joint modeling approach, examining associations of mean BP and BPV across childhood with left ventricular mass (indexed to height; LVMI) in early adulthood with data (collected 1990–2011) from the UK Avon Longitudinal Study of Parents and Children cohort. Using multilevel models, we allowed BPV to vary between individuals (a “random effect”) as well as to depend on covariates (allowing for heteroskedasticity). We further distinguished within-clinic variability (“measurement error”) from visit-to-visit BPV. BPV was predicted to be greater at older ages, at higher body weights, and in female participants and was positively correlated with mean BP. BPV had a weak positive association with LVMI (10% increase in within-individual BP variance was predicted to increase LVMI by 0.21%, 95% credible interval: −0.23, 0.69), but this association became negative (−0.78%, 95% credible interval: −2.54, 0.22) once the effect of mean BP on LVMI was adjusted for. This joint modeling approach offers a flexible method of relating repeatedly measured exposures to later outcomes.


3
Web Table 1: estimates from univariate outcome models, analysing SBP as the outcome, presenting the mean and credible intervals for the posterior parameter estimates of the regression coefficients (except where indicated). a Linear spline terms, corresponding to change per year for age ≤ 12, and for age >12, respectively. b Reference category for mother's highest education qualification: CSE / None. 5 c mean BP denotes the individual-level, between-clinic, random effects for the intercept (at mean age 11.3 years) in the mean function for SBP; BP slope denotes the individual-level, between-clinic, random effects for age in the mean function for SBP; log(BPV) denotes the individual-level, between-clinic, random effects on the log-scale of the within-individual variance. Web Table 2.
As Web Table 2 indicates, when modelling mean clinic BP as the outcome (in two-level models), the estimated associations were substantively similar to those found in the three-level models (analysing the same sample) in the main manuscript, although with some differences. For example, the SD of the log(BPV) random effects is estimated to be smaller in the two-level models, indicating that individuals are estimated to differ to a lesser degree in how erratic their BP measurements are when mean clinic BP is modelled as the outcome. In addition, the fixed intercept in the log(BPV) function is larger in the two-level models, indicating that within-individual variance is estimated to be greater, whilst the estimates of the fixed effects in that function are smaller. 6 Web Web Table 3 As Web Table 3 indicates, a univariate outcome model of SBP fitted to all those for whom at least one mean SBP measurement, from at least one of the six clinics, was available (i.e. regardless of whether the individual later had their LVMI estimated) found associations broadly in keeping with those in the smaller subset modelled in the main manuscript. The SD of the log(BPV) random effects was a little higher in this larger sample, and age and sex were estimated to have smaller associations with log(BPV), and weight and height larger associations, as well.
Web BP slope denotes the individual-level, between-clinic, random effects for age in the mean function for SBP; log(BPV) denotes the individual-level, between-clinic, random effects on the log-scale of the within-individual variance.

12
Web Table 4 As Web Table 4 indicates, when sex is the only covariate (analysing the same sample as models presented in main manuscript), females' BPV was predicted to be -28.31% (-33.09%, -23.21%) lower than that for males.
Web Table 5 Web Table 5: Estimates from the joint models presented in Table 2 in the main manuscript, but with original, non-transformed, regression coefficients presented for all parameter estimates. BP slope denotes the individual-level, between-clinic, random effects for age in the mean function for SBP; log(BPV) denotes the individual-level, between-clinic, random effects on the log-scale of the within-individual variance; i.e. these are the equivalent of 0 , 1 and 2 in equation 2, respectively.
Web Table 6 Web Table 6: Full estimates from the joint models presented in Table 3 in the main manuscript: i.e. estimates from joint models in which mean BP, BP slope, and log(BPV) are alternately included as exposures for log(LVMI). The mean and credible intervals for the posterior parameter estimates of the regression coefficients (except where indicated). BP slope denotes the individual-level, between-clinic, random effects for age in the mean function for SBP; log(BPV) denotes the individual-level,  Table 7 Web Table 7: Estimates from the joint models presented in Table 3

Introduction
In this Web Appendix, a series of models are fitted, of increasing complexity, to simulated data, concluding with a three-level joint model with a repeatedly-measured outcome and an individual-level outcome, and with complex within-individual variability.
The R script below fits models in Stan. It uses 4 chains for each model, and otherwise the default number of total iterations and burnin/warmup iterations (and thinning) are used (see ?stan for details). Dataset sample sizes and the number of chain iterations used can naturally be adjusted to facilitate better estimation, to run these example models more/less quickly, etc.
Stan is called via the R package rstan. To install rstan, please see https://github.com/stan-dev/rstan/wiki/ RStan-Getting-Started. Stan offers an efficient method of exploring complex posterior probabilities through its use of Hamiltonian Monte Carlo (HMC) and no-U-turn samplers (NUTS).
The Stan models use a non-centered parameterisation for the random effects, fitting them as independent standard Normals (N(0, 1)). Cholesky factorisation allows the covariance matrix for the random effects to be removed from the prior and recovered in the transformed parameters block. Since these optimisations change the shape of the posterior the HMC algorithm samples from, they can improve the efficiency with which multilevel models are estimated. Vectorisation (cf. the use of for loops, for example) also contributes to optimising the model.
When the resulting Cholesky factor of the random effect covariance matrix is multiplied by the independent (standard Normal) random effects, z_u, the correlated random effects u are recovered. (z_u is a matrix with n_u rows and J columns, where n_u is the number of random effects allowed to covary, and J denotes the number of individuals).
There are a wide variety of references providing further guidance and information, a small selection of which are listed at the end of this Web Appendix (4-8).

A note on number of repeated measures
With regard to how many repeated measures are needed to estimate these models, then it can be helpful to examine what complexity of model could be fitted if the measures were taken at the same age for all individuals. For example, in the case of three measures per person, each at exactly the same ages, then three means and six variance/covariances could be estimated from the data. Such a dataset would be compatible with a model with individual-level random effects for mean and slope, plus random error, as this model estimates two means (intercept and slope) and four variance/covariances (variance for mean and slope, covariance between them, random error). However, we could not estimate a model including random terms for mean, slope and BPV, plus random error, since this model estimates seven variance/covariances (the variation for each of mean, slope, BPV and random error (four variances), plus the three covariances between mean, slope and within-individual variability). However, we could estimate the more elaborate model in the case of four measures per person (each at exactly the same ages), as ten variance/covariances could be estimated from these data.

Specifying priors
To be adjusted as appropriate.
In this particular example, for the fixed effects in the mean function for y1, we use the equivalent as had the variables been standardised -of Normal(mean(y1), SD = 10) for intercept and Normal(0, SD = 2.5) for covariates (9).
For the coefficients in the function for the within-individual variance, we also use Normal priors. For the location (mean) of the prior for the coefficient of the intercept we use an estimate of the within-individual SD gleaned from a simpler (random slopes) model, with zero for the location (mean) of the prior for the coefficient of the covariate. With regard to the scale of these priors, in this example we have chosen a value which would predict sigma_e of between c. 0.1 to 42 (i.e. very weakly informative, in this context) with +/-2 SDs of a standard Normal predictor.

A note on library(brms)
Note that a 2-level model with complex within-individual variability, including a random effect, can also be fitted via the brms package (10, 11) in R.
For instance, the following example would fit a random intercept and slope (for covariate x1) for the mean of repeatedly-measured outcome y1, as well as allowing the within-individual SD (cf. variance) to depend on the covariate and a random effect. Here the s (although could be any value, as long as same in each set of parentheses) indicates that the random effects are to be modelled as correlated.
Whilst such a model can be succinctly fitted in brms, it is not currently possible to fit joint multilevel models in which the outcomes are at different levels and random effects are fitted as predictors, via the package.

Specifying priors
To be adjusted as appropriate (and again assuming using the occasion-level data simulated earlier).
In this particular example, for the fixed effects in the mean function for y1, we use the equivalent as had the variables been standardised -of Normal(mean(y1), SD = 10) for intercept and Normal(0, SD = 2.5) for covariates (9). We do the same for the coefficients in the equation for the individual-level outcome, y2, as well -we use the random effects from the dataset we simulated to get an idea of their SDs, but can of course use estimates of the SD of the random effects from simpler (e.g. univariate outcome) models fitted to actual data for same purpose.
For the coefficients in the function for the within-individual variance, we also use Normal priors. For the location (mean) of the prior for the coefficient of the intercept we use an estimate of the within-individual SD gleaned from a simpler (random slopes) model, with zero for the location (mean) of the prior for the coefficient of the covariate. With regard to the scale of these priors, in this example we have chosen a value which would predict sigma_e of between c. 0.1 to 42 (i.e. very weakly informative, in this context) with +/-2 SDs of a standard Normal predictor.

Introducing an additional, lower level
Equation Simulating data

Specifying priors
To be adjusted as appropriate.
In this particular example, for the fixed effects in the mean function for y1, we use the equivalent as had the variables been standardised -of Normal(mean(y1), SD = 10) for intercept and Normal(0, SD = 2.5) for covariates (9). We do the same for the coefficients in the equation for the individual-level outcome, y2, as well -we use the random effects from the dataset we simulated to get an idea of their SDs, but can of course use random effects from simpler models fitted to actual data for same purpose.
For the coefficients in the function for the within-individual variance, we also use Normal priors. In this example with simulated data we use the known residual SD for the location (mean) of the prior for the coefficient of the intercept, but with real data an estimate of the residual SD could be taken from a simpler model (e.g. random slope), for example, for the same purpose. Otherwise, zero is used for the location (mean) of the prior for the coefficient of the covariate. With regard to the scale of these priors, in this example we have chosen a value which would predict sigma_e of between c. 0.1 to 45 (i.e. very weakly informative, in this context) with +/-2 SDs of a standard Normal predictor.

launch_shinystan(stan_fit_shiny)
This Web Appendix provides further estimation details for models presented in main manuscript.

Chain lengths and convergence
Each model was fitted using four chains. Convergence to the target distribution was judged via visual diagnostics, the presence (and number and type) of any divergent transitions, and the value of split-̂ (with ≈ 1 suggesting convergence) (12).
We ran each chain for 15,000 iterations, including 5,000 warm-up. The chains were run for such a length that the effective sample size for each parameter of interest indicated a reasonable number of independent draws (≥400) from the posterior distribution. Note that the length of chain is very generous, but was set for the one or two models with relatively high autocorrelation for certain parameters.

Priors
For the fixed effects we used Normal priors, whilst for the SDs for the individual-level random effects, and for the occasion-level SD in models in which this was assumed constant, we used half-Cauchy priors (13,14), and for the correlation matrices for the random effects at the individual-level we used an LKJcorr prior (4,15). We used values for these priors designed to be weakly-informative (6): further details follow.
For the fixed effects in the mean function for the repeatedly-measured outcome, and for the mean function for the individual-level outcome, we used the equivalenthad the variables been standardisedof Normal( , σ = 10) for the intercept and Normal(0, σ = 2.5) for the other predictors, re-scaling as appropriate (9).
Half-Cauchy( 0 = 0, γ = 10) priors were used for the SDs for the individual-level random effects, and for the occasion-level SD in models in which this was assumed constant, whilst a prior of LKJcorr(2) was used for the correlation matrices for the random effects at the individual-level (4,(13)(14)(15).
For the fixed effects in the within-individual variability function we used Normal priors. For the intercept, the mean of the prior was the estimated within-individual variance (log-transformed) from a simpler (random slope) model, whilst the priors for the covariates had a mean of zero. For the SD of these priors, we used the equivalenthad the variables been standardised -of σ = 2.25 for both the intercept and covariates. Since a change of +/-2SDs (around the mean for the prior of the intercept) would have predicted a within-individual SD of between 0.6 and 51.3 (i.e. 4.6% of distribution of prior would predict values above/below this range), this was judged very weakly informative.