-
PDF
- Split View
-
Views
-
Cite
Cite
Grace Y. Yi, A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates, Biostatistics, Volume 9, Issue 3, July 2008, Pages 501–512, https://doi.org/10.1093/biostatistics/kxm054
Close -
Share
Abstract
Longitudinal data often contain missing observations and error-prone covariates. Extensive attention has been directed to analysis methods to adjust for the bias induced by missing observations. There is relatively little work on investigating the effects of covariate measurement error on estimation of the response parameters, especially on simultaneously accounting for the biases induced by both missing values and mismeasured covariates. It is not clear what the impact of ignoring measurement error is when analyzing longitudinal data with both missing observations and error-prone covariates. In this article, we study the effects of covariate measurement error on estimation of the response parameters for longitudinal studies. We develop an inference method that adjusts for the biases induced by measurement error as well as by missingness. The proposed method does not require the full specification of the distribution of the response vector but only requires modeling its mean and variance structures. Furthermore, the proposed method employs the so-called functional modeling strategy to handle the covariate process, with the distribution of covariates left unspecified. These features, plus the simplicity of implementation, make the proposed method very attractive. In this paper, we establish the asymptotic properties for the resulting estimators. With the proposed method, we conduct sensitivity analyses on a cohort data set arising from the Framingham Heart Study. Simulation studies are carried out to evaluate the impact of ignoring covariate measurement error and to assess the performance of the proposed method.
1. INTRODUCTION
Longitudinal studies are commonly conducted in the health sciences, biochemical, and epidemiology fields. Although longitudinal studies are designed to collect data on every individual in the studies at each assessment, missing observations often arise due to various reasons. There has been increasing interest in discussing valid inference methods for longitudinal data with missing values. Yet, there is relatively little work on investigating the effects of covariate measurement error on estimation of the response parameters, especially on simultaneously accounting for the biases induced by both missing values and mismeasured covariates. Measurement error in covariates is, however, a typical feature of longitudinal data. Sometimes, covariates of interest may be difficult to observe precisely due to physical location or cost. Sometimes, it is impossible to measure covariates accurately due to the nature of the covariates. In other situations, a covariate may represent an average of a certain quantity over time (e.g. cholesterol level (CHOL)) and any practical way of measuring such a quantity necessarily features measurement error.
It has been recognized and well documented that, in other contexts, ignoring covariate measurement error may lead to severe biased results. For example, Fuller (1987) pointed out that the slope in a simple linear regression model may be attenuated if covariate measurement error is ignored. For survival data analysis, Prentice (1982), Li and Lin (2003), Yi and He (2006), and Yi and Lawless (2007), among others, investigated measurement error effects and developed inference methods to correct for the bias resulted from measurement error in covariates. For an overview of measurement error problems, see Carroll and others (2006).
In this paper, we investigate the impact of covariate measurement error on longitudinal data analysis. This work is motivated by the need of methods to simultaneously address both missingness and measurement error that are often possessed by longitudinal data. For example, a data set arising from the Framingham Heart Study contains error-prone covariates and a portion of subjects who drop out of the study during the follow-up period. An objective of this study is to understand how obesity is associated with covariates such as age, blood pressure, and CHOL. It is well known that individual measurement for blood pressure and CHOL involves substantial measurement error. Here, the true measurements of these covariates are defined as their long-term average values. The measurement at a specific time point would fluctuate with time, seasonal variation, and other confounding factors. The features of measurement error in covariates and dropout present a challenge to the existing inference methods. In this paper, we develop an inference method for analyzing longitudinal data that have both dropout and error-contaminated covariates. We utilize marginal methods to modulate the response process. A functional method for the measurement error process is employed. Such a method is appealing because it does not require the specification of the covariate distribution.
The remainder is organized as follows. Notation and model setup are introduced in Section 2. In Section 3, we discuss a simulation–extrapolation (SIMEX) method to account for both dropout and covariate measurement error. A data set arising from the Framingham Heart Study is analyzed with the proposed method and the results are reported in Section 4. In Section 5, we conduct simulation studies to assess the performance of the proposed method as well as the impact of ignoring measurement error in covariates. General discussion is included in Section 6.
2. NOTATION AND MODEL SETUP
2.1 Response process
Longitudinal data analysis may typically be conducted based on marginal, random-effects, and transitional models (Diggle and others, 2002). In this paper, we focus on marginal analysis with the primary interest centered on the marginal mean parameters. Let Yij be the response variable for subject i at time point j, xij be the covariate vector subject to error, and zij be the vector of error-free covariates, i=1,2,…,n and j=1,2,…,m. Denote Yi=(Yi1,Yi2,…,Yim)′, xi=(xi1′,xi2′,…,xim′)′, and zi=(zi1′,zi2′,…,zim′)′. Let μij=E(Yij|xi,zi) and vij=var(Yij|xi,zi) be the conditional expectation and variance of Yij, respectively, given the covariates xi and zi.

Here, we assume that the dependence of mean μij on the subject-level covariates xi and zi is completely reflected by the time-specific covariates xij and zij, that is, E(Yij|xi,zi)=E(Yij|xij,zij). This assumption has been widely adopted in modeling longitudinal data, see Diggle and Kenward (1994), Robins and others (1995), Cook and others (2004), and Yi and Thompson (2005), for example. This assumption was noted in Pepe and Anderson (1994) and was justified from the viewpoint of formulating unbiased estimating functions. Model (2.1) may consist of baseline covariates such as gender, age, and treatment status or time-varying covariates. With an exogenous covariate process (i.e. a time-varying covariate that is not predicted by past outcomes), properly including current or lagged values of the covariates may meet this assumption (e.g. Miglioretti and Heagerty, 2004). Both cross-sectional and longitudinal effects of time-varying covariates may be featured in model (2.1). See Diggle and others (2002, Chapter 12) for more detailed discussion.
2.2 Missing data process
Let Rij be 1 if Yij is observed and 0 otherwise. Let Ri=(Ri1,Ri2,…,Rim)′ be the vector of (non)missing data indicators, i=1,2,…,n. Dropouts or monotone missing data patterns are considered here. That is, Rij=0 implies Rij′=0 for all j′>j. Without loss of generality, assume that Ri1=1 for every subject i. According to the dependence of the missing data process on the response process, missing data mechanisms may be classified as missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR) (e.g. Kenward, 1998).
In this paper, we assume an MAR mechanism for the dropout process. That is, given the covariates, the conditional distribution f(ri|xi,zi,yi) depends on the observed response components yiobs and the covariates only. Let λij=P(Rij=1|Ri,j−1=1,xi,zi,yi) and πij=P(Rij=1|xi,zi,yi). Note that πij=∏t=2jλit. Let Hijy={yi1,…,yi,j−1} denote the response history up to (but not including) time point j.

Let Mi be the random dropout time for subject i and mi be a realization, i=1,2,…,n. Define Li(α)=(1−λimi)∏t=2mi−1λit, where λit is determined by model (2.2). Let Si(α)=∂logLi(α)/∂α be the vector of score functions contributed from subject i. Denote θ=(α′,β′)′ and q=dim(θ).
2.3 Measurement error process

It is known that nonidentifiability is often a problem if model (2.3) is employed. For identifiability of model parameters, one needs a validation data set consisting of {Yij,xij,Wij,zij} or repeated measurements Wij to estimate the parameters associated with Σe. If neither validation data nor repeated measurements Wij are available, then one may conduct sensitivity analyses based on background information about the measurement process to assess the impact of different degrees of measurement error on estimation of β (Yi and Lawless, 2007). In this paper, error distribution parameters are assumed known.
3. INFERENCE PROCEDURES
3.1 Weighted estimation functions
The inverse probability weighted generalized estimating equation (IPWGEE) method is often employed to account for the bias induced by the incompleteness of data (Robins and others, 1995) when primary interest lies in the marginal mean parameters β in model (2.1). For i=1,2,…,n, let Di=∂μi′/∂β be the matrix of the derivatives of the mean vector μi with respect to β and Δi=diag(I(Rij=1)/πij,j=1,2,…,m) be the weight matrix accommodating missingness, where I(·) is the indicator function. Let Vi=Ai1/2CiAi1/2 be the covariance matrix of Yi, where Ai=diag(vij,j=1,2,…,m) and Ci=[ρi;jk] is the correlation matrix with ρi;jk being the correlation coefficient of response components Yij and Yik, for j≠k, and ρi;jj=1. For i=1,2,…,n, define Ui(θ) = DiVi−1Δi(Yi−μi) and Hi(θ)=(Ui′(θ),Si′(α))′.
of θ can be obtained by solving 
3.1 SIMEX approach
When measurement error is present in covariates xij, H(θ) is no longer unbiased if replacing xij with its observed measurement Wij. A proper adjustment is needed to account for the bias induced by using Wij. In the sequel, we describe the SIMEX method for the adjustment. Let B be a given positive integer and Λ={λ1,λ2,…,λM} be a sequence of nonnegative numbers taken from [0,λM] with λ1=0.
1. Simulation step
For i=1,2,…,n and j=1,2,…,m, generate eijb ∼ N(0,Σe)for b=1,2,…,B. Given λ ∈ Λ, set Wij(b,λ)=Wij+
.
2. Estimation step
(b,λ) by solving (3.1) with xij replaced by Wij(b,λ). This step can be quickly implemented using SAS GENMOD procedure to the data set {Yi,Wi(b,λ),zi:i=1,2,…,n}. The model-based covariance matrix for
(b,λ) is given by 
and
.Denote by
(b,λ) the rth diagonal element of
(b,λ) and
(b,λ) the rth component of
(b,λ), r=1,2,…,q. Define
(λ)=B−1∑b=1B
(b,λ),
(λ)=B−1∑b=1B
(b,λ),
(λ)=(B −1)−1∑b=1B(
(b,λ)−
(λ))2, and
(λ)=
(λ)−
(λ).
3. Extrapolation step
For r=1,2,…,q, fit a regression model to each of the sequences {(λ,
(λ)): λ ∈ Λ} and {(λ,
(λ)):λ ∈ Λ}, respectively, and extrapolate it to λ=−1 with
and
denoting the corresponding predicted values. Then,
=(
1,
2,…,
q)′ is the SIMEX estimator of θ and
is the associated standard error for the estimator
(r=1,2,…,q).
The SIMEX approach is a simulation-based method that was proposed by Cook and Stefanski (1994) for parametric measurement error models. Its idea can be intuitively illustrated with simple linear regression. Suppose that the regression model is given by Y=β0+βxx+ϵ, where ϵ has mean 0. If replacing x with its observed measurement W, modeled by W=x+e with e having mean 0 and variance σ2, then the resulting least squares estimator
for βx converges in probability to βx*=(σx2/(σx2+σ2))βx (Fuller, 1987). Here σx2 is the variance of x. Intuitively, if replacing x with W+
σeb, where eb is generated from N(0,1), then the resulting estimator
(b,λ) converges in probability to βx*(b,λ)=(σx2/(σx2+(1+λ)σ2))βx. If λ=0,
(b,0) is just the naive estimator
. However, if λ=−1 then the limit βx*(b,−1) is identical to the true parameter βx.
For univariate parametric models, Carroll and others (1996) established the asymptotic normality for the SIMEX estimator. However, their results cannot directly apply here because the current development involves multiple response outcomes along with an additional process concerning the missing data indicators. If the exact extrapolation function is used in Step 3 above, we may establish the following asymptotic distribution for the SIMEX estimator
. The proof is outlined in the Appendix.
4. AN EXAMPLE
As an illustration, we apply the proposed method to analyze cohort 2 subset of GAW13 (Genetic Analysis Workshops) data arising from the Framingham Heart Study. The data set consists of the measurements for 1672 patients from a series of exams with 5 assessments designed for each individual. Measurements such as height, weight, age, systolic blood pressure (SBP), and CHOL are collected at each assessment. About 24% patients dropped out of the study.

. σ1 and σ2 are specified as 0, 0.5, and 1.0 to feature scenarios with different degrees of measurement error in SBP and CHOL. Distinct values for ρ are considered to facilitate different strengths in correlation. The missing data process is characterized by the logistic regression model 
Three analyses are conducted here. Analysis 1 ignores measurement error in SBP and CHOL with Xi naively replaced by Wi when using (3.1), Analysis 2 accounts for measurement error in the response model but not in the missing data model, while Analysis 3 addresses measurement error in both the response and the missing data models. In implementing the SIMEX method, we choose B = 200, M = 9, and a quadratic regression for each extrapolation step.
The analyses show that only α4 in model (4.1) is statistically significant under various situations considered for error model (2.3). Other coefficients such as α1, α2, and α3 are all not statistically significant. The results suggest that the dropout rate increases as the subjects become older. Dropout probability does not depend on the previous obesity status, SBP, or CHOL.
We conduct the analyses for ρ = 0 and ρ = 0.5. Table 1 reports the results for the case with ρ = 0. It is not surprising that the 3 analyses give rise to very similar results when there is no measurement error present in SBP and CHOL. When measurement error does exits, it can be seen that the estimates and associated standard errors may be considerably impacted by different degrees of measurement error in SBP or CHOL. If there is no error in SBP (i.e. σ1 = 0), both CHOL and AGE are not statistically significant, whereas SBP has a significant positive effect no matter what degree of measurement error is involved in CHOL.
Sensitivity analyses of the data from the Framingham Heart Study
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | p-value | Bias | SE | p-value | Bias | SE | p-value | |||
| 0.00 | 0.00 | 1 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 |
| 2 | 2.9465 | 0.3119 | < 0.0001 | 0.0904 | 0.0854 | 0.2897 | − 0.0067 | 0.0057 | 0.2450 | ||
| 3 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 | ||
| 0.00 | 0.50 | 1 | 2.9827 | 0.3085 | < 0.0001 | 0.0419 | 0.0721 | 0.5614 | − 0.0060 | 0.0057 | 0.2937 |
| 2 | 2.9736 | 0.3119 | < 0.0001 | 0.0541 | 0.0871 | 0.5341 | − 0.0061 | 0.0057 | 0.2820 | ||
| 3 | 2.9737 | 0.3102 | < 0.0001 | 0.0541 | 0.0868 | 0.5334 | − 0.0061 | 0.0057 | 0.2792 | ||
| 0.00 | 1.00 | 1 | 3.0069 | 0.3068 | < 0.0001 | 0.0072 | 0.0503 | 0.8859 | − 0.0055 | 0.0057 | 0.3372 |
| 2 | 3.0016 | 0.3100 | < 0.0001 | 0.0140 | 0.0706 | 0.8434 | − 0.0056 | 0.0057 | 0.3285 | ||
| 3 | 3.0017 | 0.3083 | < 0.0001 | 0.0140 | 0.0704 | 0.8426 | − 0.0056 | 0.0057 | 0.3262 | ||
| 0.50 | 0.00 | 1 | 0.2828 | 0.0897 | 0.0016 | 0.1751 | 0.0797 | 0.0280 | 0.0121 | 0.0053 | 0.0232 |
| 2 | 0.5050 | 0.1346 | 0.0002 | 0.1654 | 0.0802 | 0.0391 | 0.0106 | 0.0053 | 0.0455 | ||
| 3 | 0.5051 | 0.1343 | 0.0002 | 0.1654 | 0.0799 | 0.0385 | 0.0106 | 0.0052 | 0.0441 | ||
| 0.50 | 0.50 | 1 | 0.2316 | 0.0968 | 0.0167 | 0.0797 | 0.0728 | 0.2737 | 0.0144 | 0.0053 | 0.0063 |
| 2 | 0.5182 | 0.1335 | 0.0001 | 0.1277 | 0.0820 | 0.1194 | 0.0112 | 0.0053 | 0.0337 | ||
| 3 | 0.5183 | 0.1332 | < 0.0001 | 0.1276 | 0.0817 | 0.1181 | 0.0112 | 0.0052 | 0.0324 | ||
| 0.50 | 1.00 | 1 | 0.2599 | 0.1018 | 0.0107 | 0.0088 | 0.0538 | 0.8701 | 0.0157 | 0.0053 | 0.0030 |
| 2 | 0.5331 | 0.1333 | < 0.0001 | 0.0703 | 0.0676 | 0.2988 | 0.0123 | 0.0052 | 0.0188 | ||
| 3 | 0.5331 | 0.1330 | < 0.0001 | 0.0703 | 0.0674 | 0.2966 | 0.0123 | 0.0052 | 0.0180 | ||
| 1.00 | 0.00 | 1 | 0.0412 | 0.0454 | 0.3648 | 0.1852 | 0.0794 | 0.0196 | 0.0137 | 0.0054 | 0.0107 |
| 2 | 0.0801 | 0.0693 | 0.2477 | 0.1830 | 0.0797 | 0.0216 | 0.0135 | 0.0054 | 0.0121 | ||
| 3 | 0.0802 | 0.0692 | 0.2464 | 0.1831 | 0.0793 | 0.0210 | 0.0135 | 0.0053 | 0.0116 | ||
| 1.00 | 0.50 | 1 | 0.0073 | 0.0488 | 0.8809 | 0.1074 | 0.0719 | 0.1351 | 0.0156 | 0.0053 | 0.0035 |
| 2 | 0.0858 | 0.0688 | 0.2128 | 0.1433 | 0.0817 | 0.0794 | 0.0142 | 0.0054 | 0.0080 | ||
| 3 | 0.0858 | 0.0686 | 0.2114 | 0.1432 | 0.0813 | 0.0780 | 0.0142 | 0.0053 | 0.0076 | ||
| 1.00 | 1.00 | 1 | 0.0112 | 0.0517 | 0.8285 | 0.0414 | 0.0535 | 0.4388 | 0.0169 | 0.0053 | 0.0015 |
| 2 | 0.0917 | 0.0688 | 0.1828 | 0.0811 | 0.0675 | 0.2296 | 0.0154 | 0.0053 | 0.0036 | ||
| 3 | 0.0917 | 0.0686 | 0.1814 | 0.0811 | 0.0671 | 0.2271 | 0.0154 | 0.0053 | 0.0034 | ||
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | p-value | Bias | SE | p-value | Bias | SE | p-value | |||
| 0.00 | 0.00 | 1 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 |
| 2 | 2.9465 | 0.3119 | < 0.0001 | 0.0904 | 0.0854 | 0.2897 | − 0.0067 | 0.0057 | 0.2450 | ||
| 3 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 | ||
| 0.00 | 0.50 | 1 | 2.9827 | 0.3085 | < 0.0001 | 0.0419 | 0.0721 | 0.5614 | − 0.0060 | 0.0057 | 0.2937 |
| 2 | 2.9736 | 0.3119 | < 0.0001 | 0.0541 | 0.0871 | 0.5341 | − 0.0061 | 0.0057 | 0.2820 | ||
| 3 | 2.9737 | 0.3102 | < 0.0001 | 0.0541 | 0.0868 | 0.5334 | − 0.0061 | 0.0057 | 0.2792 | ||
| 0.00 | 1.00 | 1 | 3.0069 | 0.3068 | < 0.0001 | 0.0072 | 0.0503 | 0.8859 | − 0.0055 | 0.0057 | 0.3372 |
| 2 | 3.0016 | 0.3100 | < 0.0001 | 0.0140 | 0.0706 | 0.8434 | − 0.0056 | 0.0057 | 0.3285 | ||
| 3 | 3.0017 | 0.3083 | < 0.0001 | 0.0140 | 0.0704 | 0.8426 | − 0.0056 | 0.0057 | 0.3262 | ||
| 0.50 | 0.00 | 1 | 0.2828 | 0.0897 | 0.0016 | 0.1751 | 0.0797 | 0.0280 | 0.0121 | 0.0053 | 0.0232 |
| 2 | 0.5050 | 0.1346 | 0.0002 | 0.1654 | 0.0802 | 0.0391 | 0.0106 | 0.0053 | 0.0455 | ||
| 3 | 0.5051 | 0.1343 | 0.0002 | 0.1654 | 0.0799 | 0.0385 | 0.0106 | 0.0052 | 0.0441 | ||
| 0.50 | 0.50 | 1 | 0.2316 | 0.0968 | 0.0167 | 0.0797 | 0.0728 | 0.2737 | 0.0144 | 0.0053 | 0.0063 |
| 2 | 0.5182 | 0.1335 | 0.0001 | 0.1277 | 0.0820 | 0.1194 | 0.0112 | 0.0053 | 0.0337 | ||
| 3 | 0.5183 | 0.1332 | < 0.0001 | 0.1276 | 0.0817 | 0.1181 | 0.0112 | 0.0052 | 0.0324 | ||
| 0.50 | 1.00 | 1 | 0.2599 | 0.1018 | 0.0107 | 0.0088 | 0.0538 | 0.8701 | 0.0157 | 0.0053 | 0.0030 |
| 2 | 0.5331 | 0.1333 | < 0.0001 | 0.0703 | 0.0676 | 0.2988 | 0.0123 | 0.0052 | 0.0188 | ||
| 3 | 0.5331 | 0.1330 | < 0.0001 | 0.0703 | 0.0674 | 0.2966 | 0.0123 | 0.0052 | 0.0180 | ||
| 1.00 | 0.00 | 1 | 0.0412 | 0.0454 | 0.3648 | 0.1852 | 0.0794 | 0.0196 | 0.0137 | 0.0054 | 0.0107 |
| 2 | 0.0801 | 0.0693 | 0.2477 | 0.1830 | 0.0797 | 0.0216 | 0.0135 | 0.0054 | 0.0121 | ||
| 3 | 0.0802 | 0.0692 | 0.2464 | 0.1831 | 0.0793 | 0.0210 | 0.0135 | 0.0053 | 0.0116 | ||
| 1.00 | 0.50 | 1 | 0.0073 | 0.0488 | 0.8809 | 0.1074 | 0.0719 | 0.1351 | 0.0156 | 0.0053 | 0.0035 |
| 2 | 0.0858 | 0.0688 | 0.2128 | 0.1433 | 0.0817 | 0.0794 | 0.0142 | 0.0054 | 0.0080 | ||
| 3 | 0.0858 | 0.0686 | 0.2114 | 0.1432 | 0.0813 | 0.0780 | 0.0142 | 0.0053 | 0.0076 | ||
| 1.00 | 1.00 | 1 | 0.0112 | 0.0517 | 0.8285 | 0.0414 | 0.0535 | 0.4388 | 0.0169 | 0.0053 | 0.0015 |
| 2 | 0.0917 | 0.0688 | 0.1828 | 0.0811 | 0.0675 | 0.2296 | 0.0154 | 0.0053 | 0.0036 | ||
| 3 | 0.0917 | 0.0686 | 0.1814 | 0.0811 | 0.0671 | 0.2271 | 0.0154 | 0.0053 | 0.0034 | ||
Sensitivity analyses of the data from the Framingham Heart Study
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | p-value | Bias | SE | p-value | Bias | SE | p-value | |||
| 0.00 | 0.00 | 1 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 |
| 2 | 2.9465 | 0.3119 | < 0.0001 | 0.0904 | 0.0854 | 0.2897 | − 0.0067 | 0.0057 | 0.2450 | ||
| 3 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 | ||
| 0.00 | 0.50 | 1 | 2.9827 | 0.3085 | < 0.0001 | 0.0419 | 0.0721 | 0.5614 | − 0.0060 | 0.0057 | 0.2937 |
| 2 | 2.9736 | 0.3119 | < 0.0001 | 0.0541 | 0.0871 | 0.5341 | − 0.0061 | 0.0057 | 0.2820 | ||
| 3 | 2.9737 | 0.3102 | < 0.0001 | 0.0541 | 0.0868 | 0.5334 | − 0.0061 | 0.0057 | 0.2792 | ||
| 0.00 | 1.00 | 1 | 3.0069 | 0.3068 | < 0.0001 | 0.0072 | 0.0503 | 0.8859 | − 0.0055 | 0.0057 | 0.3372 |
| 2 | 3.0016 | 0.3100 | < 0.0001 | 0.0140 | 0.0706 | 0.8434 | − 0.0056 | 0.0057 | 0.3285 | ||
| 3 | 3.0017 | 0.3083 | < 0.0001 | 0.0140 | 0.0704 | 0.8426 | − 0.0056 | 0.0057 | 0.3262 | ||
| 0.50 | 0.00 | 1 | 0.2828 | 0.0897 | 0.0016 | 0.1751 | 0.0797 | 0.0280 | 0.0121 | 0.0053 | 0.0232 |
| 2 | 0.5050 | 0.1346 | 0.0002 | 0.1654 | 0.0802 | 0.0391 | 0.0106 | 0.0053 | 0.0455 | ||
| 3 | 0.5051 | 0.1343 | 0.0002 | 0.1654 | 0.0799 | 0.0385 | 0.0106 | 0.0052 | 0.0441 | ||
| 0.50 | 0.50 | 1 | 0.2316 | 0.0968 | 0.0167 | 0.0797 | 0.0728 | 0.2737 | 0.0144 | 0.0053 | 0.0063 |
| 2 | 0.5182 | 0.1335 | 0.0001 | 0.1277 | 0.0820 | 0.1194 | 0.0112 | 0.0053 | 0.0337 | ||
| 3 | 0.5183 | 0.1332 | < 0.0001 | 0.1276 | 0.0817 | 0.1181 | 0.0112 | 0.0052 | 0.0324 | ||
| 0.50 | 1.00 | 1 | 0.2599 | 0.1018 | 0.0107 | 0.0088 | 0.0538 | 0.8701 | 0.0157 | 0.0053 | 0.0030 |
| 2 | 0.5331 | 0.1333 | < 0.0001 | 0.0703 | 0.0676 | 0.2988 | 0.0123 | 0.0052 | 0.0188 | ||
| 3 | 0.5331 | 0.1330 | < 0.0001 | 0.0703 | 0.0674 | 0.2966 | 0.0123 | 0.0052 | 0.0180 | ||
| 1.00 | 0.00 | 1 | 0.0412 | 0.0454 | 0.3648 | 0.1852 | 0.0794 | 0.0196 | 0.0137 | 0.0054 | 0.0107 |
| 2 | 0.0801 | 0.0693 | 0.2477 | 0.1830 | 0.0797 | 0.0216 | 0.0135 | 0.0054 | 0.0121 | ||
| 3 | 0.0802 | 0.0692 | 0.2464 | 0.1831 | 0.0793 | 0.0210 | 0.0135 | 0.0053 | 0.0116 | ||
| 1.00 | 0.50 | 1 | 0.0073 | 0.0488 | 0.8809 | 0.1074 | 0.0719 | 0.1351 | 0.0156 | 0.0053 | 0.0035 |
| 2 | 0.0858 | 0.0688 | 0.2128 | 0.1433 | 0.0817 | 0.0794 | 0.0142 | 0.0054 | 0.0080 | ||
| 3 | 0.0858 | 0.0686 | 0.2114 | 0.1432 | 0.0813 | 0.0780 | 0.0142 | 0.0053 | 0.0076 | ||
| 1.00 | 1.00 | 1 | 0.0112 | 0.0517 | 0.8285 | 0.0414 | 0.0535 | 0.4388 | 0.0169 | 0.0053 | 0.0015 |
| 2 | 0.0917 | 0.0688 | 0.1828 | 0.0811 | 0.0675 | 0.2296 | 0.0154 | 0.0053 | 0.0036 | ||
| 3 | 0.0917 | 0.0686 | 0.1814 | 0.0811 | 0.0671 | 0.2271 | 0.0154 | 0.0053 | 0.0034 | ||
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | p-value | Bias | SE | p-value | Bias | SE | p-value | |||
| 0.00 | 0.00 | 1 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 |
| 2 | 2.9465 | 0.3119 | < 0.0001 | 0.0904 | 0.0854 | 0.2897 | − 0.0067 | 0.0057 | 0.2450 | ||
| 3 | 2.9465 | 0.3103 | < 0.0001 | 0.0904 | 0.0852 | 0.2886 | − 0.0067 | 0.0057 | 0.2427 | ||
| 0.00 | 0.50 | 1 | 2.9827 | 0.3085 | < 0.0001 | 0.0419 | 0.0721 | 0.5614 | − 0.0060 | 0.0057 | 0.2937 |
| 2 | 2.9736 | 0.3119 | < 0.0001 | 0.0541 | 0.0871 | 0.5341 | − 0.0061 | 0.0057 | 0.2820 | ||
| 3 | 2.9737 | 0.3102 | < 0.0001 | 0.0541 | 0.0868 | 0.5334 | − 0.0061 | 0.0057 | 0.2792 | ||
| 0.00 | 1.00 | 1 | 3.0069 | 0.3068 | < 0.0001 | 0.0072 | 0.0503 | 0.8859 | − 0.0055 | 0.0057 | 0.3372 |
| 2 | 3.0016 | 0.3100 | < 0.0001 | 0.0140 | 0.0706 | 0.8434 | − 0.0056 | 0.0057 | 0.3285 | ||
| 3 | 3.0017 | 0.3083 | < 0.0001 | 0.0140 | 0.0704 | 0.8426 | − 0.0056 | 0.0057 | 0.3262 | ||
| 0.50 | 0.00 | 1 | 0.2828 | 0.0897 | 0.0016 | 0.1751 | 0.0797 | 0.0280 | 0.0121 | 0.0053 | 0.0232 |
| 2 | 0.5050 | 0.1346 | 0.0002 | 0.1654 | 0.0802 | 0.0391 | 0.0106 | 0.0053 | 0.0455 | ||
| 3 | 0.5051 | 0.1343 | 0.0002 | 0.1654 | 0.0799 | 0.0385 | 0.0106 | 0.0052 | 0.0441 | ||
| 0.50 | 0.50 | 1 | 0.2316 | 0.0968 | 0.0167 | 0.0797 | 0.0728 | 0.2737 | 0.0144 | 0.0053 | 0.0063 |
| 2 | 0.5182 | 0.1335 | 0.0001 | 0.1277 | 0.0820 | 0.1194 | 0.0112 | 0.0053 | 0.0337 | ||
| 3 | 0.5183 | 0.1332 | < 0.0001 | 0.1276 | 0.0817 | 0.1181 | 0.0112 | 0.0052 | 0.0324 | ||
| 0.50 | 1.00 | 1 | 0.2599 | 0.1018 | 0.0107 | 0.0088 | 0.0538 | 0.8701 | 0.0157 | 0.0053 | 0.0030 |
| 2 | 0.5331 | 0.1333 | < 0.0001 | 0.0703 | 0.0676 | 0.2988 | 0.0123 | 0.0052 | 0.0188 | ||
| 3 | 0.5331 | 0.1330 | < 0.0001 | 0.0703 | 0.0674 | 0.2966 | 0.0123 | 0.0052 | 0.0180 | ||
| 1.00 | 0.00 | 1 | 0.0412 | 0.0454 | 0.3648 | 0.1852 | 0.0794 | 0.0196 | 0.0137 | 0.0054 | 0.0107 |
| 2 | 0.0801 | 0.0693 | 0.2477 | 0.1830 | 0.0797 | 0.0216 | 0.0135 | 0.0054 | 0.0121 | ||
| 3 | 0.0802 | 0.0692 | 0.2464 | 0.1831 | 0.0793 | 0.0210 | 0.0135 | 0.0053 | 0.0116 | ||
| 1.00 | 0.50 | 1 | 0.0073 | 0.0488 | 0.8809 | 0.1074 | 0.0719 | 0.1351 | 0.0156 | 0.0053 | 0.0035 |
| 2 | 0.0858 | 0.0688 | 0.2128 | 0.1433 | 0.0817 | 0.0794 | 0.0142 | 0.0054 | 0.0080 | ||
| 3 | 0.0858 | 0.0686 | 0.2114 | 0.1432 | 0.0813 | 0.0780 | 0.0142 | 0.0053 | 0.0076 | ||
| 1.00 | 1.00 | 1 | 0.0112 | 0.0517 | 0.8285 | 0.0414 | 0.0535 | 0.4388 | 0.0169 | 0.0053 | 0.0015 |
| 2 | 0.0917 | 0.0688 | 0.1828 | 0.0811 | 0.0675 | 0.2296 | 0.0154 | 0.0053 | 0.0036 | ||
| 3 | 0.0917 | 0.0686 | 0.1814 | 0.0811 | 0.0671 | 0.2271 | 0.0154 | 0.0053 | 0.0034 | ||
If there is moderate error in SBP (i.e. σ1 = 0.5), the 3 analyses still suggest that SBP has significant positive effect on obesity. In contrast to the case with no error in SBP, AGE is found to be statistically significant by the 3 analyses and evidence tends to become stronger as error in CHOL is more substantial. However, the nature of CHOL depends on whether or not there is error in CHOL. If there is no error in CHOL, there is moderate evidence to support that CHOL has a positive effect on obesity; otherwise, CHOL is not statistically significant.
When measurement error in SBP becomes more severe (i.e. σ1 = 1.0), the effect of SBP is no longer significant indicated by the 3 analyses. Again, AGE would have a positive effect and evidence tends to become stronger as error in CHOL increases. CHOL tends to be statistically significant if error in CHOL is none or moderate; if the error in CHOL becomes larger, there is no evidence to support the effect of CHOL.
To save space, we do not display the results for ρ = 0.5 but just comment on the findings here. It seems that moderate correlation ρ tends to decrease the estimates for the effects of both SBP and CHOL but to increase associated standard errors, hence leading to increasing p-values. However, the impact of correlation ρ on AGE effect is different. Moderate correlation ρ tends to increase the estimates of AGE effect while maintaining very stable standard errors, thus the resulting p-values become smaller.
5. SIMULATION STUDIES
In this section, we conduct simulation studies to investigate the impact of ignoring measurement error on estimation and to compare the performance of the 3 analyses discussed in Section 4. The same configurations as those in Section 4 are used when implementing the SIMEX method.

with μxr = 0.5 and σxr = 1.0 (r = 1, 2). Set βx1 = log(1.5), βx2 = log(1.5), and βz = log(0.75). The surrogate value Wij = (Wij1, Wij2)′ is generated from the normal distribution N(xij, Σe) with
Various configurations are considered to feature distinct scenarios of measurement error in covariate xij. Specifically, we consider σ1, σ2 = 0.15, 0.50, and 0.75 to feature minor, moderate, and severe marginal measurement errors. ρx and ρ are specified as 0.5 to represent the cases with moderate correlations. The missing data indicator is generated from model (4.1), where we set α0 = α1 = 0.5, α2 = α3 = 0.1, and αz = 0.2.In Table 2, we report on the results of the difference of the average of the estimates and the true value (Bias), the empirical standard error (SE), and the coverage rate (CR in percent) for 95% confidence intervals. If measurement error is minor, for instance, when both σ1 and σ2 are 0.15, even Analysis 1 may give rise to reasonable results with fairly small finite-sample biases and CRs that are close to the nominal level 95%. The 3 analyses provide fairly comparable results.
Simulation results
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | CR | Bias | SE | CR | Bias | SE | CR | |||
| 0.15 | 0.15 | 1 | − 0.0175 | 0.1323 | 95.5 | 0.0000 | 0.1241 | 95.5 | 0.0044 | 0.2315 | 94.0 |
| 2 | − 0.0073 | 0.1357 | 96.0 | 0.0094 | 0.1277 | 97.5 | 0.0038 | 0.2320 | 94.5 | ||
| 3 | − 0.0073 | 0.1358 | 95.5 | 0.0094 | 0.1278 | 97.0 | 0.0038 | 0.2321 | 94.0 | ||
| 0.15 | 0.50 | 1 | 0.0223 | 0.1303 | 94.5 | − 0.1030 | 0.1098 | 87.5 | 0.0068 | 0.2314 | 94.0 |
| 2 | 0.0012 | 0.1366 | 95.0 | − 0.0135 | 0.1389 | 96.0 | 0.0050 | 0.2341 | 94.5 | ||
| 3 | 0.0011 | 0.1367 | 94.5 | − 0.0135 | 0.1393 | 95.5 | 0.0053 | 0.2344 | 94.5 | ||
| 0.15 | 0.75 | 1 | 0.0579 | 0.1282 | 91.0 | − 0.1839 | 0.0957 | 54.5 | 0.0080 | 0.2305 | 94.0 |
| 2 | 0.0253 | 0.1365 | 94.0 | − 0.0728 | 0.1376 | 91.5 | 0.0061 | 0.2343 | 94.0 | ||
| 3 | 0.0252 | 0.1365 | 94.0 | − 0.0727 | 0.1381 | 90.5 | 0.0065 | 0.2347 | 94.0 | ||
| 0.50 | 0.15 | 1 | − 0.1199 | 0.1175 | 79.0 | 0.0389 | 0.1233 | 97.5 | 0.0093 | 0.2316 | 94.5 |
| 2 | − 0.0327 | 0.1472 | 95.5 | 0.0179 | 0.1301 | 97.0 | 0.0077 | 0.2338 | 94.5 | ||
| 3 | − 0.0326 | 0.1475 | 95.0 | 0.0178 | 0.1307 | 97.0 | 0.0079 | 0.2342 | 94.5 | ||
| 0.50 | 0.50 | 1 | − 0.0970 | 0.1180 | 83.0 | − 0.0798 | 0.1113 | 89.5 | 0.0129 | 0.2314 | 94.0 |
| 2 | − 0.0259 | 0.1458 | 95.5 | − 0.0068 | 0.1386 | 96.0 | 0.0094 | 0.2364 | 94.5 | ||
| 3 | − 0.0258 | 0.1463 | 95.0 | − 0.0067 | 0.1394 | 95.0 | 0.0098 | 0.2370 | 94.5 | ||
| 0.50 | 0.75 | 1 | − 0.0641 | 0.1173 | 92.5 | − 0.1754 | 0.0982 | 58.5 | 0.0148 | 0.2303 | 93.5 |
| 2 | − 0.0066 | 0.1451 | 96.5 | − 0.0683 | 0.1387 | 91.0 | 0.0109 | 0.2369 | 94.5 | ||
| 3 | − 0.0067 | 0.1456 | 96.0 | − 0.0681 | 0.1395 | 90.0 | 0.0114 | 0.2375 | 94.5 | ||
| 0.75 | 0.15 | 1 | − 0.1976 | 0.1028 | 48.5 | 0.0730 | 0.1219 | 93.0 | 0.0118 | 0.2314 | 94.5 |
| 2 | − 0.0908 | 0.1458 | 84.5 | 0.0411 | 0.1312 | 96.5 | 0.0107 | 0.2343 | 94.5 | ||
| 3 | − 0.0906 | 0.1461 | 84.5 | 0.0410 | 0.1320 | 96.5 | 0.0111 | 0.2348 | 94.5 | ||
| 0.75 | 0.50 | 1 | − 0.1899 | 0.1041 | 54.5 | − 0.0478 | 0.1109 | 93.5 | 0.0161 | 0.2311 | 94.0 |
| 2 | − 0.0865 | 0.1454 | 86.0 | 0.0117 | 0.1386 | 98.0 | 0.0127 | 0.2370 | 94.5 | ||
| 3 | − 0.0864 | 0.1459 | 85.5 | 0.0118 | 0.1396 | 96.5 | 0.0133 | 0.2377 | 94.5 | ||
| 0.75 | 0.75 | 1 | − 0.1637 | 0.1041 | 63.0 | − 0.1498 | 0.0985 | 68.5 | 0.0184 | 0.2300 | 93.5 |
| 2 | − 0.0698 | 0.1451 | 89.0 | − 0.0521 | 0.1389 | 92.5 | 0.0144 | 0.2376 | 94.0 | ||
| 3 | − 0.0697 | 0.1456 | 88.5 | − 0.0518 | 0.1399 | 92.5 | 0.0152 | 0.2384 | 93.5 | ||
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | CR | Bias | SE | CR | Bias | SE | CR | |||
| 0.15 | 0.15 | 1 | − 0.0175 | 0.1323 | 95.5 | 0.0000 | 0.1241 | 95.5 | 0.0044 | 0.2315 | 94.0 |
| 2 | − 0.0073 | 0.1357 | 96.0 | 0.0094 | 0.1277 | 97.5 | 0.0038 | 0.2320 | 94.5 | ||
| 3 | − 0.0073 | 0.1358 | 95.5 | 0.0094 | 0.1278 | 97.0 | 0.0038 | 0.2321 | 94.0 | ||
| 0.15 | 0.50 | 1 | 0.0223 | 0.1303 | 94.5 | − 0.1030 | 0.1098 | 87.5 | 0.0068 | 0.2314 | 94.0 |
| 2 | 0.0012 | 0.1366 | 95.0 | − 0.0135 | 0.1389 | 96.0 | 0.0050 | 0.2341 | 94.5 | ||
| 3 | 0.0011 | 0.1367 | 94.5 | − 0.0135 | 0.1393 | 95.5 | 0.0053 | 0.2344 | 94.5 | ||
| 0.15 | 0.75 | 1 | 0.0579 | 0.1282 | 91.0 | − 0.1839 | 0.0957 | 54.5 | 0.0080 | 0.2305 | 94.0 |
| 2 | 0.0253 | 0.1365 | 94.0 | − 0.0728 | 0.1376 | 91.5 | 0.0061 | 0.2343 | 94.0 | ||
| 3 | 0.0252 | 0.1365 | 94.0 | − 0.0727 | 0.1381 | 90.5 | 0.0065 | 0.2347 | 94.0 | ||
| 0.50 | 0.15 | 1 | − 0.1199 | 0.1175 | 79.0 | 0.0389 | 0.1233 | 97.5 | 0.0093 | 0.2316 | 94.5 |
| 2 | − 0.0327 | 0.1472 | 95.5 | 0.0179 | 0.1301 | 97.0 | 0.0077 | 0.2338 | 94.5 | ||
| 3 | − 0.0326 | 0.1475 | 95.0 | 0.0178 | 0.1307 | 97.0 | 0.0079 | 0.2342 | 94.5 | ||
| 0.50 | 0.50 | 1 | − 0.0970 | 0.1180 | 83.0 | − 0.0798 | 0.1113 | 89.5 | 0.0129 | 0.2314 | 94.0 |
| 2 | − 0.0259 | 0.1458 | 95.5 | − 0.0068 | 0.1386 | 96.0 | 0.0094 | 0.2364 | 94.5 | ||
| 3 | − 0.0258 | 0.1463 | 95.0 | − 0.0067 | 0.1394 | 95.0 | 0.0098 | 0.2370 | 94.5 | ||
| 0.50 | 0.75 | 1 | − 0.0641 | 0.1173 | 92.5 | − 0.1754 | 0.0982 | 58.5 | 0.0148 | 0.2303 | 93.5 |
| 2 | − 0.0066 | 0.1451 | 96.5 | − 0.0683 | 0.1387 | 91.0 | 0.0109 | 0.2369 | 94.5 | ||
| 3 | − 0.0067 | 0.1456 | 96.0 | − 0.0681 | 0.1395 | 90.0 | 0.0114 | 0.2375 | 94.5 | ||
| 0.75 | 0.15 | 1 | − 0.1976 | 0.1028 | 48.5 | 0.0730 | 0.1219 | 93.0 | 0.0118 | 0.2314 | 94.5 |
| 2 | − 0.0908 | 0.1458 | 84.5 | 0.0411 | 0.1312 | 96.5 | 0.0107 | 0.2343 | 94.5 | ||
| 3 | − 0.0906 | 0.1461 | 84.5 | 0.0410 | 0.1320 | 96.5 | 0.0111 | 0.2348 | 94.5 | ||
| 0.75 | 0.50 | 1 | − 0.1899 | 0.1041 | 54.5 | − 0.0478 | 0.1109 | 93.5 | 0.0161 | 0.2311 | 94.0 |
| 2 | − 0.0865 | 0.1454 | 86.0 | 0.0117 | 0.1386 | 98.0 | 0.0127 | 0.2370 | 94.5 | ||
| 3 | − 0.0864 | 0.1459 | 85.5 | 0.0118 | 0.1396 | 96.5 | 0.0133 | 0.2377 | 94.5 | ||
| 0.75 | 0.75 | 1 | − 0.1637 | 0.1041 | 63.0 | − 0.1498 | 0.0985 | 68.5 | 0.0184 | 0.2300 | 93.5 |
| 2 | − 0.0698 | 0.1451 | 89.0 | − 0.0521 | 0.1389 | 92.5 | 0.0144 | 0.2376 | 94.0 | ||
| 3 | − 0.0697 | 0.1456 | 88.5 | − 0.0518 | 0.1399 | 92.5 | 0.0152 | 0.2384 | 93.5 | ||
Simulation results
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | CR | Bias | SE | CR | Bias | SE | CR | |||
| 0.15 | 0.15 | 1 | − 0.0175 | 0.1323 | 95.5 | 0.0000 | 0.1241 | 95.5 | 0.0044 | 0.2315 | 94.0 |
| 2 | − 0.0073 | 0.1357 | 96.0 | 0.0094 | 0.1277 | 97.5 | 0.0038 | 0.2320 | 94.5 | ||
| 3 | − 0.0073 | 0.1358 | 95.5 | 0.0094 | 0.1278 | 97.0 | 0.0038 | 0.2321 | 94.0 | ||
| 0.15 | 0.50 | 1 | 0.0223 | 0.1303 | 94.5 | − 0.1030 | 0.1098 | 87.5 | 0.0068 | 0.2314 | 94.0 |
| 2 | 0.0012 | 0.1366 | 95.0 | − 0.0135 | 0.1389 | 96.0 | 0.0050 | 0.2341 | 94.5 | ||
| 3 | 0.0011 | 0.1367 | 94.5 | − 0.0135 | 0.1393 | 95.5 | 0.0053 | 0.2344 | 94.5 | ||
| 0.15 | 0.75 | 1 | 0.0579 | 0.1282 | 91.0 | − 0.1839 | 0.0957 | 54.5 | 0.0080 | 0.2305 | 94.0 |
| 2 | 0.0253 | 0.1365 | 94.0 | − 0.0728 | 0.1376 | 91.5 | 0.0061 | 0.2343 | 94.0 | ||
| 3 | 0.0252 | 0.1365 | 94.0 | − 0.0727 | 0.1381 | 90.5 | 0.0065 | 0.2347 | 94.0 | ||
| 0.50 | 0.15 | 1 | − 0.1199 | 0.1175 | 79.0 | 0.0389 | 0.1233 | 97.5 | 0.0093 | 0.2316 | 94.5 |
| 2 | − 0.0327 | 0.1472 | 95.5 | 0.0179 | 0.1301 | 97.0 | 0.0077 | 0.2338 | 94.5 | ||
| 3 | − 0.0326 | 0.1475 | 95.0 | 0.0178 | 0.1307 | 97.0 | 0.0079 | 0.2342 | 94.5 | ||
| 0.50 | 0.50 | 1 | − 0.0970 | 0.1180 | 83.0 | − 0.0798 | 0.1113 | 89.5 | 0.0129 | 0.2314 | 94.0 |
| 2 | − 0.0259 | 0.1458 | 95.5 | − 0.0068 | 0.1386 | 96.0 | 0.0094 | 0.2364 | 94.5 | ||
| 3 | − 0.0258 | 0.1463 | 95.0 | − 0.0067 | 0.1394 | 95.0 | 0.0098 | 0.2370 | 94.5 | ||
| 0.50 | 0.75 | 1 | − 0.0641 | 0.1173 | 92.5 | − 0.1754 | 0.0982 | 58.5 | 0.0148 | 0.2303 | 93.5 |
| 2 | − 0.0066 | 0.1451 | 96.5 | − 0.0683 | 0.1387 | 91.0 | 0.0109 | 0.2369 | 94.5 | ||
| 3 | − 0.0067 | 0.1456 | 96.0 | − 0.0681 | 0.1395 | 90.0 | 0.0114 | 0.2375 | 94.5 | ||
| 0.75 | 0.15 | 1 | − 0.1976 | 0.1028 | 48.5 | 0.0730 | 0.1219 | 93.0 | 0.0118 | 0.2314 | 94.5 |
| 2 | − 0.0908 | 0.1458 | 84.5 | 0.0411 | 0.1312 | 96.5 | 0.0107 | 0.2343 | 94.5 | ||
| 3 | − 0.0906 | 0.1461 | 84.5 | 0.0410 | 0.1320 | 96.5 | 0.0111 | 0.2348 | 94.5 | ||
| 0.75 | 0.50 | 1 | − 0.1899 | 0.1041 | 54.5 | − 0.0478 | 0.1109 | 93.5 | 0.0161 | 0.2311 | 94.0 |
| 2 | − 0.0865 | 0.1454 | 86.0 | 0.0117 | 0.1386 | 98.0 | 0.0127 | 0.2370 | 94.5 | ||
| 3 | − 0.0864 | 0.1459 | 85.5 | 0.0118 | 0.1396 | 96.5 | 0.0133 | 0.2377 | 94.5 | ||
| 0.75 | 0.75 | 1 | − 0.1637 | 0.1041 | 63.0 | − 0.1498 | 0.0985 | 68.5 | 0.0184 | 0.2300 | 93.5 |
| 2 | − 0.0698 | 0.1451 | 89.0 | − 0.0521 | 0.1389 | 92.5 | 0.0144 | 0.2376 | 94.0 | ||
| 3 | − 0.0697 | 0.1456 | 88.5 | − 0.0518 | 0.1399 | 92.5 | 0.0152 | 0.2384 | 93.5 | ||
| σ1 | σ2 | Analysis | βx1 | βx2 | βz | ||||||
| Bias | SE | CR | Bias | SE | CR | Bias | SE | CR | |||
| 0.15 | 0.15 | 1 | − 0.0175 | 0.1323 | 95.5 | 0.0000 | 0.1241 | 95.5 | 0.0044 | 0.2315 | 94.0 |
| 2 | − 0.0073 | 0.1357 | 96.0 | 0.0094 | 0.1277 | 97.5 | 0.0038 | 0.2320 | 94.5 | ||
| 3 | − 0.0073 | 0.1358 | 95.5 | 0.0094 | 0.1278 | 97.0 | 0.0038 | 0.2321 | 94.0 | ||
| 0.15 | 0.50 | 1 | 0.0223 | 0.1303 | 94.5 | − 0.1030 | 0.1098 | 87.5 | 0.0068 | 0.2314 | 94.0 |
| 2 | 0.0012 | 0.1366 | 95.0 | − 0.0135 | 0.1389 | 96.0 | 0.0050 | 0.2341 | 94.5 | ||
| 3 | 0.0011 | 0.1367 | 94.5 | − 0.0135 | 0.1393 | 95.5 | 0.0053 | 0.2344 | 94.5 | ||
| 0.15 | 0.75 | 1 | 0.0579 | 0.1282 | 91.0 | − 0.1839 | 0.0957 | 54.5 | 0.0080 | 0.2305 | 94.0 |
| 2 | 0.0253 | 0.1365 | 94.0 | − 0.0728 | 0.1376 | 91.5 | 0.0061 | 0.2343 | 94.0 | ||
| 3 | 0.0252 | 0.1365 | 94.0 | − 0.0727 | 0.1381 | 90.5 | 0.0065 | 0.2347 | 94.0 | ||
| 0.50 | 0.15 | 1 | − 0.1199 | 0.1175 | 79.0 | 0.0389 | 0.1233 | 97.5 | 0.0093 | 0.2316 | 94.5 |
| 2 | − 0.0327 | 0.1472 | 95.5 | 0.0179 | 0.1301 | 97.0 | 0.0077 | 0.2338 | 94.5 | ||
| 3 | − 0.0326 | 0.1475 | 95.0 | 0.0178 | 0.1307 | 97.0 | 0.0079 | 0.2342 | 94.5 | ||
| 0.50 | 0.50 | 1 | − 0.0970 | 0.1180 | 83.0 | − 0.0798 | 0.1113 | 89.5 | 0.0129 | 0.2314 | 94.0 |
| 2 | − 0.0259 | 0.1458 | 95.5 | − 0.0068 | 0.1386 | 96.0 | 0.0094 | 0.2364 | 94.5 | ||
| 3 | − 0.0258 | 0.1463 | 95.0 | − 0.0067 | 0.1394 | 95.0 | 0.0098 | 0.2370 | 94.5 | ||
| 0.50 | 0.75 | 1 | − 0.0641 | 0.1173 | 92.5 | − 0.1754 | 0.0982 | 58.5 | 0.0148 | 0.2303 | 93.5 |
| 2 | − 0.0066 | 0.1451 | 96.5 | − 0.0683 | 0.1387 | 91.0 | 0.0109 | 0.2369 | 94.5 | ||
| 3 | − 0.0067 | 0.1456 | 96.0 | − 0.0681 | 0.1395 | 90.0 | 0.0114 | 0.2375 | 94.5 | ||
| 0.75 | 0.15 | 1 | − 0.1976 | 0.1028 | 48.5 | 0.0730 | 0.1219 | 93.0 | 0.0118 | 0.2314 | 94.5 |
| 2 | − 0.0908 | 0.1458 | 84.5 | 0.0411 | 0.1312 | 96.5 | 0.0107 | 0.2343 | 94.5 | ||
| 3 | − 0.0906 | 0.1461 | 84.5 | 0.0410 | 0.1320 | 96.5 | 0.0111 | 0.2348 | 94.5 | ||
| 0.75 | 0.50 | 1 | − 0.1899 | 0.1041 | 54.5 | − 0.0478 | 0.1109 | 93.5 | 0.0161 | 0.2311 | 94.0 |
| 2 | − 0.0865 | 0.1454 | 86.0 | 0.0117 | 0.1386 | 98.0 | 0.0127 | 0.2370 | 94.5 | ||
| 3 | − 0.0864 | 0.1459 | 85.5 | 0.0118 | 0.1396 | 96.5 | 0.0133 | 0.2377 | 94.5 | ||
| 0.75 | 0.75 | 1 | − 0.1637 | 0.1041 | 63.0 | − 0.1498 | 0.0985 | 68.5 | 0.0184 | 0.2300 | 93.5 |
| 2 | − 0.0698 | 0.1451 | 89.0 | − 0.0521 | 0.1389 | 92.5 | 0.0144 | 0.2376 | 94.0 | ||
| 3 | − 0.0697 | 0.1456 | 88.5 | − 0.0518 | 0.1399 | 92.5 | 0.0152 | 0.2384 | 93.5 | ||
When there is moderate or substantial measurement error in covariates xij, the performance of Analysis 1 deteriorates remarkably in estimation of error-prone covariate effects. Analysis 1 may lead to considerably biased estimates for βx1 and βx2. For example, see the entries with σ1 = 0.75 and σ2 = 0.15 in Table 2. The CR for 95% confidence intervals for βx1 can be as low as 49%. Accounting for measurement error in the response model, both Analyses 2 and 3 remarkably improve the performance providing a lot smaller biases and much higher CRs for the 95% confidence intervals. Analysis 2 gives rise to very comparable results to those produced by Analysis 3, though Analysis 2 seems to yield a slightly larger finite-sample biases. The simulation study considered here suggests that the impact of ignoring measurement error in modeling the missing data process is not as remarkable as that in modeling the response process.
In terms of estimation of βz, Analysis 1 produces larger biases than Analyses 2 and 3 do, though the magnitude is not as striking as that for the estimates of βx. Among the 3 analyses, Analysis 1 provides the smallest standard errors while Analysis 3 yields the largest but the differences between Analyses 2 and 3 are not considerable. The CRs for the 95% confidence intervals obtained from the 3 analyses agree reasonably well with the nominal value.
In summary, ignoring measurement error may lead to substantially biased results. Properly addressing covariate measurement error in estimation procedures is necessary. The proposed method (i.e. Analysis 3) performs reasonably well under various configurations. Its performance may become less satisfactory when measurement error becomes substantial. However, the proposed SIMEX method does significantly improve the performance of the naive analysis (i.e. Analysis 1).
6. DISCUSSION
In this paper, we propose a simulation-based marginal method to analyze longitudinal data with both missing observations and error-contaminated covariates. This work is of particular interest because missingness and measurement error in covariates arise commonly in longitudinal studies, and up to date, there is little work to address both features (Liu and Wu, 2007). Yi (2005) discussed inference approaches to handle continuous or count data arising from longitudinal studies, but those methods cannot apply to binary responses due to the nature of the logistic regression. The proposed method may, however, handle binary responses, in addition to continuous responses or count data. Moreover, in contrast to the models of Yi (2005) where only precisely observed covariates may enter model (2.2), the proposed method allows the dependence of the missing data process on error-prone covariates. The proposed method is simple but flexible. Its implementation is straightforward by slightly modifying standard statistical software such as PROC GENMOD in SAS. The proposed method does not require the complete specification of the full distribution of the response process but only requires the specification of the structures of marginal means and variances. Also the method does not need modeling the underlying covariate process, which is desirable for many practical problems.
The proposed methods may apply to handle clustered or correlated data as well. In some situations, the interest may also concern the association strength among response components within clusters. We may, following the lines of Yi and Cook (2002), construct a second set of estimating equations for association parameters. In that formulation, proper adjustments should be introduced to account for biases induced by both missing observations and measurement error in covariates.
In this paper, we focus the discussion on the IPWGEE method for which MAR missing data mechanism is assumed. One may, however, employ other modeling framework such as random-effects models to accommodate NMAR mechanisms as well. Without considering missing observations, Wang and others (1998) studied the random-effects models to account for measurement error in covariates. It would be interesting to develop methods to simultaneously adjust for the biases resulted from missingness and measurement error in this context.
When modeling the missing data process, we consider the case that the true but error-prone covariates Xi enter the model to govern the missingness probability. In some instances, it could be more feasible to facilitate the dependence of dropout on the observed covariates Wi. In this case, the proposed method can apply with a minor modification. See Carroll and others (2006, Chapter 2, Section 11.8) for general discussion on the issue of building a model by conditioning on the true underlying covariates or the observed data.
As seen in Section 4, there is no additional information, such as repeated measurements of SBP and CHOL, available to estimate variance parameters σ1 and σ2, thereby, we undertake sensitivity analyses by specifying a sequence of values of σ1 and σ2 to assess the impact of measurement error on estimation of the response parameters β. Sometimes, there exists additional information on the measurement error process and the associated parameters may be estimated. In these circumstances, we need to accommodate the resulting variation induced by estimating error parameters. With replicate measurements Wi available, for example, we may modify the proposed method by adapting the arguments in Devanarayan and Stefanski (2002) to accommodate measurement error models with unknown variance parameters.
FUNDING
Natural Sciences and Engineering Research Council of Canada.
APPENDIX
Adapting the arguments in Carroll and others (1996), we outline the proof of the Theorem as follows. Let Ui(θ; b, λ), Si(α; b, λ), and Hi(θ; b, λ) be Ui(θ), Si(α), and Hi(θ), respectively, with xij replaced by Wij(b, λ). By standard estimating equation theory, under some regularity conditions,
(b, λ) → pθ(λ),as B → ∞, where θ(λ) is the solution of E[Hi(θ; 1, λ)] = 0.
, where γ is a vector of parameters of dimension d, say. Fit
to
. Define
. Let
be the d × qM matrix and
be a d × d matrix. Then, by the similar argument to that in Carroll and others (1996), we obtain, as n → ∞, 
. Letting λ = −1 leads to the SIMEX estimator
. Therefore, the asymptotic distribution of the SIMEX estimator is 
The author acknowledges referees' helpful comments. The author thanks Boston University and the National Heart, Lung, and Blood Institute (NHLBI) for providing the data set from the Framingham Heart Study (No. N01-HC-25195) in the illustration. The Framingham Heart Study is conducted and supported by the NHLBI in collaboration with Boston University. This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or NHLBI. Conflict of Interest: None declared.






