Abstract

Background Mendelian randomization (MR) studies assess the causality of associations between exposures and disease outcomes using data on genetic determinants of the exposure. In this work, we explore the effect of exposure and outcome measurement error in MR studies.

Methods For continuous traits, we describe measurement error in terms of a theoretical regression of the measured variable on the true variable. We quantify error in terms of the slope (calibration) and the R2 values (discrimination or classical measurement error). We simulated cohort data sets under realistic parameters and used two-stage least squares regression to assess the effect of measurement error for continuous exposures and outcomes on bias, precision and power. For simulations of binary outcomes, we varied sensitivity and specificity.

Results Discrimination error in continuous exposures and outcomes did not bias the MR estimate, and only outcome discrimination error substantially reduced power. Calibration error biased the MR estimate when the exposure and the outcome measures were not calibrated in a similar fashion, but power was not affected. For binary outcomes, exposure calibration error introduced substantial bias (with negligible impact on power), but exposure discrimination error did not. Reduced outcome specificity and, to a lesser degree, reduced sensitivity biased MR estimates towards the null.

Conclusions Understanding the potential effects of measurement error is an important consideration when interpreting estimates from MR analyses. Based on these results, future MR studies should consider methods for accounting for such error and minimizing its impact on inferences derived from MR analyses.

Introduction

Mendelian randomization (MR) is an approach for determining whether there is a causal relationship between an exposure and a correlated disease outcome using data on a genetic determinant(s) of the exposure.1,2 Exposures with genetic determinants are typically biomarkers, such as circulating molecules or physical traits, and are often subject to confounding in their associations with health outcomes. Causal effects can be estimated in the MR setting when a genetic determinant is used as an instrumental variable (IV) and is analysed jointly with exposure and outcome data to derive an estimate of the exposure’s effect on the outcome. MR estimates are not as precise as simple association-based measures of effect, but, in theory, they represent the causal component of an observed association, not the components due to confounding or reverse causation.

IVs are required to be (i) associated with the exposure, (ii) independent of the outcome conditional on the exposure and confounders (measured or unmeasured) and (iii) independent of all unmeasured confounders of the exposure–outcome association.1,2 Genetic factors are attractive as IVs because they are randomly assigned before conception and should not be affected by any potential confounders other than ancestry, which can be measured and adjusted for using additional genetic data.3,4 Owing to the correlations among neighbouring genetic variants (i.e. linkage disequilibrium), multiple variants may be acceptable proxies for some unmeasured causal variant, although the mechanism by which the causal variant influences the exposure is often poorly understood. Most genetic factors can be measured with high accuracy using modern genotyping technologies and careful quality control (QC) procedures.5,6

In contrast, non-genetic exposures and outcomes are often measured with substantial error. In ordinary least squares (OLS) regression models, the presence of non-differential classical measurement error in the exposure (i.e. errors that are randomly distributed around a true value and unrelated to the outcome) will bias exposure–outcome associations towards the null (i.e. regression dilution or attenuation).7,8 However, non-differential classical errors in continuous outcome measures do not systematically bias association estimates but increase their standard error. Systematic (non-random) measurement error in an exposure or an outcome, error whose value depends on some other feature(s) of the data, can lead to bias.9 Differential measurement errors, where errors in exposure depend on outcome status (or vice versa), can lead to substantial and often unpredictable biases.10

To date, no studies have examined the effects of measurement error on MR estimates. Such studies are needed because MR is becoming a common approach in the epidemiological literature,11 in large part because recent genome-wide association studies have identified single nucleotide polymorphisms (SNPs) associated with a wide array of biomarkers with relevance for disease traits, such as body mass index,12 lipid-related traits13 and C-reactive protein14. In this work, we explore the effects of various types of non-differential measurement error on bias, precision and power in MR studies of continuous exposures in the cohort setting. We do so analytically and using simulated data sets generated using plausible parameters for epidemiological data.

Methods

Theoretical framework for measurement error

Measurement error for a continuous exposure (or outcome) can be described in terms of a theoretical regression of the error-prone measure (X*) on the true exposure (X)15–17 (Figure 1). The regression R2 represents discrimination (classical measurement error) or the degree to which individuals with higher measured values tend to have higher true values. The regression slope represents calibration or the sensitivity of the measured value to variation in the true value. The intercept represents bias, the degree to which, on average, the true value is over- or underestimated. By varying these regression characteristics in simulated data, we can systematically assess the effects of measurement error on the MR estimate. Examples of these types of non-differential error are shown in Figure 2.

Figure 1

A framework for quantifying non-differential measurement error. In the context of theoretical regression of the measured exposure (X*) on the true exposure (X), measurement error can be described according to the discrimination, calibration and bias. These features are represented by the R2, slope and intercept of the theoretical regression, respectively

Figure 1

A framework for quantifying non-differential measurement error. In the context of theoretical regression of the measured exposure (X*) on the true exposure (X), measurement error can be described according to the discrimination, calibration and bias. These features are represented by the R2, slope and intercept of the theoretical regression, respectively

Figure 2

Examples of three types of non-differential exposure measurement error. Distributions for both measured (X*) and true (X) exposure values are shown

Figure 2

Examples of three types of non-differential exposure measurement error. Distributions for both measured (X*) and true (X) exposure values are shown

Effect of measurement error on bias in MR studies of continuous outcomes

Let G be a genetic risk measure (a genotype or a risk score), let X be a continuous exposure affected by G and let Y be a continuous outcome affected by X. If the three requirements for MR are met, the standard MR Wald estimator is  

(1)
formula
where forumla is the coefficient for the regression of Y on G, and forumla is the coefficient for the regression of X on G. Assuming forumla and forumla are generated with proper control for population stratification,3,4 these coefficients can be interpreted as effect estimates for G (on X and Y). The MR Wald estimator is equivalent to that obtained from a two-stage least squares (2SLS) regression,1 and these are the most common analysis techniques for MR studies of continuous exposures and outcomes.

Treating the coefficients in Equation 1 as true effects rather than estimates and assuming the effects of G on X and X on Y are linear, we can decompose forumla into forumla (a standard decomposition for path analysis and, more generally, linear structural equation models18):  

(2)
formula

Acknowledging that G, X and Y are measured with error, we can include measurement error in the MR causal diagram (Figure 3). In this framework, G, X and Y exert some effect on their measured values (G*, X* and Y*). Note that in the presence of measurement error, there is no path from G to Y that passes through X* and avoids X, which maintains MR assumption (ii) for the true exposure. Assuming linear effects for the causal model in Figure 3, the MR estimator using the measured rather than the true variables as in Figure 3 is given by the following equation:  

(3)
formula
where the effects of X on X* and Y on Y* are represented by βxx* and βyy*. This equation can be simplified to obtain an equation that relates the calibration of X* (βxx*) and Y* (βyy*) to the MR estimate:  
(4)
formula

Thus, if X* and Y* are not perfectly calibrated (i.e. βxx* ≠ 1 or βyy* ≠ 1), then the MR estimate will be biased, unless X* and Y* are mis-calibrated in an identical fashion (βxx* = βyy*).

Figure 3

A framework for measurement error in Mendelian randomization studies. In epidemiological studies, the measured exposure (X*) and outcome (Y*) of interest are typically not perfectly correlated with the true X and Y of interest. The regression coefficients (βxx* and βyy*) correspond to the calibration term introduced in Figure 1

Figure 3

A framework for measurement error in Mendelian randomization studies. In epidemiological studies, the measured exposure (X*) and outcome (Y*) of interest are typically not perfectly correlated with the true X and Y of interest. The regression coefficients (βxx* and βyy*) correspond to the calibration term introduced in Figure 1

Effect of measurement error on precision in MR studies of continuous outcomes

The standard error of the MR estimate19 is  

(5)
formula
where ε represents the variation in Y not explained by X, σg and σε represent the standard deviations of G and ε, respectively, and σg,x represents the covariance of G and X. This is equivalent to19:  
(6)
formula

In the context of measurement error, σε2 is equivalent to the Var(Y*) minus the variance in Y* that is explained by X* [i.e. Var(βx*y*X*) under the simplifying assumption cov(X, ε) = 0, which may not hold in the MR context]. Expanding this expression based on Figure 3, we obtain the following equation:  

(7)
formula

From this equation, it is clear that increases in Var(Y*) (e.g. discrimination error) will increase the standard error of the MR estimate, if all other parameters remain constant. Describing the effects of other types of measurement error is not straightforward because the measurement error may affect multiple parameters in Equation 7.

Measurement error when outcomes are binary

Several authors have described the bias present in MR studies of binary outcomes,2,20,21 including Palmer et al.,20 who derived equations for the MR estimate. Analytically, integrating measurement error into these equations is not straightforward. However, this bias may be small and inconsequential in certain studies settings, such as epidemiological studies of rare disease outcomes.22 In this work, we assess the effects of measurement error in MR studies of binary outcomes using simulated data and show that many of the lessons learned from continuous outcomes also apply to MR studies of binary outcomes.

Simulation 1: the effect of discrimination error on bias and power in MR studies

We evaluated the effect of discrimination error on bias, precision and power in MR studies of a continuous exposure and outcome using simulated cohort data sets. For each simulated scenario, we generated 10 000 data sets consisting of 5000 observations and five variables: a genetic susceptibility score (G), a true exposure value (X) influenced by G, an error-prone measured value of X (X*), a true value of an outcome (Y) influenced by X and an error-prone measured Y value (Y*). We introduced an unmeasured confounding variable U that effects both X and Y. G and U were generated as random numbers drawn from a standard normal distribution. X was also a randomly generated standard normal variable, but with linear effects exerted by G and U:  

(8)
formula
βgx was chosen to produce an R2 of 0.05 for the regression of X on G using the following equation:  
(9)
formula

Y was modelled as a random number from a standard normal distribution plus a linear effect of X and U:  

(10)
formula

βxy was set to 0.0, 0.10, 0.25 or 0.50. The value of βux and βuy was set to 0.5.

Error-prone measures of X and Y (discrimination error only) were generated by adding normally distributed error components to X and Y. For example, X* was generated as follows:  

(11)
formula

δx* was chosen to produce a specific R2 value (1.00, 0.75, 0.50 or 0.25) for the regression of X* on X, using the following equation:  

(12)
formula

Y* was generated in a fashion similar to X*, where lower R2 values represent increasing discrimination error. Each simulation differed only in the amount of discrimination error in X* (R2,xx*) and Y* (R2yy*) and the true effect of X on Y (βxy).

For each simulation, 2SLS regression was performed on each of the 10 000 simulated data sets using Stata’s ivregress command. This procedure can be viewed as two regressions, although Stata uses a one-step procedure as described in Baum.23 Stage 1 of 2SLS is a regression of X* on the IV (G). Stage 2 is a regression of Y* on the fitted X values from stage 1. MR estimates and standard errors were obtained. Power was defined as the proportion of the 10 000 data sets in which a statistically significant positive effect of X on Y was detected (two-sided P < 0.05).

These simulations were repeated in the absence of confounding (with different U variables affecting X and Y) yielding similar results and conclusions. It has previously been shown that X–Y confounding does not affect bias or power for 2SLS when IVs are strong. IV strength is measured by F statistic from the first-stage regression of X on G.24 IVs with F > 10 are typically considered to be strong. The mean first-stage F values for all scenarios considered in this work were > 50 and thus free of appreciable weak-IV biases.

Simulation 2: the effect of calibration error on bias and power in MR studies

Similar simulations consisting of 10 000 data sets were carried out to evaluate the effect of calibration error on bias, precision and power. G, X, U and Y were generated in an identical fashion to simulation 1. However, X* and Y* were generated with calibration error rather than discrimination error. For example, X* was generated as  

(13)
formula

Calibration error was introduced by setting βxx* equal to 1.50, 1.25, 1.00, 0.75 or 0.50, with 1.00 representing perfect calibration. Y* was generated in a similar fashion, varying βyy*. 2SLS was used to analyse all simulated data sets.

Simulation 3: exposure discrimination error in MR studies of binary outcomes

To examine the effect of exposure measurement error in MR studies of binary outcomes, data on G, X and U were generated as in simulations 1–2, but Y was generated as a binary outcome using a logistic model for 5000 individuals in 10 000 data sets.

The logistic model for Y also includes effects for both X and U:  

(14)
formula

βxy was chosen to produce specific odds ratios for the true effect of X on Y (odds ratio = 1.0, 1.5 and 2.0), and β0 was chosen to produce an average population risk of 0.10 (this was varied in supplementary analyses). βux and βuy were set to 0.5. For X*, we introduced discrimination error (R2xx* = 1.0, 0.75, 0.50 or 0.25) as described in previous simulations. Each simulated data set of 5000 observations was analysed using a two-stage regression: linear regression of X on G, followed by a logistic regression of Y on the predicted X value from the stage-1 regression. In the second stage, standard errors were obtained using the ‘robust’ option in Stata.

Simulation 4: exposure calibration error in MR studies of binary outcomes

Data on G, X, U and Y were generated, and analyses were conducted in an identical fashion to simulation 1. However, for X*, we introduced calibration error (βxx* = 1.5, 1.25, 1.0, 0.75 or 0.5) as described in previous simulations.

Additional simulations

We conducted additional simulations investigating measurement error in binary outcomes in the MR setting, by varying the sensitivity and specificity of the outcomes measure. We also investigated the effect of varying the population risk for the outcome in the context of reduced sensitivity and specificity. Details on these simulations can be found in the supplementary material.

Results

Simulation 1: discrimination error affects power but does not introduce bias (Table 1)

For all scenarios evaluated, the mean MR effect estimates were equal to the true effect. However, discrimination error in Y* resulted in substantial increases in the mean standard error of the MR estimates and corresponding decreases in power for all scenarios in which βxy did not equal zero. These increases became more pronounced as R2yy* decreased and the true effect size (βxy) increased. Discrimination error for X* had very minor effects on the standard error and power as compared with Y*. When discrimination errors in X* and Y* were examined jointly, their effects on bias, precision and power were similar to their effects when examined independently (Supplementary Table S1, available as Supplementary data at IJE online).

Table 1

The effect of discrimination error in a measured continuous exposure (X*) and continuous outcome (Y*) on bias, precision and power in Mendelian randomization studies

 True effect of X on Y
 
Discrimination error
 
βxy = 0.0
 
βxy = 0.1
 
βxy = 0.2
 
βxy = 0.3
 
Exposure (forumlaOutcome (forumlaMean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power 
Discrimination error for X only 
1.00 1.00 0.00 0.062 0.03 0.10 0.062 0.37 0.20 0.062 0.88 0.30 0.062 0.99 
0.75 1.00 0.00 0.062 0.03 0.10 0.062 0.37 0.20 0.063 0.88 0.30 0.063 0.99 
0.50 1.00 0.00 0.062 0.02 0.10 0.063 0.37 0.20 0.064 0.88 0.30 0.065 0.99 
0.25 1.00 0.00 0.063 0.02 0.10 0.064 0.36 0.20 0.067 0.87 0.30 0.072 0.99 
Discrimination error for Y only 
1.00 1.00 0.00 0.062 0.03 0.10 0.062 0.37 0.20 0.062 0.88 0.30 0.062 0.99 
1.00 0.75 0.00 0.072 0.03 0.10 0.072 0.29 0.20 0.073 0.77 0.30 0.073 0.97 
1.00 0.50 0.00 0.088 0.03 0.10 0.089 0.21 0.20 0.090 0.60 0.30 0.092 0.89 
1.00 0.25 0.00 0.124 0.02 0.10 0.126 0.12 0.20 0.130 0.35 0.30 0.134 0.60 
 True effect of X on Y
 
Discrimination error
 
βxy = 0.0
 
βxy = 0.1
 
βxy = 0.2
 
βxy = 0.3
 
Exposure (forumlaOutcome (forumlaMean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power 
Discrimination error for X only 
1.00 1.00 0.00 0.062 0.03 0.10 0.062 0.37 0.20 0.062 0.88 0.30 0.062 0.99 
0.75 1.00 0.00 0.062 0.03 0.10 0.062 0.37 0.20 0.063 0.88 0.30 0.063 0.99 
0.50 1.00 0.00 0.062 0.02 0.10 0.063 0.37 0.20 0.064 0.88 0.30 0.065 0.99 
0.25 1.00 0.00 0.063 0.02 0.10 0.064 0.36 0.20 0.067 0.87 0.30 0.072 0.99 
Discrimination error for Y only 
1.00 1.00 0.00 0.062 0.03 0.10 0.062 0.37 0.20 0.062 0.88 0.30 0.062 0.99 
1.00 0.75 0.00 0.072 0.03 0.10 0.072 0.29 0.20 0.073 0.77 0.30 0.073 0.97 
1.00 0.50 0.00 0.088 0.03 0.10 0.089 0.21 0.20 0.090 0.60 0.30 0.092 0.89 
1.00 0.25 0.00 0.124 0.02 0.10 0.126 0.12 0.20 0.130 0.35 0.30 0.134 0.60 

Estimates were derived using a two-stage least squares regression on 10 000 simulated data sets. Simulated data sets consisted of 5000 samples. In these simulations, G explains 5% of the variation in X (R2 = 0.05).

Simulation 2: calibration error can bias the MR estimate but does not affect power (Table 2)

When calibration error for X* and Y* was equal (βxx* = βyy*), the MR estimate was unbiased (shown in bold). However, when X and Y were measured with different amounts of calibration error and the true effect of X on Y (βxy) was not zero, MR estimates were biased. Specifically, when βxx* > βyy*, bias was towards the null, and when βxx* < βyy*, bias was away from the null. Absolute bias increased with βxy, but the relative bias was constant (βyy*xx*). When βxy = 0, calibration error did not introduce bias. Standard errors decreased when bias towards the null was present and increased when bias away from the null was present, resulting in power estimates and type-II error rates that were unaffected by calibration error.

Table 2

Effect of calibration error in the measured exposure (X*) and outcome (Y*) on bias, precision and power in Mendelian randomization studies

 True effect of X on Y
 
Calibration error
 
βxy = 0.00
 
βxy = 0.10
 
βxy = 0.20
 
βxy = 0.30
 
Exposure (forumlaOutcome (βyy*Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power 
Calibration error for X only 
1.50 1.00 0.00 0.041 0.03 0.08 0.050 0.38 0.13 0.041 0.88 0.20 0.041 1.00 
1.25 1.00 0.00 0.050 0.03 0.10 0.062 0.38 0.16 0.050 0.88 0.24 0.050 0.99 
1.00 1.00 0.00 0.062 0.03 0.13 0.083 0.37 0.20 0.062 0.89 0.30 0.062 1.00 
0.75 1.00 0.00 0.083 0.03 0.20 0.124 0.37 0.27 0.083 0.88 0.40 0.083 0.99 
0.50 1.00 0.00 0.124 0.02 0.07 0.041 0.37 0.50 0.124 0.89 1.00 0.124 1.00 
Calibration error for Y only 
1.00 1.50 0.00 0.093 0.03 0.15 0.093 0.38 0.30 0.093 0.88 0.45 0.093 0.99 
1.00 1.25 0.00 0.078 0.02 0.12 0.078 0.37 0.25 0.077 0.89 0.37 0.078 0.99 
1.00 1.00 0.00 0.062 0.03 0.10 0.062 0.38 0.20 0.062 0.89 0.30 0.062 1.00 
1.00 0.75 0.00 0.047 0.03 0.07 0.047 0.37 0.15 0.047 0.88 0.22 0.047 1.00 
1.00 0.50 0.00 0.031 0.03 0.05 0.031 0.38 0.10 0.031 0.88 0.15 0.031 0.99 
 True effect of X on Y
 
Calibration error
 
βxy = 0.00
 
βxy = 0.10
 
βxy = 0.20
 
βxy = 0.30
 
Exposure (forumlaOutcome (βyy*Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power Mean MR estimate Mean standard error Power 
Calibration error for X only 
1.50 1.00 0.00 0.041 0.03 0.08 0.050 0.38 0.13 0.041 0.88 0.20 0.041 1.00 
1.25 1.00 0.00 0.050 0.03 0.10 0.062 0.38 0.16 0.050 0.88 0.24 0.050 0.99 
1.00 1.00 0.00 0.062 0.03 0.13 0.083 0.37 0.20 0.062 0.89 0.30 0.062 1.00 
0.75 1.00 0.00 0.083 0.03 0.20 0.124 0.37 0.27 0.083 0.88 0.40 0.083 0.99 
0.50 1.00 0.00 0.124 0.02 0.07 0.041 0.37 0.50 0.124 0.89 1.00 0.124 1.00 
Calibration error for Y only 
1.00 1.50 0.00 0.093 0.03 0.15 0.093 0.38 0.30 0.093 0.88 0.45 0.093 0.99 
1.00 1.25 0.00 0.078 0.02 0.12 0.078 0.37 0.25 0.077 0.89 0.37 0.078 0.99 
1.00 1.00 0.00 0.062 0.03 0.10 0.062 0.38 0.20 0.062 0.89 0.30 0.062 1.00 
1.00 0.75 0.00 0.047 0.03 0.07 0.047 0.37 0.15 0.047 0.88 0.22 0.047 1.00 
1.00 0.50 0.00 0.031 0.03 0.05 0.031 0.38 0.10 0.031 0.88 0.15 0.031 0.99 

Estimates were derived using two-stage least squares regression on 10 000 simulated data sets. Simulated data sets consisted of 5000 samples. In these simulations, the IV (G) explains 5% of the variation in X (R2 = 0.05).

When calibration errors in X* and Y* were examined jointly, their effects on bias, precision and power were similar to their effects when examined independently (Supplementary Table S2, available as Supplementary data at IJE online). When calibration and discrimination error were examined jointly, their effects on bias and power were similar to their effects when examined independently (Supplementary Table S3, available as Supplementary data at IJE online).

Simulations 3 and 4: exposure mis-calibration introduces bias in studies of binary outcomes

As previously reported,20 our results show that MR studies of continuous exposures and binary outcomes using a two-stage linear-logistic approach produce biased effect estimates. Under the realistic scenarios examined, discrimination error in X* introduced no additional bias into the MR estimate and resulted in slight increases in the width of the confidence interval with no detectable effect on power (Table 3). Calibration error in X* introduced substantial additional bias into the MR estimate when a true effect was present (odds ratio ≠ 1), with bias away from the null when βxx* < 1 and bias towards the null when βxx* > 1. Calibration error in X* had no clear effect on power (Table 4).

Table 3

The effect of discrimination error in the measured exposure (X*) on bias, precision and power in Mendelian randomization studies of dichotomous outcomes

  Estimates from MR
 
Discrimination error for X (forumlaTrue effect of X on Y (ORxyOR 95% CI Power 
1.00 1.00 1.00 0.68–1.47 0.02 
 1.25 1.24 0.85–1.81 0.20 
 1.50 1.46 1.01–2.11 0.52 
 1.75 1.66 1.16–2.37 0.79 
 2.00 1.83 1.29–2.60 0.92 
0.75 1.00 1.00 0.68–1.48 0.02 
 1.25 1.24 0.85–1.81 0.20 
 1.50 1.46 1.01–2.12 0.52 
 1.75 1.66 1.16–2.37 0.79 
 2.00 1.83 1.29–2.61 0.93 
0.50 1.00 1.00 0.68–1.48 0.03 
 1.25 1.24 0.84–1.81 0.19 
 1.50 1.46 1.01–2.12 0.52 
 1.75 1.66 1.16–2.38 0.79 
 2.00 1.83 1.29–2.61 0.92 
0.25 1.00 1.00 0.67–1.48 0.03 
 1.25 1.24 0.84–1.82 0.19 
 1.50 1.46 1.01–2.13 0.52 
 1.75 1.67 1.16–2.41 0.79 
 2.00 1.84 1.29–2.63 0.92 
  Estimates from MR
 
Discrimination error for X (forumlaTrue effect of X on Y (ORxyOR 95% CI Power 
1.00 1.00 1.00 0.68–1.47 0.02 
 1.25 1.24 0.85–1.81 0.20 
 1.50 1.46 1.01–2.11 0.52 
 1.75 1.66 1.16–2.37 0.79 
 2.00 1.83 1.29–2.60 0.92 
0.75 1.00 1.00 0.68–1.48 0.02 
 1.25 1.24 0.85–1.81 0.20 
 1.50 1.46 1.01–2.12 0.52 
 1.75 1.66 1.16–2.37 0.79 
 2.00 1.83 1.29–2.61 0.93 
0.50 1.00 1.00 0.68–1.48 0.03 
 1.25 1.24 0.84–1.81 0.19 
 1.50 1.46 1.01–2.12 0.52 
 1.75 1.66 1.16–2.38 0.79 
 2.00 1.83 1.29–2.61 0.92 
0.25 1.00 1.00 0.67–1.48 0.03 
 1.25 1.24 0.84–1.82 0.19 
 1.50 1.46 1.01–2.13 0.52 
 1.75 1.67 1.16–2.41 0.79 
 2.00 1.84 1.29–2.63 0.92 

Estimates were derived using a two-stage regression (i.e. a linear regression of X* on G, followed by a logistic regression of Y on the predicted X value from the first regression) on 10 000 simulated data sets. Simulated data sets consisted of 5000 samples. In these simulations, the IV (G) explains 5% of the variation in X (R2 = 0.05). ORs and CIs are derived from the means of the beta and standard errors derived from the simulations.

OR, odds ratio; CI, confidence interval.

Table 4

The effect of calibration error in the measured exposure (X*) on bias, precision and power in Mendelian randomization studies of dichotomous outcomes

  Estimates from MR
 
Calibration error for X (βxx*True effect of X on Y (ORxyOR 95% CI Power 
1.50 1.00 1.00 0.79–1.26 0.02 
 1.25 1.15 0.92–1.44 0.23 
 1.50 1.29 1.03–1.60 0.61 
 1.75 1.40 1.13–1.74 0.87 
 2.00 1.49 1.21–1.84 0.97 
1.25 1.00 1.00 0.76–1.32 0.03 
 1.25 1.18 0.90–1.55 0.24 
 1.50 1.35 1.04–1.76 0.61 
 1.75 1.50 1.16–1.93 0.87 
 2.00 1.62 1.26–2.08 0.97 
1.00 1.00 1.00 0.70–1.41 0.03 
 1.25 1.24 0.88–1.74 0.23 
 1.50 1.45 1.04–2.02 0.61 
 1.75 1.66 1.20–2.28 0.87 
 2.00 1.83 1.34–2.50 0.97 
0.75 1.00 1.00 0.63–1.59 0.02 
 1.25 1.33 0.84–2.09 0.23 
 1.50 1.65 1.06–2.57 0.62 
 1.75 1.95 1.27–3.00 0.87 
 2.00 2.23 1.47–3.38 0.97 
0.50 1.00 1.00 0.50–2.00 0.03 
 1.25 1.54 0.78–3.04 0.24 
 1.50 2.13 1.10–4.12 0.61 
 1.75 2.72 1.43–5.18 0.87 
 2.00 3.33 1.78–6.23 0.97 
  Estimates from MR
 
Calibration error for X (βxx*True effect of X on Y (ORxyOR 95% CI Power 
1.50 1.00 1.00 0.79–1.26 0.02 
 1.25 1.15 0.92–1.44 0.23 
 1.50 1.29 1.03–1.60 0.61 
 1.75 1.40 1.13–1.74 0.87 
 2.00 1.49 1.21–1.84 0.97 
1.25 1.00 1.00 0.76–1.32 0.03 
 1.25 1.18 0.90–1.55 0.24 
 1.50 1.35 1.04–1.76 0.61 
 1.75 1.50 1.16–1.93 0.87 
 2.00 1.62 1.26–2.08 0.97 
1.00 1.00 1.00 0.70–1.41 0.03 
 1.25 1.24 0.88–1.74 0.23 
 1.50 1.45 1.04–2.02 0.61 
 1.75 1.66 1.20–2.28 0.87 
 2.00 1.83 1.34–2.50 0.97 
0.75 1.00 1.00 0.63–1.59 0.02 
 1.25 1.33 0.84–2.09 0.23 
 1.50 1.65 1.06–2.57 0.62 
 1.75 1.95 1.27–3.00 0.87 
 2.00 2.23 1.47–3.38 0.97 
0.50 1.00 1.00 0.50–2.00 0.03 
 1.25 1.54 0.78–3.04 0.24 
 1.50 2.13 1.10–4.12 0.61 
 1.75 2.72 1.43–5.18 0.87 
 2.00 3.33 1.78–6.23 0.97 

Estimates were derived using a two-stage regression (i.e. a linear regression of X* on G, followed by a logistic regression of Y on the predicted X value from the first regression) on 10 000 simulated data sets. Simulated data sets consisted of 5000 samples. In these simulations, G explains 5% of the variation in X (R2 = 0.05). ORs and CIs are derived from the means of the beta and standard errors derived from the simulations OR, odds ratio; CI, confidence interval.

Additional simulations

Additional simulations with a binary outcome showed that the bias of the two-stage estimator decreased with rarer outcomes and that imperfect specificity generated more bias than imperfect sensitivity. Further details are available in the supplementary materials.

Discussion

To our knowledge, this is the first article to systematically consider the effect of measurement error on IV estimates in the MR setting. We have examined two types of measurement error that are common in epidemiological research: discrimination error and calibration error. The third characteristic of our theoretical framework for assessing measurement error is ‘bias’ (Figure 1) in the intercept term in the regression calibration setting, which is not expected to have any effect on bias or power in MR studies, as it does not affect the non-intercept regression coefficients. Our results confirm this expectation (not reported).

Using simulated data, we observed that MR estimates for continuous outcomes are not biased by exposure discrimination error (in contrast to OLS regression) or by outcome discrimination error (similar to OLS), consistent with expectations from equations derived in this work. This is expected, as 2SLS is known to be a consistent estimator of the coefficient for the true exposure,25 not being susceptible to OLS ‘regression dilution bias’ when exposures are measured with classical (discrimination) error. This consistency in the presence of classical measurement error has been mentioned as a key advantage of MR that should motivate its use in epidemiology.1,11,26,27

Our simulations also show that increasing discrimination error in the measured exposure (X*) has little impact on precision and power, another important advantage of MR. On the other hand, increasing discrimination error in the measured outcome (Y*) will increase standard errors and decrease power (similar to OLS regression), and these effects increase as the true effect of X on Y increases. This result is consistent with analytic expectations based on the equation for the standard error of the MR estimate provided by Martens et al. (Equations 5–7),19 where increases in the variance of Y* have a clear effect on the variance of the MR estimator, but increases in the variance of X* have a less straightforward effect owing to accompanying changes in βx*x and βxx*. For continuous outcomes, our analytic conclusions and our simulation-based conclusions show that calibration errors in X and Y can bias the MR estimate, and this bias can be understood in terms of ‘differential calibration error’ between the measured exposure (X*) and the measured outcome (Y*). In other words, if X* and Y* have similar calibration error, then bias will be small, but if X* and Y* are calibrated in different ways with respect to X and Y, then bias may be substantial. Differential calibration error affects the standard error of the MR estimate in such a way that power is always equal to the scenario where calibration is perfect (similar to OLS regression).10

We have also demonstrated the unique aspects of dealing with measurement error in MR studies of binary outcomes (in the cohort setting). Using a two-stage linear-logistic regression approach, MR estimates are typically somewhat biased, with the confounding structure influencing the magnitude of the bias.20 Several authors have described this bias,2,20,28 which arises because a non-linear relationship between X and Y depends on the unknown distribution of U, whereas a linear relationship does not. However, based on the realistic scenarios examined here, this bias will typically be mild, and power will not be substantially affected. Similar to continuous outcomes, exposure discrimination error results in slight decreases in power and precision, whereas exposure calibration error introduces additional bias. Suboptimal specificity for the measured outcome results in much larger biases towards the null and reductions in power than suboptimal sensitivity because reduced specificity results in misclassification of a larger number of individuals.29 The bias observed in our analyses of binary outcomes decreased as the prevalence of the outcome decreased, consistent with observations made by Bowden and Vansteelandt22 in the context of structural mean models.

Consideration of measurement error issues may be important for the interpretation of estimates from MR studies of error-prone exposures. For instance, self-reported body mass index (BMI), commonly used as an exposure in MR studies,30,31 is known to be under-reported as BMI increases, resulting in a BMI measure that is not perfectly calibrated to its true value (βxx* < 1). Thus, if self-reported BMI is used as an exposure (or an outcome) in the MR setting, the MR effect estimate may be underestimated (or overestimated, respectively). If BMI is considered a proxy for exposures such as peripheral or central adiposity or percent body fat, measurement error structure is even more complicated, an issue that warrants further study. Similarly, self-reported hours of sleep are not well-calibrated, with over-reporting for individuals who sleep less.15

Error in molecular biomarker measurements can occur for several reasons. For example, biomarkers can vary by time of day, month, season or in response to acute events or preclinical disease.32 Hence, the timing of sample collection may lead to discrepancies between measured values and longer-term average values. Furthermore, measurements made outside the relevant aetiological time window may not accurately reflect the relevant historical exposure value.33 A wide array of laboratory factors can affect biomarkers in stored samples, including freeze–thaw cycling and storage conditions, contamination (e.g. trace elements) and anticoagulants or stabilizing agents.32,34 Additional variation in measured values could be introduced by temporal variation in the laboratory environment or inherent limitations of the measurement method.35 Unfortunately, the measurement error structure for any biomarker or outcome may not always be measurable owing to a lack of gold-standard measures. However, there are various strategies for reducing the influence of measurement error in biomarker studies, including careful sample handling, inclusion of QC samples to assess and account for errors35 and taking multiple measurements over a period.36,37

We did not explore the effects of more complex types of measurement error in this work, such as limit of detection errors, where low values cannot be detected. Schisterman and Little38 have described appropriate strategies for dealing with such data. We also did not consider ‘differential’ measurement error (i.e. errors whose values depend on other features of the data, such as covariates or outcomes). Although this work does not consider more complex forms of measurement error, their effects can be conceptualized to some extent by understanding their effects on the parameters described in this work. For example, limit of detection errors, if not properly accounted for, would be likely to result in calibration and bias errors owing to a mass of X* values at zero.

For this analysis, we have used a continuous variable as an IV, representing a genetic risk score for X. Such a score may be unrealistic for exposures with few genetic determinants or undesirable because of potential violations of the assumptions required for IV analyses. However, gene-based IVs can be modelled in multiple ways (e.g. a single or multiple allele count variables, dummy variables for specific genotypes/haplotypes), and it has previously been shown that the key factor influencing power is the R2 of the first-stage regression, regardless of what type of IV is used.24 Thus, our findings for a given first-stage R2 will apply approximately to any type of instrument be it continuous, discrete or a set of multiple instruments. Our analyses were performed using an R2 of 0.05 for the effect of the IV on the exposure. This value is becoming realistic for many disease-related biomarkers of interest, including various lipid-related traits13 and inflammatory biomarkers,14,39,40 assuming multi-SNP IVs are used, but this value remains unrealistically high for other biomarkers. MR studies of exposures that have genetic determinants with weaker effects are more likely to require unrealistic sample sizes. Our simulations were conducted under the assumption that IVs were valid (i.e. all the IV assumptions are met). First-stage F values for all scenarios considered in this work were > 50, and therefore free of detectable weak-IV biases. Hence, our results apply to strong IVs only. One key difference between weak-IV biases and the biases owing to discrimination and calibration discussed in this work is that these measurement error biases do not inflate the type-I error rate, whereas weak-IV biases can increase this rate. Genetic variants are typically measured with little error, assuming modern genotyping technologies and adequate QC measures are used.5,6 Thus, we did not devote substantial attention to genotyping error in this article.

Data for our analyses were generated according to the two-stage models that were used to analyse the data, assuming no interactions. In scenarios where violations of these assumptions occur, effect estimates will likely be biased.41,42 Other models are available for binary outcomes, such as probit structural equation models and generalized method of moments estimators; however, we chose the two-stage linear-logistic approach (a linear regression followed a logistic regression in the second stage) model because it is most familiar to epidemiologists. MR studies of binary outcomes are prone to biases, which are difficult to completely account for using standard statistical methods; however, such bias can be reduced by including the residuals from the first-stage linear regression in the second-stage logistic regression, assuming the confounding variable is normally distributed and not a modifier of the effect of X on Y.20,43 This ‘residual inclusion’ method has also been shown to reduce (but not eliminate) 2SLS bias when effects are non-linear,44 although the interpretation of the resulting odds ratio parameter is not straightforward. The causal inference literature provides additional methods for handling such settings,45–47 but these impose an assumption of homogeneity (i.e. constant causal effect across units in the population) or give only local causal effects (i.e. effects only for the sub-population for which the instrument changes the exposure). This 2SLS MR estimate has been called the ‘linear IV average effect estimator’ and corresponds to the population, individual or local average causal effect, depending on what model assumptions are made.41 The two-stage linear-logistic model, also called the ‘Wald odds ratio’, corresponds to the causal odds ratio or the local causal odds ratio.41

Although this work applies to the cohort studies, case–controls studies are a common setting for analysis of binary outcomes. MR analyses of case–control data are ideally conducted using methods that integrate information on the prevalence or incidence of the outcome or the distribution of the IV in the population.22,48 Alternatively, a rare disease assumption can be used (when appropriate) to obtain approximate estimates.22 Also, only continuous exposures are considered in this work. Binary exposures will likely be less common in MR applications; however, additional research is needed to evaluate analysis methods and potential biases associated with these scenarios.49 Future studies should explore the effects of measurement error in these settings.

In conclusion, measurement error in both the exposure and the outcome can affect both bias and precision in MR studies. Understanding the potential impact of such errors will help researchers interpret estimates derived from MR analyses. Sensitivity analyses and QC procedures can be used to explore the degree to which the results of MR studies may be affected by measurement error. In future work, we will consider methods for quantifying and accounting for measurement error in MR analyses.

Supplementary Data

Supplementary Data are available at IJE online.

Funding

This work was supported by the Department of Defense [W81XWH-10-1-0499 to B.P.].

Conflict of interest: None declared.

KEY MESSAGES

  • Classical measurement error (i.e. discrimination error) in continuous exposures and outcomes will not bias MR estimates (under traditional assumptions); precision and power will be reduced in the presence of outcome error but essentially unaffected by exposure error.

  • Calibration error in exposure and outcome measures can bias MR estimates if a true effect exists, but power will not be affected.

  • For binary outcomes, the biased two-stage linear-logistic estimator is additionally biased by calibration error, but not discrimination error, in the measured exposure, and the magnitude and direction of this bias depend on the nature of the mis-calibration.

References

1
Lawlor
DA
Harbord
RM
Sterne
JA
Timpson
N
Davey Smith
G
Mendelian randomization: using genes as instruments for making causal inferences in epidemiology
Stat Med
 , 
2008
, vol. 
27
 (pg. 
1133
-
63
)
2
Didelez
V
Sheehan
N
Mendelian randomization as an instrumental variable approach to causal inference
Stat Methods Med Res
 , 
2007
, vol. 
16
 (pg. 
309
-
30
)
3
Price
AL
Patterson
NJ
Plenge
RM
Weinblatt
ME
Shadick
NA
Reich
D
Principal components analysis corrects for stratification in genome-wide association studies
Nat Genet
 , 
2006
, vol. 
38
 (pg. 
904
-
09
)
4
Tian
C
Gregersen
PK
Seldin
MF
Accounting for ancestry: population substructure and genome-wide association studies
Hum Mol Genet
 , 
2008
, vol. 
17
 (pg. 
R143
-
50
)
5
Kim
KK
Won
HH
Cho
SS
, et al.  . 
Comparison of identical single nucleotide polymorphisms genotyped by the GeneChip Targeted Genotyping 25 K, Affymetrix 500 K and Illumina 550 K platforms
Genomics
 , 
2009
, vol. 
94
 (pg. 
89
-
93
)
6
Turner
S
Armstrong
LL
Bradford
Y
, et al.  . 
Quality control procedures for genome-wide association studies
Curr Protoc Hum Genet
 , 
2011
 
Chapter 1:Unit1.19
7
Spearman
C
The proof and measurement of association between two things
Am J Psychol
 , 
1904
, vol. 
15
 (pg. 
72
-
101
)
8
Davey Smith
G
Phillips
AN
Inflation in epidemiology: “the proof and measurement of association between two things” revisited
BMJ
 , 
1996
, vol. 
312
 (pg. 
1659
-
61
)
9
White
E
Design and interpretation of studies of differential exposure measurement error
Am J Epidemiol
 , 
2003
, vol. 
157
 (pg. 
380
-
87
)
10
Thomas
D
Stram
D
Dwyer
J
Exposure measurement error: influence on exposure-disease. Relationships and methods of correction
Annu Rev Public Health
 , 
1993
, vol. 
14
 (pg. 
69
-
93
)
11
Bochud
M
Rousson
V
Usefulness of Mendelian randomization in observational epidemiology
Int J Environ Res Public Health
 , 
2010
, vol. 
7
 (pg. 
711
-
28
)
12
Hofker
M
Wijmenga
C
A supersized list of obesity genes
Nat Genet
 , 
2009
, vol. 
41
 (pg. 
139
-
40
)
13
Teslovich
TM
Musunuru
K
Smith
AV
, et al.  . 
Biological, clinical and population relevance of 95 loci for blood lipids
Nature
 , 
2010
, vol. 
466
 (pg. 
707
-
13
)
14
Dehghan
A
Dupuis
J
Barbalic
M
, et al.  . 
Meta-analysis of genome-wide association studies in > 80 000 subjects identifies multiple loci for C-reactive protein levels
Circulation
 , 
2011
, vol. 
123
 (pg. 
731
-
38
)
15
Lauderdale
DS
Knutson
KL
Yan
LL
Liu
K
Rathouz
PJ
Self-reported and measured sleep duration: how similar are they?
Epidemiology
 , 
2008
, vol. 
19
 (pg. 
838
-
45
)
16
Spiegelman
D
McDermott
A
Rosner
B
Regression calibration method for correcting measurement-error bias in nutritional epidemiology
Am J Clin Nutr
 , 
1997
, vol. 
65
 
Suppl 4
(pg. 
1179S
-
86S
)
17
Carroll
RJ
Ruppert
D
Stefanski
LA
Measurement Error in Nonlinear Models
 , 
1995
London
Chapman & Hall
18
Bollen
K
Structural Equations with Latent Variables
 , 
1989
New York
Wiley
19
Martens
EP
Pestman
WR
de Boer
A
Belitser
SV
Klungel
OH
Instrumental variables: application and limitations
Epidemiology
 , 
2006
, vol. 
17
 (pg. 
260
-
67
)
20
Palmer
TM
Thompson
JR
Tobin
MD
Sheehan
NA
Burton
PR
Adjusting for bias and unmeasured confounding in Mendelian randomization studies with binary responses
Int J Epidemiol
 , 
2008
, vol. 
37
 (pg. 
1161
-
68
)
21
Rassen
JA
Schneeweiss
S
Glynn
RJ
Mittleman
MA
Brookhart
MA
Instrumental variable analysis for estimation of treatment effects with dichotomous outcomes
Am J Epidemiol
 , 
2009
, vol. 
169
 (pg. 
273
-
84
)
22
Bowden
J
Vansteelandt
S
Mendelian randomization analysis of case-control data using structural mean models
Stat Med
 , 
2011
, vol. 
30
 (pg. 
678
-
94
)
23
Baum
CF
Schaffer
ME
Stillman
S
Instrumental variables and GMM: estimation and testing
Stata J
 , 
2003
, vol. 
3
 (pg. 
1
-
31
)
24
Pierce
BL
Ahsan
H
Vanderweele
TJ
Power and instrument strength requirements for Mendelian randomization studies using multiple genetic variants
Int J Epidemiol
 , 
2011
, vol. 
40
 (pg. 
740
-
52
)
25
Hausman
J
Mismeasured variables in econometric anlaysis: problems from the right and problems from the left
J Econ Perspect
 , 
2001
, vol. 
15
 (pg. 
57
-
67
)
26
Davey Smith
G
Ebrahim
S
Mendelian randomization: prospects, potentials, and limitations
Int J Epidemiol
 , 
2004
, vol. 
33
 (pg. 
30
-
42
)
27
Davey Smith
G
Timpson
N
Ebrahim
S
Strengthening causal inference in cardiovascular epidemiology through Mendelian randomization
Ann Med
 , 
2008
, vol. 
40
 (pg. 
524
-
41
)
28
Burgess
S
Thompson
SG
Bias in causal estimates from Mendelian randomization studies with weak instruments
Stat Med
 , 
2011
, vol. 
30
 (pg. 
1312
-
23
)
29
Brenner
H
Savitz
DA
The effects of sensitivity and specificity of case selection on validity, sample size, precision, and power in hospital-based case-control studies
Am J Epidemiol
 , 
1990
, vol. 
132
 (pg. 
181
-
92
)
30
Rothman
KJ
BMI-related errors in the measurement of obesity
Int J Obes (Lond)
 , 
2008
, vol. 
32
 
Suppl 3
(pg. 
S56
-
59
)
31
Nawaz
H
Chan
W
Abdulrahman
M
Larson
D
Katz
DL
Self-reported weight and height: implications for obesity research
Am J Prev Med
 , 
2001
, vol. 
20
 (pg. 
294
-
98
)
32
Holland
NT
Smith
MT
Eskenazi
B
Bastaki
M
Biological sample collection and processing for molecular epidemiological studies
Mutat Res
 , 
2003
, vol. 
543
 (pg. 
217
-
34
)
33
White
E
Armstrong
BK
Saracci
R
Principles of Exposure Measurement in Epidemiology
 , 
2008
2nd edn.
New York
Oxford University Press
34
Willett
W
Nutritional Epidemiology
 , 
1998
2nd edn.
New York
Oxford University Press
35
Tworoger
SS
Hankinson
SE
Use of biomarkers in epidemiologic studies: minimizing the influence of measurement error in the study design and analysis
Cancer Causes Control
 , 
2006
, vol. 
17
 (pg. 
889
-
99
)
36
Missmer
SA
Spiegelman
D
Bertone-Johnson
ER
Barbieri
RL
Pollak
MN
Hankinson
SE
Reproducibility of plasma steroid hormones, prolactin, and insulin-like growth factor levels among premenopausal women over a 2 - to 3-year period
Cancer Epidemiol Biomarkers Prev
 , 
2006
, vol. 
15
 (pg. 
972
-
78
)
37
Michaud
DS
Manson
JE
Spiegelman
D
, et al.  . 
Reproducibility of plasma and urinary sex hormone levels in premenopausal women over a one-year period
Cancer Epidemiol Biomarkers Prev
 , 
1999
, vol. 
8
 (pg. 
1059
-
64
)
38
Schisterman
EF
Little
RJ
Opening the black box of biomarker measurement error
Epidemiology
 , 
2010
, vol. 
21
 
Suppl 4
(pg. 
S1
-
S3
)
39
Pare
G
Ridker
PM
Rose
L
, et al.  . 
Genome-wide association analysis of soluble ICAM-1 concentration reveals novel associations at the NFKBIK, PNPLA3, RELA, and SH2B3 loci
PLoS Genet
 , 
2011
, vol. 
7
 pg. 
e1001374
 
40
Qi
L
Cornelis
MC
Kraft
P
, et al.  . 
Genetic variants in ABO blood group region, plasma soluble E-selectin levels and risk of type 2 diabetes
Hum Mol Genet
 , 
2010
, vol. 
19
 (pg. 
1856
-
62
)
41
Didelez
V
Meng
S
Sheehan
NA
Assumptions of IV methods for observational epidemiology
Stat Sci
 , 
2010
, vol. 
25
 (pg. 
22
-
40
)
42
Sheehan
NA
Didelez
V
Commentary: can ‘many weak’ instruments ever be ‘strong’?
Int J Epidemiol
 , 
2011
, vol. 
40
 (pg. 
752
-
54
)
43
Cai
B
Small
DS
Have
TR
Two-stage instrumental variable methods for estimating the causal odds ratio: analysis of bias
Stat Med
 , 
2011
, vol. 
30
 (pg. 
1809
-
24
)
44
Terza
JV
Basu
A
Rathouz
PJ
Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling
J Health Econ
 , 
2008
, vol. 
27
 (pg. 
531
-
43
)
45
Robins
JM
Rotnitzky
A
Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models
Biometrika
 , 
2004
, vol. 
91
 (pg. 
763
-
83
)
46
van der Laan
MJ
Hubbard
A
Jewell
NP
Estimation of treatment effects in randomized trials with noncompliance and a dichotomous outcome
J R Stat Soc Series B Stat Methodol
 , 
2007
, vol. 
69
 (pg. 
442
-
82
)
47
Vansteelandt
S
Goetghebeur
E
Causal inference with generalized structural mean models
J R Stat Soc Series B Stat Methodol
 , 
2003
, vol. 
65
 (pg. 
817
-
35
)
48
Shinohara
RT
Frangakis
CE
Platz
E
Tsilidis
K
 
Estimating effects by combining instrumental variables with case-control designs: the role of principal stratification. Johns Hopkins University, Department of Biostatistics Working Papers 2008 (Working Paper 198). http://www.bepress.com/jhubiostat/paper198 (5 September 2012, date last accessed)
49
Cai
B
Causal inference with two-stage logistic regression—accuracy, precisions, and application
Publicly Accessible Penn Dissertations (Paper 225), 2010
  
http://repository.upenn.edu/edissertations/255 (5 September 2012, date last accessed)