## Abstract

The authors describe a statistical method of combining self-reports and biomarkers that, with adequate control for confounding, will provide nearly unbiased estimates of diet-disease associations and a valid test of the null hypothesis of no association. The method is based on regression calibration. In cases in which the diet-disease association is mediated by the biomarker, the association needs to be estimated as the total dietary effect in a mediation model. However, the hypothesis of no association is best tested through a marginal model that includes as the exposure the regression calibration-estimated intake but not the biomarker. The authors illustrate the method with data from the Carotenoids and Age-Related Eye Disease Study (2001--2004) and show that inclusion of the biomarker in the regression calibration-estimated intake increases the statistical power. This development sheds light on previous analyses of diet-disease associations reported in the literature.

Dietary measurement error causes serious challenges to the detection of associations between diet and disease in epidemiologic studies. Estimated relative risks are attenuated and statistical power is reduced (1). Moreover, increasing the sample size provides only a partial remedy, as the attenuated estimates of relative risk are often as low as 1.10–1.25 and, even when statistically significant, might be indistinguishable from the effects of unknown confounders (2). Methods to reduce error in dietary measurements are therefore of primary importance.

To that aim, in previous work (3, 4), we proposed using Howe’s method or principal components for combining dietary reports with dietary biomarkers. In computer simulations and in a real example from the Carotenoids and Age-Related Eye Disease Study (CAREDS), we demonstrated substantial increases in statistical power with this method. However, the method is limited in 3 ways. First, the relative risks derived cannot be translated directly into relative risks per added unit of intake. Second, the method does not always increase power but rather sometimes decreases it. Third, the method does not derive from the usual statistical framework for modeling measurement error.

In the present article, we propose an alternative method for combining dietary self-reports and biomarkers derived from regression calibration (5), a well-known statistical approach to solving measurement error problems. Through theory and computer simulations, we show that, given adequate control for confounding, this approach gives nearly unbiased relative risk estimates and provides a significance test with power equal to or greater than that of tests based on dietary self-report alone. We apply the method to study the negative association of lutein plus zeaxanthin intake with nuclear cataracts that we reported in a previous article (4).

We also compare our method with a closely related proposal of Prentice et al. (6) and clarify methodological issues related to including a biomarker in the regression calibration equation when that same biomarker is involved in the disease model.

## MATERIALS AND METHODS

### Study population

CAREDS is an ancillary study of the Women’s Health Initiative (WHI) Observational Study (7, 8). The CAREDS population included women enrolled at 3 sites: University of Wisconsin (Madison, Wisconsin), University of Iowa (Iowa City, Iowa), and Kaiser Center for Health Research (Portland, Oregon). Of the 3,143 eligible women, 2,005 agreed to participate. Full details of the study design have been published previously (9). All procedures conformed to the Declaration of Helsinki and were approved by institutional review boards at each institution.

### Assessments

Dietary intake was assessed at the WHI Observational Study baseline (1994–1998) by using the WHI semiquantitative food frequency questionnaire (FFQ), which had been pretested (10). Serum samples were collected after subjects had fasted for 10 or more hours at the WHI baseline examinations (1994–1998) and were analyzed for lutein and zeaxanthin (sum of their trans isomers) (9). Serum lutein and zeaxanthin measurements were available for 1,787 women. These women comprised the data set for this study. Participants underwent lens photography and eye examinations during the CAREDS baseline study visits between 2001 and 2004 (9) and completed a questionnaire that included questions about time of cataract surgery in each eye, physician-diagnosed history of cataracts, and personal characteristics. The primary outcome was nuclear cataract, defined as a nuclear sclerosis severity score of 4 or greater in the worst eye or a history of cataract extraction in either eye.

### Confounders

Potential confounders used in the CAREDS analyses relating nuclear cataracts to lutein/zeaxanthin were age, smoking, iris color, body mass index (BMI, measured as weight in kilograms divided by height in meters squared), multivitamin use, physical activity level, hormone replacement therapy, and pulse pressure (9). In the present analysis, we adjust for the 2 strongest confounders, age and smoking. We considered adding BMI to the analysis, but this did not materially change the results.

### Statistical methods

#### Measurement error model.

We assume there are both a single usual dietary intake of interest and a single biomarker measure related to the intake. The true values of these are XI and XM , respectively (I for intake and M for marker). Each is measured with error, the measurements being denoted by WI and WM , respectively. In our example, these measurements are log FFQ-reported lutein plus zeaxanthin intake and log serum level of lutein plus zeaxanthin.

To deal with the effects of the measurement error, we have to know the measurement error model. We assume it takes the following form:

(1)
This model is quite general, allowing the intake measurement error to depend on true intake (α11 ≠ 1) and even on true biomarker level (α12 ≠ 0); similarly, the biomarker measurement error may depend on true biomarker level (α22 ≠ 1) and even on true intake level (α21 ≠ 0). The random error (e1 and e2) terms (with zero means) are assumed to be independent of each other, true intake, true biomarker level, and disease outcome, so that the measurement error is nondifferential. The random variables (XI, XM, WI, and WM) are assumed to have a multivariate normal distribution with a known covariance matrix. In practice, these variances and covariances, as well as the α-parameters in model 1, can be estimated by using data from feeding studies in conjunction with the data on WI and WM from the main study. An example of the derivations of these values from the literature is given for lutein plus zeaxanthin in the Appendix to Freedman et al. (3).

In our CAREDS example, the measurement error model was simpler, with α12 = α21 = 0, α20 = 0, and α22 = 1. Each assumption seemed appropriate in this context. Setting α12 = 0 corresponded to assuming that error in dietary reporting of lutein/zeaxanthin intake was unrelated to the serum level, and setting α21 = 0 corresponded to assuming that error in measured serum level was unrelated to the intake level. Setting α20 = 0 and α22 = 1 corresponded to assuming that the measured serum level was an unbiased measure of true serum level.

#### Disease model.

Denote the disease outcome variable by Y. We assume that disease is related to the exposure of interest XI through a generalized linear regression model:

(2)
where E denotes expectation, h denotes the link function (e.g., logistic for binary variables or the identity for a continuous variable), and Z represents one or more confounders, measured exactly.

We specifically include the biomarker variable XM in disease model 2 because we assume that the biomarker mediates, at least partially, the effect of dietary intake on disease. See Figure 1 for the causal path diagram. This assumption often seems biologically plausible; for example, if the biomarker is a serum level of the nutrient of interest, then the effect of the nutrient intake on disease will likely be at least partially mediated through the biomarker. We assume that, if the biomarker level is influenced by other factors associated with disease, those factors are included in the covariates Z (Figure 1); in other words, we assume that we can control for confounding.

Figure 1.

Causal pathway diagram describing the relations among dietary intake, biomarker level, confounders, and disease.

Figure 1.

Causal pathway diagram describing the relations among dietary intake, biomarker level, confounders, and disease.

Under this causal model (Figure 1), the quantity of most interest is the coefficient for the total association of dietary intake with disease outcome, given by $β1*$ in the following model:

(3)
It can be shown that both model 3 and the following equation for $β1*$ hold exactly when the disease model is a linear regression and approximately when the disease model is nonlinear, such as for logistic and Cox regression:
(4)
where γ is the coefficient of XI in the linear regression of XM on XI and Z.

In the absence of measurement error, we could estimate $β1*$ simply by omitting XM from model 2 and estimating the coefficient for XI in model 3. However, as detailed in the next subsection, if we utilize this same strategy using measurement error adjustment, then we can get a biased estimate of $β1*$.

#### Estimating the total association of dietary intake with disease.

Regression calibration (RC) (5) is now widely used to adjust estimates of regression coefficients for measurement error in explanatory variables. Suppose we are interested in estimating $β1*$ in model 3. The central idea is to use as the explanatory variable in the regression not WI but rather the expectation of the true value XI conditional on its measurement WI, as well as the covariates Z, written as E(XI|WI, Z). This is essentially a prediction of the unknown XI using its mismeasured value together with the confounders. As long as the measurement error is nondifferential, the method yields regression estimates that in large samples are unbiased (for linear disease models) or nearly unbiased (for nonlinear disease models and moderate-sized relative risks) (5). However, the statistical power for detecting an association is not increased and can actually decrease slightly (5).

Kipnis et al. (11) demonstrated that RC can be extended by including other variables in the “prediction” of XI. Such extra variables increase the precision with which XI can be predicted and thereby also increase the precision with which $β1*$ is estimated. One caveat, however, is that the extra predictors should provide no further information about the disease outcome over and above that provided by XI itself and the confounders Z. Without this condition, the method yields a biased estimate of $β1*$, and the only remedy is to include those extra predictors in the disease model as well.

In the present article, we explore the use of the biomarker WM as an extra predictor of XI. We compare several methods of estimating $β1*$:

• 1. Unadjusted for measurement error, using the estimated coefficient for WI in the model $h[E(Y)]=β0U+β1UWI+βZUZ$;

• 2. Usual RC, using the estimated coefficient for E(XI|WI, Z) in the model $h[E(Y)]=β0R+β1RE(XI|WI,Z)+βZRZ$;

• 3. Enhanced RC, with the biomarker used for prediction, using the estimated coefficient for E(XI|WI, WM, Z) in the model $h[E(Y)]=β0E+β1EE(XI|WI,WM,Z)+βZEZ$ (the method used by Prentice et al. (6) in an example where the dietary intake was total energy and the “biomarker” was BMI); and

• 4. A newly proposed method, in which enhanced RC is used to estimate β1 and β2 of model 2 through the model $h[E(Y)]=β0N+β1NE(XI|WI,WM,Z)+β2NE(XM|WI,WM,Z)+βZNZ$. $β1*$ is then estimated by $βˆ1N+γβˆ2N$, where the hat denotes the estimated value. Note that in the regression model, we use $E(XM|WI,WM,Z)$ in place of WM to account for any measurement error in the biomarker.

In the Web Appendix, Part A (available at http://aje.oxfordjournals.org/), we present the approximate expected values of these 4 estimators. This enables us to predict the following regarding the bias in estimating $β1*$.

• 1. The unadjusted estimate, $βˆ1U$, is nearly unbiased only if there is no measurement error in dietary intake.

• 2. The usual RC estimate,$βˆ1R$, is nearly unbiased if either β2 is zero (i.e., if there is no mediation by the biomarker) or α12 is zero (i.e., if the biomarker provides no information about reported dietary intake over and above that provided by the true intake). The latter scenario is sometimes plausible but not in the example of energy intake and BMI (6).

• 3. The enhanced RC estimate, $βˆ1E$, is nearly unbiased if β2 is zero (i.e., if there is no mediation by the biomarker) but not in any other plausible scenarios.

• 4. The newly proposed estimate is nearly unbiased under the more general conditions of the measurement error model 1 and the disease model 2.

In the Results, we present the results of computer simulations that verify these predictions.

To apply these methods to the CAREDS example, we must develop the prediction models, that is, the quantities E(XI|WI, Z), E(XI|WI, WM, Z), and E(XM|WI, WM, Z) involved in usual RC, enhanced RC, and the newly proposed method. See the Web Appendix, Part B.

#### Testing the null hypothesis of no diet-disease association.

Each of the 4 estimation methods described above can also be used to test the hypothesis of no diet-disease association. In each case, the test is obtained by comparing the ratio of the estimate to its standard error with the standard normal distribution. The estimate’s standard error may be computed by using bootstrap methods or, when the measurement error model parameters are assumed known, from the usual model-based estimates.

Two questions arise: 1) Which of the 4 tests are valid? and 2) Which among the valid tests is the most statistically powerful? Answering these questions requires careful definition of the null hypothesis. Specifically, we mean that not only is $β1*$ zero, but also that its 2 components β1 and β2 in model 2 are both zero. It is theoretically possible that $β1*$ equals zero even when these 2 components are not zero, namely, when β1 = −γ1β2. However, it would be highly unusual for the direct effect of diet on disease (the part not mediated through the biomarker) to be in the opposite direction of the indirect effect (the part mediated) with the 2 effects in precisely the appropriate ratio to cancel each other. Thus, we concentrate on the more plausible hypothesis that β1 = β2 = 0.

The tests derived from all 4 estimators are nearly valid tests (in the same sense that the RC estimator is nearly unbiased) of the above-mentioned null hypothesis. This happens because, under this hypothesis, the expected value of all 4 estimators is nearly zero. Therefore, although 3 of the estimators can be biased, they are all nearly unbiased when the dietary association is zero.

Furthermore, theory predicts that the power of enhanced RC will be larger than that of usual RC, which is expected to have power similar to that of the unadjusted method (see Web material to Kipnis et al. (11)). The ratio of the required sample size using enhanced RC to that using usual RC is given as

We show in our example and in simulations that, among the 4 tests, enhanced RC has the highest statistical power.

#### Computer simulations.

The simulation was designed to mimic data from CAREDS. Parameters for the measurement error and disease models, derived from the literature (3), are shown in Table 1. The different combinations of (β1, β2) values were designed to represent 4 scenarios for the effect of intake on disease: 1) not mediated by the biomarker, 2) fully mediated by the biomarker, 3) partially mediated by the biomarker, and 4) zero. The RC models were assumed known and were calculated from the measurement error parameters.

Table 1.

Parameters for Simulation Study of Both Dietary and Serum Lutein/Zeaxanthin Based on the Models 1 and 2a, Carotenoids in Age-Related Eye Disease Study, 2001–2004

 Set α11 α12 α21 α22 Variance of e1 Variance of e2 Variance of XI Variance of XM γ β1 β2 No mediation 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 −1.00 0.00 Full mediation 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 0.00 −1.20 Partial mediation 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 −0.48 −0.63 No effect 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 0.00 0.00
 Set α11 α12 α21 α22 Variance of e1 Variance of e2 Variance of XI Variance of XM γ β1 β2 No mediation 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 −1.00 0.00 Full mediation 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 0.00 −1.20 Partial mediation 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 −0.48 −0.63 No effect 0.71 0.00 0.00 1.00 0.36 0.05 0.25 0.19 0.60 0.00 0.00
a

Measurement error model (model 1): WI = α10 + α11XI + α12XM + e1, WM = α20 + α21XI + α22XM + e2. Disease model (model 2): h[E(Y|XI, XM, Z)] = β0 + β1XI + β2XM + βZ Z, where XI and XM are the true values of intake and biomarker, respectively; WI and WM are their measured values; Y is the disease indicator; the e’s are random errors; and γ is the coefficient of XI in the linear regression of XM on XI and Z.

For each simulation, a study with 500 individuals and measurements (Y, WI, WM) was generated from the models. Then, the 4 estimation methods were applied to the data, leading to 4 estimates of $β1*$. This was repeated 1,000 times, and the means and standard deviations of each of the 4 estimators over the 1,000 repetitions were calculated. To estimate statistical power, Wald chi-squared statistics were computed, and the proportion of simulations in which the statistic exceeded the 95th percentile was calculated.

## RESULTS

### CAREDS example

#### Regression calibration equations.

Equations for predicting true intake (and true biomarker level) are required to implement usual RC, enhanced RC, and the newly proposed method. For usual RC, the quantity E(XI|WI, Z) is required, where WI is the log FFQ report of lutein/zeaxanthin intake and Z represents the covariates age (y) and smoking (0 = nonsmoker, 1 = former smoker, 2 = current smoker). We obtained the equation:

For enhanced RC, the biomarker level was included in the prediction, and we obtained the equation:
Note that the coefficient for WM, the log measured serum level, is relatively large, showing its major role in the prediction of true intake.

For the newly proposed method, we also needed a prediction equation for true serum lutein/zeaxanthin, obtained as follows:

Note that the coefficient of WI, the log FFQ-reported intake, is small, showing its minor role in predicting serum level.

#### Estimates of risk and significance tests.

Estimates of the odds ratio of nuclear cataract associated with a doubling of lutein/zeaxanthin intake derived from the 4 methods are shown in Table 2. The unadjusted estimate of 0.89 was closer to the null value of 1.0 than were the estimates of the other methods, and it just achieved statistical significance (P = 0.038). Usual RC yielded a stronger (negative) association (odds ratio = 0.72) but the same level of significance (P = 0.038). Enhanced RC estimated an even stronger association (odds ratio = 0.70) that was highly significant (P = 0.002). The newly proposed method estimated a weaker association than did the enhanced RC method (odds ratio = 0.74), with a level of significance similar to that of usual RC (P = 0.046).

Table 2.

Logistic Regression Analyses Relating Nuclear Cataracts to Dietary Lutein/Zeaxanthin Intake in the Carotenoids in Age-Related Eye Disease Study, 2001–2004

 Estimation Method Coefficient (SE) Odds Ratioa Z Value Sample-Size Ratiob Unadjusted −0.165 (0.080) 0.89 −2.07 1.0 Regression calibration −0.464 (0.225) 0.72 −2.07 1.0 Enhanced regression calibration −0.506 (0.161) 0.70 −3.15 0.43 Newly proposed −0.436 (0.218) 0.74 −2.00 1.07
 Estimation Method Coefficient (SE) Odds Ratioa Z Value Sample-Size Ratiob Unadjusted −0.165 (0.080) 0.89 −2.07 1.0 Regression calibration −0.464 (0.225) 0.72 −2.07 1.0 Enhanced regression calibration −0.506 (0.161) 0.70 −3.15 0.43 Newly proposed −0.436 (0.218) 0.74 −2.00 1.07

Abbreviation: SE, standard error.

a

Odds ratio of disease for a doubling of the dietary intake of lutein/zeaxanthin.

b

The ratio of sample sizes required to give the same statistical power as that based on the unadjusted analysis.

Which of the 4 methods should one choose? For estimation, the newly proposed method is the only one that is nearly unbiased under general models 1 and 2, but in our particular case in which α21 = 0, the usual RC method is also nearly unbiased. Therefore, one may choose between them, and, because its standard error is smaller, the newly proposed method would be preferable. With regard to significance testing, all the methods are nearly unbiased, and the natural choice is the enhanced RC method because it is the most powerful.

#### Sample size savings.

We calculated the predicted ratio of sample size required using enhanced RC to that required using usual RC,

$0.06830.1250 = 0.55.$
This value was calculated without reference to the outcome variable. However, CAREDS data allowed us to calculate the sample size savings in relation to testing the association with nuclear cataracts. The sample size ratio for the enhanced RC method versus the usual RC method was estimated as 0.43 (Table 2), which was not dissimilar to the predicted value of 0.55.

### Computer simulations

Results regarding bias in the estimated risk parameter resembled those predicted by theory (Table 3). The unadjusted estimate was attenuated, except under zero dietary effect. The usual RC method gave nearly unbiased estimates in all scenarios because we set the measurement error parameter α21 at zero. The enhanced RC method gave nearly unbiased estimates under no mediation of the dietary effect through the biomarker and also under zero dietary effect, but when mediation occurred, the estimate was biased and inflated away from the null. The newly proposed method gave nearly unbiased estimates in all scenarios.

Table 3.

Simulated Means and Standard Deviations of Estimates of the Marginal Effect $β1*$ of Dietary and Serum Lutein/Zeaxanthin on Eye Disease for 4 Different Methods of Estimationa

 Set True $β1*$ Estimation Method Unadjusted Regression Calibration Enhanced Regression Calibration Newly Proposed Mean (SE) SD Mean (SE) SD Mean (SE) SD Mean (SE) SD No mediation −1.00 −0.349 (0.004) 0.135 −0.951 (0.008) 0.259 −0.989 (0.006) 0.193 −0.966 (0.008) 0.261 Full mediation −0.72 −0.257 (0.004) 0.132 −0.688 (0.008) 0.254 −1.215 (0.006) 0.197 −0.716 (0.008) 0.260 Partial mediation −0.86 −0.303 (0.004) 0.131 −0.826 (0.008) 0.258 −1.121 (0.006) 0.201 −0.850 (0.008) 0.264 No effect 0.00 −0.006 (0.004) 0.128 −0.017 (0.011) 0.351 −0.003 (0.008) 0.260 −0.017 (0.011) 0.352
 Set True $β1*$ Estimation Method Unadjusted Regression Calibration Enhanced Regression Calibration Newly Proposed Mean (SE) SD Mean (SE) SD Mean (SE) SD Mean (SE) SD No mediation −1.00 −0.349 (0.004) 0.135 −0.951 (0.008) 0.259 −0.989 (0.006) 0.193 −0.966 (0.008) 0.261 Full mediation −0.72 −0.257 (0.004) 0.132 −0.688 (0.008) 0.254 −1.215 (0.006) 0.197 −0.716 (0.008) 0.260 Partial mediation −0.86 −0.303 (0.004) 0.131 −0.826 (0.008) 0.258 −1.121 (0.006) 0.201 −0.850 (0.008) 0.264 No effect 0.00 −0.006 (0.004) 0.128 −0.017 (0.011) 0.351 −0.003 (0.008) 0.260 −0.017 (0.011) 0.352

Abbreviations: SD, standard deviation; SE, standard error.

a

There were 1,000 simulated studies per set with 500 individuals per study.

The precisions of the estimates differed markedly. The standard deviations of the usual RC and newly proposed estimates were similar and larger than those of enhanced RC estimates. Each of the 4 significance tests yielded approximately 5% of significant results under the null hypothesis (Table 4). The tests based on the unadjusted method and RC method were identical and had lower statistical power than did the test based on the enhanced RC method (Table 4). The test based on the newly proposed method had slightly higher power than did the RC method.

Table 4.

Simulated Statistical Powers of Significance Testsa of the Marginal Effect $β1*$ of Dietary and Serum Lutein/Zeaxanthin on Eye Disease for 4 Different Methods of Testingb

 Set True $β1*$ Estimation Method Unadjusted Regression Calibration Enhanced Regression Calibration Newly Proposed No mediation −1.00 0.746 0.746 0.951 0.748 Full mediation −0.72 0.496 0.496 0.994 0.520 Partial mediation −0.86 0.633 0.633 0.986 0.649 No effect 0.00 0.045 0.045 0.045 0.044
 Set True $β1*$ Estimation Method Unadjusted Regression Calibration Enhanced Regression Calibration Newly Proposed No mediation −1.00 0.746 0.746 0.951 0.748 Full mediation −0.72 0.496 0.496 0.994 0.520 Partial mediation −0.86 0.633 0.633 0.986 0.649 No effect 0.00 0.045 0.045 0.045 0.044
a

All tests are based on a Wald chi-squared test statistic with 1 df, assuming that the parameters of the measurement error model are known a priori. Unadjusted test: regress disease on WI; test that $β1*$ = 0. Regression calibration test: regress disease on E(XI|WI); test that $β1*$ = 0 (equivalent to test 1). Enhanced regression calibration test: regress disease on E(XI|WI, WM); test that $β1*$ = 0. Newly proposed test: regress disease on E(XI|WI, WM) and E(XM|WI, WM); test that $β1*=β1+γβ2=0$ (assuming γ is known).

b

There were 1,000 simulated studies per scenario with 500 individuals per study.

## DISCUSSION

We have described a method of combining self-reports and biomarkers, based on RC that, under reasonable assumptions, provides 1) a nearly valid significance test of the diet-disease association with increased power and 2) nearly unbiased estimates of relative risks or odds ratios for the association.

The method relies on prior knowledge or estimation of the statistical model describing the measurement error in self-reports of the dietary intake and in biomarker levels related to dietary intake. For the few existing recovery biomarkers (the doubly labeled water technique (12) for measuring energy intake and 24-hour urinary nitrogen (13) and potassium (14) for measuring protein and potassium intake), this method could be easily applied given the biomarkers’ known quantitative relation to intake (13), although the cost or effort to perform these tests in very large numbers may be prohibitive. For the newly developed predictive biomarker for sugars (15), estimation of the measurement error parameters has been recently described (16). In our example, we extracted such prior knowledge from the literature on carotenoid feeding studies, validation studies of dietary reporting of carotenoids, and cohort studies that investigated carotenoid-disease associations (3). For other concentration biomarkers, such as other serum carotenoids or vitamin C or adipose tissue fatty acids, a similar exercise using previous feeding studies could be attempted; otherwise, new feeding studies will be needed to develop the RC equations lying at the heart of the method. One such feeding study is now being conducted (17). The method is not applicable to foods or food patterns that have no known specific biomarkers.

When the parameters of the measurement error model are estimated from a feeding study, the limited size of the study often limits the precision of the estimates. This uncertainty transfers to the risk estimates obtained from the RC adjustment. One should then use a method to adjust the standard error of the risk estimate to include this extra uncertainty, such as the bootstrap or stacking equations (see Appendix B.3 of Carroll et al. (5)).

The method we propose is linked to a previous proposal to use principal components or Howe’s method to combine different error-prone measures of dietary intake. Our results, which showed that enhanced RC can yield reductions in sample size of approximately 50%, are similar to the savings found using Howe’s method applied to the same data set (4). However, this similarity is serendipitous. It happened that in this data set, Howe’s method yielded a dietary index close to the RC prediction of dietary intake based on self-reported intake and serum level, and consequently provided a nearly optimal analysis. This will not always happen, and neither principal components nor Howe’s method is guaranteed to increase statistical power. In many cases, both methods can actually decrease it. In contrast, enhanced RC is expected always to increase power provided the biomarker improves prediction of dietary intake.

Although the enhanced RC method yields a valid and more powerful significance test of the diet-disease association, it does not in general provide an unbiased estimate of the risk parameter. Whenever there is mediation of the dietary effect through the biomarker, which is often expected, the estimate becomes inflated. We have provided a new method that yields an unbiased estimate, albeit with lower precision than that provided by the enhanced RC method. When the biomarker is uncorrelated with dietary reporting error, the usual RC estimate (based on self-report alone) will also be unbiased. We recommend reporting one of these unbiased estimates together with the result of the significance test based on enhanced RC.

Prentice et al. (6) presented an analysis of the association between energy intake and several invasive cancers based on a calibration equation for energy intake that includes BMI. The authors discussed whether BMI was a confounder or a mediator of the diet-disease association. Assuming it was a mediator, they estimated hazard ratios by using a method that corresponded to enhanced RC. According to the results of the present study, the significance tests of the diet-disease association were valid, but the estimated hazard ratios were biased. If BMI were a confounder, then the risk quantity of interest would no longer be $β1*$ but β1 in model 2, the confounder-adjusted dietary risk parameter.

Possible confounding of the diet-disease association through the biomarker is the most serious obstacle to using our approach. If there are risk factors for disease that also affect the biomarker, then introducing the biomarker into the prediction of dietary intake while not controlling for these risk factors in the disease model will lead to biased estimation with unknown direction of the bias. Biomarkers known to bear a strong relation to dietary intake, such as recovery (18) and predictive (15) biomarkers, will be largely immune from such concern, but concentration biomarkers are affected by complex metabolic pathways in their regulation and will always be subject to concerns about confounding. If a strong risk factor for the disease is known to affect the biomarker, that risk factor must be included in the disease risk model so as to avoid ascribing its effect to nutritional components. In the CAREDS example, smoking was included in the model because it is associated with nuclear cataract and might also lead to depletion of lutein and zeaxanthin in blood, as it is a source of free radicals and oxidative stress (19).

Another challenge is the need to specify a measurement error model, such as model 1. Naturally, such a model could be incomplete and might omit influential explanatory variables. For a discussion of this challenge and related cost issues, see the Web Appendix, Part C.

In summary, a major obstacle facing the field of nutritional epidemiology is the loss of statistical power for detecting diet-disease associations that result from measurement error. With careful use, the methods described in the present article could yield more powerful tests of such associations together with reliable risk estimates. Use of these methods in other branches of epidemiology is discussed briefly in the Web Appendix, Part C.

### Abbreviations

Abbreviations
• BMI

body mass index

• CAREDS

Carotenoids and Age-Related Eye Disease Study

• FFQ

food frequency questionnaire

• RC

regression calibration

• WHI

Women’s Health Initiative

Author affiliations: Biostatistics Unit, Gertner Institute for Epidemiology and Health Policy Research, Tel Hashomer, Israel (Laurence S. Freedman); Division of Cancer Prevention, National Cancer Institute, Bethesda, Maryland (Douglas Midthune, Victor Kipnis); Department of Statistics, Texas A&M University, College Station, Texas (Raymond J. Carroll); Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland (Nataša Tasevska, Nancy Potischman); Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland (Arthur Schatzkin); Department of Ophthalmology and Visual Sciences, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, Wisconsin (Julie Mares); and Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington (Lesley Tinker).

This work was supported by the National Institutes of Health (under contract HHSN261200633000 to L. S. F.), the National Cancer Institute (grant R27-CA057030 to R. J. C.), the National Eye Institute (grants EY013018 and EY016886), the National Heart, Lung, and Blood Institute (for support of the Women’s Health Initiative), and by Research to Prevent Blindness.

Women’s Health Initiative Investigators—Program Office (National Heart, Lung, and Blood Institute, Bethesda, Maryland): Elizabeth Nabel, Jacques Rossouw, Shari Ludlam, Joan McGowan, Leslie Ford, and Nancy Geller. Clinical Coordinating Centers—Fred Hutchinson Cancer Research Center, Seattle, Washington: Ross Prentice, Garnet Anderson, Andrea LaCroix, Charles L. Kooperberg, Ruth E. Patterson, and Anne McTiernan; Medical Research Labs, Highland Heights, Kentucky: Evan Stein; and University of California at San Francisco, San Francisco, California: Steven Cummings. Clinical Centers—Albert Einstein College of Medicine, Bronx, New York: Sylvia Wassertheil-Smoller; Baylor College of Medicine, Houston, Texas: Aleksandar Rajkovic; Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts: JoAnn E. Manson; Brown University, Providence, Rhode Island: Charles B. Eaton; Emory University, Atlanta, Georgia: Lawrence Phillips; Fred Hutchinson Cancer Research Center, Seattle, Washington: Shirley Beresford; George Washington University Medical Center, Washington, DC: Lisa Martin; Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California: Rowan Chlebowski; Kaiser Permanente Center for Health Research, Portland, Oregon: Yvonne Michael; Kaiser Permanente Division of Research, Oakland, California: Bette Caan; Medical College of Wisconsin, Milwaukee, Wisconsin: Jane Morley Kotchen; MedStar Research Institute/Howard University, Washington, DC: Barbara V. Howard; Northwestern University, Evanston, Illinois: Linda Van Horn; Rush Medical Center, Chicago, Illinois: Henry Black; Stanford Prevention Research Center, Stanford, California: Marcia L. Stefanick; State University of New York at Stony Brook, Stony Brook, New York: Dorothy Lane; Ohio State University, Columbus, Ohio: Rebecca Jackson; University of Alabama at Birmingham, Birmingham, Alabama: Cora E. Lewis; University of Arizona, Phoenix, Arizona: Cynthia A Thomson; University at Buffalo, Buffalo, New York: Jean Wactawski-Wende; University of California at Davis, Sacramento, California: John Robbins; University of California at Irvine, California: F. Allan Hubbell; University of California at Los Angeles, Los Angeles, California: Lauren Nathan; University of California at San Diego, La Jolla/Chula Vista, California: Robert D. Langer; University of Cincinnati, Cincinnati, Ohio: Margery Gass; University of Florida, Gainesville/Jacksonville, Florida: Marian Limacher; University of Hawaii, Honolulu, Hawaii: J. David Curb; University of Iowa, Iowa City, Iowa: Robert Wallace; University of Massachusetts/Fallon Clinic, Worcester, Massachusetts: Judith Ockene; University of Medicine and Dentistry of New Jersey, Newark, New Jersey: Norman Lasser; University of Miami, Miami, Florida: Mary Jo O’Sullivan; University of Minnesota, Minneapolis, Minnesota: Karen Margolis; University of Nevada, Reno, Nevada: Robert Brunner; University of North Carolina, Chapel Hill, North Carolina: Gerardo Heiss; University of Pittsburgh, Pittsburgh, Pennsylvania: Lewis Kuller; University of Tennessee Health Science Center, Memphis, Tennessee: Karen C. Johnson; University of Texas Health Science Center, San Antonio, Texas: Robert Brzyski; University of Wisconsin, Madison, Wisconsin: Gloria E. Sarto; Wake Forest University School of Medicine, Winston-Salem, North Carolina: Mara Vitolins; and Wayne State University School of Medicine/Hutzel Hospital, Detroit, Michigan: Michael Simon. Women’s Health Initiative Memory Study (Wake Forest University School of Medicine, Winston-Salem, North Carolina): Sally Shumaker.

Conflict of interest: none declared.

## References

1.
Freudenheim
JL
Marshall
JR
The problem of profound mismeasurement and the power of epidemiological studies of diet and cancer
Nutr Cancer
,
1988
, vol.
11

4
(pg.
243
-
250
)
2.
Thiébaut
AC
Kipnis
V
Chang
SC
, et al.  .
Dietary fat and postmenopausal invasive breast cancer in the National Institutes of Health-AARP Diet and Health Study cohort
J Natl Cancer Inst
,
2007
, vol.
99

6
(pg.
451
-
462
)
3.
Freedman
LS
Kipnis
V
Schatzkin
A
, et al.  .
Can we use biomarkers in combination with self-reports to strengthen the analysis of nutritional epidemiologic studies?
Epidemiol Perspect Innov
,
2010
, vol.
7

1
pg.
2

(doi:10.1186/1742-5573-7-2)
4.
Freedman
LS
Tasevska
N
Kipnis
V
, et al.  .
Gains in statistical power from using a dietary biomarker in combination with self-reported intake to strengthen the analysis of a diet-disease association: an example from CAREDS
Am J Epidemiol
,
2010
, vol.
172

7
(pg.
836
-
842
)
5.
Carroll
RJ
Ruppert
D
Stefanski
LA
, et al.  .
Measurement Error in Nonlinear Models: A Modern Perspective
,
2006
2nd ed
Boca Raton, FL
Chapman and Hall/CRC Publishers
6.
Prentice
RL
Shaw
PA
Bingham
SA
, et al.  .
Biomarker-calibrated energy and protein consumption and increased cancer risk among postmenopausal women
Am J Epidemiol
,
2009
, vol.
169

8
(pg.
977
-
989
)
7.
Langer
RD
White
E
Lewis
CE
, et al.  .
The Women’s Health Initiative Observational Study: baseline characteristics of participants and reliability of baseline measures
Ann Epidemiol
,
2003
, vol.
13

suppl 9
(pg.
S107
-
S121
)
8.
Design of the Women’s Health Initiative clinical trial and observational study
The Women’s Health Initiative Study Group
Control Clin Trials
,
1998
, vol.
19

1
(pg.
61
-
109
)
9.
Moeller
SM
Voland
R
Tinker
L
, et al.  .
Associations between age-related nuclear cataract and lutein and zeaxanthin in the diet and serum in the Carotenoids in the Age-Related Eye Disease Study (CAREDS), an ancillary study of the Women’s Health Initiative. CAREDS Study Group; Women’s Health Initiative
Arch Ophthalmol
,
2008
, vol.
126

3
(pg.
354
-
364
)
10.
Patterson
RE
Kristal
AR
Tinker
LF
, et al.  .
Measurement characteristics of the Women’s Health Initiative food frequency questionnaire
Ann Epidemiol
,
1999
, vol.
9

3
(pg.
178
-
187
)
11.
Kipnis
V
Midthune
D
Buckman
DW
, et al.  .
Modeling data with excess zeros and measurement error: application to evaluating relationships between episodically consumed foods and health outcomes
Biometrics
,
2009
, vol.
65

4
(pg.
1003
-
1010
)
12.
Schoeller
DA
Measurement error energy expenditure in free-living humans by using doubly labeled water
J Nutr
,
1988
, vol.
118

11
(pg.
1278
-
1289
)
13.
Bingham
SA
Cummings
JH
Urine nitrogen as an independent validatory measure of dietary intake: a study of nitrogen balance in individuals consuming their normal diet
Am J Clin Nutr
,
1985
, vol.
42

6
(pg.
1276
-
1289
)
14.
Tasevska
N
Runswick
SA
Bingham
SA
Urinary potassium is as reliable as urinary nitrogen for use as a recovery biomarker in dietary studies of free living individuals
J Nutr
,
2006
, vol.
136

5
(pg.
1334
-
1340
)
15.
Tasevska
N
Runswick
SA
McTaggart
A
, et al.  .
Urinary sucrose and fructose as biomarkers for sugar consumption
Cancer Epidemiol Biomarkers Prev
,
2005
, vol.
14

5
(pg.
1287
-
1294
)
16.
Tasevska
N
Midthune
D
Potischman
N
, et al.  .
Use of the predictive sugars biomarker to evaluate self-reported total sugars intake in the Observing Protein and Energy Nutrition (OPEN) Study
Cancer Epidemiol Biomarkers Prev
,
2011
, vol.
20

3
(pg.
490
-
500
)
17.
Prentice
RL
Huang
Y
Tinker
LF
, et al.  .
Statistical aspects of the use of biomarkers in nutritional epidemiology research
Stat Biosci
,
2009
, vol.
1

1
(pg.
112
-
123
)
18.
Kaaks
R
Ferrari
P
Ciampi
A
, et al.  .
Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments
Public Health Nutr
,
2002
, vol.
5

6A
(pg.
969
-
976
)
19.
Handelman
GJ
Packer
L
Cross
CE
Destruction of tocopherols, carotenoids, and retinol in human plasma by cigarette smoke
Am J Clin Nutr
,
1996
, vol.
63

4
(pg.
559
-
565
)

Deceased.