Abstract

Associations between modifiable exposures and disease seen in observational epidemiology are sometimes confounded and thus misleading, despite our best efforts to improve the design and analysis of studies. Mendelian randomization—the random assortment of genes from parents to offspring that occurs during gamete formation and conception—provides one method for assessing the causal nature of some environmental exposures. The association between a disease and a polymorphism that mimics the biological link between a proposed exposure and disease is not generally susceptible to the reverse causation or confounding that may distort interpretations of conventional observational studies. Several examples where the phenotypic effects of polymorphisms are well documented provide encouraging evidence of the explanatory power of Mendelian randomization and are described. The limitations of the approach include confounding by polymorphisms in linkage disequilibrium with the polymorphism under study, that polymorphisms may have several phenotypic effects associated with disease, the lack of suitable polymorphisms for studying modifiable exposures of interest, and canalization—the buffering of the effects of genetic variation during development. Nevertheless, Mendelian randomization provides new opportunities to test causality and demonstrates how investment in the human genome project may contribute to understanding and preventing the adverse effects on human health of modifiable exposures.

Genetic epidemiology—the theme of this issue of the International Journal of Epidemiology—is seen by many to be the only future for epidemiology, perhaps reflecting a growing awareness of the limitations of observational epidemiology1 (Box 1). Genetic epidemiology is concerned with understanding heritable aspects of disease risk, individual susceptibility to disease, and ultimately with contributing to a comprehensive molecular understanding of pathogenesis. The massive investment and expansion of human genetics, if it is to return value for the common good, must be integrated into public health functions. The human genome epidemiology network (HuGE Net—http://www.cdc.gov/genetics/huge.htm) has been established to promote the use of genetic knowledge—in terms of genetic tests and services—for disease prevention and health promotion.2,3 A broad taxonomy of human genome studies of public health relevance has been developed4 (Box 2). In this issue of the IJE, we publish a paper by Miguel Porta,5 who highlights the need for a more rational approach to genetic testing, given the likely low penetrance of many genes associated with cancers,6 likening the role of the genome to a jazz score that is interpreted and developed through experience and context—and is seldom predictable. Such insights may well temper enthusiasm for genetic testing in populations. However, in parallel to the approaches advocated by HuGE, genetic epidemiology can lead to a more robust understanding of environmental determinants of disease (e.g. dietary factors, occupational exposures, and health-related behaviours) relevant to whole populations (and not simply to genetically susceptible sub-populations).7–10 This approach has recently been referred to as ‘Mendelian randomization’.11–15 Here we begin by briefly reviewing reasons for current concerns about aetiological findings generated by conventional observational epidemiology and then we outline the potential contribution (and limitations) of Mendelian randomization.

Box 1
Nature announces a genetic epidemiology initiative alongside a cartoon illustrating the demise of classical epidemiology.

‘Epidemiology set to get fast-track treatment’

‘A consortium of leading European research centres and pharmaceutical companies will this week announce a plan to transform epidemiology by combining it with the new techniques of high-throughput biology. They plan to create a new field of study—genomic epidemiology—by using screening technologies derived from the human genome project … We think it is important to expand classical epidemiology and genetic epidemiology to take it to this high-throughput mode, says Esper Boel, vice-president of biotechnology research at Novo Nordisk. We want to use post-genomic technologies to create a new clinical science, to turn functional genomics into real clinical chemistry.’

From: Butler D. Epidemiology set to get fast-track treatment. Nature 2001;414:139. Reprinted with permission.

Box 2
A taxonomy of genetic studies of public health relevance

Surveillance

  • Population frequency of gene variants predisposing to specific diseases

  • Population frequency of morbidity and mortality from such diseases

  • Population frequency and effects of environmental factors known to interact with gene variants

  • Economic costs of genetic components of diseases

  • Coverage, access, and uptake of genetic tests and services

Aetiology

  • Magnitude of disease risk associated with gene variants in different populations

  • Contribution of gene variants to the overall level of disease in different populations

  • Magnitude of disease risk associated with gene–gene and gene–environment interactions in different populations

Health services research

  • Clinical validity and utility of genetic tests in different populations

  • Determinants and impact of using genetic tests and services in different populations

Adapted from Khoury MJ, Burke W, Thomson EJ (eds). Genetics and Public Health in the 21st Century. Oxford: Oxford University Press, 2000.

Observational epidemiology: yet more residually confounded associations of no causal significance?

Over the last decade several severe indictments of epidemiology have appeared, with the major thrust being that spurious non-replicable and non-causal findings are produced and sometimes widely disseminated.16–20 The most salient examples come from situations in which observational epidemiological studies have highlighted an apparently substantial causal association that has later failed to be confirmed in large-scale randomized controlled trials (RCT). An important example of this is the contradictory set of findings regarding the association between the antioxidant vitamin β-carotene and smoking-related cancers. Early enthusiasm that risk of smoking-related cancers might be reduced by increased dietary β-carotene21,22 was muted by RCT evidence ruling out any meaningful reduction in lung cancer among those receiving β-carotene treatment.23–25

This is not an isolated example. The association between Vitamin E and coronary heart disease (CHD) found in observational studies was not supported by the results of RCTs.26 Furthermore, a strong observational inverse association between plasma vitamin C levels and CHD mortality27 was rendered implausible by a subsequent large RCT of a vitamin supplement that raised plasma vitamin C levels substantially, but left 5-year CHD mortality unchanged. In this case the range of plasma vitamin C levels in the observational study and the change introduced by supplementation were similar, yet the outcomes of observation and experiment were very different (Figure 1).28 The results from robust experiment and fallible observation were clearly not compatible. However, the domains in which such comparisons can be made are limited, as it may not be feasible to use RCTs to evaluate some exposures, such as physical activity and complex aspects of nutrition.

Indiscriminate epidemiological data-dredging may be responsible for some spurious epidemiological findings, but this is unlikely to be the main contributor.15 By far the most likely cause is confounding—where one factor that is not itself causally related to disease is associated with a range of other factors that do change disease risk. Associations reported in observational studies but not confirmed in RCTs tend to be between exposures and diseases that are related to socioeconomic position, behavioural factors, and health service utilization. For example, people with high antioxidant vitamin intakes and plasma levels tend to differ in a range of other characteristics that are themselves related to disease risk. In this case, biologically plausible hypotheses are no safeguard against spurious association,29 and standard statistical techniques used to ‘control’ for confounding are fallible, given the limited range of confounders measured in many studies, and the inevitable and substantial degree of measurement error in assessing both the exposure and the potential confounders.30,31

A variety of approaches can be adopted to reduce the risk of being misled by observational studies. Better control for confounding requires a range of modifications in study design and analysis. As confounding structures can differ between study populations, replication of findings in a range of databases gives some limited protection against being deceived by confounding. Specificity of disease-exposure associations may also be helpful, as most diseases have only a finite (and probably limited) number of final causes.32 When exposures are related with a wide variety of outcomes it is likely that confounding by socially patterned behavioural and environmental factors is at play. In such cases it is instructive to investigate associations between the exposure and outcomes with which it is implausibly causally linked, such as associations of antioxidant vitamin levels with injury mortality, or hormone replacement therapy use with accidental death.33

Other approaches include improving study design by measuring confounders better and thus allowing for a greater degree of statistical control. This may require carrying out more measurements on a smaller number of study participants,31 as Andrew Phillips discusses in this issue of the IJE.34 Sensitivity analyses can be carried out to model the degree to which measurement error in confounders could have left residual confounding,35,36 and should be a necessary part of the statistical reporting of study results. Finally, the findings from observational studies of individuals should be related to the differences in disease risk observed between populations, and within populations over time, as only those exposures which fit coherently into this scheme are likely to be important determinants of disease.37

An additional way of increasing the robustness of findings of observational studies is to utilize certain aspects of associations with genetic disease risk. In the remainder of this paper we will discuss ways in which genetic epidemiology can contribute to understanding cause and effect in health sciences. We will start with an illustration from the field of evaluation of health care, which is of interest both because of the ingenuity displayed in its formulation and because it was apparently the first use of the term ‘Mendelian randomization’ in the medical literature. The main part of the article will then cover the use of Mendelian randomization, in the manner this term has been adopted by some previous authors to indicate the use of genetic associations to elucidate modifiable environmental contributions to disease causation.11,14

The concept of Mendelian randomization

The term ‘Mendelian randomization’ was apparently first used to describe an inspired method for evaluating the effectiveness of allogenic sibling bone marrow transplantation in the treatment of acute myeloid leukaemia (AML),38 through comparing outcomes in patients with and without human leukocyte antigen (HLA)-compatible siblings. Several studies have now been carried out with this design,39–43 as we describe in Box 3. It is not, however, essential to understand the particulars of this design to interpret the use of Mendelian randomization within aetiological epidemiology.

Randomized controlled trials (RCTs) appear to be the only way to ensure that the comparison of transplanted patients and non-transplanted patients is unbiased, since disease stage, general fitness, and overall clinical evaluation are likely to influence treatment allocation.38 It is unlikely that adequate RCTs comparing allogenic bone marrow transplantation with no transplantation (i.e. chemotherapy alone) will ever be carried out. Comparisons between patients with an HLA-compatible sibling donor (and therefore capable of receiving a matched sib allogenic bone marrow transplant) and patients without such a donor (and therefore incapable of receiving a matched sib allogenic bone marrow transplant) could be made. The presence or absence of an HLA-compatible sibling donor is determined by random assortment of genes at the time of gamete formation and conception and thus produces, effectively, a randomized comparison. Furthermore, belonging to the group with an HLA-compatible donor and the group without such a donor will not be related to potential confounding factors such as disease stage at presentation, general fitness, or selection effects by the treating physician. It is necessary to compare all the patients with an HLA-compatible donor with all the patients without such a donor—independently of whether or not the patients with a donor receive a transplant —in a form of intention-to-treat analysis (Figure below). Several studies have now been carried out with this design,3941 and it has also been applied to acute lymphoblastic leukaemia.4243 In some of these studies the basic principle that there is no confounding in the potential donor versus no potential donor comparisons has been empirically established,4043 and it can also be seen that there are differences in prognostic factors between groups defined by the treatments received,4143 which would confound a conventional observational analysis comparing treatment modalities. The basic design would be improved by matching on the number of siblings a patient has, since patients with more siblings will have a greater chance of having an HLA-compatible donor and it is conceivable that number of siblings could itself be related to disease progression and survival rates. One study has moved in this direction by performing an analysis among patients with at least one sibling,41 although exact matching on number of siblings would be the most robust approach.

 

The approach to Mendelian randomization that we focus on in this paper utilizes what is sometimes called Mendel’s second law within an epidemiological setting. This law—the law of random assortment—states, in Mendel’s words, that

‘the behaviour of each pair of differentiating characteristics in hybrid union is independent of the other differences between the two original plants, and, further, the hybrid produces just so many kinds of egg and pollen cells as there are possible constant combination forms‘.44

Put simply, this suggests that the inheritance of one trait is independent of the inheritance of other traits.

Conventional genetic epidemiology investigates the association between genetic and phenotype variation within a population, to elucidate the genetic basis of the phenotype or to characterize gene function. In such studies genetic variation is assessed using markers, often single nucleotide polymorphisms (SNPs), and the informative markers are those that show sufficient variation within a population and are of high enough prevalence to allow meaningful comparisons to be made. But it is also possible to exploit the random assignment of genes as a means of reducing confounding in examining exposure–disease associations: this is Mendelian randomization in the epidemiological context.

The use of the terms loci, alleles, genes, genotype, and polymorphisms has evolved since Mendel’s use of ‘differentiating characteristics’, and conventions in usage differ between human and animal geneticists, which adds to confusion.45,46 For clarity, these terms are defined (Box 4, Figure 2). Briefly, the genotype of an individual refers to the two alleles inherited at a specific locus—if the alleles are the same, the genotype is homozygous, if different, heterozygous. A polymorphism is the existence of two or more variants (e.g. SNPs) at a locus.45–47 The basic ideais that, if such polymorphisms produce phenotypic differences that mirror the biological effects of modifiable environmental exposures which in turn alter disease risk, the different polymorphisms should themselves be related to disease risk to the extent predicted by their influence on the phenotype. Common polymorphisms that have a well-characterized biological function can therefore be utilized to study the effect of a suspected exposure on disease risk. One key point is that the distribution of such polymorphisms is largely unrelated to the sorts of confounders—socioeconomic or behavioural—that were identified above as having distorted interpretations of findings from observational epidemiological studies.

Box 4
A glossary of genetic terms

  • alleles are the variant forms detectable at a locus

  • canalization is the process by which potentially disruptive influences on normal development from genetic (and environmental) variations are damped or buffered by compensatory developmental processes

  • a gene comprises a DNA sequence, including introns, exons, and regulatory regions, related to transcription of a given RNA

  • genotype of an individual refers to the two alleles inherited at a specific locus—if the alleles are the same, the genotype is homozygous, if different, heterozygous

  • a haplotype is the set of alleles present at a series of linked loci on a chromosome; a person has two haplotypes for any such series of loci, one inherited maternally and the other paternally

  • linkage disequilibrium is the association between alleles at different loci within the population. Linkage disequilibrium can exist because alleles are physically close together and tend to be co-inherited, or because they occur together for reasons of population origin in subsections of an overall population and therefore demonstrate a statistical association within the overall population

  • a locus is the position in a DNA sequence and may be used to refer to a single nucleotide polymorphism (SNP), or to larger regions of DNA sequence

  • a marker is a segment of DNA with an identifiable physical location on a chromosome, whose inheritance can be followed and can be assayed in genetic association studies. In such studies, markers are of interest if they are linked to polymorphisms with functional significance. A marker can be a gene, an SNP or a section of DNA with no known function

  • a mutation is a permanent structural alteration in DNA or the process by which a DNA sequence is altered

  • pleiotropy is the potential for polymorphisms to have more than one specific phenotypic effect

  • polymorphism is the existence of two or more variants at a locus. Conventionally, the prevalence in the population should be above 1% to be referred to as a polymorphism; if prevalence is below this, variants are referred to as mutations

  • population stratification is an example of confounding in which the co-existence of different disease rates and allele frequencies within population sub-sections lead to an association between the two at a whole population level

  • single nucleotide polymorphisms (SNPs) are positions along a chromosome where the genetic code varies between individuals by a single base pair (pronounced ‘snips’)

Two types of polymorphism with a functional consequence can be distinguished. First, they may have a regulatory influence, that would modify the level of the product coded for by a gene. In the examples given below the β-fibrinogen polymorphism we discuss is of this type, with its influence being on plasma fibrinogen level. Second, polymorphisms that influence the structure (and function) of gene products can also be studied. Many of the metabolic polymorphisms discussed below are of this kind. In these cases interpretation is somewhat more complex than when the factor of interest is, for example, the plasma level of the gene product.

The basis of Mendelian randomization is most clearly seenin parent–offspring designs that study the way phenotype and alleles co-segregate during transmission from parents to offspring.48,49 In matings in which at least one parent is heterozygous at a polymorphic locus, the frequency with which one of the two alleles from a heterozygous parent is transmitted to an offspring with a particular disease or phenotypic characteristic can be evaluated. If there is no association between allelic form and the disease or phenotypic characteristic, each of the two alleles from the heterozygous parent has a 50% probability of being transmitted to the offspring. A shift from this 50/50 ratio indicates an association between disease or phenotypic characteristic and the alleles at this locus (Figure 3). This study design is closely analogous to that of RCTs as by Mendelian principles there should be an equal probability of either allele being randomly transmitted to the offspring. Such studies may be difficult to carry out however, both because of problems in obtaining data from parents and offspring (particularly when parents may be dead) and because they generally have lower statistical power than case-control studies carried out within whole populations, rather than within families.50 Of course populations share much common ancestry and the genetic make-up of individuals can be traced back through the random segregation of alleles during a sequence of matings, but associating genetic markers with disease risk or phenotype within such populations is not as well protected against potential distorting factors as are parent–offspring comparisons. Thus the Mendelian randomization in genetic association studies is approximate, rather than absolute.

Mendelian randomization—applications in observational epidemiological studies

Martijn Katan was an early exponent of what has since become termed Mendelian randomization.7 He was concerned with observational studies suggesting that low serum cholesterol levels were associated with an increased risk of cancer.51 This association might be explained by the early stages of cancer resulting in lower cholesterol levels—reverse causality—or by confounding factors (such as cigarette smoking) related both to future cancer risk and to lower circulating cholesterol.52 Katan pointed out that polymorphic forms of the apolipoprotein E (APOE) gene were related to different levels of serum cholesterol. If low circulating cholesterol levels were indeed a causal risk factor for cancer, then individuals with the genotype associated with low cholesterol would be expected to have higher cancer risk. If, however, reverse causation or confounding generated the association between low cholesterol and cancer, then no association would be expected between APOE genotype and cancer. Individuals with lower cholesterol because of their genotype, rather than because clinically unrecognized cancers had lowered their cholesterol, would not have a higher risk of cancer; nor would there be substantial confounding between genotype-associated differences in cholesterol and lifestyle or socioeconomic factors. While Katan did not have any data on this, he advocated it as a study design. To our knowledge this intriguing suggestion has not been systematically investigated with respect to the important question Katan posed, although sporadic reports relating APOE to risk of specific cancers have appeared.53–55

The easiest way to understand how epidemiological studies can utilize Mendelian randomization is to consider particular examples of how the principles can be applied to practical issues in aetiological epidemiology. We discuss several such examples, before concluding the paper by considering the limitations of Mendelian randomization.

Examples of Mendelian randomization: triangulation of genotype and phenotype associations with disease risk

Folate, homocysteine, and coronary heart disease

The association of the amino acid homocysteine with CHD has generated much interest. Observational studies have consistently demonstrated that higher plasma homocysteine level is associated with an increased CHD risk.56 This in itself may not be of interest to environmentally minded epidemiologists, but RCTs have shown that a moderate increase in folate consumption can substantially decrease homocysteine levels.57 Therefore if the association between homocysteine and CHD is causal the population intervention of increasing folate intake could lead to worthwhile decreases in CHD risk. However, homocysteine–CHD associations may be confounded in a variety of ways—in some studies homocysteine levels are higher in smokers or people from less favourable socioeconomic backgrounds, and existing atherosclerosis could itself increase homocysteine levels, which would automatically lead to a positive association between homocysteine and subsequent CHD.58,59 The phenomenon of reverse causality could explain the different estimates of the strength of the homocysteine–CHD association derived from cohort studies and case-control studies; with stronger associations being seen in case-control studies which are more prone to being biased by existing disease influencing the exposure measure.60 Despite promising results from preliminary small RCTs of folate supplementation which have examined proxy endpoints,61–63 current evidence is largely based on observational epidemiological studies that suffer from the limitations discussed above.

In the absence of a definitive folate trial, how can we obtain more robust evidence of whether the links between folate, homocysteine, and CHD are causal? Does raised homocysteine cause CHD and is folate supplementation a good candidate for an intervention that will reduce the risk of CHD? Here functional genomics and genetic epidemiology can come together in an example of Mendelian randomization.13 The metabolic pathways involving folate and homocysteine are reasonably well understood (Figure 4). A functional polymorphism of the gene encoding for the enzyme methylene tetrahydrofolate reductase (MTHFR)—the thermolabile variant MTHFR 677C→T—involving a C-to-T substitution at base 677 of the gene—results in reduced enzyme activity. The enzyme is involved in the conversion of 5,10-methylene tetrahydrofolate (derived from dietary folate) to 5-methyl tetrahydrofolate, which is in turn involved in the conversion of homocysteine to methionine, and people with this polymorphism have higher levels of homocysteine. This genetic variant therefore mimics low dietary folate intake, which is known to result in higher homocysteine levels. Individuals who are homozygous (i.e. both alleles have the 677C→T substitution) for the thermolabile variant (denoted by TT, indicating possession of both TT alleles at the relevant locus) have homocysteine levels 2.6 μmol/l higher on average than individuals homozygous for the more common CC allele.64 Thus the individuals with TT genotype are exposed to higher levels of homocysteine and—if homocysteine is a causal factor —they should be at higher risk of CHD than CC individuals. Since this exposure comparison is based upon genotype—with essentially random assortment of alleles at the time of gamete production and fertilization as indicated by Mendel’s Second Law—there is little possibility of confounding. Individuals with higher homocysteine because they have TT genotype should be no more likely to be smokers, of lower social class and with more homocysteine-unrelated pre-existing disease than individuals with CC genotype. New data giving an empirical demonstration of this lack of association between genotype and confounding factors, but of strong associations between the same confounders and plasma homocysteine, are presented in Table 1.

Two systematic reviews recently appeared in the same issue of the Journal of the American Medical Association: one an individual-participant based meta-analysis of studies of homocysteine and CHD;65 the other a meta-analysis of studies that have related MTHFR genotype to CHD risk.66 In the meta-analysis of the observational studies, the association between homocysteine and CHD was such that 2.6 μmol/l higher homocysteine was associated with a relative risk (RR) of CHD of 1.13 (95% CI: 1.08–1.19). But it should be remembered that this association between circulating levels of homocysteine and CHD could be confounded, or due to reverse causation. This is not the case with the elevated homocysteine generated among individuals with TT compared with CC MTHFR genotype. In the meta-analysis of the studies relating MTHFR genotype to CHD risk, TT individuals experienced a risk of 1.16 (95% CI: 1.05–1.28) compared with CC individuals. This RR is related to a 2.6 μmol/l higher homocysteine level, since this is the difference in homocysteine between genotypes. This is the same difference in homocysteine as we have used for the presentation of the results from the meta-analysis of studies relating measured homocysteine levels to CHD risk. Thus, the two RR estimates—from measuring plasma homocysteine in observational studies and relating this to CHD risk, and from assaying MTHFR genotype and using the association between this and both homocysteine level and CHD to predict the strength of the homocysteine–CHD association—are similar. While an association found in observational studies measuring homocysteine and CHD risk might be confounded by behavioural and socioeconomic factors, or influenced by reverse causation, associations between homocysteine and CHD risk estimated via the MTHFR genotype studies are not. Indeed, health behaviours cannot determine MTHFR genotype, although it is possible that the thermolabile variant MTHFR might be associated with polymorphic variants of loci determining health behaviours, such as smoking, or influencing physiological factors such as blood pressure, thereby resulting in confounding. However, some of the studies included in the meta-analysis measured an extensive array of potential confounders and found no association with MTHFR genotype, as we also demonstrate in Table 1. In line with this, adjustment for these potential confounders did not attenuate the genotype– CHD associations, suggesting that the observations are trustworthy.

The observational studies measuring plasma homocysteine level and relating this to CHD risk should produce effect estimates that are lower than the true association between usual level of homocysteine and CHD, because there will be measurement error in the single measures of homocysteine: either laboratory error, or (with the same effect on the strength of association between measured homocysteine and CHD), because within any individual their homocysteine levels change over time and a single measure is not a precise indicator of usual level of homocysteine.67 Studies in which homocysteine was measured on repeat occasions were used to quantify the degree of measurement imprecision, and this was in turn used to correct the strength of association between homocysteine and CHD in the meta-analysis of observational studies. The adjusted RR relating ‘usual’ homocysteine level to CHD was 1.17 (95% CI: 1.10–1.25). This is even closer to the effect estimate from the MTHFR studies, which is to be expected because in the MTHFR studies genotype will be related to usual level of homocysteine rather than to a single level at a particular time. Thus the findings of genetic association studies can give evidence of associations neither confounded by the usual lifestyle and socioeconomic factors, nor diluted by measurement error. Such evidence gives a truer picture of the aetiological importance of the exposure and the potential health gains through interventions that modify its level.

Mendelian randomization allows genetic epidemiology to make direct contributions to the understanding of environmental determinants of disease beyond the essentially nominal incantation of the term ‘environment’ (as in ‘gene–environment interaction’). However, when investigating effects of functional polymorphisms it is possible that effects will differ by environmental exposure, and there is some evidence that the association between MTHFR genotype and CHD may differ according to folate status, as may the association between genotype and homocysteine level.66 However, the important message is that the association of the MTHFR genotype and CHD risk does not indicate that genetic screening is merited—the RR associated with MTHFR genotype is small and such screening would be a very inefficient way of detecting a group at high risk of CHD. The triangulation of the associations between genotype, homocysteine, and CHD risk provides robust evidence of a general causal effect of homocysteine on CHD risk, and therefore of a protective effect of folate, that would be experienced by the whole population, independent of their genotype.

Maternal folate and neural tube defects

A second example relates to the same polymorphism, but with a different disease outcome and mechanism. It is now widely accepted that neural tube defects (NTDs) can in part be prevented by periconceptual maternal folate supplementation.68 Randomized controlled trials of folate supplementation have provided the key evidence in this regard.69,70 But could we have reached the same conclusion before the RCTs were carried out, if we had access to evidence from genetic association studies? In a meta-analysis of studies carried out to investigate the MTHFR 677C→T polymorphism in newborns with NTDs compared with controls, there was an increased risk in TT versus CC newborns, with a RR of 1.75 (95% CI: 1.41–2.18).71 Studies have also looked at the association between this MTHFR variant in parents and the risk of NTD in their offspring. Mothers who have the TT genotype have an increased risk of 2.04 (95% CI: 1.49–2.81) of having an offspring with an NTD compared with mothers who have the CC genotype.68 For TT fathers, the equivalent RR is 1.18 (95% CI: 0.65–2.12).68

This pattern of associations suggests that it is the intra-uterine environment—influenced by maternal TT genotype—rather than the genotype of offspring that is related to disease risk (see Figure 5). This is consistent with the hypothesis that maternal folate intake is the exposure of importance. There is some evidence that maternal homocysteine level may be the key mediating variable,72 although other data suggest that alternative pathways link folate to NTD risk in offspring.73 However, the elevated homocysteine levels among people who possess the TT genotype provide a biological marker that can be translated into an equivalent difference in folate status. Thus MTHFR TT mothers have on average 2.6 μmol/l higher plasma homocysteine64 and also a RR of 2.04 of having an offspring with an NTD. A 2.6-μmol/l lower plasma homocysteine therefore would predict a halving in the risk of having an offspring with an NTD (RR = 1/2.04, i.e. 0.49). Folate supplementation reduces homocysteine by 25% in Western populations.58 The degree to which homocysteine is lowered depends upon pre-treatment blood homocysteine concentrations (greater lowering at higher concentrations) and pre-treatment folate levels (greater lowering at lower folate levels). Since mothers who have babies with NTDs have higher homocysteine and lower folate levels than controls74 (around 15 μmol/l before pregnancy) additional lowering would be seen—reductions of 5 μmol/l or 33% may be expected.58 It should be noted that pregnancy leads to reduced homocysteine75 and since peri-conceptual folate is believed to be the important factor, pre-pregnancy measures should be evaluated. Given the strength of association between maternal MTHFR genotype and offspring NTD risk (RR = 0.49 for a genotypic difference in homocysteine level of 2.6 mmol/l) a reduction of homocysteine by 5 mmol/l would be predicted to lead to a relative risk for an offspring being born with an NTD of 0.25 (95% CI: 0.14–0.46). The observed effect of folate supplementation on NTD risk in the MRC trial was an RR in the folate supplemented group of 0.28 (95% CI: 0.12–0.71).76 A similar strength of association was found in a study relating plasma homocysteine levels among women to the risk of having had a child with an NTD.77

In this case the findings from observational studies, genetic association studies, and an RCT are closely similar. Had the technology been available, the genetic association studies, with the particular influence of maternal versus paternal genotype on NTD risk, would have provided strong evidence of the beneficial effect of folate supplementation before the results of any RCT had been completed. Certainly, the genetic association studies would have provided better evidence than that given by conventional epidemiological studies that had to cope with the problems of accurately assessing diet and also with the considerable confounding of maternal folate intake with a wide variety of lifestyle and socioeconomic factors that may also influence NTD risk. As with MTHFR and CHD the association of genotype with NTD risk does not suggest that genetic screening is indicated—rather it demonstrates that an environmental intervention may benefit the whole population, independently of the genotype of individuals receiving the intervention.

Methylene tetrahydrofolate reductase and cancers

Before leaving MTHFR it is worth considering studies that have related genotype to cancer risk, since they are represented78 and discussed79 in this issue of the IJE, and provide further illustrations of the potentials of the Mendelian randomization approach. However, they also show how the interpretation of such findings is not always obvious, and may provide scope for creatively fitting hypotheses to the data. There are two ways in which MTHFR polymorphisms may influence cancer risk. First, referring to Figure 4, it can be seen that MTHFR catalyses the conversion of 5,10-tetrahydrofolate to 5-methyl tetrahydrofolate. The substrate is involved in the conversion of deoxyuridylate monophosphate (dUMP) to deoxythymidylate monophosphate (dTMP), and low levels of 5,10-methylene tetrahydrofolate would lead to an increase in the dUMP to dTMP ratio. With a high dUMP/dTMP ratio there is increased incorporation of uracil into DNA, in place of thymine, and this is associated with increased point mutations and DNA/ chromosome breakage.80 This would be expected to increase cancer risk. The less-active form of MTHFR—the thermolabile variant—will, all other factors being equal, lead to accumulation of 5,10-methylene tetrahydrofolate and thus a lower dUMP/ dTMP ratio, and presumably a lower risk of cancer. This is what has been found with respect to colon cancer81,82 and acute lymphocytic leukaemia in both children83 and adults.84

When the less-active form of MTHFR was found to be associated with an increased risk of CHD,66 this was taken to indicate that higher dietary folate should be protective against CHD. Does this mean that finding the less-active form of MTHFR to be protective against cancer indicates that lower folate intake would protect against cancer? Consideration of the metabolic pathways presented in Figure 4 would not support this interpretation. Higher dietary folate intake would lead to higher 5,10-methylene tetrahydrofolate levels and thus a lower dUMP/dTMP ratio, which would in turn lead to less incorporation of uracil into DNA and fewer mutations or less DNA/chromosome breakage. Thus depending upon the component of the metabolic pathway which influences disease—homocysteine in the case of CHD, and dUMP/dTMP ratio in the case of cancer— the interpretation relevant to environmental influences from observed genotype–disease associations will differ.

There is a second way in which the MTHFR polymorphism may influence cancer. From Figure 4 it can be seen that s-adenosylmethionine (SAM) levels are also influenced by folate intake. Higher folate intake leads to higher levels of SAM, which is the common methyl donor necessary for the maintenance of the methylation patterns in DNA that influence DNA conformation and gene expression.85 The thermolabile MTHFR variant leads to lower SAM levels and the altered methylation patterns consequent on lower SAM levels would be expected to increase the risk of some cancers.85 The exact interpretation of the MTHFR–cancer associations are, therefore, not straightforward, although they suggest that dietary folate may be protective.

Alcohol and coronary heart disease

The possible protective effect of moderate alcohol consumption on CHD risk remains controversial.86–88 Are non-drinkers at a higher risk of CHD because health problems (perhaps induced by previous alcohol abuse) dissuade them from drinking?89 As well as this form of reverse causation, confounding could play a role, with non-drinkers being more likely to display an adverse profile of socioeconomic or other behavioural risk factors for CHD90 (moderate drinking may be a sign of moderation in all things). Alternatively, alcohol may have a direct biological effect that lessens the risk of CHD—for example by increasing the levels of protective high density lipoprotein (HDL) cholesterol.91 There is unlikely to be an RCT of alcohol intake that will be able to test the hypothesis of a causal protective effect or CHD events.

Functional polymorphisms of genes related to alcohol metabolism can be utilized to investigate this association, as the distribution of confounders should be little different between groups defined by genotype. Alcohol dehydrogenase (ADH) oxidizes alcohol to acetaldehyde, which is in turn oxidized by aldehyde dehydrogenase (ALDH) to acetate.92 One of the ADH isoenzymes, ADH3, has two polymorphic forms which produce two different polypeptide enzyme subunits; ADH3*1 produces γ1 and ADH3*2 produces γ2.93 Allele frequencies in European origin populations are roughly 60% γ1 and 40% γ294,95 and there are differences in the maximal velocity of alcohol oxidation, with γ1γ1 individuals showing a greater than twofold higher rate than γ2γ2 individuals.89 Thus if there is a biologically protective effect of alcohol on CHD risk then the slow oxidizers may be expected to have a lower risk of disease, since any alcohol they drink may be less rapidly cleared from the system. In fact, this is what a case-control study found, with the risk ratios, compared with the homozygous fast oxidizers (γ1γ1), being 0.90 in the heterozygote γ1γ2 and 0.72 in the homozygous γ2γ2.92,95 Adjustment for confounding factors had little influence on this gradient, indicating that potential confounding factors did not differ greatly by genotype. This was also true for alcohol intake, which might have been expected to show an association with genotype, but the lack of a strong association with variants at this polymorphic site is in agreement with other studies.91 This lack of association with alcohol intake is in distinction to the effect of the variants of ALDH2 that are associated with slow acetaldehyde oxidation, facial flushing and hangovers in response to alcohol consumption, and thus related to reduced alcohol consumption and protection against alcoholism.96–98 This latter example demonstrates that polymorphisms can be associated with behavioural factors, and thus these behaviours do need to be assessed in these studies.

In the above case-control study95 statistical power with respect to CHD was weak, but there was more power to analyse the association between genotype and HDL cholesterol levels. Randomized controlled trials have demonstrated that alcohol intake increases HDL cholesterol in a dose–response fashion99,100 and therefore it would be expected that γ1γ1 fast alcohol oxidizers would have lower HDL cholesterol levels than γ2γ2 slow oxidizers (who in a simplistic way could be seen as having a higher level of alcohol exposure at a given intake). This is what was found, and furthermore the effect was confined to people with more than minimal alcohol intakes, again as expected, since without alcohol, the genotype would not be expected to have a biological influence on HDL cholesterol95 (Figure 6). The magnitude of association between genotype and HDL cholesterol levels can be related to the increase in HDL cholesterol seen in RCTs, and in approximate terms each γ2 allele is the equivalent to 18 g per day alcohol intake (which equals about one pint of beer). Thus, the biological range of effect of these variants is equivalent to a moderate increase in alcohol intake.

More data are required on ADH3, HDL cholesterol, and CHD risk, but the current evidence provides support for direct biological protection of moderate alcohol consumption against CHD. As in the MTHFR and NTD or CHD examples, these findings do not mean that only people with a particular genotype will benefit from an environmental (in these cases dietary) factor, rather the whole population—whatever their genotype—would benefit. The method provides strong evidence—more robust than from conventional observational epidemiological studies—of environmental manipulations that could benefit population health.

Further examples of Mendelian randomization: genotype as an indicator of exposure characteristics and action

Organophosphates and ill-health in farmers

Agricultural workers who have been exposed to sheep dips containing organophosphates attribute a variety of symptoms of poor health to this exposure,101 but there have been claims that such attribution is false and may reflect secondary gain from compensation or paid early retirement on health grounds. Thus it is difficult to obtain reliable evidence in this area, and RCTs are not feasible. People who become cases in studies of health-related outcomes of organophosphate exposure generally know that the exposure is hypothesized to cause health problems, and it is thus difficult, if not impossible, to conduct unbiased case-control studies.

An enzyme that deactivates a potentially toxic component found in many sheep dips—paraoxonase—has isoforms with different biological activity. If the component of sheep dip that is detoxified by this enzyme does cause symptoms of ill-health, then among people exposed to sheep dip a higher proportion of those reporting symptoms would be expected to be poor detoxifiers. A study designed along these lines found that the genetic variant associated with lower detoxification was related to reporting symptoms of poor health among people exposed to sheep dip.101 Since it is unlikely that genotype is related to potential confounding factors, to the tendency to report symptoms differentially, or to a desire for compensation or early retirement, these findings provide evidence that there is a causal effect of the sheep dip exposure on health outcomes. As in the earlier examples there is an important caveat—these findings do not support genetic screening and selective employment based on genotype, as some people will suffer toxicity from sheep dip regardless of their paraoxonase status. The findings simply assist in assigning a causal interpretation to the association between sheep dip exposure and symptoms of ill-health.

Metabolic polymorphisms and cancer

Associations between various metabolic polymorphisms and cancer risk have been interpreted as providing evidence for particular environmental determinants of cancer.102,103 Examples include an acetylation polymorphism (NAT2) and risk of various forms of cancer. It appears that slow acetylators are at increased risk of bladder cancer, perhaps particularly so if they are exposed to aromatic amines, and here acetylation deactivates the carcinogen. Conversely there is some evidence that rapid acetylators are at increased risk of colon cancer, the interpretation being that acetylation activates heterocyclic amines that are found in burnt meat. In this latter case, the association may provide better evidence on the potential colon cancer risk of burnt meat than do studies attempting to quantify this difficult to measure environmental exposure.104,105 However, the studies of NAT2 and colon cancer have produced variable findings, so the basis of this interpretation is not particularly robust.81 This illustrates that Mendelian randomization is as prone to problems from the non-replication of findings from genetic association studies as are other areas of genetic epidemiology.106

Mendelian randomization: what does it mean when there is disagreement between conventional observational and genetic studies?

Fibrinogen and coronary heart disease: proving a negative?

The status of plasma fibrinogen as a cardiovascular risk factor remains controversial.107–109 In prospective observational studies and case-control studies it is certainly the case that fibrinogen level is predictive of CHD risk, with the latest meta-analysis reporting a RR of 1.8 (95% CI: 1.6–2.0) for the top to the bottom tertile of the fibrinogen distribution.110 However, existing atherosclerosis increases fibrinogen, generating reverse causation between disease and the apparent risk factor, and also there is substantial confounding, with higher fibrinogen levels being seen in a wide variety of population sub-groups known to have increased CHD risk, for example cigarette smokers, people from less-favourable socioeconomic backgrounds, non-drinkers, and people who engage in less leisure time activity.111 While RCTs of drugs that reduce blood clotting tendency have demonstrated reduced CHD risk,112,113 these do not do so purely by reducing fibrinogen level. Indeed the class of drugs which do reduce fibrinogen—the fibrates, which also have a relatively weak cholesterol-lowering effect—have not been associated with reduced CHD and peripheral vascular disease risk.114,115 Thus it is unclear whether fibrinogen is a causal factor for CHD, or merely a bystander, which serves as a marker of both disease state and other causal factors.

Several authors explicitly suggested that polymorphisms related to differences in fibrinogen levels could be utilized in a ‘Mendelian randomization’ fashion to examine whether fibrinogen is an aetiological factor with respect to CHD,107–109 however, these studies were too small to provide firm evidence on this point. Recently a large case-control study has examined this issue. A polymorphism in the β-fibrinogen gene was associated with fibrinogen levels among the controls11 with fibrinogen levels of 3.10 g/l in G/G individuals, 3.22 g/l in A/G individuals, and 3.36 g/l in A/A individuals. For each A allele there was an increase of 0.12 g/l in fibrinogen. In this case-control study, fibrinogen was related to CHD risk in the usual fashion, with 0.12 g/l higher fibrinogen being associated with a RR of CHD of 1.20 (95% CI: 1.13–1.26). Note that 0.12 g/l higher fibrinogen is also the per allele difference in fibrinogen seen according to genotype. It would therefore be predicted that there should be a per allele effect on CHD, with a RR of approximately 1.20. However, when genotype was related to CHD risk, essentially no relationship was seen, with a per allele RR of 1.03 (95% CI: 0.96–1.10).

In this case-control study, therefore, individuals with a genotype that would have subjected them to long-term elevated fibrinogen levels did not experience any increased risk of CHD. This suggests that circulating fibrinogen may not be a causal factor with respect to CHD, despite being associated with CHD risk in observational studies. Confounding may explain why conventional observational epidemiological studies consistently find a positive association between fibrinogen and risk of CHD: in one study107 plasma fibrinogen was strongly associated with the usual confounding factors, but the genotype associated with higher plasma fibrinogen was not related to these confounding factors. These data also illustrate the basic principle of Mendelian randomization, that genotype–disease associations can provide an unconfounded test of the association between a particular phenotype and disease.

Apolipoprotein E and coronary heart disease: getting the wrong answer?

Lowering circulating cholesterol levels pharmacologically reduces the risk of CHD by a substantial degree.116 Indeed, the effects seen in cholesterol lowering trials are as would be predicted from the increased CHD risk among individuals who are heterozygous for the familial hypercholestrolaemia mutation,117 which provided a form of Mendelian randomization of cholesterol before the definitive RCTs appeared (Box 5). The relatively extreme nature of this mutation—pushing cholesterol levels outside their usual range—meant that it was difficult to extrapolate to people in the normal range of circulating cholesterol levels within a population, however.

Familial hypercholesterolaemia is a Mendelian dominant condition in which many rare mutations of the low density lipoprotein receptor gene, with an overall prevalence of around 0.2%, lead to very high circulating cholesterol levels. In a UK study the average total cholesterol level at registration was around 9.0 mmol/l amongst people heterozygous for this condition.117 The relative risk (RR) of CHD mortality amongst these people was around 3.9 compared with the general population of the UK, for whom the average total cholesterol levels were around 6.0 mmol/l. Among people without coronary heart disease, reducing total cholesterol levels with statin drugs by around 1 to 1.5 mmol/l reduces CHD mortality by around 25% over 5 years, with the magnitude of the mortality reduction increasing over time from randomization.170172 Assuming a linear relationship between blood cholesterol and CHD risk and given the difference in cholesterol of 3.0 mmol/l between people with familial hypercholesterolaemia and the general population, the randomized controlled trial evidence on lowering total cholesterol and reducing CHD mortality would predict a relative risk for CHD of around 2, as opposed to 3.9, for people with familial hypercholesterolaemia. However the trials also demonstrate that the magnitude of the reduction in CHD mortality increases over time, as would be expected for a disease like CHD where elevated levels of cholesterol over time influence the development of atherosclerosis. For people with familial hypercholesterolaemia their circulating total cholesterol levels will have been high throughout their lives and therefore would be expected to generate a greater risk than would be predicted for the results of lowering cholesterol levels for only 5 years. Mendelian randomization is one method for assessing the effects of long-term differences in exposures on disease risk, free from the diluting problems of both measurement error and of only having short-term assessment of risk factor levels. This approach may provide an indication that cholesterol-lowering efforts should be lifelong rather than limited to the period for which RCT evidence with respect to CHD outcomes is available.

A functional polymorphism—that of the apolipoprotein E gene (APOE)—generates differences in circulating cholesterol levels that are much smaller than seen with familial hypercholestrolaemia, and do not generally displace people from the population distribution of cholesterol.118 The gene is polymorphic with three common alleles, ε2, ε3 and ε4, which produce three isoforms of the protein product, E2, E3 and E4. The commonest allele is ε3 (with allele frequencies ranging from around two-thirds to around 85%), while ε2 is the rarest (allele frequencies ranging from 3 to 13%). Thus some combinations are rare, but ε2/ε3, ε3/ε3, and ε3/ε4 genotypes constitute well over 80% of individuals within a population. Genotype is related to both low density lipoprotein (LDL) cholesterol (which is positively associated with CHD risk) and HDL cholesterol,119 which is protective against CHD. In the same large case-control study discussed above in relation to fibrinogen-related polymorphisms, the APOE genotype was studied.120 High density lipoprotein cholesterol is mainly carried with apolipoprotein A-I (Apo A-I), and LDL cholesterol with apolipoprotein B (Apo B). The findings of the study are presented in Table 2. APOE genotype was associated with Apo A-I and Apo B, and also with myocardial infarction (MI) risk. Measures of Apo A-I and Apo B were also associated with MI risk in the case-control study, and with each 0.022 g/l lower Apo A-I and 0.077 g/l higher Apo B the RR of MI was 1.54 (95% CI: 1.43–1.66). Note that this effect is for the equivalent differences in Apo A-I and Apo B that are seen per allele (Table 2), yet the effect on MI is considerably larger than the per allele influence of genotype on MI for which RR = 1.11 (95% CI: 1.06–1.17); test between relative risks P < 0.0001.

While total cholesterol and HDL cholesterol levels according to genotype were not reported in this study, they are known to be strongly associated with APOE genotype.119,121 Why, then, does the per allele influence on apolipoproteins predict a much greater difference in MI risk than is shown by the direct association between genotype and MI risk? There are various possible answers. First, it could be that the direct apolipoprotein–MI risk association is distorted by reverse causation (disease leading to changes in apolipoproteins), or by confounding, leading to a greater effect estimate than the true causal estimate. There is some suggestion that this may be the case in that the observational association between apolipoproteins and MI risk in this case-control study is greater than that predicted from a large RCT122 of the effect of simvastatin on apolipoproteins and MI risk. However, the trial only influenced apolipoprotein levels for 5 years, and the effect of lifetime differences in apolipoprotein levels, as generated by APOE polymorphisms, would be expected to be greater. Second, the APOE polymorphisms relate to more than differences in apolipoproteins and circulating total and HDL cholesterol.123,124 The ε2 variant is associated with less-efficient transfer of very low density lipoproteins and chylomicrons from the blood to the liver, greater postprandial lipaemia, and an increased risk of type III hyperlipoproteinameia. These differences go alongside the lower LDL and higher HDL cholesterol levels and may counter the predicted beneficial effect of these on CHD risk. Thus, a meta-analysis of observational studies relating APOE to CHD risk found no reduced risk amongst carriers of ε2,125 in line with the small difference in risk seen in the case-control study discussed above. Thus it is important to appreciate that other effects of a polymorphism than the one under investigation may influence the association with disease risk, thereby not allowing a direct comparison of intermediate phenotype and genotype associations with disease.

Potentials and limitations of Mendelian randomization

We have illustrated the potential of Mendelian randomization with a series of quantitative and non-quantitative examples. We showed quantitative triangulation (or non-triangulation) of environmental exposure–allele–intermediate phenotype–disease pathways with the examples of MTHFR and CHD, and MTHFR and NTDs; non-quantitative indications of the influence of particular environmental exposures with examples involving MTHFR and cancer; metabolic polymorphisms and cancer and organophosphates and sheep dip syndrome; and quantatitive exclusion of the role of a potential intermediate phenotype in the lack of an association between a β-fibrinogen polymorphism and CHD.

The future potential of Mendelian randomization will depend upon the elucidation of functional polymorphisms that mirror environmental exposures of interest. Progress in this area is fast—and can conveniently be monitored through use of the Online version of Mendelian Inheritance in Man,126 the Human Gene Mutation Database127 and the Human Genome Variation Database.128

Consider one of our examples of the apparent failure of observational epidemiology at the beginning of this paper: vitamin C intake and CHD risk (Figure 1). Could this have been studied utilizing the principles of Mendelian randomization? Certainly, polymorphisms exist that are related to lower circulating vitamin C levels—for example, the haptoglobin polymorphism129,130—but in this case the effect on vitamin C is at some distance from the polymorphic protein and, as in the apolipoprotein E example, the other phenotypic differences could have an influence on CHD risk that would distort examination of the influence of vitamin C levels through relating genotype to disease. Even so, investigating this polymorphism could provide some further evidence on vitamin C and CHD, but the studies so far relating the haptoglobin polymorphism to CHD risk have been of too limited size to be informative.130–132 Where there exists a range of polymorphic loci that influence an intermediate phenotype—as they do for vitamin C levels126—then similar quantitative findings for the influence of the potential intermediate phenotype (vitamin C levels) on CHD risk from relating the different loci to disease outcomes would provide greater confidence in interpreting the associations. Confounding by other effects of the different polymorphic loci influencing vitamin C levels would be unlikely to act in the same direction and have the same distorting effect on the intermediate phenotype-disease association.

There are, however, a number of important limitations to Mendelian randomization that need to be considered before a balanced view of the potential value of the approach can be reached. These can be considered under various headings.

Failure to establish reliable genotype–intermediate phenotype or genotype–disease associations

If the associations between genotype and a potential intermediate phenotype, or between genotype and disease outcome, are not reliably estimated, then interpreting these associations in terms of their implications for potential environmental causes of disease will clearly be inappropriate. This is not an issue particular to Mendelian randomization, rather the non-replicable nature of perhaps most apparent findings in genetic association studies is a serious limitation to the whole enterprise. In Table 3 we summarize possible reasons for the non-replication of findings.106,133 Population stratification—i.e. confounding of genotype–disease associations by factors related to subpopulation group membership within the overall population in a study—is unlikely to be a major problem in most situations.134–136 Genotyping errors can of course lead to failures of replication of genotype–disease associations. Where intermediate phenotypes can be measured, as is the case of MTHFR or β-fibrinogen, a demonstration of the expected relationship between genotype and intermediate phenotype in such studies indicates that genotyping errors are not to blame. For example, in the large β-fibrinogen case-control study11 the report of a lack of association between genotype and CHD risk could be claimed to reflect genotyping errors (and thus not to be taken to mitigate against a causal role for fibrinogen). However, since within this study the investigators also demonstrated that their genotyping data did predict plasma fibrinogen levels to the same degree as in other studies, this interpretation is not tenable.

Regarding failure to replicate in genetic epidemiology, true variation between studies is clearly possible—for example, people heterozygous for familial hypercholesterolaemia only seem to experience increased mortality in populations with high dietary fat intakes and the presence of other CHD risk factors.137,138 Nevertheless, the major factor for non-replication is probably inadequate statistical power (generally reflecting limited sample size), coupled with publication bias.106

Interestingly, Gregor Mendel appreciated the need for adequate sample size when he carried out his experiments on pea crosses, stating that ‘with a small number of plants … very considerable fluctuations may occur’ and that the ‘true ratio of the numbers can only be ascertained by an average deduced from the sum of as many single values as possible; the greater the number the more are merely chance effects eliminated’.44 It has been suggested that Mendel adopted the strategy of fabricating some (or even all) of his data to solve the problem, although there are several possible reasons for his ‘too good to be true’ findings (Box 6).

Mendel’s ‘Experiments in Plant Hybridisation’,44 formed the basis of quantitative genetics on their re-discovery in the early part of the 20th century. The detailed results of the experiments, animated in an accessible way on the website http://www.mendel-museum.org, were, however, possibly too good to be true, as has long been discussed.173174 Various possibilities have been given as to why the ratios reported by Mendel are closer to the theoretical ratios than would be expected by chance. The range of explanations runs from the suggestion that Mendel did not actually carry out the experiments and simply made up the results; that Mendel falsified some, but not all, of his data; that an assistant of Mendel who was collecting and tabulating the results knew of Mendel’s expectations and manipulated the data to fit these; or that Mendel simply had good luck.175 A further possibility is that the data were being constantly updated—not all the products of his cross-breeding experiment were assessed and tabulated—and that Mendel stopped carrying on classifying data when chance fluctuation had led his results to be very close to expectation. This is similar to potential problems in RCTs whereby trialists look at updated outcomes and the trials that are stopped at a time when the results are, by chance, particularly favourable.

In the case of quantitative approaches to Mendelian randomization, sample size calculations need to consider the magnitude of the predicted effect of the intermediate phenotype on disease outcome. This often leads to very large studies being required—for example, in the case of MTHFR variants and CHD risk the magnitude of the MTHFR variant-homocysteine and homocysteine-CHD associations discussed earlier would mean that around 9500 CHD cases and 9500 controls are required to establish the predicted effect, with a power of 80% at the P = 0.05 level.66

Failure to recognize the sample sizes required to detect plausible or predicted effects of genotype on disease can lead to studies being uninformative. For example, in a report of a case-control study provocatively entitled ‘Elevated plasma fibrinogen. Cause or consequence of cardiovascular disease?’,108 the RR of CHD for a 1 g rise in fibrinogen level was 1.45 (95% CI: 1.12–1.88), while the association between genotype and CHD risk was essentially null (RR = 1.08, 95% CI: 0.71–1.65 for GA and AA genotypes compared with GG genotype). As with the large case-control study of this issue discussed above,11 the authors interpreted these results as indicating that fibrinogen was not a cause of CHD. However, given the strength of the association between genotype and fibrinogen, with GA plus AA individuals having 0.17 g/l higher fibrinogen than GG individuals, the predicted risk according to genotype, given the observational association between fibrinogen and CHD, would be 1.07 (95% CI: 1.01–1.11). This is clearly not different from the estimated RR—indeed the point estimates are remarkably close, although there is a very wide confidence interval around the RR for genotype. Thus the authors’ claim that their study suggests that fibrinogen is not causally related to the risk of CHD is not supported by evidence from their own study. The much larger case-control study discussed above11 was required to demonstrate this.

The small genotype-associated RR predicted by knowledge of intermediate phenotype in the case of MTHFR and β-fibrinogen mean that very large studies are required; in other cases it may be that even smaller RRs would be expected. If polymorphisms at more than one locus influence an intermediate phenotype then it may be possible to explore combinations of polymorphisms at different loci that produce differences in intermediate phenotype that are substantial enough to generate detectable effects on disease outcome. If the loci are not in linkage disequilibrium and thus segregate independently this could be termed ‘factorial Mendelian randomization’, with interest being in the groups in which the combination of polymorphisms produce the most extreme difference in intermediate phenotype. Alternatively haplotypes that produce the most extreme phenotypic differences could be studied. It should be remembered that with variants at multiple loci contributing to phenotypic differences there is greater likelihood of confounding or pleiotropic effects, as we discuss below.

The problems in establishing reliable genotype–disease associations are, of course, a general issue in genetic epidemiology. Recently, Tabor and colleagues have emphasized the advantages of candidate-gene approaches in which plausible links between the functional effects of candidate polymorphisms and dis-ease outcomes exist.139 Such studies are less likely to produce false-positive findings than are investigations relating non-functional genetic variants to disease risk. Since the relative frequency of different forms of polymorphisms decreases with increasing potential functional effect140 rigid adoption of this approach would limit the number of associations that are statistically examined, and thus reduce the proportion of false-positive reports.106 Mendelian randomization clearly depends upon studying genetic variants that have a defined biological effect, and therefore the relevant studies fit within this model. Tabor and colleagues extend their reasoning on candidate-gene association studies to suggest that researchers should carry out initial sequencing work on the functional regions of a gene to identify new SNP, then determine the population frequency of these SNP and their functional relevance, before performing the epidemiological analyses. The need for epidemiologists to work closely and collaboratively with laboratory scientists to take forward Mendelian randomization is made clear in this exposition.

Confounding of genotype-intermediate phenotype–disease associations

The power of Mendelian randomization lies in its ability to avoid the often substantial confounding seen in conventional observational epidemiology. However, confounding can occur in several ways that need to be considered. First, it is possible that the locus under study is in linkage disequilibrium—i.e. is associated—with another polymorphic locus, with the effect of the polymorphism under investigation being confounded by the influence of the other polymorphism. It may seem unlikely—given the relatively short distances over which linkage disequilibrium is seen in the human genome—that a polymorphism influencing, say, CHD risk would be associated with another polymorphism influencing CHD risk (and thus producing confounding). There are, nevertheless, cases of different genes influencing the same metabolic pathway being in physical proximity. For example, different polymorphisms influencing alcohol metabolism appear to be in linkage disequilibrium.141 Furthermore, given the pleiotropic effect of perhaps most genes and the multiple polymorphisms that can exist within a single gene, such confounding may not be uncommon. However in cases where the intermediate phenotype is measured—such as the β-fibrinogen example—this explanation is not applicable, as the effect of any linked polymorphism on fibrinogen levels would be directly observed.

Confounding by behavioural factors is possible if the loci under study influence the behaviour either directly (through modifying response to, and thus consumption of, tobacco or alcohol, for example) or indirectly (through effects on schooling, learning and social trajectory, for example). The identified genetic contributions to behaviours such as smoking and drinking alcohol render such confounding possible either through a polymorphism under study being in linkage disequilibrium with a polymorphism influencing these behaviours, or through an influence on these behaviours being a pleiotropic effect of the polymorphism under study.

As well as contributing to potential confounding, such influences also provide a potential way of studying the effects of health related behaviours. For example ALDH297,98 has a strong influence on alcohol consumption, through its effect on flushing and hangovers following drinking. Alcohol consumption would, then, confound associations between ALDH2 and disease outcomes (or, alternatively, alcohol consumption could be seen to be on the pathway between ALDH2 and outcome). These associations provide a further investigative opportunity: since groups defined by this genotype have large differences in alcohol consumption patterns, the health effects of alcohol can be examined by studying the health differences between these groups.

The influence of genetic variants may not have straightforward interpretations with respect to the direct causal factor, however. For example, the ALDH2 variant associated with slow oxidation of acetaldehyde, facial flushing, headaches, and much lower alcohol consumption is also associated with lower weight and adiposity in Japanese men (but not women).98 This is presumably because alcohol constitutes a substantial proportion of total calorie intake among men, thus the virtual non-drinkers who are homozygous for slow acetaldehyde oxidation have lower calorie intake. When relating ALDH2 to disease outcomes the associations could reflect either differences in alcohol consumption or differences in obesity, both of which could have causal effects on disease outcomes such as CHD or cancer. In such situations Mendelian randomization is clearly limited in determining the proximal causal factor, although it may be possible to look in groups (such as Japanese women) for whom genotype is less strongly or not associated with adiposity to untangle these effects. It is clear, however, that in order to be able to interpret the findings of these studies potential intermediate or confounding factors need to be measured to elucidate the understanding of aetiological pathways. Therefore, where possible, when studying disease associations with a particular polymorphism a wide range of phenotypic and genotypic potential confounders should be assessed to explore the possible contribution of confounding. In the case of MTHFR and β-fibrinogen we show above that confounding by conventional risk factors of genotype–disease associations is non-existent or negligible compared with the magnitude of confounding of associations between plasma homocysteine or fibrinogen and disease. Thus while the full potential of studies following the principles of Mendelian randomization is not yet well delineated, it is likely that they will suffer from substantially less confounding than do conventional observational epidemiological studies of environmental exposures and disease.

Pleiotropy and the multi-function of genes

Mendelian randomization is most useful when it can be used to relate a single intermediate phenotype to a disease outcome. However, polymorphisms may (and probably will) influence more that one intermediate phenotype, as we saw with the MTHFR, APOE, and ALDH2 examples. This can be the case either through multiple effects mediated by their immediate protein coding or gene expression, or (probably less importantly) through alternative splicing, where one polymorphic region contributes to alternative forms of more than one protein.142 The most robust interpretations will be possible when the functional polymorphism appears to directly influence the level of the intermediate phenotype of interest (as in the β-fibrinogen example), but such examples are probably going to be less common in Mendelian randomization than cases where the polymorphism can influence several systems, with different potential interpretations of how the effect on outcome is generated (as we discussed with respect to the MTHFR and APOE examples earlier).

Canalization and developmental stability

The previous problems for Mendelian randomization can be examined through measuring potential confounding factors, although in the case of pleiotropy this may be difficult as the other systems that are being influenced may not be well characterized. A greater problem arises from the developmental compensation that may occur through a polymorphic genotype being expressed during fetal development, and thus influencing development in such a way as to buffer against the effect of the polymorphism. Such compensatory processes have been discussed since C.H. Waddington introduced the notion of canalization in the 1940s.143 Canalization refers to the buffering of the effects of either environmental or genetic forces attempting to perturb development, and Waddingtion’s ideas have been well developed both empirically and theoretically.144–150 Such buffering can be achieved either through genetic redundancy (more than one gene having the same or similar function) or through alternative metabolic routes, where the complexity of metabolic pathways allows recruitment of different pathways to reach the same phenotypic endpoint. In effect a functional polymorphism expressed during fetal development or post-natal growth may influence the expression of a wide range of other genes, leading to changes that may compensate for the influence of the polymorphism. Put crudely, if a person has developed and grown from the intra-uterine period onwards within an environment in which one factor is perturbed (e.g. there is elevated fibrinogen due to genotype) then they may be rendered resistant to the influence of lifelong elevated circulating fibrinogen, through permanent changes in tissue structure and function that counterbalance its effects. In intervention trials—for example, RCTs of folate supplementation—the intervention is generally randomized to participants during their middle-age; similarly in observational studies of this issue, folate intake or plasma homocysteine levels are ascertained during adulthood. In Mendelian randomization, on the other hand, randomization occurs before birth. This leads to important caveats when attempting to relate the findings of conventional observational epidemiological studies to the findings of studies carried out within the Mendelian randomization paradigm.

The most dramatic demonstrations of developmental compensation come from knockout studies—where a functioning gene is essentially removed from an organism. The overall phenotypic effects of such knockouts have often been much lower than knowledge of the function of the genes would predict, even in the absence of others genes carrying out the same function as the knock-out gene.151–154 For example, pharmacological inhibition demonstrates that myoglobin is essential to maintain energy balance and contractile function in the myocardium of mice, yet disrupting the myoglobin gene resulted in mice devoid of myoglobin with no disruption of cardiac function.155 A second example relates to the substantial experimental and epidemiological evidence that prostaglandins formed via cyclooxygenase 1 (COX-1) pathways maintain stomach function and that drugs (such as NSAIDS) that inhibit COX-1 produce stomach ulceration.156,157 A ‘Mendelian randomization’ test of this observation would be to relate genetic inhibition of COX-1 to stomach ulceration—with the prediction being that an association would be found. The most dramatic genetic inhibition of COX-1 is its absence, in a knockout preparation. Yet mice with such a knockout do not develop ulcers.158,159 The interpretation of findings from knockout studies is complex,152–154 however, as is illustrated in the latter case by the fact that recent studies have suggested that COX-1 inhibition without COX-2 inhibition only produces ulcers in the presence of an environmental challenge.160,161

In the field of animal genetic engineering studies—such as knockout preparations or transgenic animals manipulated so as to over-express foreign DNA—the interpretive problem created by developmental compensation is well recognized.151–154 Conditional preparations—in which the level of transgene expression can be induced or suppressed through the application of external agents—are now being utilized to investigate the influence of such altered gene expression after the developmental stages during which compensation can occur.152 Thus, further evidence on the issue of genetic buffering should emerge to inform interpretations of both animal and human studies.

Most examples of developmental compensation relate to dramatic genetic or environmental insults, thus it is unclear whether the generally small phenotypic differences induced by common functional polymorphisms will be sufficient to induce compensatory responses. The fact that the large gene–environment interactions that have been observed often relate to novel exposures that have not been present during the evolution of a species (e.g. drug interactions)162 may indicate that homogenization of response to exposures that are widely experienced—as would be the case with the products of functional polymorphisms or common mutations—has occurred; canalizing mechanisms may be paticularly relevant in these cases. Only further work on the basic mechanisms of developmental stability and how this relates to relatively small exposure differences during development will allow these considerations to be taken forward. This leaves Mendelian randomization in the somewhat unsatisfactory position of facing a potential problem that cannot currently be adequately assessed.

Lack of suitable polymorphisms for studying modifiable exposure of interest

An obvious limitation of Mendelian randomization is that it can only examine areas for which there are functional polymorphisms (or markers linked to such functional polymorphisms) that are relevant to the modifiable exposure of interest. In the context of genetic association studies more generally it has been pointed out that in many cases even if a locus is involved in a disease-related metabolic process there may be no suitable marker or functional polymorphism to allow study of this process.163 For example, in the vitamin C and CHD association referred to earlier, SLC23A1—a gene encoding for the vitamin C transporter SVCT1, that is particularly responsible for vitamin C transport by intestinal cells—would be an attractive candidate for Mendelian randomization studies. However, a search for variants failed to find any common SNP that could be used in such a way.164 Clearly what is possible will depend on further empirical evidence regarding the density of markers and functional polymorphisms within the human genome.

Gene–environment interaction and Mendelian randomization

Mendelian randomization is one way in which genetic epidemiology can inform understanding about environmental determinants of disease. A more conventional approach has been to study interactions between environmental exposures and genotype.165,166 Several issues arise in this regard.

The most reliable findings in genetic association studies relate to the main effects of polymorphisms on disease risk.12 The power to detect meaningful gene–environment interaction is low,162 with the result being that there are a large number of reports of spurious gene–environment interactions in the medical literature.106 Mendelian randomization is most powerful when studying modifiable exposures that are measured poorly and/or considerably confounded, such as dietary factors. Given measurement error—particularly if this is differential with respect to other factors influencing disease risk—interactions are both difficult to detect and often misleading when, apparently, they are found. Thus for essentially universal exposures (such as blood folate levels or dietary folate intake) detecting the main effect of a genotype which mimics the influence of this exposure (such as MTHFR) is more reliable than attempting to detect the particular influence of genotype in essentially arbitrarily defined population subgroups, where there is considerable scope for creative thinking with regard to the hypothesis under test (for example, plausible reasons could be given for assuming that the association between MTHFR and CHD should be seen amongst either people with low folate intake, or people with high folate intake). In the case of NAT2 polymorphisms and colon cancer claims have been made that, despite the lack of an overall effect in many studies, rapid acetylators are at increased risk if they consume red meat,167 processed meat,168 fried meat169 and various other meat products. Clearly a large number of sub-group analyses could be carried out in this case, and the lack of a main effect, while differing sub-groups apparently show an effect within different studies, does not provide very robust evidence of a real biological interaction. Since a large proportion of the population within the countries in which these studies have been carried out consume the meat products that have, variously, been shown to interact with NAT2 genotype, an overall effect should be seen in these populations.

The situation is perhaps different with exposures which differ qualitatively rather than quantitatively between individuals. For example it is sensible to restrict examination of the influence of polymorphisms relating to organophosphate deactivation just amongst those who are occupationally exposed, as in the sheep dip example we discussed earlier.101 Similarly examining the effect of a polymorphism relating to alcohol metabolism can sensibly be carried out amongst people who drink some alcohol.95

Conclusions

Mendelian randomization provides a promising means of examining the effects of modifiable exposures on disease risk. In Box 7 we summarise the key issues regarding this methodology. Interestingly Mendelian randomization within epidemiology reflects similar thinking among transgenic animal researchers. Williams and Wagner consider that ‘A properly designed transgenic experiment can be a thing of exquisite beauty in that the results support absolutely unambiguous conclusions regarding the function of a given gene or protein within the authentic biological context of an intact animal. A transgenic experiment may provide the most rigorous test possible of a mechanistic hypothesis that was generated by previous observational studies. A successful transgenic experiment can cut through layers of uncertainty that cloud the interpretation of the results produced by other experimental designs.’154 The problems of interpreting some aspects of transgenic animal studies may also apply to Mendelian randomization within genetic epidemiology, however, and linked progress across the fields of genomics, animal experimentation and epidemiology will better define the scope of Mendelian randomization in future. For the present, however, it is probably fair to say that the method offers a more robust approach to understanding the effect of some modifiable exposures on health outcomes than does much conventional observational epidemiology. Where possible randomized controlled trials remain the final arbiter of the effects of interventions intended to influence health, however.

Box 7
Summary and key messages

Genetic association studies with functional polymorphisms can provide powerful evidence on mechanisms of disease and potential interventions

Such studies are considerably less prone to confounding than conventional risk-factor epidemiology

Liability to bias—in particular publication bias—is high; perhaps higher than in conventional risk factor epidemiology

Effect sizes are likely to be small; sample sizes need to be very large

The value of such studies is higher the better the functional consequences of the polymorphisms are characterized

Polymorphism–disease associations with relatively small phenotypic influence provide more information of mechanistic/ intervention import than catastrophic rare mutations

Pleiotrophy and linkage disequilibrium can produce confounding, thus a full investigation of potential confounding should be carried out in such studies

Suitable polymorphisms to study particular exposures may not be available

Morphogenic stability/developmental adaptation/canalization create important caveats to the interpretation of such studies; this has been under-appreciated in some presentations of Mendelian randomization

Table 1

Comparison of potential confounding factors associated with Methylene tetrahydrofolate reductase (MTHFR) polymorphisms and plasma homocysteine. The former is not confounded, but the latter is

 MTHFR genotype 
Risk factors TT
 N = 349 CT
 N = 340 CC
 N = 76 P-value 
Smoking—current % (95% CI) 17 (13–21) 23 (18–27) 19 (11–31) 0.9 
Systolic BP mean (SD) mmHg 147 (24) 146 (24) 150 (23) 0.2 
Total cholesterol mean (SD) mmol/l 6.3 (1.1) 6.2 (1.1) 6.3 (1.3) 0.7 
Manual social class % (95% CI) 52 (47–58) 58 (52–64) 53 (40–65) 0.9 
 MTHFR genotype 
Risk factors TT
 N = 349 CT
 N = 340 CC
 N = 76 P-value 
Smoking—current % (95% CI) 17 (13–21) 23 (18–27) 19 (11–31) 0.9 
Systolic BP mean (SD) mmHg 147 (24) 146 (24) 150 (23) 0.2 
Total cholesterol mean (SD) mmol/l 6.3 (1.1) 6.2 (1.1) 6.3 (1.3) 0.7 
Manual social class % (95% CI) 52 (47–58) 58 (52–64) 53 (40–65) 0.9 
 Plasma homocysteine tertiles 
 4.5–9.9 μmol/l
 N = 268 10.0–12.4 μmol/l
 N = 252 12.5–74.4 μmol/l
 N = 257  
Source: British Regional Heart Study: Dewsbury and Maidstone data. 
Smoking—current % (95% CI) 16 (12–21) 14 (10–20) 30 (24–36) <0.001 
Systolic blood pressure mean (SD) mmHg 142 (22) 148 (25) 150 (25) <0.001 
Total cholesterol mean (SD) mmol/l 6.3 (1.1) 6.3 (1.2) 6.2 (1.1) 0.5 
Manual social class % (95% CI) 48 (42–55) 55 (48–61) 63 (56–69) 0.001 
 Plasma homocysteine tertiles 
 4.5–9.9 μmol/l
 N = 268 10.0–12.4 μmol/l
 N = 252 12.5–74.4 μmol/l
 N = 257  
Source: British Regional Heart Study: Dewsbury and Maidstone data. 
Smoking—current % (95% CI) 16 (12–21) 14 (10–20) 30 (24–36) <0.001 
Systolic blood pressure mean (SD) mmHg 142 (22) 148 (25) 150 (25) <0.001 
Total cholesterol mean (SD) mmol/l 6.3 (1.1) 6.3 (1.2) 6.2 (1.1) 0.5 
Manual social class % (95% CI) 48 (42–55) 55 (48–61) 63 (56–69) 0.001 
Table 2

APOE, apolipoproteins, and myocardial infarction (MI) risk120

 Apo A-1 Apo B MI risk 
The relative risks for MI by the three genotypes are floating absolute risks, and therefore 95% CI are given around all categories, including the baseline referent category. 
ε3/2 1.24 g/l 0.90 g/l 1.0 (0.89–1.13) 
ε3/3 1.22 g/l 1.03 g/l 1.18 (1.12–1.24) 
ε3/4 1.20 g/l 1.07 g/l 1.37 (1.26–1.48) 
Per allele –0.022 g/l +0.077 g/l 1.11 (1.06–1.17) 
 Apo A-1 Apo B MI risk 
The relative risks for MI by the three genotypes are floating absolute risks, and therefore 95% CI are given around all categories, including the baseline referent category. 
ε3/2 1.24 g/l 0.90 g/l 1.0 (0.89–1.13) 
ε3/3 1.22 g/l 1.03 g/l 1.18 (1.12–1.24) 
ε3/4 1.20 g/l 1.07 g/l 1.37 (1.26–1.48) 
Per allele –0.022 g/l +0.077 g/l 1.11 (1.06–1.17) 
Table 3

Reasons for inconsistent genotype–phenotype associations

Adapted from refs 106, 133
True variation 
Variation of allelic association between subpopulations: (1) disease causing allele in linkage disequilibrium with different marker alleles in different populations; or (2) different variants within the same gene contribute to disease risk in different populations 
Effect modification by other genetic or environmental factors that vary between populations 
Spurious variation 
Genotyping errors 
Misclassification of phenotype 
Confounding by population structure 
Lack of power 
Chance 
Publication bias 
Adapted from refs 106, 133
True variation 
Variation of allelic association between subpopulations: (1) disease causing allele in linkage disequilibrium with different marker alleles in different populations; or (2) different variants within the same gene contribute to disease risk in different populations 
Effect modification by other genetic or environmental factors that vary between populations 
Spurious variation 
Genotyping errors 
Misclassification of phenotype 
Confounding by population structure 
Lack of power 
Chance 
Publication bias 
Figure 1

Estimates of the effects of an increase of 15.7 μmol/l plasma vitamin C on coronary heart disease 5-year mortality estimated from observational epidemiological EPIC study27 and randomized controlled Heart Protection Study.28 (EPIC m = men, age-adjusted; EPIC m* = men, adjusted for systolic blood pressure, cholesterol, body mass index, smoking, diabetes, and vitamin supplement use; EPIC w = women, age-adjusted; EPIC w* = women, adjusted for systolic blood pressure, cholesterol, body mass index, smoking, diabetes, and vitamin supplement use)

Figure 1

Estimates of the effects of an increase of 15.7 μmol/l plasma vitamin C on coronary heart disease 5-year mortality estimated from observational epidemiological EPIC study27 and randomized controlled Heart Protection Study.28 (EPIC m = men, age-adjusted; EPIC m* = men, adjusted for systolic blood pressure, cholesterol, body mass index, smoking, diabetes, and vitamin supplement use; EPIC w = women, age-adjusted; EPIC w* = women, adjusted for systolic blood pressure, cholesterol, body mass index, smoking, diabetes, and vitamin supplement use)

Figure 2

Genes, alleles, genotypes

Figure 2

Genes, alleles, genotypes

Figure 3

Mendelian randomization in parent–offspring design

Offspring should have an equal chance of receiving either of the alleles that the parents have at any particular locus

Figure 3

Mendelian randomization in parent–offspring design

Offspring should have an equal chance of receiving either of the alleles that the parents have at any particular locus

Figure 4

Pathways of homocysteine metabolism. Homocysteine is re-methylated to form methionine. Methylene tetrahydrofolate reductase (MTHFR) participates in this conversion of homocysteine to methionine, through influencing levels of 5-methyl tetrahydrofolate. Homocysteine is also metabolized by the transsulfuration pathway to cysteine, which is vitamin B6 dependent

Figure 4

Pathways of homocysteine metabolism. Homocysteine is re-methylated to form methionine. Methylene tetrahydrofolate reductase (MTHFR) participates in this conversion of homocysteine to methionine, through influencing levels of 5-methyl tetrahydrofolate. Homocysteine is also metabolized by the transsulfuration pathway to cysteine, which is vitamin B6 dependent

Figure 5

Inheritance of MTHFR polymorphism, homocysteine and neural tube defects

Figure 5

Inheritance of MTHFR polymorphism, homocysteine and neural tube defects

Figure 6

Associations between polymorphisms of the alcohol dehydrogenase (ADH) enzyme (γ1γ1 are fast alcohol metabolizers and γ2γ2 are slow metabolizers) and high density lipoprotein levels in men (A) and women (B), stratified by alcohol consumption.95 Reprinted with permission

Figure 6

Associations between polymorphisms of the alcohol dehydrogenase (ADH) enzyme (γ1γ1 are fast alcohol metabolizers and γ2γ2 are slow metabolizers) and high density lipoprotein levels in men (A) and women (B), stratified by alcohol consumption.95 Reprinted with permission

*
30th Thomas Francis Jr Memorial Lecture, to be delivered by George Davey Smith at the University of Michigan, School of Public Health, 6 March 2003.

We thank Richard Gray for information about the first usage of the term ‘Mendelian randomization‘, Martijn Katan for discussing his early conceptualization of these issues, Sheila Bird, who suggested the term ‘Factorial Mendelian Randomization’, and the following who commented on earlier drafts of this paper: Helen Colhoun, Ian Day, David Gunnell, Andrew Hattersley, Nancy Krieger, Debbie Lawlor, David Leon, John Lynch, Peter McCarron, Paul McKeigue, Tony McMichael, Andy Ness, Neil Pearce, Rodolfo Saracci, Jonathan Sterne, Ezra Susser, Martin Tobin and Jan Vandenbrooke.

References

1
Butler D. Epidemiology set to get fast-track treatment.
Nature
 
2001
;
414
:
139
.
2
Khoury MJ, Dorman JS. The human genome epidemiology network (HuGE Net)
Am J Epidemiol
 
1998
;
148
:
1
–3.
3
Khoury MJ. Human Genome Epidemiology (HuGE): translating advances in human genetics into population-based data for medicine and public health.
Genet Med
 
1999
;
1
:
71
–73.
4
Khoury MJ, Burke W, Thomson EJ (eds). Genetics and Public Health in the 21st Century. Oxford: Oxford University Press, 2000, pp. 10–11.
5
Porta M. The genome sequence is a jazz score.
Int J Epidemiol
 
2003
;
32
:
29
–31.
6
Vineis P, Schulte P, McMichael AJ. Misconceptions about the use of genetic tests in populations.
Lancet
  ;
357
:
709
– 12.
7
Katan MB. Apolipoprotein E isoforms, serum cholesterol, and cancer.
Lancet
 
1986
;
i
:
507
– 08.
8
Ames BN. Cancer prevention and diet: Help from single nucleotide polymorphisms.
PNAS
 
1999
;
96
:
12216
–18.
9
Rothman N, Wacholder S, Caporaso NE, Garcia-Closas M, Buetow K, Fraumeni JF. The use of common genetic polymorphisms to enhance the epidemiologic study of environmental carcinogens.
Biochimica Biophysica Acta
 
2001
;
1471
:
C1
–C10.
10
Brennan P. Gene environment interaction and aetiology of cancer: what does it mean and how can we measure it?
Carcinogenesis
 
2002
;
23
:
381
–87.
11
Youngman LD, Keavney BD, Palmer A et al. Plasma fibrinogen and fibrinogen genotypes in 4685 cases of myocardial infarction and in 6002 controls: test of causality by ‘Mendelian randomization’.
Circulation
 
2000
;
102
(Suppl.II):
31
–32.
12
Clayton D, McKeigue PM. Epidemiological methods for studying genes and environmental factors in complex diseases.
Lancet
 
2001
;
358
:
1356
–60.
13
Fallon UB, Ben-Shlomo Y, Davey Smith G. Homocysteine and coronary heart disease. Heart Online 14 March 2001 (http://heart.bmjjournals.com/cgi/eletters/85/2/153)
14
Keavney B. Genetic epidemiological studies of coronary heart disease.
Int J Epidemiol
 
2002
;
31
:
730
–36.
15
Davey Smith G, Ebrahim S. Data dredging, bias and confounding
BMJ
 
2002
;
325
:
1437
–38.
16
Feinstein AR. Scientific standards in epidemiologic studies of the menace of daily life.
Science
 
1988
;
242
:
1257
–63.
17
Taubes G. Epidemiology faces its limits.
Science
 
1995
;
269
:
164
–69.
18
Le Fanu J. The Rise and Fall of Modern Medicine. New York: Little Brown, 1999.
19
Skrabanek P. False Premises, False Promises. Whithorn: Tarragon Press, 2000.
20
Davey Smith G. Reflections on the limitations to epidemiology.
J Clin Epidemiol
 
2001
;
54
:
325
–31.
21
Peto R, Doll R, Buckley JD, Sporn MB. Can dietary beta-carotene materially reduce human cancer rates?
Nature
 
1981
;
290
:
201
–08.
22
Willett WC. Vitamin A and lung cancer.
Nutr Rev
 
1990
;
48
:
201
–11.
23
Alpha-Tocopherol, Beta Carotene Cancer Prevention Study Group. The effect of vitamin E and beta carotene on the incidence of lung cancer and other cancers in male smokers.
N Engl J Med
 
1994
;
330
:
1029
–35.
24
Omenn GS, Goodman GE, Thornquist MD et al. Effects of a combination of beta carotene and vitamin A on lung cancer and cardiovascular disease.
N Engl J Med
 
1996
;
334
:
1150
–55.
25
Hennekens CH, Buring JE, Manson JE et al. Lack of effect of long-term supplementation with beta carotene on the incidence of malignant neoplasms and cardiovascular disease.
N Engl J Med
 
1996
;
334
:
1145
–49.
26
Hooper L, Ness AR, Davey Smith G. Antioxidant strategy for cardiovascular diseases.
Lancet
 
2001
;
357
:
1705
–06.
27
Khaw K-T, Bingham S, Welch A et al. Relation between plasma ascorbic acid and mortality in men and women in EPIC-Norfolk prospective study: a prospective population study.
Lancet
 
2001
;
357
:
657
–63.
28
Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of antioxidant vitamin supplementation in 20 536 high-risk individuals: a randomised placebo-controlled trial.
Lancet
 
2002
;
360
:
23
–33.
29
Davey Smith G, Phillips AN. Confounding in epidemiological studies: why ‘independent’ effects may not be all they seem.
BMJ
 
1992
;
305
:
757
–59.
30
Phillips A, Davey Smith G. How independent are ‘independent’ effects? Relative risk estimation when correlated exposures are measured imprecisely.
J Clin Epidemiol
 
1991
;
44
:
1223
–31.
31
Phillips AN, Davey Smith G. The design of prospective epidemiological studies: more subjects or better measurements?
J Clin Epidemiol
 
1993
;
46
:
1203
–11.
32
Weiss NS. Can the ‘specificity’ of an association be rehabilitated as a basis for supporting a causal hypothesis?
Epidemiology
 
2002
;
13
:
6
–8.
33
Petitti DB, Perlman JA, Sidney S. Postmenopausal estrogen use and heart disease.
N Engl J Med
 
1986
;
315
:
131
–32.
34
Phillips A. Balancing quantity and quality when designing epidemiological studies.
Int J Epidemiol
 
2003
;
32
:
58
–59.
35
Phillips AN, Davey Smith G. Cigarette smoking as a potential cause of cervical cancer: has confounding been controlled?
Int J Epidemiol
 
1994
;
23
:
42
–49.
36
Greenland S. Basic methods for sensitivity analysis.
Int J Epidemiol
 
1996
;
25
:
1107
–16.
37
Davey Smith G. The uses of ‘The Uses of Epidemiology’.
Int J Epidemiol
 
2001
;
30
:
1146
–55.
38
Gray R, Wheatley K. How to avoid bias when comparing bone marrow transplantation with chemotherapy.
Bone Marrow Transplant
 
1991
;
7
(Suppl.3):
9
–12.
39
Ljungman P, De Witte T, Verdonck L et al. Bone marrow transplantation for acute myeloblastic leukaemia: an EBMT Leukaemia Working Party prospective analysis from HLA-typing.
Br J Haematol
 
1993
;
84
:
61
–66.
40
Keating S, de Witte T, Suciu S et al. The influence of HLA-matched sibling donor availability on treatment outcome for patients with AML: an analysis of the AML 8A study of the EORTC Leukaemia Cooperative Group and GIMENA.
Br J Haematol
 
1998
;
102
:
1344
–53.
41
Burnett AK, Wheatley K, Goldstone AH et al. The value of allogenic bone marrow transplant in patients with acute myeloid leukaemia at differing risk of relapse: results of the UK MRC AML 10 trial.
Br J Haematol
 
2002
;
118
:
385
–400.
42
Bleakley M, Shaw PJ, Nielsen JM. Allogenic bone marrow transplantation for childhood relapsed acute lymphoblastic leukemia: comparison of outcome in patients with and without a matched family donor.
Bone Marrow Transplant
 
2002
;
31
:
1
–7.
43
Harrison G, Richards S, Lawson S et al. on behalf of the MRC Childhood Leukaemia Working Party. Comparison of allogeneic transplant versus chemotherapy for relapsed childhood acute lymphoblastic leukaemia in the MRC UKALL R1 trial.
Ann Oncol
 
2000
;
11
:
999
–1006.
44
Mendel G. Experiments in Plant Hybridization (1865). http://www.mendelweb.org/archive/Mendel.Experiments.txt (accessed 12 Dec 2002).
45
Elston R, Olson J, Palmer L (eds). Biostatistical Genetics and Genetic Epidemiology. Chichester: John Wiley & Sons, 2002, p. 285.
46
Terwilliger JD, Goring HHH. Gene mapping in the 20th and 21st centuries: statistical methods, data analysis and experimental design.
Hum Biol
 
2000
;
72
:
63
–132.
47
Talking Glossary: National Human Genome Research Institute. www.genome.gov/glossary.cfm (accessed 12 Dec 2002).
48
Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM).
Am J Hum Genet
 
1993
;
52
:
506
–16.
49
Ahsan H, Hodge SE, Heiman GA, Begg MD, Susser ES. Relative risk for genetic associations: the case-parent triad as a variant of case-cohort design.
Int J Epidemiol
 
2002
;
31
:
669
–78.
50
Risch NJ. Searching for genetic determinants in the new millennium.
Nature
 
2000
;
405
:
847
–56.
51
Garcia-Palmiere MR, Sorlie PD, Costas R, Havlik RJ. An apparent inverse relationship between serum cholesterol and cancer mortality in Puerto Rico.
Am J Epidemiol
 
1981
;
114
:
29
–40.
52
McMichael AJ, Jensen OM, Parkin DM, Zaridze DG. Dietary and endogenous cholesterol and human cancer.
Epidemiol Rev
 
1984
;
6
:
192
–216.
53
Wessel N, Liestol K, Maehlen J, Brorson SH. The apolipoprotein E epsilon4 allele is no risk factor for prostate cancer in the Norwegian population.
Br J Cancer
 
2001
;
85
:
1418
.
54
Niemi M, Kervinen K, Kiviniemi H et al. Apolipoprotein E phenotype, cholesterol and breast and prostate cancer.
J Epidemiol Community Health
 
2000
;
54
:
938
–39.
55
Moysich KB, Freudenheim JL, Baker JA et al. Apolipoprotein E genetic polymorphism, serum lipoproteins, and breast cancer risk.
Mol Carcinog
 
2000
;
27
:
2
–9.
56
Ford ES, Smith SJ, Stroup DF, Steinberg KK, Mueller PW, Thacker SB. Homocyst(e)ine and cardiovascular disease: a systematic review of the evidence with special emphasis on case-control studies and nested case-control studies.
Int J Epidemiol
 
2002
;
31
:
59
–70.
57
Homocysteine Lowering Trialists’ Collaboration. Lowering blood homocysteine with folic acid based supplements: meta-analysis of randomized controlled trials.
BMJ
 
1998
;
316
:
894
–98.
58
Brattstro″m L, Wilcken DEL. Homocysteine and cardiovascular disease: cause or effect?
Am J Clin Nutr
 
2000
;
72
:
315
–23.
59
Ueland PM, Refsum H, Beresford SAA, Vollset SE. The controversy over homocysteine and cardiovascular risk.
Am J Clin Nutr
 
2000
;
72
:
324
–32.
60
Clarke R. An updated review of the published studies of homocysteine and cardiovascular disease.
Int J Epidemiol
 
2002
;
31
:
70
–71.
61
Schnyder G, Roffi M, Flammer Y, Pin R, Hess OM. Effect of homocysteine-lowering therapy with folic acid, vitamin B12 and vitamin B6 on clinical outcome after percutaneous coronary intervention. The Swiss Heart Study: a randomized controlled trial.
JAMA
 
2002
;
288
:
973
–79.
62
Schnyder G, Roffi M, Flammer Y et al. Decreased rate of coronary restenosis with lowering of plasma homocysteine levels.
N Engl J Med
 
2001
;
345
:
1593
–600.
63
Vermeulen E. Stehouwer C, Twisk J et al. Effect of homocysteine lowering treatment with folic acid plus vitamin B6 on progression of subclinical atherosclerosis: a randomized placebo controlled trial.
Lancet
 
2000
;
355
:
517
–22.
64
Brattström L, Wilcken D, Öhrvik J, Brudin L. Common methylenetetrahydrofolate reductase gene mutation leads to hyperhomocysteinemia but not to vascular disease: the result of a meta-analysis.
Circulation
 
1998
;
98
:
2520
–26.
65
The Homocysteine Studies Collaboration. Homocysteine and risk of ischemic heart disease and stroke. A meta-analysis.
JAMA
 
2002
;
288
:
2015
–22.
66
Klerk M, Verhoef P, Clarke R et al. MTHFR 677C→T polymorphism and risk of coronary heart disease. A meta-analysis.
JAMA
 
2002
;
288
:
2023
–31.
67
Clarke R, Lewington S, Donald A et al. Underestimation of the importance of homocysteine as a risk factor for cardiovascular disease in epidemiological studies.
J Cardiovasc Risk
 
2001
;
8
:
363
–69.
68
Scholl TO, Johnson WG. Folic acid: influence on the outcome of pregnancy.
Am J Clin Nutr
 
2000
;
71
(Suppl.):
1295S
–303S.
69
MRC Vitamin Study Research Group. Prevention of neural tube defects: Results of the Medical Research Council vitamin study.
Lancet
 
1991
;
338
:
131
–37.
70
Czeizel AE, Dudás I. Prevention of the first occurrence of neural-tube defects by periconceptional vitamin supplementation.
New Engl J Med
 
1992
;
327
:
1832
–35.
71
Botto LD, Yang Q. 5,10-Methylenetetrahydrofolate reductase gene variants and congenital anomalies: a HuGE Review.
Am J Epidemiol
 
2000
;
151
:
862
–77.
72
Steegers-Theunissen RPM, Boers GHJ, Trijbels FJM et al. Maternal hyperhomocysteinemia: a risk factor for neural tube defects?
Metabolism
 
1994
;
43
:
1475
–80.
73
Rosenquist TH, Finnell RH. Genes, folate and homocysteine in embryonic development.
Proc Nutr Soc
 
2001
;
60
:
53
–61.
74
Mills JL, McPartlin MJ, Kirke PN et al. Homocysteine metabolism in pregnancies complicated by neural-tube defects.
Lancet
 
1995
;
345
:
149
–51.
75
Andersson A, Hultberg B, Brattstrom L, Isaksson A. Decreased serum homocysteine in pregnancy.
Eur J Clin Chem Clin Biochem
 
1992
;
30
:
377
–79.
76
Anonymous. Prevention of neural tube defects: results of the Medical Research Council Vitamin Study. MRC Vitamin Study Research Group.
Lancet
 
1991
;
338
:
131
–37.
77
Vollset SE, Refsum H, Irgens LM, Emblem BM, Tverdal A, Gjessing HK. Plasma total homocysteine, pregnancy complications, and adverse pregnancy outcomes: the Hordaland Homocysteine Study.
Am J Clin Nutr
 
2000
;
71
:
962
–68.
78
Marugame T, Tsuji E, Kiyohara C et al. Relation of plasma folate and methylenetetrahydrofolate reductase C677T polymorphism to colorectal cancer.
Int J Epidemiol
 
2003
;
32
:
64
–66.
79
Fallon UB. Colon cancer, folate and genetic status.
Int J Epidemiol
 
2003
;
32
:
67
–70.
80
Fenech M. The role of folic acid and Vitamin B12 in genomic stability of human cells.
Mutat Res
 
2001
;
475
:
57
–67.
81
Houlston RS, Tomlinson IPM. Polymorphisms and colorectal tumor risk.
Gastroenterology
 
2001
;
121
:
282
–301.
82
Le Marchand L, Donlon T, Hankin JH, Kolonel LN, Wilkens LR, Seifried A. B-vitamin intake, metabolic genes, and colorectal cancer risk (United States).
Cancer Causes Control
 
2002
;
13
:
239
–48.
83
Franco RF, Simões BP, Tone LG, Gabellini SM, Zago MA, Falcão. The methylenetetrahydrofolate reductase C677T gene polymorphism decreases the risk of childhood acute lymphocytic leukaemia.
Br J Haematol
 
2001
;
115
:
616
–18.
84
Skibola CF, Smith MT, Kane E et al. Polymorphisms in the methylenetetrahydrofolate reductase gene are associated with susceptibil-ity to acute leukemia in adults.
Proc Natl Acad Sci USA
 
1999
;
96
:
12810
–15.
85
Zingg JM, Jones PA. Genetic and epigenetic aspects of DNA methylation on genome expression, evolution, mutation and carcinogenesis.
Carcinogenesis
 
1997
;
18
:
869
–82.
86
Marmot M. Commentary: Reflections on alcohol and coronary heart disease.
Int J Epidemiol
 
2001
;
30
:
729
–34.
87
Bovet P, Paccaud F. Commentary: Alcohol, coronary heart disease and public health: which evidence-based policy?
Int J Epidemiol
 
2001
;
30
:
734
–37.
88
Klatsky AL. Commentary: Could abstinence from alcohol be hazardous to your health?
Int J Epidemiol
 
2001
;
30
:
739
–42.
89
Shaper AG. Editorial: alcohol, the heart, and health.
Am J Public Health
 
1993
;
83
:
799
–801.
90
Hart CL, Davey Smith G, Hole DJ, Hawthorne VM. Alcohol consumption and mortality from all causes, coronary heart disease, and stroke: results from a prospective cohort study of Scottish men with 21 years of follow up.
BMJ
 
1999
;
318
:
1725
–29.
91
Rimm E. Commentary: Alcohol and coronary heart disease—laying the foundation for future work.
Int J Epidemiol
 
2001
;
30
:
738
–39.
92
Eriksson CJP, Fukunaga T, Sarkola T et al. Functional relevance of human ADH polymorphism.
Alcohol Clin Exp Res
 
2001
;
25
:
157S
–63S.
93
Agarwal DP. Genetic polymorphisms of alcohol metabolizing enzymes.
Pathol Biol
 
2001
;
49
:
703
–09.
94
Whitfield JB, Nightingale BN, Bucholz KK, Madden PAF, Heath AC, Martin NG. ADH Genotypes and alcohol use and dependence in Europeans.
Alcohol Clin Exp Res
 
1998
;
22
:
1463
–69.
95
Hines LM, Stampfer MJ, Ma J et al. Genetic variation in alcohol dehydrogenase and the beneficial effect of moderate alcohol consumption on myocardial infarction.
N Engl J Med
 
2001
;
344
:
549
–55.
96
Tadel M, Goldman D. Pharmacogenetics of alcohol response and alcoholism: the interplay of genes and environmental factors in thresholds for alcoholism.
Drug Metabolism and Disposition
 
2001
;
29
:
489
–94.
97
Sun F, Tsuritani I, Yamada Y. Contribution of genetic polymorphisms in ethanol-metabolizing enzymes to problem drinking behavior in middle-aged Japanese men.
Behav Genet
 
2002
;
32
:
229
–36.
98
Nakamura Y, Amamoto K, Tamaki S et al. Genetic variation in aldehyde dehydrogenase 2 and the effect of alcohol consumption on cholesterol levels.
Atherosclerosis
 
2002
;
164
:
171
–77.
99
Haskell WL, Camargo C, Williams PT et al. The effect of cessation and resumption of moderate alcohol intake on serum high-density-lipoprotein subfractions.
N Engl J Med
 
1984
;
310
:
805
–10.
100
Burr ML, Fehily AM, Butland BK et al. Alcohol and high-density-lipoprotein cholesterol: a randomized controlled trial.
Br J Nutr
 
1986
;
56
:
81
–86.
101
Cherry N, Mackness M, Durrington P et al. Paraoxonase (PON1) polymorphisms in farmers attributing ill health to sheep dip.
Lancet
 
2002
;
359
:
763
–64.
102
D’Errico A, Taioli E, Chen X, Vineis P. Genetic metabolic polymorphisms and the risk of cancer: a review of the literature.
Biomarkers
 
1996
;
1
:
149
–73.
103
Hein DW, Doll MA, Fretland AJ et al. Molecular genetics and epidemiology of the NAT1 and NAT2 acetylation polymorphisms.
Cancer Epidemiol Biomark Prev
 
2000
;
9
:
29
–42.
104
Vineis P, McMichael AJ. Interplay between heterocyclic amines in cooked meat and metabolic phenotype in the aetiology of colon cancer.
Cancer Causes Control
 
1996
;
7
:
479
–86.
105
Roberts-Thomson I, Ryan PR, Khoo K et al. Diet, acetylator phenotype and risk of colorectal neoplasia.
Lancet
 
1996
;
347
:
1372
–74.
106
Colhoun H, KcKeigue PM, Davey Smith G. Problems of reporting genetic associations with complex outcomes. Lancet 2003, in press.
107
Tybjaerg-Hansen A, Agerholm-Larsen B, Humphries SE, Abildgaard S, Schnohr P, Nordestgaard BG. A common mutation (G_455A) in the β-Fibrinogen promoter is an independent predictor of plama fibrinogen, but not of ischaemic heart disease.
J Clin Invest
 
1999
:
3034
–39.
108
Van der Bom JG, De Maat MPM, Bots ML et al. Elevated plasma fibrinogen. Cause or consequence of cardiovascular disease?
Arterioscler Thromb Vasc Biol
 
1998
;
18
:
621
–25.
109
Doggen CJM, Bertina RM, Manger Cats V, Rosendaal FR. Fibrinogen polymorphisms are not associated with the risk of myocardial infarction.
Br J Haematol
 
2000
;
110
:
935
–38.
110
Danesh J, Collins R, Appleby P, Peto R. Association of fibrinogen, c-reactive protein, albumin, or leukocyte count with coronary heart disease.
JAMA
 
1998
;
279
:
1477
–82.
111
Brunner E, Davey Smith G, Marmot M, Canner R, Beksinska M, O’Brien J. Childhood social circumstances and psychosocial and behavioural factors as determinants of plasma fibrinogen.
Lancet
 
1996
;
347
:
1008
–13.
112
Antithrombotic Trialists’ Collaboration. Collaborative meta-analysis of randomised trials of antiplatelet therapy for prevention of death, myocardial infarction, and stroke in high risk patients.
BMJ
 
2002
;
324
:
71
–86.
113
Anand SS, Yusuf S. Oral anticoagulant therapy in patients with coronary artery disease: a meta-analysis.
JAMA
 
1999
;
282
:
2058
–67.
114
The Coronary Drug Project Research Group: Clofibrate and niacin in coronary heart disease. The coronary drug project research group.
JAMA
 
1975
;
231
:
360
–80.
115
Meade T, Zuhrie R, Cook C, Cooper J. Bezafibrate in men with lower extremity arterial disease: randomised controlled trial.
BMJ
 
2002
;
325
:
1139
.
116
Ebrahim S, Davey Smith G, McCabe C et al. Cholesterol and coronary heart disease: screening and treatment.
Qual Health Care
 
1998
;
7
:
232
–39.
117
Scientific Steering Committee on behalf of the Simon Broome Register Group. Risk of fatal coronary heart disease in familial hypercholesterolaemia.
BMJ
 
1991
;
303
:
893
–96.
118
Hallman DM, Boerwinkle E, Saha N et al. The apolipoprotein E polymorphism: a comparison of allele frequencies and effects in nine populations.
Am J Hum Genet
 
1991
;
49
:
338
–49.
119
Frikke-Schmidt R, Nordestgaard BG, Agerholm-Larsen B, Schnohr P, Tybjaerg-Hansen A. Context-dependent and invariant associations between lipids, lipoproteins, and apolipoproteins and apolipoprotein E genotype.
J Lipid Res
 
2000
;
41
:
1812
–22.
120
Keavney BD, Youngman LD, Palmer A et al. Large-scale test of hypothesized associations between polymorphisms of lipid-related genes and myocardial infarction in about 5000 cases and 6000 controls.
Circulation
 
2000
;
102
(Suppl.II):
852
.
121
Davignon J, Gregg RE, Sing CF. Apolipoprotein E polymorphism and atherosclerosis.
Arteriosclerosis
 
1988
:
8
:
1
–21.
122
Heart Protection Study Collaborative Group. MRC/BHF Heart Protection Study of cholesterol lowering with simvastatin in 20 536 high-risk individuals: a randomised placebo-controlled trial.
Lancet
 
2002
;
360
:
7
–22.
123
Smith J. Apolipoproteins and aging: emerging mechanisms.
Ageing Research Reviews
 
2002
;
1
:
345
–65.
124
Eichner JE, Dunn ST, Perveen G, Thompson DM, Stewart KE, Stroehla BC. Apolipoprotein E polymorphism and cardiovascular disease: a HuGE Review.
Am J Epidemiol
 
2002
;
155
:
487
–95.
125
Wilson PWF, Schaefer EJ, Larson MG, Ordovas JM. Apolipoprotein E alleles and risk of coronary disease.
Arterioscl Thromb Vasc Biol
 
1996
;
16
:
1250
–55.
126
http://www.ncbi.nlm.nih.gov/Omim (accessed 12 Dec 2002).
128
http://hgvbase.cgb.ki.se/about.htm (accessed 12 Dec 2002).
129
Langlois MR, Delanghe JR, De Buyzere ML, Bernard DR, Ouyang J. Effect of haptoglobin on the metabolism of vitamin C.
Am J Clin Nutr
 
1997
;
66
:
606
–10.
130
Delanghe J, Langlois M, Duprez D, De Buyzere M, Clement D. Haptoglobin polymorphism and peripheral arterial occlusive disease.
Atherosclerosis
 
1999
;
145
:
287
–92.
131
Delanghe J, Cambier B, Langlois M et al. Haptoglobin polymorphism, a genetic risk factor in coronary artery bypass surgery.
Atherosclerosis
 
1997
;
132
:
215
–19.
132
De Bacquer D, De Backer G, Langlois M, Delanghe J, Kesteloot H, Kornitzer M. Haptoglobin polymorphism as a risk factor for coronary heart disease mortality.
Atherosclerosis
 
2001
;
157
:
161
–66.
133
Cardon LR, Bell JI. Association study designs for complex diseases.
Nature Rev: Genetics
 
2001
;
2
:
91
–99.
134
Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias.
J Natl Cancer Inst
 
2000
;
92
:
1151
–58.
135
Wacholder S, Rothman N, Caporaso N. Counterpoint: Bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer.
Cancer Epidemiol Biomark Prev
 
2002
;
11
:
513
–520.
136
Cardon LR, Palmer LJ. Wagging the dog? Population stratification and spurious allelic association.
Lancet
 
2003
;
361
:
598
–604.
137
Sijbrands EJG, Westengorp RGJ, Defesche JC, De Meier PHEM, Smelt AHM, Kastelein JJP. Mortality over two centuries in large pedigrees with familial hypercholesterolaemia: family tree mortality study.
BMJ
 
2001
;
322
:
1019
–23.
138
Pimstone SN, Sun X-M, Du Souich C, Frohlich JJ, Hayden MR, Soutar AK. Phenotypic variation in heterozygous familial hypercholesterolemia.
Arterioscler Thromb Vasc Biol
 
1998
;
18
:
309
–15.
139
Tabor HK, Risch NJ, Myers RM. Candidate-gene approaches for studying complex genetic traits: practical considerations.
Nat Rev Genetics
 
2002
;
3
:
391
–97.
140
Stephens JC, Schneider JA, Tanguay DA et al. Haplotype variation and linkage disequilibrium in 313 human genes.
Science
 
2001
;
293
:
489
–93.
141
Osier MV, Pakstis AJ, Soodyall H et al. A global perspective on genetic variation at the ADH genes reveals unusual patterns of linkage disequilibrium and diversity.
Am J Hum Genet
 
2002
;
71
:
84
–99.
142
Glebart WM. Databases in genomic research.
Science
 
1998
;
282
:
659
–61.
143
Waddington CH. Canalization of development and the inheritance of acquired characteristics.
Nature
 
1942
;
150
:
563
–65.
144
Wilkins AS. Canalization: a molecular genetic perspective.
BioEssays
 
1997
;
19
:
257
–62.
145
Rutherford SL. From genotype to phenotype: buffering mechanisms and the storage of genetic information.
BioEssays
 
2000
;
22
:
1095
–105.
146
Gibson G, Wagner G. Canalization in evoluationary genetics: a stabilizing theory?
BioEssays
 
2000
;
22
:
372
–80.
147
Hartman JL, Garvik B, Hartwell L. Principles for the buffering of genetic variation.
Science
 
2001
;
291
:
1001
–04.
148
Debat V, David P. Mapping phenotypes: canalization, plasticity and developmental stability.
Trends in Ecology and Evolution
 
2001
;
16
:
555
–61.
149
Kitami T, Nadeau JH. Biochemical networking contributes more to genetic buffering in human and mouse metabolic pathways than does gene duplication.
Nature Genet
 
2002
;
32
:
191
–94.
150
Gu Z, Steinmetz LM, Gu X, Scharfe C, Davis RW, Li W-H. Role of duplicate genes in genetic robustness against null mutations.
Nature
 
2003
;
421
:
63
–66.
151
Morange M. The Misunderstood Gene. Cambridge: Harvard University Press, 2001.
152
Bolon B, Galbreath E. Use of genetically engineered mice in drug discovery and development: wielding Occam’s razor to prune the product portfolio.
Int J Toxicol
 
2002
;
21
:
55
–64.
153
Gerlai R. Gene targeting: technical confounds and potential solutions in behavioural and brain research.
Behavioural Brain Research
 
2001
;
125
:
13
–21.
154
Williams RS, Wagner PD. Transgenic animals in integrative biology: approaches and interpretations of outcome.
J Appl Physiol
 
2000
;
88
:
1119
–26.
155
Garry DJ, Ordway GA, Lorenz JN, Radford ER, Chin RW, Grange R et al. Mice without myoglobin.
Nature
 
1998
;
395
:
905
–08.
156
Gabriel SE, Jaakimainen L, Bombardier C. Risk for serious gastro-intestinal complications related to use of nonsteroidal anti-inflammatory drugs: a meta-analysis.
Ann Intern Med
 
1991
;
115
:
787
–96.
157
Warner TD, Giuliano F, Vojnovic I, Bukasa A, Mitchell JA, Vane JR. Nonsteroid drug selectivities for cyclo-oxygenase-1 rather than cyclo-oxygenase-2 are associated with human gastrointestinal toxicity: A full in vitro analysis.
Proc. Natl. Acad. Sci
 , USA
1999
;
96
:
7563
–68.
158
Langenbach R, Morham SG, Tiano HF et al. Prostaglandin synthase gene disruption in mice reduces arachidonic acid-induced inflammation and indomethacin-induced gastric ulceration.
Cell
 
1995
;
83
:
483
–92.
159
De Witt D, Smith WL. Yes, but do they still get headaches?
Cell
 
1995
;
83
:
345
–48.
160
Peskar BM. Role of cyclooxygenase isoforms in gastric mucosal defense.
Journal of Physiology—Paris
 
2001
;
95
:
3
–9.
161
Gretzer B, Maricic N, Respondek M, Schuligoi R, Peskar BM. Effects of specific inhibition of cyclo-oxygenase-1 and cyclo-oxygenase-2 in the rat stomach with normal mucosa and after acid challenge.
British Journal of Pharmacology
 
2001
;
132
:
1565
–73.
162
Wright AF, Carothers AD, Campbell H. Gene-environment interactions—the BioBank UK study.
Pharmacogenomics Journal
 
2002
;
2
:
75
–82.
163
Weiss K, Terwilliger J. How many diseases does it take to map a gene with SNPs?
Nat Genet
 
2000
;
26
:
151
–57.
164
Erichsen HC, Eck P, Levine M, Chanock S. Characterization of the genomic structure of the human vitamin C transporter SVCT1 (SLC23A2).
J Nutr
 
2001
;
131
:
2623
–27.
165
Perera FP. Environment and cancer: who are susceptible?
Science
 
1997
;
278
:
1068
–73.
166
Mucci LA, Wedren S, Tamimi RM, Trichopoulos D, Adami HO. The role of gene-environment interaction in the aetiology of human cancer: examples from cancers of the large bowel, lung and breast.
Journal of Internal Medicine
 
2001
;
249
:
477
–93.
167
Chen J, Stampfer MJ, Hough HL et al. A prospective study of N-acetyltransferase genotype, red meat intake, and risk of colorectal cancer.
Cancer Res
 
1998
;
58
:
3307
–11.
168
Kampman E, Slattery ML, Bigler J et al. Meat consumption, genetic susceptibility, and colon cancer risk: a United States multi-center case-control study.
Cancer Epidemiol Biomarkers Prev
 
1999
;
8
:
15
–24.
169
Welfare MR, Cooper J, Bassendine MF, Daly AK. Relationship between acetylator status, smoking, and diet and colorectal cancer risk in the north-east of England.
Carcinogenesis
 
1997
;
18
:
1351
–54.
170
Randomized trial of cholesterol lowering in 4444 patients with coronary heart disease: the Scandinavian Simvastin survival Study (4S).
Lancet
 
1994
;
344
:
1383
–89.
171
Heart Protection Study Collaborative Group. MRC/BHF heart protection Study of cholesterol lowering with simvastatin in 20 536 high-risk individuals.
Lancet
 
2002
;
360
:
7
–22.
172
Shepherd J, Cobbe SM, Ford I et al. for the West of Scotland Coronary Prevention Study Group. Prevention of coronary heart disease with pravastatin in men with hypercholesterolemia.
N Engl J Med
 
1995
;
333
:
1301
–07.
173
Bateson W. Mendel’s Principles of Heredity. Cambridge: Cambridge University Press, 1909.
174
Fisher RA. Has Mendel’s work been rediscovered?
Annals of Science
 
1936
;
1
:
115
–37.
175
Sapp J. The nine lives of Gregor Mendel. In: Le Grand HE (ed.) Environmental Enquiries. London: Kluwer Academic Publishers, 1990, pp. 137–66.