The classical twin study, wherein the similarity of monozygotic (MZ) twins is compared with that of dizygotic (DZ) twins to estimate the heritability of a trait (proportion of total variance due to genetic factors), is one of the most widely used and powerful designs available to the (genetic) epidemiologist. Its origin is frequently attributed to the Victorian polymath Francis Galton and to his 1875 article in particular.1 However, although the article shows Galton in all his dazzling eclecticism and grappling with nebulous concepts of genetic and environmental influence, a close reading shows that he had not grasped the distinction between MZ and DZ twins that gives the method its incisive power. In addition, even if he had, it would be another 43 years before R.A. Fisher2 developed the quantitative genetic theory that would enable numerical estimates of heritability.

Galton’s predicament in almost seeing the whole story but being thwarted by one critical piece of missing theory is reminiscent of his cousin Charles Darwin’s in Origin of Species.3 Darwin realized that a mechanism of inheritance was critical to his theory of natural selection and put forward his own theory of blending inheritance, which was fundamentally flawed because it halved genetic variation in every generation. The correct theory of particulate inheritance (which maintained variation from one generation to the next) was not published by Gregor Mendel until 1865, but its significance was not appreciated until 35 years later. Analogously, Galton fumbled towards the fundamental distinction of DZ (multiple ovulation) and MZ (zygotic fission) twins—‘The reader will understand that the word “twins” is a vague expression which covers two very dissimilar events; the one corresponding to the progeny of animals that have usually more than one young at birth, and the other corresponding to those double-yolked eggs that are due to two germinal spots in a single ovum.’ But just as we think he has ‘got it’, he veers away with: ‘Twins may be divided into three groups, so distinct that there are not many intermediate instances; namely, strongly alike, moderately alike, and extremely dissimilar.’

In retrospect, we see that Galton too was hampered by ignorance of Mendel’s unheralded article, as well as its later elaboration by Fisher (some 7 years after Galton’s death in 1911). Fisher developed quantitative genetic theory to show that sibling pairs, of which DZ twins are a special case, share on average half their genes in common. Recently, genome-wide genotyping of more than 11 000 sibling and DZ pairs has confirmed the accuracy of this theory and also the subsequent prediction that the standard deviation of gene sharing amongst siblings is about 4% so that 2.5% of sibling pairs share >58% of genes identical by descent and another 2.5% share <42%.4 We can see then that even for traits with strong genetic determination, Galton’s ‘extremely dissimilar’ class comprises mainly DZ twins sharing fewest genes in common, whereas his ‘strongly alike’ class probably comprises a few DZ pairs sharing more genes in common than average plus his suspected ‘double-yolked’ twins, which we now know to be MZ. His ‘moderately alike’ class would be mainly DZ twins with a sprinkling of MZ twins in which accidents of gestation and birth or infectious disease have cause great discordance.

Therefore, if Galton did not invent the twin method as it is now used, who did? The answer is not clear. The question was explored by Rende et al.5 who concluded that it had jointly been discovered in the USA and Europe in 1924. However, more recent delving into the literature by Mayo6 and Teo and Ball7 conclude that its origins were earlier, primarily in the work of Weinberg8 who clearly recognized that there were two types of twins and correctly postulated their aetiology, as well as Poll9 who recognized that the differences between MZ twins and triplets must be due to environmental differences uncontaminated by genetic factors. Teo and Ball also draw attention to the unpleasant fact that the German’s interest in twins was to a considerable extent driven by those who would later enthusiastically embrace ‘race hygiene’, culminating in the ghastly experiments by Mengele on twins in Auschwitz. This perversion of science is a legacy that those of us who spend our careers studying twins must both acknowledge and live with. Galton, who was a founder of the eugenics movement, is also sometimes pronounced guilty by retrospective association with the Nazi perversions, which occurred a generation after his death. The reader can judge whether this historicism is fair. What is indisputable is his amazing prescience in ‘The history of twins’ in which he anticipates many of the issues, and findings, tackled by complex trait geneticists for the next century in a wide range of behaviours and in diseases as diverse as asthma and schizophrenia (which Galton calls monomania).

As the distinction between MZ and DZ twins had not yet been made, what Galton could not articulate was the central plank of modern twin studies known as the equal environments assumption (EEA) in which it is assumed that the same range of environments are acting equally on MZ and DZ twin pairs to produce similarities and differences within pairs. Because this assumption is so critical to the unbiased estimate of heritability, it has been both asserted and denied vehemently and, more usefully, subjected to a great deal of empirical investigation, some of it ingenious, including making use of twins whose zygosity has been mistaken.1012 What seems clear from all this work is that to some extent MZ twins do experience more similar treatment than DZs; this is because their parents and other individuals respond to their greater genetic similarity rather than some arbitrary notion that they should be treated more alike. Even in these cases, there is precious little evidence that any difference in treatment is related to differences in the phenotype of interest. All in all, after two generations of close scrutiny, the EEA seems to be remarkably still valid.

Doubts about the EEA have traditionally been the main reason for questioning the validity of estimates of heritability from twin studies. These doubts have recently been fuelled by estimates from genome-wide association studies of the total genetic variance in linkage disequilibrium with all the single nucleotide polymorphisms (SNPs) on the commercial genome-wide association studies arrays. For a wide variety of traits and diseases, this SNP-associated variance is about half the heritability estimated from twin studies, and this deficit has been called the ‘missing heritability’.13,14 However, for height at least, when allowance is made for an imperfect association between SNP markers and causal variants, and for the fact that commercial arrays only contain common, but not rare, SNPs, the adjusted SNP-associated variance approaches that of twin studies.15 Moreover, making use only of the same densely genotyped 11 000 sibling pairs discussed earlier and regressing their genome-wide identity-by-descent on their difference in height, one arrives at an estimate of heritability of 86%, almost identical to that obtained from comparing the similarity of MZ and DZ twins.4 We tentatively conclude therefore that the deficit of current SNP-based estimates is not because heritability is missing, but merely that it is hiding and awaiting more powerful molecular techniques and larger samples to reveal it.

What we can be sure of is that were Galton alive today, he would be amazed and delighted at the profusion of twin studies making use of modern molecular techniques—expression and methylation arrays, proteomics and whole-genome and methylation-sensitive sequencing—to elucidate the exact environmental and genetic causes of phenotypic variation and disease predisposition (for a review see van Dongen et al.16). In ‘The history of twins’1, Galton spends a lot of time exploring cases of dramatic twin discordance (we infer that he is talking about MZs). Modern studies of discordant MZ twins have already revealed striking differences in methylation patterns in certain genes. In some instances, it can be inferred that the gene is causal,17 but in others, it is not clear whether the epigenetic discordance is cause or effect of the disease state.18 New longitudinal studies are just beginning in which twins are traced from their first visualization in utero and tissues are sampled at regular intervals,19,20 and in coming decades, these will yield answers to the questions that Galton formulated, which have challenged five generations of researchers since.

Conflict of interest: None declared.

References

1
Galton
F
The history of twins, as a criterion of the relative powers of nature and nurture
J Anthropol Inst
 , 
1875
, vol. 
5
 (pg. 
391
-
406
(Reprinted in Int J Epidemiol 2012;41:905–11)
2
Fisher
RA
The correlation between relatives on the supposition of Mendelian inheritance
Trans R Soc Edinb
 , 
1918
, vol. 
52
 (pg. 
399
-
433
)
3
Darwin
C
On the Origin of Species
 , 
1859
London, UK
John Murray
4
Visscher
PM
Macgregor
S
Benyamin
B
, et al.  . 
Genome partitioning of genetic variation for height from 11,214 sibling pairs
Am J Hum Genet
 , 
2007
, vol. 
81
 (pg. 
1104
-
10
)
5
Rende
RD
Plomin
R
Vandenberg
SG
Who discovered the twin method?
Behav Genet
 , 
1990
, vol. 
20
 (pg. 
277
-
85
)
6
Mayo
O
Early research on human genetics using the twin method: Who really invented the method?
Twin Res Hum Genet
 , 
2009
, vol. 
12
 (pg. 
237
-
45
)
7
Teo
T
Ball
LC
Twin research, revisionism and metahistory
Hist Hum Sci
 , 
2009
, vol. 
22
 (pg. 
1
-
23
)
8
Weinberg
W
Beiträge zur Physiologie und Pathologie der Mehrlingsgeburten beim Menschen
Pflugers Archiv für Gesamte Pysiologie
 , 
1901
, vol. 
88
 (pg. 
346
-
430
)
9
Poll
H
Über zwillingsforschung als hilfsmittel menschlicher erbkunde
Zeitschrift für Ethnologie
 , 
1914
, vol. 
46
 (pg. 
87
-
105
)
10
Heath
AC
Neale
MC
Hewitt
JK
Eaves
LJ
Fulker
DW
Testing structural equation models for twin data using LISREL
Behav Genet
 , 
1989
, vol. 
19
 (pg. 
9
-
35
)
11
Mitchell
KS
Mazzeo
SE
Bulik
CM
Aggen
SH
Kendler
KS
Neale
MC
An investigation of a measure of twins' equal environments
Twin Res Hum Genet
 , 
2007
, vol. 
10
 (pg. 
840
-
47
)
12
Martin
N
Boomsma
D
Machin
G
A twin-pronged attack on complex traits
Nat Genet
 , 
1997
, vol. 
17
 (pg. 
387
-
92
)
13
Lee
SH
Wray
NR
Goddard
ME
Visscher
PM
Estimating missing heritability for disease from genome-wide association studies
Am J Hum Genet
 , 
2011
, vol. 
88
 (pg. 
294
-
305
)
14
Manolio
TA
Collins
FS
Cox
NJ
, et al.  . 
Finding the missing heritability of complex diseases
Nature
 , 
2009
, vol. 
461
 (pg. 
747
-
53
)
15
Yang
J
Benyamin
B
McEvoy
BP
, et al.  . 
Common SNPs explain a large proportion of the heritability for human height
Nat Genet
 , 
2010
, vol. 
42
 (pg. 
565
-
69
)
16
van Dongen
J
Slagboom
EP
Draisma
HHM
Martin
NG
Boomsma
DI
The continuing value of twin studies in the omics era
Nat Rev Genet
  
(in press)
17
Oates
NA
Van Vliet
J
Duffy
DL
, et al.  . 
Increased DNA methylation at the AXIN1 gene in a monozygotic twin from a pair discordant for a caudal duplication anomaly
Am J Hum Genet
 , 
2006
, vol. 
79
 (pg. 
155
-
62
)
18
Javierre
BM
Fernandez
AF
Richter
J
, et al.  . 
Changes in the pattern of DNA methylation associate with twin discordance in systemic lupus erythematosus
Genome Res
 , 
2010
, vol. 
20
 (pg. 
170
-
79
)
19
Llewllyn
CH
van Jaarsveld
CHM
Johnson
L
Carnell
S
Wardle
J
Nature and nurture in infant appetite: Analysis of the Gemini twin birth cohort
Am J Clin Nutr
 , 
2012
, vol. 
91
 (pg. 
1172
-
79
)
20
Saffery
R
Morley
R
Carlin
JB
, et al.  . 
Cohort profile: The peri/post-natal epigenetic twins study
Int J Epidemiol
 , 
2012
, vol. 
41
 (pg. 
55
-
61
)