An outline account is given of the work of nine major figures working mostly in the earlier two-thirds of the 20th century. Some comments are included about their personal characteristics.
This essay is far from a systematic history of statistics or even of statistical theory over a defined period. Rather, it is a personal comment on a number of major figures concerning their scientific contributions, and also recalling them as individuals with their inevitable foibles. These latter aspects, about which I have written quite frankly, are of course subject to appreciable errors of observation, especially so for recalled conversation reported here in direct speech.
The account focuses primarily on nine individuals. Two, R. A. Fisher (1890–1962) and H. Jeffreys (1891–1989), I heard lecture on various occasions but knew only slightly at a personal level. The others, M. S. Bartlett (1901–2001), J. Neyman (1894–1981), E. S. Pearson (1885–1980), F. Yates (1902–1984), L. J. Savage (1917–1971), H. E. Daniels (1912–2000) and J. W. Tukey (1915–2000), I met relatively frequently over a period of time. Inevitably the paper is rather Anglocentric; indeed, it is to some extent based on a triangle with one vertex in Gower Street, London, at University College, another vertex in what was then called the village of Harpenden, about 30 miles north-northwest of London, the home of Rothamsted Experimental Station, and the third vertex in Cambridge, a town about 60 miles north-northeast of London.
2. R. A. Fisher: a key figure
R. A. Fisher was in a real sense the driving figure behind a high proportion of developments in statistics during the earlier two-thirds of the 20th century. He read mathematics at Cambridge in the period just before the influence of G. H. Hardy succeeded in dragging British analysis, and the theory of functions, into line with continental European thinking. Thus, while Fisher was a very powerful mathematician, there is, so far as I know, no trace of a regularity condition in any of his papers. In an introduction to Fisher’s work in Sankhyā in 1938, P. C. Mahalanobis wrote: `The explicit statement of a rigorous argument interested him, but only on the important condition that such explicit demonstration of rigour was needed. Mechanical drill in the technique of rigorous statement was abhorrent to him, partly for its pedantry, and partly as an inhibition to the active use of the mind. He felt it was more important to think actively, even at the expense of occasional errors from which an alert intelligence would soon recover, than to proceed with perfect safety at a snail’s pace along well-known paths with the aid of the most perfectly designed mechanical crutches’ (Mahalanobis, 1938). This could be Fisher himself speaking!
At that earlier time, the term combination of observations was used particularly by physicists and astronomers to describe a mixture of numerical analysis and statistics, the normal distribution of errors, the notion of probable or standard errors, and the idea of least-squares fitting. Fisher took such a course from the astronomer Airy, and while Fisher’s initial interests were in a different direction, the course was presumably a strong initial influence on him.
Karl Pearson, in his period of intense activity up to 1914, had made extensive use of the correlation coefficient but had been unable to get beyond finding a standard error for the estimate. Fisher found the exact normal-theory distribution of the estimate. Difficulties arose when Fisher submitted the work to Biometrika, marking the start of a long period of bad relations between the two men, which continued until Fisher’s posthumous criticism of K. Pearson as a ‘peevish and intolerant old man’.
Fisher was first appointed to Rothamsted to analyse the long series of crop yields that had been collected on the experimental farm there. His interests rapidly expanded both into theoretical issues and, strongly related to those, into developments of new methods, set out in the revolutionary Statistical Methods for Research Workers (Fisher, 1925b), as well as, at about the same time, fundamental work on experimental design, somewhat expanded later in The Design of Experiments (Fisher, 1935b). Towards the end of this period, he had established his reputation as a geneticist with his formulation of the theory of natural selection.
The emphasis in Fisher’s work was on concepts and also on obtaining solutions to difficult problems in usable form. In particular, he used geometrical arguments to powerful effect. With the exception of a major paper ‘On the mathematical foundations of theoretical statistics’ (Fisher, 1922) and parts of the book Statistical Methods and Scientific Inference (Fisher, 1956), his general theoretical developments tend to be in a series of isolated components. For instance, the very short 1930 paper, tantalizingly entitled ‘Inverse probability’, essentially introduced confidence distributions for a single parameter, in effect nested sets of what were later called confidence intervals. This paper very probably led Fisher on to what was almost certainly a major error, the idea that these distributions could for several parameters be manipulated like probability distributions, leading to the notion of fiducial distributions.
Fisher is, of course, remembered as much for his work in genetics as in statistics, and his final formal position was as Professor of Genetics at Cambridge University. Several very prominent figures in the field of genetics began their careers under Fisher’s guidance and encouragement.
Fisher wrote beautiful if rather old-fashioned English. Someone once commented that some of his sentences needed reading very slowly three times, paying careful attention to the presence and absence of commas, and that if then the meaning was still unclear it must be because Fisher did not want it to be clear.
3. R. A. Fisher: some personal aspects
In my limited personal observation, Fisher was unusual in seeming not to like being asked questions, perhaps particularly on issues where he knew he might be on shaky ground.
When Harold Hotelling came from the U.S.A. to Rothamsted in around 1930, there being a suggestion that they should collaborate on a theoretical book centring on recent developments, Fisher was very welcoming. After Hotelling had settled in Harpenden and turned up for his first day of work, he asked Fisher time for discussion to clarify some points in Fisher’s recently published paper on the exact distribution of the squared multiple correlation coefficient, a pinnacle in the use of geometrical arguments to solve complex distributional problems; but Fisher’s reply was ‘No’. Hotelling later said that it took him almost a year to understand Fisher’s derivation. There are a number of possible explanations of this incident, but no suggestion whatever that personal animosity was involved.
Other later incidents are harsher. Just before David Finney was leaving Oxford for Aberdeen in 1954, he invited Fisher to give what I think was the last talk in his seminar series. It was known that Fisher was working on a book setting out his views on statistical inference, and Dennis Lindley, who unusually for the time had a car, very kindly offered to make what was then the long drive between Cambridge and Oxford, 80 miles and back in the days when, for the U.K. at least, main roads had not yet been invented. He took me and two doctoral students, Ewan Page and Wally Smith, to hear what Fisher had to say; it would, apparently, not have been feasible to ask Fisher to speak at the Statistical Laboratory in Cambridge. The four of us sat silently in the back while Fisher was very rude to a young questioner about what was then called Fieller’s problem and dealt sarcastically with a question from J. M. Hammersley about whether fiducial probability satisfied Kolmogorov’s axioms. More than 50 years after that event, David Finney spoke to me about the occasion, saying that a group of people had come over from Cambridge especially to disrupt the seminar. I reassured David that we had been totally quiet throughout and that the disruptions, or what were really enquiring questions, originated locally.
On another occasion, when a young student asked an innocent question about the inversion involved in fiducial probability, Fisher was dismissively insulting.
Fisher could, however, be extremely generous to colleagues in providing important ideas and encouraging them to publish under their own names. Although it is hard to be sure, I think that one rather extreme example was the Wishart distribution, the elegant derivation of which has all the signs of Fisher’s style. Wishart, it is said without consulting Fisher, accepted an offer to move from being Fisher’s deputy at Rothamsted to a position as Reader in the School of Agriculture at Cambridge. This unleashed a stream of hostility from Fisher that continued for the next 25 years and extended to activities in Cambridge associated with Wishart, who, it must be stressed, had a very beneficial influence on statistics at Cambridge both before and especially after the Second World War. He kept a reasonable balance between theory and application in the Statistical Laboratory.
4. Harold Jeffreys
Harold Jeffreys was known to Cambridge undergraduates in the early 1940s as the man who discovered, or indeed some said invented, the Earth. This was on the basis of his 1924 book The Earth (Jeffreys, 1924), which had revolutionized or even given birth to modern geophysics. While Jeffreys had from the start of his career an interest in the role of uncertainty in drawing conclusions, it seems possible that it was the challenging problems of interpreting observational data in geophysics that encouraged Jeffreys into developing his interests in the nature of statistical inference, leading eventually to The Theory of Probability (Jeffreys, 1939); there were revised editions of the book in 1948 and 1961, and it is still in print.
By probability Jeffreys meant objective degree of belief; he reserved the word chance for notions based on physical frequencies of occurrence. Jeffreys was thus continuing in the spirit of Laplace, and his development in particular provided a rigorous formal basis for improper, flat, priors and suggested the appropriate such prior for representing initial ignorance in various special problems. His book went on to develop posterior distributions for many interesting special problems, not numerically very different from confidence intervals, and also significance tests that required much more information than is used in a Fisherian-type simple significance test and which thus yielded, given the quite strong injection of further information, stronger answers. Although not noted at the time, there may, however, be difficulties if the dimension of the parameter space is large. Jeffreys’s notation strongly suggested links to Riemannian geometry with a metric defined by information, an idea that was not developed in any detail until much later.
In the earlier discussion of these issues, differences between Jeffreys and Fisher were apparent, but relations between the two remained reasonably cordial; Jeffreys described himself at one point as Fisher’s most disobedient student.
Jeffreys’s statistical work, highly influential though it was, formed a quite small proportion of his whole research and was, I think, dismissed to a certain extent by at least some of his mathematical physicist colleagues. He published on many other topics, perhaps most notably the very wide-ranging Methods of Mathematical Physics written with his wife Bertha Swirles Jeffreys, herself a mathematical physicist (Jeffreys & Jeffreys, 1946). This book combined mathematical rigour with an insistence on respecting physical motivation. The notion of Lebesgue integration was therefore rejected as being too remote from the motivating notion of area under a curve, and Laplace transforms as a device for solving ordinary differential equations with constant coefficients were also rejected because they involved the unnecessary assumption that the equations hold for all nonnegative $$ t $$.
As a student my first lectures on statistics were given by Jeffreys. As a wartime realization of the importance of statistics, he had been asked to give a short, virtually compulsory course to mathematics undergraduates. By general agreement Jeffreys was a very poor lecturer and manifestly did not at all enjoy the experience. He spoke quietly with a Northumbrian accent and his writing on the blackboard was not ideal. An exception was a masterly relatively informal lecture he once gave after dinner to a student society. It was on Markov chains, based I think on a draft of a section of the Jeffreys and Jeffreys book. I may be naïve in my recollection, but I think in those days the importance and interest of the subject matter and the style of presentation were considered two different aspects of a course of lectures, and that it was really the former that mattered to students.
Jeffreys was a fellow of the same Cambridge college as P. A. M. Dirac, one of the pioneers of quantum mechanics. After a formal college dinner Dirac said to one of their colleagues: ‘sat next to Jeffreys at dinner and he didn’t say a word all evening.’ About ten minutes later Jeffreys went up to the same man and said: ‘sat next to Dirac at dinner and he didn’t say a word all evening.’ The story is entirely believable.
Jeffreys lived to nearly 100 years. In his early to mid-90s he still cycled around Cambridge, having a series of minor and not-quite-so-minor accidents. In the end Bertha took the tyres off his bike.
5. M. S. Bartlett: a wide-ranging pioneer
Maurice Bartlett, when either a final-year undergraduate or a postgraduate student, went to Wishart’s lectures on statistics. As Wishart put it: ‘When I discussed my distribution I suggested it might be derived by characteristic functions. The next day a student, Bartlett, handed me the proof. I made sure he was an author of the resulting publication.’
Bartlett was at first sight an austere figure, not a natural communicator. I recall inviting him to give a seminar when I was at Birkbeck College, London. ‘As you will know from page so-and-so of my book’, he began, or at least that is my recollection. I had had the foresight to study the page in question and so understood at least a little something of the first ten minutes of the talk, but after that joined what I think was the rest of the audience. Yet I am quite sure this was not deliberate one-upmanship. Bartlett was a kind, thoughtful and caring man.
He had been penalized for daring to question some of Fisher’s views on inference in the 1930s. Bartlett introduced a number of key ideas, such as modifying the likelihood before applying maximum likelihood techniques, as well as wide-ranging special results and major contributions to time series analysis and multivariate analysis. His interest in applications was extensive and deep, from the foundations of quantum mechanics to, especially later, mathematical biology, and he wrote an important paper in the philosophical literature on the nature of probability. Some time in the late 1940s, he started to write with J. E. Moyal an account of stochastic processes, then in a phase of especially rapid development. Moyal was to deal with, I think, the connections with theoretical physics but never produced his part, and eventually Bartlett published his own account which had, I believe, been finished appreciably earlier. It is surely one of the almost-forgotten masterpieces of our field (Bartlett, 1955).
Bartlett, with H. E. Daniels, regarded the study of special stochastic models motivated by applications as an intrinsic part of statistics.
He had a friend in the late 1930s who married and emigrated to Canada. About 25 or more years later she was widowed and returned to the U.K., and Maurice, who had remained single, married her. They were a delightful couple, Sheila cheerful and outgoing and Maurice with his dignified but obviously very happy near-silence.
6. Frank Yates
Frank Yates’s contributions were deep and wide ranging but more directly focused on specific applications. He too read mathematics at Cambridge and, after a short period, worked for what was in those days called the Colonial Office organizing a land survey of the Gold Coast, now Ghana. During this period he developed a deep mastery of least squares, interestingly making no use of matrix algebra. When he returned to the U.K. he was appointed to be Fisher’s deputy at Rothamsted, replacing Wishart. There were three phases to Yates’s long career there. In the first he contributed major developments to the theory of experimental design, to factorial experiments and to various ramifications of incomplete block designs. The second phase, motivated I think by the perceived need for surveys of agricultural production as the Second World War became imminent, was characterized by a deep interest in sampling theory and methods. After the end of the war he was very early to see the potential of computers for statistical work and was the driving force behind important developments emanating from Rothamsted.
After his retirement he was appointed a Visiting Professor in the Department of Mathematics at Imperial College. He never gave a lecture there, I hope by mutual agreement, but was unfailingly helpful, bringing a distinctive viewpoint to colleagues and doctoral students over numerous matters.
7. Fisher and Yates
Fisher and Yates were a formidable pair, greatly admired for their massive contributions, with Yates in no sense subservient to Fisher. They took little part in general activities of the Royal Statistical Society, apart from both serving as President for a period. They did, however, take a deep interest in the British Region of the Biometric Society. In the late 1940s to 1950s the Region met three times a year between 2:00 and 5:30 pm to hear two papers, each of one hour’s duration plus discussion. Shortly after 2:15 pm Fisher and Yates would enter, having had a pub lunch, and would walk slowly to two seats in the front row; Fisher had poor eyesight and Yates suffered from tinnitus. The poor speaker often was unclear whether to stop, review the first portion of the lecture, or what. Yates’s deafness meant he spoke rather loudly and sometimes private comments about the talk from Yates to Fisher, which were by no means necessarily favourable, echoed around the room. Those were hard times.
In 1951 Yates published an assessment of the book Statistical Methods for Research Workers, celebrating the 25th anniversary of its publication (Yates, 1951). He lauded its high originality and strong impact but was critical of what he regarded as Fisher’s overemphasis on significance tests at the expense of estimation, a theme with a very contemporary resonance. The preface to the 1939 book of tables edited by Fisher and Yates contains much new material.
Jerzy Neyman initially worked in his native Poland. In 1923 he published a very original paper setting out a simple model of unit-treatment additivity in experimental design, in an agricultural context specifically. This was not translated from Polish into English until nearly 70 years later (Neyman, 1990). It has the distinction of being quite highly cited nowadays but presumably had little or no short-term impact. At the time Neyman was unaware of randomization as a technique of design and its consequences for analysis. Later, in 1935, he took the theme up in more detail with the consequences described below. In the mid to late 1920s Neyman visited London, I think quite regularly, to work with Egon Pearson to develop their theory of testing hypotheses. The initial object of this work was to clarify some of the ideas in Fisher’s Statistical Methods for Research Workers, and for some years relations with Fisher were entirely reasonable; however, they gradually deteriorated and collapsed completely after the 1935 paper discussed next. Neyman made central contributions to the theory of sampling.
In 1935 Neyman, in a paper written jointly with some Polish colleagues and presented to a section of the Royal Statistical Society (Neyman, 1935), considered the randomization-based analysis of the Latin square under a very general formulation of his original model that did not require unit-treatment additivity. His conclusion that error estimation in the Latin square was biased incensed Fisher, as is clear even from the formal report of the discussion, and this was the breaking point in relations between the two men. There followed in Biometrika a wide-ranging discussion of randomization but not a resolution of this particular issue. The topic was taken up in the 1950s by Kempthorne, who had left Rothamsted for Ames, Iowa, and Neyman’s conclusion was formally confirmed (Kempthorne, 1955). This led to further discussion after which, so far as can be seen, it was agreed that algebraically Neyman was right but he was addressing the wrong question, and that Fisher was right in claiming relevance of the standard Latin square analysis. Interestingly, essentially the same issue has recurred in some recent discussions and the mistakes of the past repeated. The issue is in part that main effects are rarely very relevant in the presence of serious interaction.
Neyman was, through the Berkeley Symposia in particular, deeply interested in applications on a wide front, bringing in leading workers from a range of fields for intensive discussion. I recall with great pleasure attending two of the symposia. One of Neyman’s characteristics was an insistence on mathematical rigour and precision of statement at all stages, in marked contrast with Fisher’s attitude noted above. A possibly oversimplified view is that rigour of this sort may sometimes be appropriate for general theory but can be a source of unnecessary effort and delay when it comes to dealing in a novel way with specific applied problems. I recall in my own case, after giving a paper to one of the symposia, Neyman saying something like ‘quite interesting but there must be some regularity conditions somewhere’. This was true enough but perhaps not the prime aspect of the work. On another occasion, at a seminar in the department I had used a Dirac delta function. ‘We don’t do that sort of mathematics here,’ said Neyman rather severely. I rather think that a year or two later it became respectable after a visit from an expert in generalized functions. As others have described, mention of Fisher was virtually forbidden.
In 1961 Neyman wrote two closely related papers in Japanese journals, titled ‘The 25th anniversary of my quarrel with R. A. Fisher’ and ‘Silver jubilee of my dispute with Fisher’ (Neyman, 1961). The common theme is that he recognized Fisher’s extreme ability at distributional calculations but considered him weak on concepts. This may seem misjudged in that the list of fruitful concepts introduced by Fisher is formidable: sufficiency, large-sample statistical information, conditional inference, analysis of variance, factorial experiments, randomization, discriminant analysis, and so on! Indeed, later Neyman (1967) wrote a careful and measured assessment describing Fisher as ‘a great scholar’, singling out not only Fisher’s immense skills at distributional calculations but also what Neyman called the theory of experimentation, that is, the notions around randomization and other techniques of experimental design.
Sadly, Neyman’s Berkeley Symposia ended after a political disagreement with some of his colleagues.
Constance Reid, in her biography of Neyman (Reid, 1982), has given a fascinating account of his life and work. There were considerable tensions within the department at Berkeley, starting in the early 1950s, which it would be out of place to discuss here, and Neyman undoubtedly evoked strong feelings in some other senior U.S. statisticians outside Berkeley.
A small incident may illustrate the situation. I recall that a party of about a hundred of us were at dinner in a restaurant, probably connected with one of the symposia. Mr Neyman, as he preferred to be called, welcomed the guests in a warm little speech and then announced that the menu was either steak or fish; vegetarians were rare in those days. ‘Who wants fish?’ Not a single hand went up. I expressed surprise to my neighbour that no one at all wanted fish. My neighbour replied, ‘but it’s Friday and Mr Neyman would not be pleased.’
In 1981 Neyman was due to speak in Poland at the European Meeting of Statisticians held in Wrocław. He died a month or so before the meeting and instead of his paper a memorial session was held. Two eminent speakers gave eloquent but somewhat impersonal eulogies of Neyman. The third speaker was F. N. David, who had known Neyman since working with K. Pearson in the 1930s; she had emigrated to California in around 1960. She said, in what I found a very moving tribute: ‘I knew him over 50 years. He could be quite impossible and we quarrelled strongly every six months or so. But I loved him.’
9. Egon Pearson
E. S. Pearson, son of Karl Pearson, did major work in the mid-1920s on robustness, although he did not use the word. He was, I think, concerned that some of the methods in Statistical Methods for Research Workers depended too strongly on normality assumptions. Fisher circumvented these concerns by an appeal to randomization arguments, backed up by a single numerical example! Most notably, E. S. Pearson began a highly fruitful period of collaboration with Neyman which continued over the next ten years or so. Some time before Neyman left for Berkeley in 1938, relations cooled; I do not know why.
The joint work of Neyman and Pearson forms the centrepiece of many textbook accounts of statistical theory, so detailed references are not given here. A flavour of the superficially relaxed speed of the times can be gained from their first paper (Neyman & Pearson, 1928).
In about 1930 E. S. Pearson spent some time at Bell Labs and became much interested in industrial quality control. Upon returning to London he played a major role in forming the Industrial, Agricultural and Research Section of the Royal Statistical Society, the focus for dragging that society into the 20th century. He was active in the British Standards Institution and prepared the very detailed British Standard 600. All or nearly all copies were destroyed in one of the early air raids of 1940, and it was rapidly replaced by BS600R, a much shorter and more practically oriented version compiled by two industrial workers, Dudding and Jennet. This was very influential during the Second World War, when the great majority of U.K. industry was in effect under government control and quality control standards could be rigorously enforced.
E. S. Pearson was a kind and dignified figure, saddened and perhaps bruised, I think, by the quarrels that swirled around his father, Fisher and Neyman. Biometrika had remained very firmly in K. Pearson’s control and rather little of permanent interest had been published in it after the First World War. On K. Pearson’s death, E. S. Pearson published a long appreciation of his father and then totally changed the emphasis of the journal, making it a prime place for publication of contemporary research, which it has remained.
10. Jimmie Savage
L. J., Jimmie, Savage came from a quite different background, having had mathematical training to a high level and been much influenced by Wald’s emphasis on decision theory; Wald himself had been killed in an aircraft accident in India. Earlier Savage had served as a research assistant to von Neumann. Savage visited Cambridge in 1953, bringing with him the typescript of his book The Foundations of Statistics (Savage, 1954). In this book the emphasis was on personalistic degree of belief, developing and much extending earlier discussions by F. P. Ramsey, B. de Finetti and I. J. Good. The draft was read by Frank Anscombe, Dennis Lindley and me. Dennis, who had become very interested in Jeffreys’s approach to statistics, found Savage’s personalistic treatment very appealing and became an exceptionally eloquent and original developer and advocate of the theory. I recall finding the book fascinating but ultimately unconvincing, at least as a basis for the type of applied statistical work in which I had been involved, a judgement that I think was correct. The approach placed prime emphasis on the internal consistency of an individual, You, not on connection with the real world or on communication of conclusions to others.
Savage initially took the view that the personalistic view generated interesting ideas well worth exploring, as was surely true, but this developed fairly rapidly into a more exclusive approach in which words such as honesty and dishonesty figured prominently. This caused Savage considerable difficulties at the University of Chicago, where he was originally based. Savage was a powerful lecturer, eloquent, swift and formidable in private and public discussion. He had extremely poor eyesight and, while this may seem fanciful, the combination of intellectual strength and physical limitation used to bring the parallel of Beethoven to my mind.
At a meeting at Birkbeck College, London, in 1960 there was a wide-ranging discussion of the issues involved in formulating a theory of statistical inference, centring on an account by Savage of his views. After the meeting I recall Savage saying to me: ‘The motto of the Bayesian approach should be the song “Anything you can do, I can do better”.’ I did reply.
11. Some main themes: a broad assessment
With one omission, Fisher, Jeffreys, Yates, Bartlett, Neyman, Pearson and Savage represent the main themes of statistical theory over the period under discussion. The omitted figure is Abraham Wald. He came from an econometric background and, influenced perhaps by von Neumann and Morgenstern’s Theory of Games and Economic Behavior (von Neumann & Morgenstern, 1944), sought to cast the whole of statistical theory in decision-theoretic terms. Despite the importance of specific decision-making problems, such as health screening and sampling inspection, most statistical problems, even if they have some decision-making element, do not fit easily into that formulation.
While of course many others contributed, to an appreciable extent current ideas on statistical inference stem from the work of the seven men mentioned above. Many of the issues that distinguish their approaches retain current relevance.
One initial base for comparing their contributions is their formal attitude towards the meaning of probability. At first sight there are broadly three such views: a stable frequency in a repetitive system, an objective degree of belief, and a personalistic degree of belief. Neyman took the first view and Jeffreys a mixture of the first two, while Savage strongly emphasized the third and Bartlett took an eclectic view.
The third, personalistic, approach is not an empirically based theory of how individual scientists behave in their beliefs about their subject field, a fascinating topic for empirical enquiry; rather, it is a specification of how beliefs are to be organized by an individual in order to be coherent, that is, internally consistent. Despite the undoubted interest of this approach, it seems relatively remote from the objectives of much statistical work because it is not sufficiently firmly anchored in the real world. Important initial formulations were due to the philosopher F. P. Ramsey in 1928, B. de Finetti later and, most importantly from a statistical perspective, I. J. Good, who strongly related his discussion to empirical statistical issues. As noted above, Jimmie Savage became a very influential figure in this setting.
The essence of Neyman’s, and to a lesser extent perhaps Egon Pearson’s, approach was to use the notion of long-run behaviour of procedures to characterize the uncertainty of conclusions. This follows the broad principle that measuring devices are calibrated by their performance when used. In his formal writing, albeit with less emphasis in his applied work, Neyman seemed not to accept the notion that the formal conclusion of a procedure could be applied to a specific case. The broad approach is often described as frequentist.
Fisher is often thought of as frequentist in his thinking, but this is rather misleading. He strongly emphasized that when probability was used to describe what underlay a set of data, he did not have in mind probability as a limiting frequency over a large number of repetitions. Rather, by probability Fisher meant a proportion in a hypothetical infinite population, the data being regarded as a random sample from that hypothetical population. This in particular allowed the associated methods to be applied to situations, such as studies of literary authorship, in which direct replication of the data was inconceivable.
Fisher’s work had many strands, focused on the revolutionary ideas in Statistical Methods for Research Workers and the associated concepts on experimental design. One such strand had a very high level of originality in providing solutions to challenging specific problems and, in particular, at resolving underlying issues in distributional theory. Fisher’s broad approach, emphasizing the variety of depths of formulation that may be appropriate, is best seen from his last book, Statistical Methods and Scientific Inference (Fisher, 1956), although it is not really a systematic development from scratch. Key sources include the wide-ranging 1922 paper, the first part of which stresses concepts such as likelihood and sufficiency, a development in 1925 introducing ancillary statistics (Fisher, 1925a), a 1930 paper on interval estimation, further studies of the likelihood in 1934, and a discussion of the $$2\times 2$$ contingency table in 1935 that has led to seemingly never-ending controversy (Fisher, 1935a).
Jeffreys, as already noted, specifically distinguished chance from probability. The estimation procedures he developed differ appreciably from those derived from other approaches typically only when the dimension of the parameter space is high. For significance tests and tests of hypotheses, the approaches of Fisher, Jeffreys, and Neyman and Pearson correspond, however, to considerably different formulations, and failure to distinguish between these has been responsible for apparently endless confusion.
In essence, for Fisher only the null hypothesis need be explicitly defined, and if the underlying distribution is discrete, the probabilities under the null hypothesis order the possible sample points according to their consistency with the null hypothesis. Arguably, while in discrete cases this often works well enough, it does seem to be a conceptual error, as illustrated for example by Spearman’s rank correlation coefficient: there the ordering in terms of null probability is highly irregular. Some external notion of ranking the sample points in order of consistency with the null hypothesis is required. Neyman and Pearson initially described their work as essentially a clarification of Fisher’s ideas, but in principle it was different in at least two respects: first, Neyman and Pearson required an explicit formulation of an alternative hypothesis; and then they required specification of the so-called Type I error rate, the probability of rejecting the null hypothesis when true. Jeffreys in a sense took this a step further by requiring prior probabilities for the various possibilities. Fisher and Jeffreys were focused on assessing evidence, whereas, formally at least, Neyman and Pearson took a more decision-based approach, in their theoretical work if not necessarily in applications. The explicit theory of the Neyman–Pearson formulation was encapsulated later in two masterly books by Erich Lehmann. Interestingly, in his final book, Fisher, Neyman, and the Creation of Classical Statistics, Lehmann (2011) recorded his regret that the personal tension mentioned above had essentially cut him off from Fisher’s work, and indeed Savage made a similar point.
Although Yates wrote one essentially critical paper on fiducial probability, he focused his work more on applications. For example, he developed a very systematic account of the design of experiments with many major innovations, in a sense stemming from Fisher’s paper of 1926 (Fisher, 1926).
12. Henry Daniels: another major figure
The statisticians discussed above all, in their different ways, had a major interest in and focus on the broad principles of statistical inference, and this makes connected comment on their impact easier. Another important figure, whose interests were much less focused on formal inference, was Henry Daniels. He read mathematics at Edinburgh and was, I think, influenced by Sir Edmund Whittaker, a major figure with wide-ranging interests. Towards the end of his time as an undergraduate, Henry Daniels published a short paper on a pure mathematical topic. After a period at Cambridge, he moved in 1936 to the Wool Industries Research Association in Leeds. This type of organization, common in those days, was supported partly by the government and partly by a compulsory levy on the industries. It undertook a mixture of specific applied investigations for industry and fundamental research. Henry developed a dual role there. He was a very skilled and imaginative experimental physicist, and ran what was called the fibre measurement lab, where he developed definitive and delicate methods of sampling and measuring such properties as fibre diameter and fibre length. He was, I believe, the moving force behind other important and definitive measurement techniques developed there. At the same time Henry persuaded his research colleagues, who largely consisted of physicists and chemists, that careful use of statistical methods of design and analysis was crucial in dealing with such highly variable material as wool.
His first words to me when I went to work under him in 1946 were to the effect that there was this topic called stochastic processes and it was essential to get to grips with it. To help with this he gave me four papers to read, interestingly not by the great Soviet and French pioneers, but three from the physics literature and one on stationary processes in the Bell System Technical Journal. In addition to his scientific breadth, Henry was a very powerful mathematician, strong especially in applying and extending techniques of asymptotic analysis to problems of applied probability. Some of his early work on strength problems of textile systems later turned out to be fundamental to the study of fibre composite materials. He had astonishing ability at lucid description of complex problems and their mathematical formulation, followed by superficially simple and elegant resolution of seemingly insoluble issues. I know from personal experience that the simplicity did not transfer when one tried to do something similar.
While the formal theory of statistical inference was not Henry’s primary interest, he lectured on that theme with clarity, elegance and originality, introducing for example the concept of asymptotic sufficiency. He was pressed by a number of colleagues to write the lecture notes up as a book but declined, saying modestly that all he was doing was to make Maurice Bartlett’s ideas a bit more comprehensible.
At a conference in 2000 in Wales, Henry talked enthusiastically at dinner about the results he was to present the next day, but he was taken ill during the night and did not recover consciousness.
I have not attempted to describe his work in any depth, because it lies largely outside the main theme of this essay. Nevertheless, Henry Daniels’s work was of the very highest calibre, and his impact on the subject through students and colleagues was, I believe, very broad and deep.
13. John Tukey
Of all the statisticians discussed in this essay, John Tukey is the most difficult to write about, partly but not only because he and his wife, Elizabeth, showed my wife and me great kindness, starting from when we first disembarked from the ship bringing us to the U.S.A. in 1955. He and S. S. Wilks were professors of statistics at Princeton, and John also played a major role in various U.S. government activities and held a key position at Bell Labs, then a wonderful institution, only later to be massacred at the behest of economic dogma. His scientific knowledge and interests seemed all-embracing.
At the time one of his concerns was fiducial probability, and when he visited the U.K. briefly he went to see Fisher to get clarification. I recall asking him soon after his return to Princeton how he had fared: ‘The old boy blew a fuse and threw me out,’ John said. Incidentally, John was a powerfully built man. I suspect that this experience persuaded him that the foundational aspects of statistical inference were best put to one side, and although his interests remained astonishingly broad he soon shifted his primary concerns to exploratory data analysis. This was aimed at redressing the overemphasis, particularly but not only in the U.S.A., on the formal aspects of statistical inference at the expense of empirical studies. The work was, I think, a little hindered by John’s fondness for inventing new words, sometimes very successfully, as with ‘bits’.
His lectures could be disconcerting. A long time could go by with nothing very tangible being said, and then there would be golden nuggets easily missed.
Some 20 or more years later, he unexpectedly came to see me at home in London one Sunday. What was I working on? I told him; it was something that he was highly unlikely to have known about. After a few moments of thought he made ten suggestions. Six I had already considered, two would clearly not work, and the other two were strong ideas which had not occurred to me even after long thought on the topic. This small incident illustrates one of his many strengths: the ability to comment searchingly and swiftly on a very wide range of issues.
14. A less Anglocentric perspective
Of course, the issues involved in the work discussed above are wholly international, while this essay is primarily U.K.-based. Two very influential and very different figures from outside the U.K. were Harald Cramér (1893–1989) in Sweden and Anders Hald (1913–2007) in Denmark.
Cramér had done major work in actuarial mathematics but was known primarily as a probabilist. In the mid-1940s he published, first in Sweden and then in the U.S.A., a very influential exposition of modern probability theory, leading on to a careful account of the developments of Fisher and of Neyman and Pearson. He was a dignified figure of authority and a fine lecturer. He became head of the entire Swedish university system.
Hald had industrial experience and his special interests were in industrial sampling and the history of statistics. He built up the very strong tradition of statistical teaching and research in Denmark and was extremely influential in promoting the subject throughout Europe. He too wrote an important book, with an emphasis on engineering applications and giving a lucid combination of basic theory and accounts of analysis of variance and other statistical methods. This was at a time when the very small number of statistics books published were mostly concerned with explaining in nonmathematical terms the methods emanating from Statistical Methods for Research Workers.
During this period strong work was also produced in the communist countries, particularly in the Soviet Union, developing there from the long history of Russian work in probability theory. Kolmogorov was a towering figure with very wide interests, including specific applications and concern with the more philosophical side of probability. Henry Daniels, Joe Gani and I met him once in a private apartment in Budapest, but communication was difficult; he spoke no English and our command of French and German, let alone Russian, was poor to nonexistent, so we were unable to persuade him to visit the U.K., despite giving the obvious reassurances. Boris Gnedenko had limited English too, but this did not prevent him making a forceful contribution in mathematical and general terms when he came on a short visit to Newcastle and London. One of his many earlier contributions was a definitive discussion of the limiting distribution of extreme values, a topic which Fisher had earlier resolved by his characteristically informal arguments.
In the German Democratic Republic there was distinctive work done both in probability theory and in statistics, and in Hungary the very strong mathematical tradition flourished, with Alfréd Rényi being a prominent contributor to topics ranging from order statistics to the axiomatic basis of probability, connecting with Jeffreys’s use of improper priors. One Polish emphasis, perhaps reflecting the residual impact of Neyman, was on agricultural statistics. Throughout the Cold War many, whatever their personal political opinions, thought it very important both in general terms and scientifically to have good relations with fellow scientists in the Eastern Bloc and Soviet Union. In the U.S.A. Neyman was, I believe, very insistent on having Soviet scientists attend the Berkeley Symposia.
I shall resist the temptation to list the names of the many others who made major contributions in the period described. All those mentioned made massive contributions to our field, and the comments above are far from a complete assessment of their work. Because of the informal nature of the present paper, it seems out of place to give full references for all the issues discussed; this would in any case be a formidably long list. Hald (2006) provides a thorough account of the history up to 1935. Lehmann (2011) gives much careful comparison of the work of Fisher and Neyman and provides comprehensive bibliographies, and Box (1978) and Reid (1982) contain biographies of these two titans. Thorough accounts of the lives of eight of the key figures discussed here are available in Biographical Memoirs of Fellows of the Royal Society.
I thank Valerie Isham and Anthony Davison for encouragement and advice, Clare Kavanagh of Nuffield College Library for her resourceful help, and the referees for constructive suggestions.