The first edition of this book was favourably reviewed in 2011, including in Significance and in the Journal of the Royal Statistical Society, Series A, where it was commended for its

‘comprehensive discussion about methods for data representation and graphical display’.

The new edition's cover quotes another review:

‘It is clearly written in plain language and the inclusion of R code is particularly useful’.

If those were so, something has gone dramatically wrong in preparing this new edition.

One cause may have been

‘the use of ggplot2 in addition to the base graphics and lattice packages of the first edition’,

compounded with a

‘minor glitch discovered in release 3.4.1 of R when new figures for the second edition were being drafted’.

These would not, however, excuse a dense and unintelligible rush through some theories of ‘the grammar of graphics’ and software design considerations leading to the unhelpful position

‘The choice between the base and lattice packages will be decided by ease of use. There will be periodic examples [through the book] where similar graphs will be produced by ggplot2. Readers can decide whether the ggplot2 code is to be preferred’.

Nor do the changes explain some slipshod and perverse terminology. Summarizing a data set through standard summary statistics is referred to as ‘data reduction, also known as data compression.’ (page 9). Never in my hearing, and the latter term has quite different resonance within computer science.

‘The primary purpose is descriptive for graphical displays that depict distributions of the population [my italics] of values for a single variable’.

(page 34) Is that really so? I think that most graphs display a sample.

The aspect ratio (simple ratio height/width) is bizarrely confused with the golden ratio (a constant), leading to an equation showing the golden ratio as its inverse followed by the advice,

‘The goal is to achieve a Golden Ratio equal to one’.

(page 24). This is compounded by

‘It is awkward to deal with a height dimension multiplied by an irrational number’

(well, we work with finite computers) and

‘One simplification would be to prepare a statistical graph that is 50% wider than tall. [aspect ratio 2/3] …. It is left to the reader to decide whether this simplification is as aesthetically pleasing in comparison with the Golden Section [sic]’.

The evaluation of statistical graph design is proposed as ‘technical merit and artistic impression’ (page 9), which is an attitude that I utterly reject as arbitrary and subjective. Despite suggesting the Golden Ratio (sic) as a graphical statistic in Chapter One, it seems to have been ignored throughout the rest of the book. All the figures are reproduced as full page width, generally occupying more than half a page in height. Fig. 2.1, for example, introducing the dot chart, is set up in a window of aspect ratio 1 but with half the width used for (oversized) category labels. As a result, the plotting area has an aspect ratio 1.7 and the graphic occupies about 125 cm2 to display 14 values. Pairs of graphs meant for comparison differ in overall size and in fount size (pages 94 and 95). Scatter plots and scatter plot matrices are drawn with each pane either square (page 366) or rectangular (page 367, 404 and 486), regardless of content or meaning.

Axis labels are software defaults, arbitrarily vertical or horizontal according to the package; ‘labels’ does not even feature in the index. Axis titles are even more rudimentary; many axes are simply ‘Percent’—of what? Outliers receive one brief mention as a feature of boxplots. There seem no suggestions to add explanatory text or to manipulate features within graphs.

What seems lacking in all the examples is any idea of progression; in my school, you should plot a graph, learn about the data from it and improve the graph to emphasize the lesson learned. All that come out of the examples is ‘this is one version; this is another; you choose’. Many are Aunt Sallys: a pie chart is drawn with 14 segments, whereas the clear usual advice is to use a maximum of seven in this form. (Keen says six; I have also seen five—it is a judgement.) Chernoff faces I thought long dead.

Perhaps the book can be commended as a source of code of its examples, but it disappoints even here:

‘The value [xaxs=] “r” is rarely, if ever, used when plotting graphical displays for this textbook as the results can be less than satisfactory’.

(page 38): in what way? Why is it there? Code for each graph is presented with no comments, no indentations for function calls split across lines and no thought for the semantics at line ends. If you read code like a computer does, you will be fine.

Finally, I should mention the other additions that increase this book by 20% over the first edition. Each chapter has a section of ‘Learning Outcomes’. Keen helpfully comments in the Preface that for

‘many professions, reporting course learning outcomes is a requirement of the process of review’.

So it might help your continuing professional development for the Royal Statistical Society. An appendix on vision contains irrelevant material on history, physics and anatomy before a description of colour vision variations and a taste of perceptual psychology. A second appendix describes in detail alternative colour-generating mechanisms and, in the last sentence of text, offers firm practical advice: having seen your graph on the computer screen,

‘when saving a color graphic to be printed afterwards on paper, pass the argument phrase colormodel = “cmyk” so that the image is converted … when saved’.

What is the learning outcome from this book? It has taught me a lesson.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)