Graphical data analysis uses graphics to reveal and display information in a data set. Graphical displays are useful for exploring structures in data, detecting outliers, identifying trends, spotting patterns and complementing statistical modelling and for many other purposes. If you have not heard of R before, it is a free software environment for statistical computing and graphics. One of the greatest strengths of the R language is its graphics capabilities. The book under review does an excellent job of discussing and showing how typical graphical data analysis tasks can be done with R.

Chapter 1 sets the scene, introducing the role and place of graphical data analysis in exploring data to discover information. Chapter 2 reviews the literature on graphics for data analysis and statistics, Web sites on data visualization and data sets that are used in the book. Chapter 3 looks at ways of displaying single continuous variables, mostly through histograms and boxplots. Chapter 4 looks at visualizing single categorical variables (nominal, ordinal or discrete), mostly by means of bar charts. Chapter 5 examines how a pair of continuous variables are related. It is all about scatter plots, smoothing them and grouping them into matrices (‘sploms’), where each variable is plotted against all the others. Chapter 6 considers visualization of multivariate continuous data, featuring parallel co-ordinate plots. These plots, in which all axes (one for each variable) are parallel to each other (as opposed to the two perpendicular axes in scatter plots), are discussed in the context of clustering, time series and indices data. Scaling, α-blending and other strategies to help to see outliers are also discussed. Chapter 7 presents ways of displaying combinations of categorical variables by using various types of mosaic plots, fluctuation diagrams, double-decker plots and other subgroup displays).

Chapter 8 explores techniques and approaches that can be used for an initial overview of a data set. Here scatter plot matrices, heat maps, parallel co-ordinate plots, glyphs, mosaic plots, trellis graphics and other graphical displays considered earlier in the book are applied to various data sets. Chapter 9 discusses data quality issues, focusing on visualizing missing values and outliers, both univariate and multivariate. Chapter 10 shows how to make visual comparisons of common quantities (e.g. percentages) at different levels of categorical variables (e.g. male and female). Among the functions that are used in this chapter are coefplot::coefplot() and extracat::facetshade(). Chapter 11 discusses time series plots. Graphic displays for single and multiple, regular and irregular time series are discussed in detail. Chapter 12 applies ensembles of graphics (groups of static displays for presenting several aspects of a data set simultaneously) to various data sets. It also contains several case-studies to help readers to test their graphical skills. Chapter 13 contains notes on graphics systems in R, general graphical conventions in statistics and coding tips for R graphics. Chapter 14 concludes the book by summarizing strengths and weaknesses of graphical data analysis.

In this book the package ggplot2 is mostly used. R packages lattice, gridExtra, GGally, extracat and vcd are used extensively throughout the book. Many more packages are used as the source of data for examples or exercises. The R code in the book was run using the package versions that were available at the end of 2014. Code chunks affected by subsequent changes in packages are said to be updated accordingly on the book’s Web site at http://www.gradaanwr.net. The package that accompanies the book, named GDAdata, is available from the Comprehensive R Archive Network.

Overall, the book is a very good introduction to the practical side of graphical data analysis using R. The presentation of R code and graphics output is excellent, with colours used when required. The book appears to be free of typographical and other errors, and its index is useful. Also, the book is well written and neatly structured. I enjoyed reading the book and can recommend it to anyone who wants to learn more about their data through graphics using R. It will also be a valuable asset for a library and as part of an undergraduate course in applied statistics.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)