-
PDF
- Split View
-
Views
-
Cite
Cite
Sebastian Dietz, R Visualizations – Derive Meaning from Data, Journal of the Royal Statistical Society Series A: Statistics in Society, Volume 184, Issue 1, January 2021, Pages 401–402, https://doi.org/10.1111/rssa.12640
Close - Share Icon Share
‘R is one of the most popular scripting languages in the field of Data Science. It provides a set of inbuilt functions and libraries to visualize data and statistical models’ for better clarity. The approach is more technical when compared to the commercial tools and languages, like SAS or Tableau, as there is no drag and drop editor. Gerbing's book is a deep dive into the solutions for visual representation without requiring prior R-specific knowledge. Although it has a clear technical focus, it follows a didactical approach to data analysis, starting with definitions of continuous and categorial variables continuing with bar charts and boxplots to more complex things like time series decomposition. Topics are introduced in a clear language that does not extensively use mathematical terms and formulas. Fundamental statistical terms are explained in side notes. Examples are provided as R code and also the resulting graphics are shown. The book can be seen as an update and extension to an earlier one by the same author Gerbing (2014).
The text uses mostly the R package ggplot2, one of the most popular ones which includes many built-in features and allows to generate graphics without too much programming knowledge. It is compared to the widely used similar package lessR, built and maintained by the author. Comparison of the code and results and highlighting the differences runs through all chapters. The different software design approaches are explained at the beginning, most remarkably ggplot2 is more flexible in coding whereas lessR includes statistical functions also, for example a full regression analysis. If any visualization problem is not included in one of the two mentioned R packages, also solutions from other packages are explained, e.g. package vcd for generating mosaic plots. Finally, also embedding the Shiny extension—a graphical interface which currently is popular in the R scene—discussed as well as creating interactive graphics.
It is not only a textbook for undergraduate students of statistics, data or computer science and other quantitative subjects, but it may also be useful for professionals working on specific tasks of data representation. Its clear structure, not too technical language and hands-on examples are easy to read. In few topics, the discussion of ggplot2 and lessR becomes academic, for example the discussion of the visual advantages of a doughnut over a pie chart. Another aspect is that this textbook is currently one of the very few textbooks which presents an alternative to the package ggplot2.