Book Review – Lattice: Multivariate Data Visualization with R by Deepayan Sarkar (Springer 2008)

July 19th, 2009

[amazonshowcase_d70c7b600ca8b4964397b97932ca0cd7]

This book by Deepayan Sarkar, who is the author of the lattice package for R, provides an introduction to this implementation of the trellis graphics system followed up with a large range of examples of frequently used graphics. The book is divided into three parts starting with the basics leading into taking greater control of the graphics systems and finishing with a brief discussion of extending the lattice library.

Overall the book is easy to read and there are many examples covering different types of display and the author has a website where the different figures can be browsed, which is a good online complement to the book. Many of these examples are based on multi-panel displays which is one of the strengths of Trellis graphics to allow multivariate data to be investigated with lower dimensional plots.

Part I starts with an overview of using lattice to produce graphical displays of some of the frequent used data sets for teaching R. These examples show how patterns in multivariate data can be visualised on a static display. There are a couple of examples based on stack bar charts which are not easy to compare groups when compared to a dot plot and are characteristic of the chart junk produced by various other software systems. The next chapter looks at summarising univariate data and provides a reasonable coverage of the standard graph types – including box and whisker, density plot, histogram and quantile-quantile plots.

Chapter 4 considers tabular data and types of graphs that are alternatives to histograms or bar charts. The dot plot is a display introduced by Cleveland that is more effective than a bar chart for comparing data and this chapter does a good job at demonstrating the benefits of dot plots.

Chapter 5 concentrates on scatter plots and shows different ways to visualising groups within a data set. This can be done by grouping the data using different symbols or colours within a panel or by conditioning the data on one or more of the variables. Scatter plot matrices are introduced at the end of the chapter as a good way to explore multivariate data by looking at pairwise plots of the variables and lastly parallel coordinate plots are introduced. These are not particularly useful displays as the message show to the viewer depends on the order the variables in the plot.

Part I concludes with a chapter on three dimensional displays which are always challenging to produce on a two dimensional computer display or printed page. Some examples of the different types of surface are shown, both three dimensional as well as contour plots. There are then some examples illustrating trellis displays with a surface in each panel which tie in nicely with the trellis philosophy of visualising high dimensional data.

Part II provides details of how the user can adjust some of the lattice parameters to customise the graphical display.

Chapter 7 provides a good overview of the graphical parameters that can be set in a trellis theme and which types of display make use of the different types of parameter. These include the types of plot symbol, size of the symbol, colour etc. Chapter 8 continues the look at the visual display covering first axis labels in detail followed by aspect ratios. Chapter 9 considers various types of labels that can be applied to a graph, such as titles or axis labels. There is also a good section on creating legends for trellis displays.

Chapter 10 covers miscellaneous topics associated with converting data into a suitable format for using lattice graphics, which is an area which is often neglected and is a good addition to the book. The use of shingles to create categories from a continuous variable is also covered – this is equivalent to taking slices through a response surface and gives a useful view of the data. Chapter 11 is a short discussion about manipulating a lattice object – the plotting functions create a lattice object that is printed if it is not saved to an object. Chapter 12 is a brief look at interacting with trellis displays – in particular identifying points in the display.

Part III takes a look at extending the trellis graphics setup. Chapters 13 and 14 provide a a good introduction to writing custom panel functions to customise the type of display and the functions that are providing with the lattice library to build up a new type of display.

Overall Comment: This is a useful book for learning the lattice graphics package by examples with some discussion about the various components that make up the system. For a user who only wants to dabble with these graphics the book is probably too extensive as there are other books or freely available documents that introducing lattice graphics.

Comments are closed.