On the bottom line, I found the book quite interesting to read, although you probably would have managed to fit the material into three to five blog posts (yes, that’s how we measure document lengths today). The book spends about a third of the time reviewing the history of statistical plots. While it might be quite fascinating that William Playfair produced pretty modern looking plots already in the 18th century, I’m not sure that this is the best way to approach such a field.
Another third is spent on common mistakes and lies in statistics plots, including all kinds of exaggerations, visual noise (cross hatching, moire effects and friends), and downright stupid plots, for example “executive summary”-style plots consisting of only three bars (one of which is the sum of the other two).
The strongest part of the book IMHO was the third part which develops a number of design principles. Tufte’s main points are to use as much “ink” (read “toner” or “pixels”) as possible to show data, reducing gridlines and axes as much as possible. Tufte is also an advocate for data rich plots, arguing that our visual system is quite capable of dealing with high information densities.
Like most machine learners, I’ve done most of my plots with MATLAB and more recently matplotlib, and I’m sort of used to the style their provide. Tufte’s approach is somewhat different, and more clean, which is a nice change. JavaScript plotting libraries like protovis or D3.js follow the aesthetics of Tufte more.
What I particularily liked about his approach was the idea that visualizations can really help to understand data using our visual system. As he says, “Above all, show data”, meaning that you shouldn’t hesitate to put as much data as possible before your eyes (within reason) so that you can really start exploring the structure in your data visually.
Posted by Mikio L. Braun at 2011-08-15 17:21:00 +0000
blog comments powered by Disqus