Developed by Leland Wilkinson, is a set of grammatical rules for creating perceivable graphs
Rather than thinking about a limited set of graphs, think about graphical forms
Charts are instances of much more general objects
An abstraction which makes thinking, reasoning, and communicating graphics easier
Grammar of Graphics
Different types of graphs may appear completely distinct, but in actuality share many common elements.
By making different visual choices, you can use graphs to highlight different aspects of the same data.
For example, here are three ways of displaying the same data:
Grammar of Graphics
Statistical graphic specifications are expressed in six statements:
DATA: a set of data operations that create variables from datasets
TRANS: variable transformations
SCALE: scale transformations
COORD: a coordinate system
ELEMENT: graphs (points) and their aesthetic attributes (color)
GUIDE: one or more guides (axes, legends, etc.)
Limitations
The Grammar of Graphics…
tells us what words make up our graphical “sentences,” but offers no advice on how to write well
is not about good taste, practice, or graphic design
is useful, but is not all encompassing
does not include interactive graphics
does not include a few interesting and useful charts
ggplot2
A layered grammar of graphics
A layered grammar vs The Grammar of Graphics
ggplot2 is based on the more general concept of the Grammar of Graphics
The components are independent, meaning that we can generally change a single component in isolation
What is a graphic?
ggplot2 uses the idea that you can build every graph with graphical components from three sources
the data, represented by geoms
the scales and coordinate system
the plot annotations
to display values, map variables in the data to visual properties of the geom (aesthetics) like size, color, and x and y locations
ggplot2: A layered grammar
The layered grammar defines the components of a plot as:
a default data set and set of mappings from variables to aesthetics
one or more layers, each layer having one geometric object, one statistical transformation, one position adjustment, and optionally, one data set and set of aesthetic mappings
one scale for each aesthetic mapping used
a coordinate system
the facet specification
What is a Layer?
it determines the physical representation of the data
a plot may have multiple layers
usually all the layers on a plot have something in common, i.e. different views of the same data
a layer is composed of four parts:
data and aesthetic mapping
a statistical transformation (stat)
a geometric object (geom)
a position adjustment
ggplot2: Specifications
A plot consists of several mostly independent specifications:
aesthetics - links between data variables and graphical features (position, color, shape, size)
layers - geometric elements (points, lines, rectangles, text, …)
transformations - transformations specify a functional link between the data and the displayed information (identity, count, bins, density, regression). Transformations act on the variables.
scales - scales map values in data space to values in the aesthetic space. Scales change the coordinate space of an aesthetic, but don’t change the underlying value (so the change is at the visual level, not the mathematical level).
coordinate system - e.g. polar or Cartesian
faceting - facets allow you to split plots by other variables to produce many sub-plots.
theme - formatting items, such as background color, fonts, margins…
ggplot2: A layered grammar
data: diamonds layer: - aes: x = cut, y = count, fill = cut
- geom: bar coordinates: Cartesian
data: diamonds layer: - aes: x = 1, y = count, fill = cut - geom: fill-bar coordinates: Cartesian
data: diamonds layer: - aes: x = 1, y = count, fill = cut - geom: fill-bar coordinates: Polar