3.6 Histograms and Density Plots (ggplot2)

Much like with boxplots, the default settings of ggplot2 are quite a bit nicer for both histograms and density plots.

hist_ggplot <- ggplot(diamonds, aes(x = price))

g_his <- hist_ggplot +
  geom_histogram()

print(g_his)
A basic histogram produced with **ggplot2**.

Figure 3.22: A basic histogram produced with ggplot2.

One thing that is really nice about the ggplot2 density plots is that it is so easy to fill the area under the curve which really helps the visual representation of the data.

dens_ggplot <- ggplot(diamonds, aes(x = price))

g_den <- dens_ggplot +
  geom_density(fill = "black", alpha = 0.5)

print(g_den)
A basic density plot produced with **ggplot2**.

Figure 3.23: A basic density plot produced with ggplot2.

Just as before, we are encountering the rather peculiar way of ggplot2 to adjust certain default settings to suit our needs (likes). If we wanted to show percentages instead of counts for the histograms, we again need to use the strange ..something.. syntax.

Another thing that we want to highlight in the following code chunk is the way to achieve binary conditioning in ggplot2. This can be achieved through

facet_grid(g ~ f)

where, again, g and f are the two variables used for conditioning.

g_his <- hist_ggplot +
  geom_histogram(aes(y = ..ncount..)) +
  scale_y_continuous(labels = percent_format()) +
  facet_grid(color ~ cut) + 
  ylab("Percent")

print(g_his)
A faceted **ggplot2** histogram with percentages on the y-axis.

Figure 3.24: A faceted ggplot2 histogram with percentages on the y-axis.

Similar to our lattice approach we’re going to rotate the x-axis labels by 45 degrees.

dens_ggplot <- ggplot(diamonds, aes(x = price))

g_den <- dens_ggplot +
  geom_density(fill = "black", alpha = 0.5) +
  facet_grid(color ~ cut) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

print(g_den)
A faceted **ggplot2** density plot conditioned according to two variables.

Figure 3.25: A faceted ggplot2 density plot conditioned according to two variables.

Okay, another thing we might want to show is a certain estimated value (like the mean of our sample) including error bars.