- Creating graphs
- Manipulating legends
- Manipulating strings
- Manipulating colours
- Saving plots
- Titles, captions and cross-references
- Time series graphs
- Publication quality graphics
- Maps
ggplot2 is awesome. The following notes are not a comprehensive guide to using it, but instead are reminders on how to do certain tricky things for future reference. The official user guides are here and the ggplot2 cookbook is also excellent.
Creating graphs
The basics
Creating a graph is a little different in ggplot compared to anything else I’ve tried. One first sets the data and variables to be used in a graph, then adds the plotting method, the ‘geoms’ separately. A basic plot would look something like this
library(ggplot2)
data(mtcars)
p <- ggplot(data = mtcars, aes(x = wt, y = mpg, colour = cyl)) +
geom_point()
p
Grouping
If one has collapsed data and one wants to plot separate lines by a second factor, then need to set group in the plot. To set multiple factors as a grouping, then use group = interaction(factor1, factor2).
p <- ggplot(df, aes(x = x_var, y = y_var, group = group_var)) +
geom_line()
or
p <- ggplot(df, aes(x = x_var, y = y_var,
group = interaction(g_var_1, g_var_2))) +
geom_line()
Rotating axis labels
p + theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
hjust
sets the alignment of the text left (hjust=0
) or right (hjust=1
) and vjust
the vertical justification where vjust = 0
means that the text is justified to the bottom of the text, and vjust = 1
from the top.
Setting the aspect ratio
p + theme(aspect.ratio=1)
Set axes to zero
ggplot axes don’t tend to originate at zero.
p + expand_limits(x = 0, y = 0)
Display panelled graphs
with facet_wrap()
or facet_grid()
p + facet_wrap( ~ cyl)
Manipulating legends
Position at bottom of graph
p + theme(legend.position = "bottom")
Postion in graph area
In the top right corner, orientated by the top right corner of the legend box.
p + theme(legend.position = c(1,1), legend.justification = c(1,1))
Set number of rows in a legend
p + guides(col = guide_legend(nrow = 2))
Wrapping legend labels
This won’t show anything different for this example, but if long strings were used as labels, then these would be wrapped after 13 characters rather than allowed to over plot other points.
p <- ggplot(data = mtcars, aes(x = wt, y = mpg, colour = factor(cyl))) +
geom_point()
p + scale_color_discrete(labels = function(x) strwrap(x, width = 13))
Manipulating strings
Italics in labels
Typically species names need to be italicised, this can be a serious nuisance. Italics can be controlled by the theme()
function.
theme( axis.test.x = element_text(face = 'italic'))
strip.text.x
controls the horizontal labels of facets.
Legend text can be controlled as below, assuming that multiple lines have been coloured by a third factor.
scale_colour_discrete("Organism", breaks = c("One", "Two", "Three"),
labels = expression(italic("One"), italic("Two"), italic("Three")))
Partial italics can be introduced with expression and paste from plotmath:
ggtitle(expression(paste("Resistance to selected antibiotics in ", italic("E. coli"), " by local area team" )))
Adding newlines is harder as plotmath does not allow newline characters. There is a workaround:
ggtitle(
expression(
atop(paste("Resistance to selected antibiotics in ", italic("E. coli")),
" by local area team, England 2013")
))
Set the order of labels in a legend
p + guides(colour = guide_legend(reverse = TRUE))
Other special characters on graphs
Say you want to add special characters, like equality symbols.
scale_x_discrete("Age group", labels = c("<1 month", "1-11 months", "1-4 years", "5-9 years",
"10-14 years", "15-44 years", "45-64 years", "65-74 years",
expression(phantom(x) >=75), "Unknown"))
?phantom
provides a list of other special characters.
I recently wanted to include regression coefficients and p values on a facet_wrap()
panelled plot of linear regressions.
That was an enormous pain in the backside.
I found the solution here, but couldn’t get commas to work.
I did find that the solution in the link resulted in over-plotting all the coefficients.
My solution was to join df2 into the main plotting data frame.
I’ll try to add an example later.
It is also possible to use unicode characters to produce special characters.
labels = c("\u2264 15", "15-20", "20-24","25-30")
<= is \u2264
and >= is \u2265
Labels on axes
To format 10000 as 10 000.
french = function(x) format(x, big.mark = " ")
ggplot()...
scale_y_continuous("n cases", labels = french)
Manipulating colours
Setting manual colours
I think the package viridis
provides really nice colour schemes.
Others, however, disagree and one may need to set these manually.
There’s a nice SO post here on using Rcolorbrewer
package to create colour schemes.
# Create a colour palette
my_palette = c(brewer.pal(5, "Blues")[c(1,3,4,5)]
# Examine the palette
grid::grid.raster(my_palette, int=F)
# Get the hex values for each of the colours
my_palette
[1] "#EFF3FF" "#C6DBEF" "#9ECAE1" "#6BAED6" "#3182BD" "#08519C" # not correct, but copy and paste from another job
# Put these into a scale_colour_manual() function
p + scale_colour_manual("scale name", values = c("#EFF3FF", "#C6DBEF", "#9ECAE1", "#6BAED6", "#3182BD", "#08519C"))
Saving plots
svg(filename = "plot.svg")
p
dev.off()
Also, have yet to try this, but looks useful. ggplot-powerpoint-wall-head-solution.
Important note re saving plots: devSVG()
does not handle <
characters. Base function svg()
does.
Or also:
ggsave(filename = "filename.extension", plot, width = 7, height = 4, dpi = 300)
It’s also worth noting that dingbats = FALSE
may reduce file size and prevent editors from getting upset.
Titles, captions and cross-references
ggplot2
ggplot supports titles, subtitles and footnotes (called captions).
Title defaults to centre justification, subtitle to left and caption to right.
This can be overridden with theme()
as below.
p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
labs(title = "My title",
subtitle = "My subtitle",
caption = "This is a caption") +
theme(
plot.title = element_text(hjust = -0.0),
plot.caption = element_text(hjust = -0.0)
)
p
Markdown/knitr
captioner package
Time series graphs
Axes can be set to date format with scale_x_date()
and the defaults tend to look pretty good.
last_month <- Sys.Date() - 0:730
df <- data.frame(
date = last_month,
price = runif(731)
)
base <- ggplot(df, aes(date, price)) +
geom_line() +
scale_x_date(date_minor_breaks = "1 month", date_breaks = "1 year",
date_labels = c("%Y"))
base
However, one may wish to mimic the Excel approach to have two-level x-axes. from SO post here
set.seed(1)
df=data.frame(year=rep(2009:2013,each=4),
quarter=rep(c("Q1","Q2","Q3","Q4"),5),
sales=40:59+rnorm(20,sd=5))
g1 <- ggplot(data = df, aes(x = interaction(year, quarter,
lex.order = TRUE),
y = sales, group = 1)) +
geom_line(colour = "blue") +
coord_cartesian(ylim = c(35, 65), expand = FALSE) +
annotate(geom = "text", x = seq_len(nrow(df)), y = 34,
label = df$quarter, size = 4) +
annotate(geom = "text", x = 2.5 + 4 * (0:4), y = 32,
label = unique(df$year), size = 6) +
# theme_bw()
# +
theme(plot.margin = unit(c(1, 1, 4, 1), "lines"),
axis.title.x = element_blank(),
axis.text.x = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank()
)
# remove clipping of x axis labels
g2 <- ggplot_gtable(ggplot_build(g1))
g2$layout$clip[g2$layout$name == "panel"] <- "off"
grid::grid.draw(g2)
It’s worth noting that the above works well for an Rscript, but I’ve been advised that to get it to work in a markdown doc, the final line grid:grid.draw(g2)
should be replaced with plot(g2)
.
Publication quality graphics
Possibly useful for powerpoint
options(bitmapType="cairo")
ggsave("test_ggsave_300.png", width=14.11, height=14.11, unit="cm", dpi=300)
from https://mcfromnz.wordpress.com/2013/09/03/ggplot-powerpoint-wall-head-solution/#comment-487