ggplot2 loops and delayed evaluation in aes()

This note solves a problem faced by Josiah Parry (relevant gist and Bluesky post) Generating a series of plots with ggplot2 and a for-loop can lead to unexpected results, as in the following example:

library(ggplot2)

m <- as.matrix(penguins[,3:6])

all_plots <- list()
for (i in 1:ncol(m)) {
  gg <- ggplot(penguins, aes(bill_len, m[, i])) + 
    geom_point(na.rm = TRUE) +
    labs(title = sprintf("Column %i", i))
  all_plots[[i]] <- gg
}
patchwork::wrap_plots(all_plots)

All panels show the same data because the loop variable `i` is evaluated too late.

The data in every subplot is the same, but that’s wrong. The problem is that the i in aes(bill_len, m[, i]) is not evaluated until the value is needed and that doest not occur until after the loop when i has its final value. So the data for m[, 4] is shown in every panel.

all_plots[[1]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, i]`
all_plots[[2]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, i]`
all_plots[[3]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, i]`
all_plots[[4]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, i]`

What we need to do is force evaluation of something so that i can no longer point to the same value in each subplot.

(Note also that we could avoid this whole thing by using lapply() or something but pretend that our hands are tied here.)

One option is to force evaluation of i with !! so the column index is inserted into the expression immediately.

all_plots <- list()
for (i in 1:ncol(m)) {
  gg <- ggplot(penguins, aes(bill_len, m[, !! i])) + 
    geom_point(na.rm = TRUE) +
    labs(title = sprintf("Column %i", i))
  all_plots[[i]] <- gg
}
patchwork::wrap_plots(all_plots)

Each panel now uses the correct column after forcing evaluation with `!!`.

The y value still comes from m[, ], but the column index is now fixed.

all_plots[[1]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, 1L]`
all_plots[[2]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, 2L]`
all_plots[[3]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, 3L]`
all_plots[[4]][["layers"]][[1]][["computed_mapping"]]
#> Aesthetic mapping: 
#> * `x` -> `bill_len`
#> * `y` -> `m[, 4L]`

We could also assemble and store the plot immediately:

all_plots <- list()
for (i in 1:ncol(m)) {
  gg <- ggplot(penguins, aes(bill_len, m[, i])) + 
    geom_point(na.rm = TRUE) +
    labs(title = sprintf("Column %i", i))
  all_plots[[i]] <- patchwork::wrap_ggplot_grob(ggplotGrob(gg))
}
patchwork::wrap_plots(all_plots)

Forcing early evaluation by converting plots to grobs also yields correct panels.

ggplotGrob() converts the plot into a graphical object (“grob”) table to be drawn into by… whatever comes next in the plotting pipeline. (I never have to work past this point.) patchwork can handle these, but wrap_ggplot_grob() does some more work on grob tables that come from ggplot2.

This is a subtle problem. I asked ChatGPT to proofread this note and it offered a nonsolution:

Screenshot of ChatGPT suggesting to use `y <- m[, i]` on each loop iteration.

Leave a comment