---
title: "Changes to 'grid' units"
author: "Paul Murrell, Thomas Lin Pedersen"
date: 2020-04-13
categories: ["Internals"]
tags: ["grid, units"]
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(collapse = TRUE)
```
```{r eval=FALSE, include=FALSE, results="hide"}
## Run two containers, one with R, one with RD
system(paste0("docker run -t -d --rm ",
              "--name grid-new-units-r-3.6.3 ",
              "-v ", getwd(), ":/home/work ",
              "-w /home/work ",
              "pmur002/grid-new-units-r-3.6.3"))
system(paste0("docker run -t -d --rm ",
              "--name grid-new-units-r-devel ",
              "-v ", getwd(), ":/home/work ",
              "-w /home/work ",
              "pmur002/grid-new-units-r-devel"))
```

One of the main downsides to the 'grid' graphics package is that
it is slow.  And that makes some important packages that depend on 'grid', 
like 'ggplot2',
slow.  For example, the scatterplots shown below are roughly 
equivalent, but one is drawn 
using 'graphics' and the other using 'ggplot2'.

```{bash eval=FALSE, echo=FALSE, results="hide"}
docker exec grid-new-units-r-3.6.3 Rscript grid-new-units-files/scatterplots.R
```

![](/Blog/public/post/grid-new-units-files/scatterplots.png)

The 'ggplot2' version takes more than 4 times as long to draw.

*(The benchmarks in this post were produced using the 'bench' package
within a Docker container
based on [`rocker/r-devel`](https://hub.docker.com/r/rocker/r-devel/), 
but with R-devel (r77995) built with 
`--enable-memory-profiling` and a bunch of R packages installed;
the Docker images, `pmur002/grid-new-units-r-3.6.3` and
`pmur002/grid-new-units-r-devel`, are available from DockerHub.)*

![](/Blog/public/post/grid-new-units-files/scatterplots-timing-3.6.3.png)

Thomas Lin Pedersen identified that, in
typical 'ggplot2' usage, a significant amount of time
was being spent creating and manipulating 'grid' "unit" objects
 and this has lead to a change in
the internal implementation of "units" in the 'grid' graphics package 
for R version 4.0.0.

On one hand, this is not news, because the public behaviour of 'grid' units
has not changed at all.   However, there are two important 
consequences of this change:  one for users and one for developers.

For users, the reason for making the change was speed;  with the new 'grid'
units, certain operations go a lot faster (an order of magnitude or more
in many cases).  For example, the following code that just
creates unit objects is up to ten
times faster with the new unit implementation.

```{r echo=FALSE, comment=NA}
cat(readLines("grid-new-units-files/units-src.R"), sep="\n")
```

```{bash eval=FALSE, echo=FALSE, results="hide"}
docker exec grid-new-units-r-3.6.3 Rscript grid-new-units-files/units.R
docker exec grid-new-units-r-devel RDscript grid-new-units-files/units.R
```

```{r echo=FALSE, results="hide", message=FALSE}
library(bench)
library(ggplot2)
## Load timing info generated by containers
unitTiming <- rbind(readRDS("grid-new-units-files/units-timing-3.6.3.rds"),
                    readRDS("grid-new-units-files/units-timing-4.0.0.rds"))
png("grid-new-units-files/units-timing.png", height=400)
autoplot(bench::as_bench_mark(unitTiming)) + facet_wrap("version", dir="v")
dev.off()
```

![](/Blog/public/post/grid-new-units-files/units-timing.png)

Although manipulating units is only a fraction of what packages like 'ggplot2' 
do, the impact of the unit speed up
is sufficient to be noticeable in the production of 'ggplot2' plots.
The plots below include examples of both simple and complex 'ggplot2' plots.

```{bash eval=FALSE, echo=FALSE, results="hide"}
docker exec grid-new-units-r-3.6.3 Rscript grid-new-units-files/ggplot2.R
docker exec grid-new-units-r-devel RDscript grid-new-units-files/ggplot2.R
```

```{r echo=FALSE, results="hide"}
library(bench)
library(ggplot2)
## Load timing info generated by containers
unitTiming <- rbind(readRDS("grid-new-units-files/ggplot2-timing-3.6.3.rds"),
                    readRDS("grid-new-units-files/ggplot2-timing-4.0.0.rds"))
png("grid-new-units-files/ggplot2-timing.png", height=400)
autoplot(unitTiming) + facet_wrap("version", dir="v")
dev.off()
```

![](/Blog/public/post/grid-new-units-files/ggplot2.png)

The following timings show that the new unit implementation in 'grid'
can translate to a 10%-20% speed-up in 'ggplot2' plots.

![](/Blog/public/post/grid-new-units-files/ggplot2-timing.png)

```{r eval=FALSE, echo=FALSE}
## A run across all 'ggplot2' examples.
pdf()
system.time(for (i in ls("package:ggplot2")) 
                example(i, ask=FALSE, character.only=TRUE, echo=TRUE))
dev.off()
```

```{r eval=FALSE, echo=FALSE}
## R 3.6.3
   user  system elapsed 
486.792   0.584 487.366 
```

```{r eval=FALSE, echo=FALSE}
## R-devel
   user  system elapsed 
407.388   0.828 408.206 
```

For developers, the impact of the changes to 'grid' units should be
neutral, but they 
can be disastrous if a
package has been peeking and poking at the internal implementation
of 'grid' units.

We believe that we have identified most of these cases and that
most of those have now been fixed.  In case some problems have not
yet come to light, the following known problems and solutions may
be helpful:

* it is possible for a package to contain a saved R object that contains
old-style 'grid' units.  There are protections in the new 'grid' 
implementation to upgrade such objects to new-style units or, at worst,
generate an error.  Recreating the saved R object should hopefully
resolve any issues.

* several packages were extracting attributes from "unit" objects, e.g.,
the `"mm"` from `unit(1, "mm")`;  there is a new `grid::unitType()` 
function that may help packages to avoid accessing 'grid' unit internals in the
future.

There was also a small lie earlier:  the public behaviour of 'grid' units
has actually 
changed a little because the printing of some units is now different.
For example, the following code and output shows that arithmetic
on units produces a different printed result.

```{bash eval=FALSE, echo=FALSE, results="hide"}
docker exec grid-new-units-r-3.6.3 R CMD BATCH --quiet --no-timing grid-new-units-files/print.R grid-new-units-files/print-log-3.6.3.Rout
docker exec grid-new-units-r-devel RD CMD BATCH --quiet --no-timing grid-new-units-files/print.R grid-new-units-files/print-log-4.0.0.Rout
```

```{r echo=FALSE, comment=NA}
cat(readLines("grid-new-units-files/print-log-3.6.3.Rout"), sep="\n")
```

```{r echo=FALSE, comment=NA}
cat(readLines("grid-new-units-files/print-log-4.0.0.Rout"), sep="\n")
```

The original design and implementation of new units was contributed
by Thomas Lin Pedersen.  Paul Murrell contributed minor fixes
and features and lead the testing, diagnosis, and remedying
 of problems in packages.  Paul Murrell's contribution was partially 
supported by a donation from R Studio to The University of Auckland 
Foundation.
Both authors would like to acknowledge the patience and support
of the CRAN team and the cooperation of the authors of the 
numerous package that were affected by these changes.

```{bash eval=FALSE, include=FALSE}
## Eval this to clean up containers
## Shut down two containers
docker kill grid-new-units-r-3.6.3
docker kill grid-new-units-r-devel
```