--- title: "Enhancements to HTML Documentation" author: "Deepayan Sarkar and Kurt Hornik" date: 2022-04-08 categories: ["Internals", "User-visible Behavior"] tags: ["documentation", "package checking"] ---
The upcoming release of R (version 4.2.0) features several enhancements to the HTML help system.
The most noticeable features are that LaTeX-like mathematical equations in help pages are now typeset using either KaTeX or MathJax, and usage and example code are highlighted using Prism. Additionally, the output of examples and demos can now be shown within the browser if the knitr package is installed. This is especially useful if the examples produce graphical output. Apart from these, several less noticeable changes have been made to improve useablity.
The goal of this post is to introduce these enhancements, and request testing of these features during the run-up to the release of R 4.2.0 so that as many bugs can be taken care of before release as possible.
The R documentation format supports LaTeX-like mathematics through the
\eqn{}
and \deqn{}
commands. Historically, these have been
rendered properly when converted to PDF (via LaTeX), but not when
converted to other formats. However, support for rendering mathematics
in HTML has matured over time, and in particular the
MathJax and KaTeX
Javascript libraries are stable and popular solutions. In fact,
several R packages provide indirect support for mathematics in R
documentation, notably
mathjaxr and
katex. Naturally, both of
these require special markup.
From R 4.2.0, \eqn{}
and \deqn{}
commands can be rendered as
mathematics in HTML output, using either KaTeX or MathJax. The details
can be controlled using the help.htmlmath
option. For dynamic help,
the default is equivalent to setting
options(help.htmlmath = "katex")
which uses a local copy of KaTeX that ships with R. The
help.htmlmath
option can also be set to "mathjax"
; in this case,
if the mathjaxr package
is installed then its local copy of MathJax is used, otherwise an
online CDN is used. If the option is set to any other non-NULL value,
the displayed pages will fall back to the basic substitutions that
were used previously.
Static HTML output, if enabled, will always use KaTeX via a CDN (as relative paths to a local copy cannot be reliably computed).
R now ships with Javascript and CSS files downloaded from
https://prismjs.com/ to provide syntax highlighting of R code in the
Usage and Examples sections in R help pages. Such code are now marked
with a <code class='language-R'>
tag, which are processed and
rendered using the Prism Javascript and CSS.
Syntax highlighting is currently disabled for static HTML output, again due to the difficulty in computing relative paths, and the lack of a CDN.
R provides specific facilities to run two types of example code
included in package documentation, namely examples and demos, via the
example()
and demo()
functions respectively. The example code is
shown as part of the corresponding help page, and demos can be
accessed via dynamic help through the package’s index page. Versions
of R prior to 4.2.0 provided no option to run the examples via the
dynamic help system. Demos could be run by clicking a link, but the
result was to run the demo in the console.
Displaying the result of running examples or demos within the help system has certain obvious benefits, especially if the output contains graphics. While this is a fairly non-trivial task, the powerful knitr package makes it quite simple.
If the knitr package is installed, R 4.2.0 will include links in help
pages that allow its examples to be run, which when clicked will run
the examples as a .Rhtml
document and display the output as an HTML
page. Such pages can be also be created directly by accessing links of
the form
http://127.0.0.1:<PORT>/library/<PKG>/Example/<TOPIC>
Previously available demo links of the form
http://127.0.0.1:<PORT>/library/<PKG>/Demo/<TOPIC>
will similarly display output in the browser instead of running the demo in the console.
Both the example()
and demo()
functions have a new type
argument,
which can be set to "html"
to show output in a browser instead of
the console using links of this form.
One caveat that may be considered either a feature or a bug depending on your perspective: The code to create the HTML output using knitr sets some but not all knitr options. This means that the output may be affected by settings previously modified by the user. For example, if one sets
knitr::opts_chunk$set(dev = "svg")
or
library(svglite)
knitr::opts_chunk$set(dev = "svglite")
then the embedded images will be SVG instead of PNG (which in most
cases will be an improvement). Similarly, loading packages that define
additional methods for
knit_print()
may change the output from how it would have appeared in the console.
Help pages are now HTML5, the current HTML standard, which in particular helps facilitating some of the enhancements described previously (and in the future may be used for additional enhancements).
Considerable effort was put into ensuring creating valid HTML5. The old validation toolchain could not handle HTML5, so a new one was created based on HTML Tidy and integrated into the tools package.
Validation not only identified HTML generation issues in R, but also Rd
markup problems in add-on packages, often from outputting raw HTML. So
a new check for the validity of the package HTML help pages was added
which is turned on for the CRAN submission checks, and can generally be
activated by setting env var _R_CHECK_RD_VALIDATE_RD2HTML_
to
something true. (This needs HTML tidy available on the system path for
executables. The checks are currently not performed for Rd pages
generated by roxygen2, which still needs updating for the HTML changes.)
These checks report validation problems for the generated HTML, and
relating these to the Rd sources is not always immediate. In such
cases, what seems to work best is calling help(<TOPIC>, help_type = "html")
on a TOPIC from the offending Rd file, and then use the browser
to view the HTML source. This allows to identify the context, and in
fact, often already highlights invalid content.
R help pages now also add the viewport meta tag inside <head>
, which
improves rendering and zooming on mobile devices.
The R.css
stylesheet that is used to style R help pages features
some simple improvements. More usefully, dynamic help now serves the
copy of R.css
located in $R_HOME/doc/html/R.css
rather than the
copy created for every package during installation (the latter is
still necessary for static HTML). This means that any local changes
made to $R_HOME/doc/html/R.css
after R is installed will affect all
help pages subsequently displayed through the dynamic help system. For
example, one might include
@import url("https://cdn.jsdelivr.net/npm/bootstrap@4.1.3/dist/css/bootstrap.min.css");
at the top of R.css
to include the Bootstrap
4
CSS. Note that this does not affect RStudio’s built-in browser, which
uses its own styling.
Many of these enhancements can be disabled by setting the environment
variable _R_HELP_ENABLE_ENHANCED_HTML_
to a false value.