Starting with R 3.6.0 the library()
and require()
functions allow
more control over handling search path conflicts when packages are
attached. The policy is controlled by the new conflicts.policy
option. This post provides some background and details on this new
feature.
Background
When loading a package and attaching it to the search path, conflicts
can occur between objects defined in the new package and ones already
provided by other packages on the search path. Sometimes these
conflicts are intentional on the part of a package author; for
example, dplyr
provides its own versions of several functions from
base
and stats
:
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
In other cases the conflict is not intended and can cause problems.
Students in my classes often load the MASS
package after dplyr
and then get
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
If select
is used at this point, the one from MASS
will be used;
if the intent was to use the one from dplyr
, then this results in
error messages that are difficult to understand.
By default, library()
and require()
do show messages about the
conflicts, but they are easy to miss. And they will not be visible at
all in an Rmarkdown
document if the calls appear in code chunks that
do not show their results at all or use the message = FALSE
chunk
option.
The new mechanism is intended to provide more control over how conflicts are handled, in particular allowing for stricter checks if desired. Different levels of strictness may be appropriate in different contexts. Some contexts, ordered by the level of strictness that might make sense:
- Interactive work in the console.
- Interactive work in a notebook.
- Code in an
Rmarkdown
orSweave
document. - Scripts.
- Packages.
Conflicts in packages are already handled by the NAMESPACE
mechanism. The others could use some help. Scripts in particular might
benefit from being able to specify exactly which objects should be
made available from the packages used, much like specifying explicit
imports in a package NAMESPACE
file.
Another useful distinction is between anticipated and unanticipated conflicts:
Anticipated conflicts occur when a single package is attached that in turn might cause other packages to be attached. The package author will see the messages, and should address any ones that are not as intended. These conflicts should usually not require user intervention.
Unanticipated conflicts arise when a user requests that two packages be attached that may not have been designed to be used together. Conflicts in these cases will typically require the user to make an appropriate choice.
In the initial examples, the conflicts caused by loading dplyr
would
be anticipated conflicts, but the one caused by then loading MASS
would be an unanticipated conflict. For interactive work, as well as
code in Rmarkdown
documents, it would be useful to be able to use a
setting that allows anticipated conflicts, but signals an error for
unanticipated conflicts that are not explicitly resolved.
Specifying a Policy
A number of aspects of the conflict handling process can be customized
via the conflicts.policy
option. The value of this option can be a
string naming a standard policy, or a list with named elements
specifying policy details. Supported named policies are "strict"
and
"depends.ok"
. Policy elements that are supported:
error
: IfTRUE
, then conflicts produce errors.generics.ok
: If set, this determines whether masking a function with an S4 generic version is considered a conflict. The default isFALSE
for strict checking andTRUE
otherwise.can.mask
: A character vector of names of packages that are allowed to be masked without producing an error. Specifying the base packages can reduce the number of explicit masking approvals needed.depends.ok
: IfTRUE
, then allow all conflicts produced within a single package load.warn
: Sets the default for thewarn.conflicts
argument tolibrary
andrequire
.
A specification that may work for most users who want protection against unanticipated conflicts:
options(conflicts.policy =
list(error = TRUE,
generics.ok = TRUE,
can.mask = c("base", "methods", "utils",
"grDevices", "graphics",
"stats"),
depends.ok = TRUE))
This specification assumes that package authors know what they are
doing, and all conflicts from loading a package by itself are
intentional and OK. With this specification all CRAN
and BIOC
packages should individually load without error. A shortcut provided
for specifying this policy is the name "depends.ok"
:
options(conflicts.policy = "depends.ok")
A strict policy specification would be
options(conflicts.policy = list(error = TRUE, warn = FALSE))
This can also be specified as the named policy "strict"
:
options(conflicts.policy = "strict")
Resolving Conflicts
With a "strict"
policy, attempting to load a conflicting package
produces an error:
options(conflicts.policy = "strict")
library(dplyr)
## Error: Conflicts attaching package ‘dplyr’:
##
## The following objects are masked from ‘package:stats’:
##
## filter, lag
##
## The following objects are masked from ‘package:base’:
##
## intersect, setdiff, setequal, union
The error signaled with strict checking is of class
packageConflictsError
with fields package
and conflicts
.
Explicitly specifying names that are allowed to mask variables already on the search path avoids the error:
library(dplyr,
mask.ok = c("filter", "lag",
"intersect", "setdiff", "setequal",
"union"))
With the "depends.ok"
policy, calling library(dplyr)
will succeed and
produce the usual message about conflicts.
With both "strict"
and "depends.ok"
policies, loading MASS
after
dplyr
will produce an error:
library(MASS)
## Error: Conflicts attaching package ‘MASS’:
##
## The following object is masked from ‘package:dplyr’:
##
## select
One way to eliminate this error is to exclude select
when attaching
the MASS
exports:
library(MASS, exclude = "select")
The select
function from MASS
can still be used as MASS::select
.
To make it easier to use a strict policy for commonly used packages,
and for implicitly attached packages, conflictRules
provides a way
to specify default mask.ok
and exclude
values for packages,
e.g. with
conflictRules("dplyr",
mask.ok = c("filter", "lag",
"intersect", "setdiff",
"setequal", "union"))
conflictRules("MASS", exclude = "select")
To allow masking only variables in specific packages:
library(dplyr, mask.ok = list(stats = c("filter", "lag"),
base = c("intersect", "setdiff",
"setequal", "union")))
To allow masking all variables in specific packages:
library(dplyr, mask.ok = list(base = TRUE, stats = TRUE))
library()
has some heuristics to avoid warning about S4 overrides.
These heuristics mask the overrides in Matrix
, but not those in
BiocGenerics
. These heuristics are disabled by default when strict
conflict checking is in force, so they have to be handled explicitly
in some way.
This specification would allow Matrix
to be loaded with strict
checking:
conflictRules("Matrix",
mask.ok = list(stats = TRUE,
graphics = "image",
utils = c("head", "tail"),
base = TRUE))
Similarly, this will work for BiocGenerics
:
conflictRules("BiocGenerics",
mask.ok = list(base = TRUE,
stats = TRUE,
parallel = TRUE,
graphics = c("image", "boxplot"),
utils = "relist"))
In the future it might be worth considering whether this information
could be encoded in a package’s DESCRIPTION
file.
Additional Strictness
The arguments include.only
and attach.required
have been added to
library()
and require()
to provide additional control in a script
using a "strict"
policy:
- If
include.only
is supplied as a character vector, then only variables named there are included in the attached frame. - Packages specified in the
Depends
field of theDESCRIPTION
file are only attached whenattach.required
isTRUE
. The default value ofattach.required
isTRUE
ifinclude.only
is missing, andFALSE
ifinclude.only
is supplied.
Other Approaches
Other approaches to handling conflicts are provided by packages
conflicted
and
import
; there are also
several packages by the name modules
.
The conflicted
package arranges to signal an error when evaluating a
global variable that is defined by multiple packages on the search
path, unless an explicit resolution of the conflict has been
specified.
Being able to ask for stricter handling of conflicts is definitely a
good idea, but it seems it would more naturally belong in base R than
in a package. There are several downsides to the package approach, at
least as currently implemented in conflicted
, including being fairly
heavy-weight, confusing find
, and not handling packages that have
functions that do things like
foo <- function(x) { require(bar); ... }
Not that this is a good idea, but it is out there.
The approach taken in conflicted
of signaling an error only when a
symbol is used might be described as a dynamic approach. This dynamic
approach could be implemented more efficiently in base R by adjusting
the global cache mechanism. The mechanism introduced in R 3.6.0 in
contrast insists that conflicts be resolved at the point where the
library()
or require()
call occurs. This might be called a static
or declarative approach.
To think about these options it is useful to go back to the range of
activities that might need increasing levels of strictness in conflict
handling listed at the beginning of this post. For Rmarkdown
documents and scripts the static approach seems clearly better, as
long as a good set of tools is available to express conflict
resolution concisely. For notebooks I think the static approach is
probably better as well. For interactive use it may be a matter of
personal preference. Once I have decided I want help with conflicts,
my preference would be to be forced to resolve them on package load
rather than to have my work flow interrupted at random points to go
and sort things out.