Dear Emacs, please make this -*-Text-*- mode! ************************************************** * * * 2.6 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.6.2 NEW FEATURES o colnames(DF) is now also fast for large dataframes DF with automatic row.names. Note that the correct usage is names(DF). (PR#10470) o tools::texi2dvi() works around the failure of 'texi2dvi --quiet' to be quiet in texinfo 4.11. o On Linux, parallel 32/64-bit installations are supported using multilib. BUG FIXES o A compilation problem on one system where glob was not found has been corrected. (PR#10468) o The "profile.nls" method of plot() was losing the x axis labels. o array() computed the total number of entries in the array before coercing the dimensions to integer. (Reported by Allen McIntosh.) o persp() misreported errors in the y parameter. (Reported by Allen McIntosh.) o source("clipboard", echo=TRUE) and file("clipboard", open="rt") gave spurious errors. (Reported by Fernando Saldanha.) o attributes<-() stripped any existing attributes before checking that all elements of the right-hand side had names. o rbinom(n, size, *) gave NaN when 'size > .Machine$integer.max'. o print.summary.lm() is now consistent in the capitalization of "R-squared". o confint() misreported on some rank-deficient lm() models. (PR#10496) This could also occur in the default method. o \code{\var{}} was not rendered correctly to latex in Rd files for non-alphabetic arguments . o In 2.6.1, curve(*, add=TRUE) used a wrong default 'xlim' when x coordinates were logged. o The Java-based search in help.start() now only requires a JVM >= 1.4 (2.6.1 accidentally required >= 1.5). o The default method for range() was omitting 'na.rm' for non-numeric objects such as those of class "Date". (PR#10508) o cut(x, breaks=) misbehaved on a constant vector of negative values. o bxp(), the plotting engine of boxplot(), no longer plots staple ticks multiple times. (PR#10499) o The automatic detection of the domain for message translation was not working correctly for messages in message(), warning() and stop() in packages other than 'base'. o The profile.nls() function misbehaved when encountering non-convergence of the "port" algorithm. o Under certain rare circumstances in R 2.6.x, log(), round() and trunc() could alter their arguments in the caller. This involved passing of empty '...' arguments, and was spotted when using apply(x, 2, log). o par() no longer warns unnecessarily when asked to set new=FALSE on an unused graphics device. o plot.formula() was not passing on '...' when used with a one-sided formula. plot.formula() was not accepting expressions for annotations passed to title(). (PR#10525) o pchisq(x, df=0, ncp=L) now returns the correct limit exp(-L/2) for x=0 and is no longer returning NaN for x > 0, L < 80. (PR#10551) o Non-ASCII characters were only working correctly in Hershey fonts if these were specified by the 'vfont' argument to text() and not if specified as a font family. o There were several errors in Hershey$allowed, but the help page listed the allowed combinations correctly. o text() no longer attempts to use 'vfont' with an expression for 'labels' (it was documented not to work). o fisher.test(simulate.p.value = TRUE) gave incorrect answers in some extremely degenerate problems. (PR#10558) o src/extra/pcre has been updated to PCRE 7.5 (bugfix release). o capture.output() completes an incomplete final line of output when file = NULL. (PR#10534) o capture.output() now returns invisibly if output is written to a file/connection. o format.AsIs() did not remove the "AsIs" class and so could go into an infinite loop. o summary.mlm() lost the names of the coefficients when there was only one. o Rdconv was not marking examples files with an encoding if this was known from the package's DESCRIPTION file. o readChar() from a raw vector was reading a number of bytes, not characters. o slotNames() was erroneously treating classes that extend "character" as strings. o R no longer ignores SIGPIPE signals even in processes launched by system(). Instead PR#1959 is handled by a simple error handler which will give an error message in circumstances where none was given before. o The AIC() S4 generic in package stats4 no longer disables dispatch of S3 methods for AIC(). o The conflicts check in library() excluded all S4 generics, even where they were unrelated to the function masked. It is now more selective (although still too generous to S4 generics). o proc.time() was missing a protect and could misbehave if provoked by gctorture(). (PR#10600) o The cut() and hist() methods for dates and datetimes are now more accurate for intervals of "months" and "years", thanks to Marc Schwarz. o url()/download.file() could segfault if the HTTP interaction involved a redirect to an address starting with '/' on the same server. o Memory allocations used in format() and in an internal utility function could be off by one byte. (PR#10635) o isoreg(x, y) no longer segfaults when y has NAs. o split(x, g) always returns a list as documented. (It used to return NULL for a zero-length 'x'.) o tapply(x, g, ...) misbehaved if the args were of zero length. (PR#10644) o hist.POSIXt(*, xaxt = "n") no longer suppresses the y-axis. o strptime() crashed under certain locales on Mac OS X. o gregexpr() no longer segfaults when "" is given as the search pattern. Thanks to Herve Pages for the bug report. o matplot(x, *) with default 'pch' did not plot columns from column number 37 on (because the default pch was NA for those). (PR#10676) o print.htest() lost output when used within sink(file, split=TRUE). o Setting par(col.main=) also set par("col") to the same colour. o Anonymous fifos were broken (again). CHANGES IN R VERSION 2.6.1 NEW FEATURES o The "data.frame" and "factor" methods for [[ now support the 'exact' argument introduced in 2.6.0. o plot.lm() gains a new argument 'cex.caption' to allow the size of the captions to be controlled. o A series of changes make the CHARSXP cache introduced in 2.6.0 faster (and in some cases many times faster) in sessions with a large number (e.g. a million) of unique character strings, and also if there are many empty strings. o embedFonts(), bitmap() and dev2bitmap() explicitly turn off auto-rotation in Ghostscript when generating PDF. o The canonical architecture is no longer checked when loading packages using a non-empty sub-architecture, since it is possible to (e.g.) build packages for i386-pc-linux-gnu on both that architecture and on x86_64-unknown-linux-gnu. o Deparsing will (if option warnIncomplete is set) warn on strings longer than the parser limit (8192 bytes). o url() now uses the UserAgent header in http transactions in the same way as download.file() (making use of option "HTTPUserAgent"). BUG FIXES o iconv() is again able to translate character strings with embedded nuls (such as those in UCS-2). o new.packages() and update.packages() failed when called on an empty library, since old.packages() threw an error. old.packages() now returns NULL (as documented) in that case. o Builds on Mac OS X 10.4 or higher now allocate enough space in the binary headers to relocate dependent libraries into the framework. o R CMD build now computes the exclusion list on the copy it makes: this avoids problems if the original sources contain symbolic links (which are resolved in the copy). Thanks to Michael Lawrence for diagnosis and patch. o object.size() had slightly too low a size for objects of type "S4". o symbol() in plotmath expressions was only accepting valid character strings, which made it impossible to specify symbols such as aleph (obtained by symbol("\300")) in a UTF-8 locale. o An event handling issue caused autorepeat functions to misbehave with tcltk (notably scrollbars). o plot(sin, -5, 5) gives ylab 'sin(x)' again, where it resulted in 'x(x)' in 2.6.0. Further, plot(sin) again plots from [0,1] also in cases where a previously used coordinate system differs. o curve() with unspecified 'from', 'to' and 'xlim' now reuses the previous x limits, and not slightly larger ones. o It was intended that R code filenames in packages should start with an ASCII letter or digits (and R CMD INSTALL uses that), but the test used in R CMD build ([A-Za-z0-9]) was locale-specific (and excluded t to y in Estonian, for example). (PR#10351) o 'R CMD build' could misbehave when faced with files with CRLF line endings *and* no line ending on the final line of the file, removing the last byte of the file. o DF[i, j] failed in 2.6.0 if j was a logical vector selecting a single column. o Unix x11() would fail if a valid 'display' was specified but DISPLAY was unset. (PR#10379) o postscript() was not always ignoring .Postscript.Options in the workspace (where it should not have occurred). o help.search() would give an error if it found a badly installed package, even if 'package' was not specified. o tclServiceMode() (package tcltk) now works under Unix-alikes. (Although documented, it used only to work under Windows.) o As Mac OS X 10.5.x comes with incompatible /bin/sh shell, we force SHELL=/bin/bash (which is ok) in that case. [Only for 2.6.x: another solution is used in 2.7.0.] o Deliberately using malformed source attributes no longer causes deparsing/printing of functions to crash R. (PR#10437) o R CMD check and R CMD INSTALL now work with (some) directory names containing spaces. o choose(n, k) gave incorrect values for negative n and small k. o plot.ts(x,y) could use wrong default labels; fixed thanks to Antonio, Fabio di Narzo. o reshape() got column names out of sync with contents in some cases; found by Antonio, Fabio Di Narzo. o ar(x) for short 'x' (i.e. length <= 10) could fail because the default 'order.max' was >= length(x) which is non-sensical. o Keyboard events in getGraphicsEvent() could cause stack imbalance errors. (PR#10453) CHANGES IN R VERSION 2.6.0 SIGNIFICANT USER-VISIBLE CHANGES o integrate(), nlm(), nlminb(), optim(), optimize() and uniroot() now have '...' much earlier in their argument list. This reduces the chances of unintentional partial matching but means that the later arguments must be named in full. o The default type for nchar() is now "chars". This is almost always what was intended, and differs from the previous default only for non-ASCII strings in a MBCS locale. There is a new argument 'allowNA', and the default behaviour is now to throw an error on an invalid multibyte string if type="chars" or type="width". o Connections will be closed if there is no R object referring to them. A warning is issued if this is done, either at garbage collection or if all the connection slots are in use. NEW FEATURES o abs(), sign(), sqrt(), floor(), ceiling(), exp() and the gamma, trig and hyperbolic trig functions now only accept one argument even when dispatching to a Math group method (which may accept more than one argument for other group members). o abbreviate() gains a 'method' argument with a new option "both.sides" which can make shorter abbreviations. o aggregate.data.frame() no longer changes the group variables into factors, and leaves alone the levels of those which are factors. (Inter alia grants the wish of PR#9666.) o The default 'max.names' in all.names() and all.vars() is now -1 which means unlimited. This fixes PR#9873. o as.vector() and the default methods of as.character(), as.complex(), as.double(), as.expression(), as.integer(), as.logical() and as.raw() no longer duplicate in most cases where the object is unchanged. (Beware: some code has been written that invalidly assumes that they do duplicate, often when using .C/.Fortran(DUP=FALSE).) o as.complex(), as.double(), as.integer(), as.logical() and as.raw() are now primitive and internally generic for efficiency. They no longer dispatch on S3 methods for as.vector() (which was never documented). as.real() and as.numeric() remain as alternative names for as.double(). expm1(), log(), log1p(), log2(), log10(), gamma(), lgamma(), digamma() and trigamma() are now primitive. (Note that logb() is not.) The Math2 and Summary groups (round, signif, all, any, max, min, sum, prod, range) are now primitive. See under METHODS PACKAGE below for some consequences for S4 methods. o apropos() now sorts by name and not by position on the search path. o attr() gains an 'exact = TRUE' argument to disable partial matching. o bxp() now allows 'xlim' to be specified. (PR#9754) o C(f, SAS) now works in the same way as C(f, treatment), etc. o chol() is now generic. o dev2bitmap() has a new option to go via PDF and so allow semi-transparent colours to be used. o dev.interactive() regards devices with the displaylist enabled as interactive, and packages can register the names of their devices as interactive via deviceIsInteractive(). o download.packages() and available.packages() (and functions which use them) now support in 'repos' or 'contriburl' either file: plus a general path (including drives on a UNC path on Windows) or a file:/// URL in the same way as url(). o dQuote() and sQuote() are more flexible, with rendering controlled by the new option 'useFancyQuotes'. This includes the ability to have TeX-style rendering and directional quotes (the so-called 'smart quotes') on Windows. The default is to use directional quotes in UTF-8 locales (as before) and in the Rgui console on Windows (new). o duplicated() and unique() and their methods in base gain an additional argument 'fromLast'. o fifo() no longer has a default 'description' argument. fifo("") is now implemented, and works in the same way as file(""). o file.edit() and file.show() now tilde-expand file paths on all interfaces (they used to on some and not others). o The find() argument is now named 'numeric' and not 'numeric.': the latter was needed to avoid warnings about name clashes many years ago, but partial matching was used. o stats:::.getXlevels() confines attention to factors since some users expected R to treat unclass() as a numeric vector. o grep(), strsplit() and friends now warn if incompatible sets of options are used, instead of silently using the documented priority. o gsub()/sub() with perl = TRUE now preserves attributes from the argument x on the result. o is.finite() and is.infinite() are now S3 and S4 generic. o jpeg(), png(), bmp() (Windows), dev2bitmap() and bitmap() have a new argument 'units' to specify the units of 'width' and 'height'. o levels() is now generic (levels<- has been for a long time). o Loading serialized raw objects with load() is now considerably faster. o New primitive nzchar() as a faster alternative to nchar(x) > 0 (and avoids having to convert to wide chars in a MBCS locale and hence consider validity). o The way old.packages() and hence update.packages() handle packages with different versions in multiple package repositories has been changed. The first package encountered was selected, now the one with highest version number. o optim(method = "L-BFGS-B") now accepts zero-length parameters, like the other methods. Also, method = "SANN" no longer attempts to optimize in this case. o New options 'showWarnCalls' and 'showErrorCalls' to give a concise traceback on warnings and errors. showErrorCalls=TRUE is the default for non-interactive sessions. Option 'showNCalls' controls how abbreviated the call sequence is. o New options 'warnPartialMatchDollar', 'warnPartialMatchArgs' and 'warnPartialMatchAttr' to help detect the unintended use of partial matching in $, argument matching and attr() respectively. o A device named as a character string in options(device =) is now looked for in the grDevices name space if it is not visible from the global environment. o pmatch(x, y, duplicates.ok = TRUE) now uses hashing and so is much faster for large x and y when most matches are exact. o qr() is now generic. o It is now a warning to have an non-integer object for .Random.seed: this indicates a user had been playing with it, and it has always been documented that users should only save and restore it. o New higher-order functions Reduce(), Filter() and Map(). o [g]regexpr() gain an 'ignore.case' argument for consistency with grep(). (This does change the positional matching of arguments, but no instances of positional matching beyond the second were found.) o relist() utility, an S3 generic with several methods, providing an 'inverse' for unlist(); thanks to a code proposal from Andrew Clausen. o require() now returns invisibly. o The interface to reshape() has been revised, allowing some simplified forms that did not work before, and somewhat improved error handling. A new argument 'sep' has been introduced to replace simple usages of 'split' (the old features are retained). o rmultinom() uses a high-precision accumulator where available, and so is more likely to give the same result on different platforms (although it is still possible to get different results, and the result may differ from previous versions of R). o row() and col() now work on matrix-like objects such as data frames, not just matrices. o Rprof() allows smaller values of 'interval' on machines that support it: for example modern Linux systems support interval = 0.001. o sample() now requires its first argument 'x' to be numeric (in the sense of is.numeric()) as well as of length 1 and >= 1 before it is regarded as shorthand for 1:x. o sessionInfo() now provides details about package name spaces that are loaded but not attached. The output of sessionInfo has been improved to make it easier to read when it is inadvertently wrapped after being pasted into an email message. o setRepositories() has a new argument 'ind' to allow selections to be made programmatically. o sprintf() no longer has a output string length limit. o storage.mode<- is now primitive, and hence makes fewer copies of an object (none if the mode is unchanged). It is a little less general than mode<-, which remains available. (See also the entry under DEFUNCT below.) o sweep() gains an argument 'check.margin = TRUE' which warns about mismatched dimensions. o The mathematical annotation facility (plotmath) now recognises a symbol() function which forces the font to be a symbol font. This allows access to all characters in the Adobe Symbol encoding within plotmath expressions. o For OSes that cannot unset environment variables, Sys.unsetenv() sets the value to "", with a warning. o New function Sys.which(), an interface to 'which' on Unix-alikes and an emulation on Windows. o On Unix-alikes, system(, intern = TRUE) reports on very long lines that may be truncated, giving the line number of the content being read. o termplot() has a default for 'ask' that uses dev.interactive(). It allows 'ylim' to be set, or computed to cover all the plots to be made (the new default) or computed for each plot (the previous default). o uniroot(f, *) is slightly faster for non-trivial f() because it computes f(lower) and f(upper) only once, and it has new optional arguments 'f.lower' and 'f.upper' by which the caller can pass these. o unlink() is now internal, using common POSIX code on all platforms. o unsplit() now works with lists of dataframes. o The vcov() methods for classes "gls" and "nlme" have migrated to package 'nlme'. o vignette() has a new argument 'all' to choose between showing vignettes in attached packages or in all installed packages. o New function within(), which is like with(), except that it returns modified versions back of lists and data frames. o X11(), postscript() (and hence bitmap()), xfig(), jpeg(), png() and the Windows devices win.print(), win.metafile() and bmp() now warn (once at first use) if semi-transparent colours are used (rather than silently treating them as fully transparent). o New function xspline() to provide base graphics support of X-splines (cf grid.xspline). o New function xyTable() does the 2D gridding "computations" used by sunflowerplot(). o Rd conversion to HTML and CHM now makes use of classes, which are set in the stylesheets. Editing R.css will change the styles used for \env, \option, \pkg etc. (CHM styles are set at compilation time.) o The documented arguments of '%*%' have been changed to be x and y, to match S and the implicit S4 generic. o If members of the Ops group (the arithmetic, logical and comparison operators) and '%*%' are called as functions, e.g. '>'(x, y), positional matching is always used. (It used to be the case that positional matching was used for the default methods, but names would be matched for S3 and S4 methods and in the case of '!' the argument name differed between S3 and S4 methods.) o Imports environments of name spaces are named (as "imports:foo"), and so are known e.g. to environmentName(). o Package 'stats4' uses lazy-loading not SaveImage (which is now deprecated). o Installing help for a package now parses the .Rd file only once, rather than once for each type. o PCRE has been updated to version 7.2. o bzip2 has been updated to version 1.0.4. o gettext has been updated to version 0.16.1. o There is now a global CHARSXP cache, R_StringHash. CHARSXPs are no longer duplicated and must not be modified in place. Developers should strive to only use mkChar (and mkString) for creating new CHARSXPs and avoid use of allocString. A new macro, CallocCharBuf, can be used to obtain a temporary char buffer for manipulating character data. This patch was written by Seth Falcon. o The internal equivalents of as.complex, as.double, as.integer and as.logical used to handle length=1 arguments now accept character strings (rather than report that this is 'unimplemented'). o Lazy-loading a package is now substantially more efficient (in memory saved and load time). o Various performance improvements lead to a 45% reduction in the startup time without 'methods' (and one-sixth with - 'methods' now takes 75% of the startup time of a default session). o The [[ subsetting operator now has an argument 'exact' that allows programmers to disable partial matching (which will in due course become the default). The default value is exact=NA which causes a warning to be issued when partial matching occurs. When exact = TRUE, no partial matching will be performed. When exact = FALSE, partial matching can occur and no warning will be issued. This patch was written by Seth Falcon. o Many of the C-level warning / error messages (e.g. from subscripting) have been re-worked to give more detailed information on either the location or the cause of the problem. o The S3 and S4 Math groups have been harmonized. Functions log1p(), expm1(), log10() and log2() are members of the S3 group, and sign(), log1p(), expm1(), log2(), cummax(), cummin(), digamma(), trigamma() and trunk() are members of the S4 group. gammaCody() is no longer in the S3 Math group. They are now all primitive. o The initialization of the random-number stream makes use of the sub-second part of the current time where available. Initialization of the 1997 Knuth TAOCP generator is now done in R code, avoiding some C code whose licence status has been questioned. o The reporting of syntax errors has been made more user- friendly. METHODS PACKAGE o Packages using 'methods' have to have been installed in R 2.4.0 or later (when various internal representations were changed). o Internally generic primitives no longer dispatch S4 methods on S3 objects. o load() and restoring a workspace attempt to detect and warn on the loading of pre-2.4.0 S4 objects. o Making functions primitive changes the semantics of S4 dispatch: these no longer dispatch on classes based on types but do dispatch whenever the function in the base name space is called. This applies to as.complex(), as.integer(), as.logical(), as.numeric(), as.raw(), expm1(), log(), log1p(), log2(), log10(), gamma(), lgamma(), digamma() and trigamma(), as well as the Math2 and Summary groups. Because all members of the group generics are now primitive, they are all S4 generic and setting an S4 group generic does at last apply to all members and not just those already made S4 generic. as.double() and as.real() are identical to as.numeric(), and now remain so even if S4 methods are set on any of them. Since 'as.numeric' is the traditional name used in S4, currently methods must be exported from a NAMESPACE for 'as.numeric' only. o The S4 generic for '!' has been changed to have signature (x) (was (e1)) to match the documentation and the S3 generic. setMethod() will fix up methods defined for (e1), with a warning. o The "structure" S4 class now has methods that implement the concept of structures as described in the Blue Book--that element-by-element functions and operators leave structure intact unless they change the length. The informal behavior of R for vectors with attributes was inconsistent. o The implicitGeneric() function and relatives have been added to specify how a function in a package should look when methods are defined for it. This will be used to ensure that generic versions of functions in R core are consistent. See ?implicitGeneric. o Error messages generated by some of the functions in the methods package provide the name of the generic to provide more contextual information. o It is now possible to use setGeneric(useAsDefault = FALSE) to define a new generic with the name of a primitive function (but having no connection with the primitive). o showMethods() has a "smart" default for 'inherited' such that showMethods(, incl = TRUE) becomes a useful short cut. DEPRECATED & DEFUNCT o $ on an atomic vector now gives a warning that it is 'invalid'. It remains deprecated, but may be removed in R >= 2.7.0. o storage.mode(x) <- "real" and storage.mode(x) <- "single" are defunct: use instead storage.mode(x) <- "double" and mode(x) <- "single". o In package installation, SaveImage: yes is deprecated in favour of LazyLoad: yes. o seemsS4Object (methods package) is deprecated in favour of isS4(). o It is planned that [[exact=TRUE]] will become the default in R 2.7.0. UTILITIES o checkS3methods() (invoked by R CMD check) now checks the arguments of methods for primitive members of the S3 group generics. o R CMD check now does a recursive copy on the 'tests' directory. o R CMD check now warns on non-ASCII .Rd files without an \encoding field, rather than just on ones that are definitely not from an ISO-8859 encoding. This agrees with the long-standing stipulation in 'Writing R Extensions', and catches some packages with UTF-8 man pages. o R CMD check now warns on DESCRIPTION files with a non-portable Encoding field, or with non-ASCII data and no Encoding field. o R CMD check now loads all the 'Suggests' and 'Enhances' dependencies to reduce warnings about non-visible objects, and also emulates standard functions (such as shell()) on alternative R platforms. o R CMD check now (by default) attempts to latex the vignettes rather than just weave and tangle them: this will give a NOTE if there are latex errors. o R CMD check computations no longer ignore Rd \usage entries for functions for extracting or replacing parts of an object, so S3 methods should use the appropriate \method{} markup. o R CMD check now checks for CR (as well as CRLF) line endings in C/C++/Fortran source files, and for non-LF line endings in Makefile[.in] and Makevars[.in] in the package 'src' directory. R CMD build will correct non-LF line endings in source files and in the make files mentioned. o Rdconv now warns about unmatched braces rather than silently omitting sections containing them. (Suggestion by Bill Dunlap, PR#9649) Rdconv now renders (rather than ignores) \var{} inside \code{} markup in latex conversion. R CMD Rdconv gains a --encoding argument to set the default encoding for conversions. o The list of CRAN mirrors now has a new (manually maintained) column "OK" which flags mirrors that seem to be OK, only those are used by chooseCRANmirror(). The now exported function getCRANmirrors() can be used to get all known mirrors or only the ones that are OK. o R CMD SHLIB gains arguments --clean and --preclean to clean up intermediate files after and before building. o R CMD config now knows about FC and FCFLAGS (used for F9x compilation). o R CMD Rdconv now does a better job of rendering quotes in titles in HTML, and \sQuote and \dQuote into text on Windows. C-LEVEL FACILITIES o New utility function alloc3DArray similar to allocMatrix. o The entry point R_seemsS4Object in Rinternals.h has not been needed since R 2.4.0 and has been removed. Use IS_S4_OBJECT instead. o Applications embedding R can use R_getEmbeddingDllInfo() to obtain DllInfo for registering symbols present in the application itself. o The instructions for making and using standalone libRmath have been moved to the R Installation and Administration manual. o CHAR() now returns (const char *) since CHARSXPs should no longer be modified in place. This change allows compilers to warn or error about improper modification. Thanks to Herve Pages for the suggestion. o acopy_string is a (provisional) new helper function that copies character data and returns a pointer to memory allocated using R_alloc. This can be used to create a copy of a string stored in a CHARSXP before passing the data on to a function that modifies its arguments. o asLogical, asInteger, asReal and asComplex now accept STRSXP and CHARSXP arguments, and asChar accepts CHARSXP. o New entry point R_GE_str2col listed in R_ext/GraphicsEngine.h for external graphics device developers. o doKeybd and doMouseevent are now exported in GraphicsDevice.h. o R_alloc now has first argument of type 'size_t' to support 64-bit platforms (e.g. Win64) with a 32-bit 'long' type. o The type of the last two arguments of getMatrixDimnames (non-API but mentioned in R-exts.texi and in Rinternals.h) has been changed to 'const char **' (from char **). o R_FINITE now always resolves to the function call R_finite in packages (rather than sometimes substituting isfinite). This avoids some issues where R headers are called from C++ code using features tested on the C compiler. o The advice to include R headers from C++ inside extern "C" {} has been changed. It is nowadays better *not* to wrap the headers, as they include other headers which on some OSes should not be wrapped. o Rinternals.h no longer includes a substantial set of C headers. All but ctype.h and errno.h are included by R.h which is supposed to be used before Rinternals.h. o Including C system headers can be avoided by defining NO_C_HEADERS before including R headers. This is intended to be used from C++ code, and you will need to include C++ equivalents such as before the R headers. INSTALLATION o The 'test-Lapack' test is now part of 'make check'. o The 'stat' system call is now required, along with 'opendir' (which had long been used but not tested for). ('make check' would have failed in earlier versions without these calls.) o 'evince' is now considered as a possible PDF viewer. o 'make install-strip' now also strips the DLLs in the standard packages. o Perl 5.8.0 (released in July 2002) or later is now required. (R 2.4.0 and later have in fact required 5.6.1 or later.) o The C function 'finite' is no longer used: we expect a C99 compiler which will have 'isfinite'. (If that is missing, we test separately for NaN, Inf and -Inf.) o A script/executable 'texi2dvi' is now required on Unix-alikes: it is part of the texinfo distribution. o Files texinfo.tex and txi-en.tex are no longer supplied in doc/manual (as the latest versions have an incompatible licence). You will need to ensure that your texinfo and/or TeX installations supply them. o wcstod is now required for MBCS support. o There are some experimental provisions for building on Cygwin. PACKAGE INSTALLATION o The encoding declared in the DESCRIPTION file is now used as the default encoding for .Rd files. o A standard for specifying package license information in the DESCRIPTION License field was introduced, see 'Writing R Extensions'. In addition, files LICENSE or LICENCE in a package top-level source directory are now installed (so putting copies into the 'inst' subdirectory is no longer necessary). o install.packages() on a Unix-alike now updates doc/html/packages.html only if packages are installed to .Library (by that exact name). o R CMD INSTALL --clean now runs SHLIB --clean to do the clean up (unless there is a src/Makefile), and this will remove $(OBJECTS) (which might have been redefined in Makevars). R CMD INSTALL --preclean cleans up the sources after a previous installation (as if that had used --clean) before attempting to install. R CMD INSTALL will now run R CMD SHLIB in the 'src' directory if src/Makevars is present, even if there are no source files with known extensions. o If there is a file src/Makefile, src/Makevars is now ignored (it could be included by src/Makefile if desired), and it is preceded by etc/Makeconf rather than share/make/shlib.mk. Thus the makefiles read are R_HOME/etc/Makeconf, src/Makefile in the package and then any personal Makevars files. o R CMD SHLIB used to support the use of 'OBJS' in Makevars, but this was changed to 'OBJECTS' in 2001. The undocumented alternative of 'OBJS' has finally been removed. o R CMD check no longer issues a warning about no data sets being present if a lazyload db is found (as determined by the presence of Rdata.rdb, Rdata.rds, and Rdata.rdx in the 'data' subdirectory. BUG FIXES o charmatch() and pmatch() used to accept non-integer values for 'nomatch' even though the return value was documented to be integer. Now 'nomatch' is coerced to integer (rather than the result being coerced to the type of 'nomatch'). o match.call() no longer 'works' outside a function unless 'definition' is supplied. (Under some circumstances it used to 'work', matching itself.) o The formula methods of boxplot, cdplot, pairs and spineplot now attach 'stats' so that model.frame() is visible where they evaluate it. o Date-time objects are no longer regarded as numeric by is.numeric(). o methods("Math") did not work if 'methods' was not attached. o readChar() read an extra empty item (or more than one) beyond the end of the source; in some conditions it would terminate early when reading an item of length 0. o Added a promise evaluation stack so interrupted promise evaluations can be restarted. o R.version[1:10] now nicely prints. o In the methods package, prototypes are now inherited for the .Data "slot"; i.e., for classes that contain one of the basic data types. o [[i, j]] now works if 'i' is character. o write.dcf() no longer writes NA fields (PR#9796), and works correctly on empty descriptions. o pbeta(x, log.p = TRUE) now has improved accuracy in many cases, and so have functions depending on it such as pt(), pf() and pbinom(). o mle() had problems with the L-BFGS-B in the no-parameter case and consequentially also when profiling 1-parameter models (fix thanks to Ben Bolker). o Two bugs fixed in methods that in involve the "..." argument in the generic function: previously failed to catch methods that just dropped the "..."; and use of callGeneric() with no arguments failed in some circumstances when "..." was a formal argument. o sequence() now behaves more reasonably, although not back-compatibly for zero or negative input. o nls() now allows more peculiar but reasonable ways of being called, e.g., with data=list() or a model without variables. o match.arg() was not behaving as documented when several.ok=TRUE (PR#9859), gave spurious warnings when 'arg' had the wrong length and was incorrectly documented (exact matches are returned even when there is more than one partial match). o The data.frame method for split<-() was broken. o The test for -D__NO_MATH_INLINES was badly broken and returned true on all non-glibc platforms and false on all glibc ones (whether they were broken or not). o LF was missing after the last prompt when --quiet was used without --slave. Use --slave when no final LF is desired. o Fixed bug in initialisation code in 'grid' package for determining the boundaries of shapes. Problem reported by Hadley Wickham; symptom was error message: "Polygon edge not found". o str() is no longer slow for large POSIXct objects. Its output is also slightly more compact for such objects; implementation via new optional argument 'give.head'. o strsplit(*, fixed=TRUE), potentially iconv() and internal string formatting is now faster for large strings, thanks to report PR#9902 by John Brzustowski. o de.restore() gave a spurious warning for matrices (Ben Bolker) o plot(fn, xlim=c(a,b)) would not set "from" and "to" properly when plotting a function. The argument lists to curve() and plot.function() have been modified slightly as part of the fix. o julian() was documented to work with POSIXt origins, but did not work with POSIXlt ones. (PR#9908) o Dataset HairEyeColor has been corrected to agree with Friendly (2000): the change involves the breakdown of the Brown hair / Brown eye cell by Sex, and only totals over Sex are given in the original source. o Trailing spaces are now consistently stripped from \alias{} entries in .Rd files, and this is now documented. (PR#9915) o .find.packages(), packageDescription() and sessionInfo() assumed that attached environments named "package:foo" were package environments, although misguided users could use such a name in attach(). o spline() and splinefun() with method = "periodic" could return incorrect results when length(x) was 2 or 3. o getS3method() could fail if the method name contained a regexp metacharacter such as "+". o help() now uses the name and not the value of the vector unless it has length exactly one, so e.g. help(letters) now gives help on 'letters'. (Related to PR#9927) o Ranges in chartr() now work better in CJK locales, thanks to Ei-ji Nakama. ************************************************** * * * 2.5 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.5.1 patched INSTALLATION o doc/manual now includes the texinfo support file epsf.tex which basic TeX installations often omit. BUG FIXES o Attempting to do in-memory serialization of an object requiring more than 1Gb might have failed. o Using formals<- on a function whose body was NULL worked incorrectly. (PR#9758) o logb() is now strictly a wrapper for log(), so if S4 methods are set on log(), logb() will also dispatch on them. o conflicts(where=) did not work correctly. (PR#9760) o log(x, base) was intended to handle complex 'base' even for real 'x', but there was a typo in the code to do so. o Syntax errors would sometimes misreport the error context. o qt(p, df=1) is now also correct for very small p. (PR#9804) qt(p, df=2) ditto; also is more accurate for 0 < |p - 0.5| << 1. qt(*, log.p=TRUE) now is finite and monotone (again!) where possible. o Several functions including those making use of printCoefmat(), layout() and sortedXyData() now work correctly with non-default settings of options("OutDec"). o S4 method dispatch for group generics (and %*%) failed to pass arguments to methods as promises and so in some circumstances methods could change their arguments. (Seen for the 'Math' group with package Matrix.) o The print() method for "ts" now handles quarterly and monthly series which do not start at the beginning of a quarter or month respectively. o Deserializing raw objects saved using save(..., ascii=TRUE) now works correctly. o ISOLatin7 encoding for postscript/PDF has been corrected. (PR#9845) o rbind(x,y) and cbind(x,y) did not dispatch properly when x and y had multiple S3-style classes. o The workaround for seeking on > 2Gb files did not work correctly on Unix-alike 32-bit systems. (PR#9883) o We had identical(NaN, NA_real_) != identical(NA_real_, NaN), spotted by Petr Savicky. CHANGES IN R VERSION 2.5.1 NEW FEATURES o density(1:20, bw = "SJ") now works as bw.SJ() now tries a larger search interval than the default (lower, upper) if it does not find a solution within the latter. o The output of library() (no arguments) is now sorted by library trees in the order of .libPaths() and not alphabetically. o R_LIBS_USER and R_LIBS_SITE feature possible expansion of specifiers for R version specific information as part of the startup process. o C-level warning calls now print a more informative context, as C-level errors have for a while. o There is a new option "rl_word_breaks" to control the way the input line is tokenized in the readline-based terminal interface for object- and file-name completion. This allows it to be tuned for people who use their space bar vs those who do not. The default now allows filename-completion with +-* in the filenames. o If the srcfile argument to parse() is not NULL, it will be added to the result as a "srcfile" attribute. o It is no longer possible to interrupt lazy-loading (which was only at all likely when lazy-loading environments), which would leave the object being loaded in an unusable state. This is a temporary measure: error-recovery when evaluating promises will be tackled more comprehensively in 2.6.0. INSTALLATION o 'make check' will work with --without-iconv, to accommodate building on AIX where the system iconv conflicts with libiconv and is not compatible with R's requirements. o There is support for 'DESTDIR': see the R-admin manual. o The texinfo manuals are now converted to HTML with a style sheet: in recent versions of makeinfo the markup such as @file was being lost in the HTML rendering. o The use of inlining has been tweaked to avoid warnings from gcc >= 4.2.0 when compiling in C99 mode (which is the default from configure). BUG FIXES o as.dendrogram() failed on objects of class "dendrogram". o plot(type ="s") (or "S") with many (hundreds of thousands) of points could overflow the stack. (PR#9629) o Coercing an S4 classed object to "matrix" (or other basic class) failed to unset the S4 bit. o The 'useS4' argument of print.default() had been broken by an unrelated change prior to 2.4.1. This allowed print() and show() to bounce badly constructed S4 objects between themselves indefinitely. o Prediction of the seasonal component in HoltWinters() was one step out at one point in the calculations. decompose() incorrectly computed the 'random' component for a multiplicative fit. o Wildcards work again in unlink() on Unix-alikes (they did not in 2.5.0). o When qr() used pivoting, the coefficient names in qr.coef() were not pivoted to match. (PR#9623) o UseMethod() could crash R if the first argument was not a character string. o R and Rscript on Unix-alikes were not accepting spaces in -e arguments (even if quoted). o Hexadecimal integer constants (e.g. 0x10L) were not being parsed correctly on platforms where the C function atof did not accept hexadecimal prefixes (as required by C99, but not implemented in MinGW as used by R on Windows). (PR#9648) o libRlapack.dylib on Mac OS X had no version information and sometimes an invalid identification name. o Rd conversion of \usage treated '\\' as a single backslash in all but latex: it now acts consistently with the other verbatim-like environments (it was never 'verbatim' despite the documentation). \code{\.} is now rendered as '\.' in all formats, as documented (it was not the case for latex conversion). codoc() (and checkDocStyle() and checkDocUsage()) now apply the same transformations to \usage as Rd conversion does, so {, % and \\ in strings in usages will now be related correctly to the help files. o rbind() failed if the only data frame had 0 rows. (PR#9657) o [i, j] could sometimes select the wrong column when j is numeric if there are duplicate column names. o sample(x, size, replace=TRUE, prob) had a memory leak if 10000 < size <= 100000. o x <- cbind(1:2); rownames(x) <- factor(c("A",NA)) now longer segfaults. o R CMD BATCH no longer assumes Sys.unsetenv() is supported (it is not on older Solaris systems). o median() returned a logical result when it was 'NA': it now returns an NA of appropriate type (e.g. integer or double). o grep(fixed = TRUE, perl = TRUE) ignored 'fixed', although it was documented to ignore 'perl' Same for [g]regexpr and [g]sub. o getNamespaceExports("base") works again. o runmed(c(), 1) no longer segfaults. o qr.coef(QR, b) failed for an LAPACK-produced QR if b was integer or for an over-determined system. qr.solve() for an under-determined system produces a solution with 0 and not NA for columns which are unused. o segments() was not handling full transparency correctly in PDF. (PR#9694) Nor was arrows(). o callGeneric() inside a method with extra arguments {and hence currently defined via .local()} now works. o [g]sub(fixed=TRUE, useBytes=FALSE) could substitute in the wrong place in an MBCS locale. gregexpr() could give incorrect answers in MBCS locales for perl = TRUE or fixed = TRUE (unless useBytes = TRUE). o The legacy quartz() device no longer crashes in locator() if the user attempts to close the window. o "CGGStackRestore: gstack underflow" warning is no longer shown in legacy quartz() device. o formatC() now limits 'digits' to 50 to avoid problems in C-level sprintf in some OSes. o seq.int(x, y, by=z) gave 'x' (and not an error) if 0 > (y-x)/z > -1. o promptClass() now lists methods, including those for generics in other attached packages. o Connection-related functions such as readBin() no longer crash when supplied with a non-connection object. o as.character.srcref() didn't handle bad srcref objects cleanly. o predict.nls() no longer requires 'newdata' to contain exactly the variable names needed to fit the model: variables used on the LHS only are no longer required and further variables are allowed. o plot.hclust() had a 'out by one' error, and ignored the last object when computing the window region (and could overrun arrays). o deriv() was creating results with double (and not integer) dims. o The unserialize code (e.g. as called by load()) looked for a function findPackageEnv() to set a saved package environment. This was missing, but is now supplied. o [cr]bind could segfault when creating a list matrix result. (Reported by Martin Morgan.) o besselI(x, nu, exp=TRUE) and besselY(x, nu) could give wrong answers for nu < 0. (Reported by Hiroyuki Kawakatsu.) o [g]sub could confuse a trailing byte '\' for a backreference in MBCSs where '\' can occur as a trailing byte (not UTF-8 nor EUC-JP, but SJIS and the CJK character sets used on Windows). (PR#9751) CHANGES IN R VERSION 2.5.0 USER-VISIBLE CHANGES o apropos(x) and find(x) now both only work for character 'x', and hence drop all non-standard evaluation behaviour. o Data frames can have 'automatic' row names which are not converted to dimnames by as.matrix(). (Consequently, e.g., t(.) for such data frames has NULL column names.) This change leads to memory reductions in several places, but can break code which assumes character dimnames for data frames derived from matrices. No existing R object is regarded as having 'automatic' row names, and it may be beneficial to recreate such objects via read.table() or data.frame(). o Using $ on an atomic vector now raises a warning, as does use on an S4 class for which a method has not been defined. o The Unix-alike readline terminal interface now does command-completion for R objects, incorporating the functionality formerly in package 'rcompletion' by Deepayan Sarkar. This can be disabled by setting the environment variable R_COMPLETION=FALSE when starting R (e.g. in ~/.Renviron). (Note that when this is enabled, filename completion no longer works for file paths containing R operators such as '+' and '-'.) NEW FEATURES o abbreviate() no longer has an 8191 byte limit on the size of strings it can handle. o abs(x) now returns integer for integer or logical arguments. o apropos() has a new argument 'ignore.case' which defaults to TRUE, potentially matching more than previously, thanks to a suggestion by Seth Falcon. o args(), str() and print() now give the argument lists of primitive functions. o as.matrix() gains the '...' argument that several packages have assumed it always had (and S-PLUS has). o Manipulation of integers as roman numerals via as.roman() in package utils. o attr() no longer treats name = NA_character_ as meaning name = "NA". o binom.test() now allows a 'fuzz' for calculated integer values in its x and n arguments. o boxplot(*, notch = TRUE) now warns when notches are outside hinges; related to PR#7690. o New function callCC() providing a downward-only version of Scheme's call with current continuation. o capabilities() now has a "profmem" entry indicating whether R has been compiled with memory profiling. o colnames<-() and rownames<-() now handle data frames explicitly, so calling colnames<- on a data frame no longer alters the representation of the row names. o commandArgs() has a new 'trailingOnly' argument to be used in conjunction with --args. o contour() now passes graphical parameters in '...' to axis() and box(). o New data set 'crimtab' on Student(1908)'s 3000 criminals. o cut.default() has a new argument 'ordered_result'. o .deparseOpts() has two new options: "keepNA" to ensure that different types (logical, integer, double, character and complex) of NAs are distinguished, and "S_compatible" to suppress the use of R-specific features such as 123L and to deparse integer values of a double vector with a trailing decimal point. The 'keepInteger' option now uses the suffix 'L' rather than as.integer() where possible (unless all entries are NA or "S_compatible" is also set). Other deparse options can now be added to "all" (which has not for some time actually switched on all options). Integer sequences m:n are now deparsed in that form. o deparse() and dput() now include "keepInteger" and "keepNA" in their defaults for the 'control' argument. o detach() now takes another argument, unload, which indicates whether or not to unload the package and then only cleans up the S4 methods if the package successfully unloads. o There are new constants NA_integer_, NA_real_, NA_complex_ and NA_character_ to denote NAs of those types, and they will be used in deparsing in place of as.integer(NA) etc unless .deparseOpts() includes "S_compatible". o dev.print() now recognizes 'screen devices' as all those with an enabled display list, rather than a hard-coded set. o Objects of class "difftime" are now handled more flexibly. The units of such objects can now be accessed via a units() function, which also has a replacement form, and there are conversion methods to and from numeric, which also allow the specification of units. Objects of this class can also be stored in data frames now. A format() method has been added, and the print method was revised. o New function environmentName() to give the print name of environments such as "namespace:base". This is now used by str(). o New function env.profile() provides R level access to summary statistics on environments. In a related patch, new.env() now allows the user to specify an initial size for a hashed environment. o file() can read the X11 clipboard selection as "X11_clipboard" on suitable X11-using systems. o file("stdin") is now recognized, and refers to the process's 'stdin' file stream whereas stdin() refers to the console. These may differ, for example for a GUI console, an embedded application of R or if --file= has been used. o file_test() is now also available in package utils. (It is now private in package tools.) o file.show() gains an 'encoding' argument. o New functions formatUL() and formatOL() in package utils for formatting unordered (itemize) and ordered (enumerate) lists. o The statistics reported when gcinfo(TRUE) are now of the amounts used (in Mb) and not of the amounts free (which are not really relevant when there are no hard limits, only gc trigger points). o New function get_all_vars() to retrieve all the (untransformed) variables that the default method of model.frame() would use to create the model frame. o interaction() has a new argument 'lex.order'. o initialize() (in methods) now tries to be smarter about updating the new instance in place, thereby reducing copying. o install.packages(dependencies = NA) is a new default, which is to install essential dependencies when installing from repositories to a single library. As a result of this change, update.packages() will install any new dependencies of the packages it is updating (alongside the package in the same library tree). If 'lib' is not specified or is specified of length one and the chosen location is not a writable directory, install.packages() offers to create a personal library directory for you if one does not already exist, and to install there. o is.atomic, is.call, is.character, is.complex, is.double (== is.real), is.environment, is.expression, is.function, is.integer, is.list, is.logical, is.null, is.object, is.pairlist, is.recursive, is.single and is.symbol (== is.name) are no longer internally S3 generic, nor can S4 methods be written for them. The "factor" methods of is.integer and is.numeric have been replaced by internal code. o Added is.raw() for completeness. o l10n_info() also reports if the current locale is Latin-1. o levels<-(), names() and names<-() now dispatch internally for efficiency and so no longer have S3 default methods. o .libPaths() now does both tilde and glob expansion. o Functions lm(), glm() loess(), xtabs() and the default method of model.frame() coerce their 'formula' argument (if supplied) to a formula. o max(), min() and range() now work with character vectors. o message() has a new argument 'appendLF' to handle messages with and without newlines. There is a new message class packageStartupMessage() that can be suppressed separately. o A new function, method.skeleton() writes a skeleton version of a call to setMethod() to a file, with correct arguments and format, given the name of the function and the method signature. o mode<- and storage.mode<- do slightly less copying. o nls.control(* , printEval = FALSE, warnOnly = FALSE) are two new options to help better analyze (non-)convergence of nls(), thanks to Kate Mullen. nls() and summary(nls()) now contain more information and also print information about convergence. o options(device = ) now accepts a function object as well as the name of a function. o pdf() supports new values for 'paper' of "US" (same as "letter"), "a4r" and "USr" (the latter two meaning rotated to landscape). postscript() also accepts paper = "US". o persp() now respects the graphical pars 'cex.axis', 'cex.lab', 'font.axis' and 'font.lab'. o New faster internal functions pmax.int() and pmin.int() for inputs which are atomic vectors without classes (called by pmax/pmin where applicable). pmin/pmax are now more likely to work with classed objects: they work with POSIXlt datetimes, for example. o postscript() now by default writes grey colors (including black and white) via 'setgray', which gives more widely acceptable output. There are options to write pure RGB, CMYK or gray via the new argument 'colormodel'. o rbind.data.frame() now ignores all zero-row inputs, as well as zero-column inputs (which it used to do, undocumented). This is because read.table() can create zero-row data frames with NULL columns, and those cannot be extended. o readChar() and writeChar() can now work with a raw vector. o read.table(), write.table() and allies have been moved to package utils. o rgb() now accepts the red, green and blue components in a single matrix or data frame. o New utility function RShowDoc() in package 'utils' to find and display manuals and other documentation files. o New .row_names_info() utility function finds the number of rows efficiently for data frames; consequently, dim.data.frame() has become very fast for large data frames with 'automatic' row names. o RSiteSearch() now also allows to search postings of the 'R-devel' mailing list. o screeplot() is now (S3) generic with a default method, thanks to a patch from Gavin Simpson. o Experimental 'verbose' argument for selectMethod(). Might be replaced later by a better interface for method selection inspection. o Added links to source files to the parsing routines, so that source() can now echo the original source and comments (rather than deparsing). This affects example() and Sweave() as well. o stack() and unstack() have been moved to package utils. o strptime() now sets the "tzone" attribute on the result if tz != "". o str.default() typically prints fewer entries of logical vectors. o The RweaveLatex driver for Sweave() now supports two new options: expand=FALSE, to show chunk references in the output, and concordance=TRUE, to output the concordance between input and output lines. o system() now takes the same set of arguments on all platforms, with those which are not applicable being ignored with a warning. Unix-alikes gain 'input' and 'wait', and Windows gains 'ignore.stderr'. o system.time() and proc.time() now return an object of class "proc_time" with a print() method that returns a POSIX-like format with names. o Sys.getenv() has a new argument 'unset' to allow unset and set to "" to be distinguished (if the OS does). The results of Sys.getenv() are now sorted (by name). o New function Sys.glob(), a wrapper for the POSIX.2 function glob(3) to do wildcard expansion (on systems which have it, plus an emulation on Windows). o Sys.setenv() is a new (and preferred) synonym for Sys.putenv(). The internal C code uses the POSIX-preferred 'setenv' rather than 'putenv' where the former is available. o New function Sys.unsetenv() to remove environment variables (on systems where unsetenv is implemented or putenv can remove variables, such as on Windows). o text(), mtext(), strheight(), strwidth(), legend(), axis(), title(), pie(), grid.text() and textGrob() all attempt to coerce non-language annotation objects (in the sense of is.object) to character vectors. This is principally intended to cover factors and POSIXt and Date objects, and is done via the new utility function as.graphicsAnnot() in package grDevices. o tcltk:tk_select.list() now chooses the width to fit the widest item. o {re,un}tracemem() are now primitives for efficiency and so migrate from 'utils' to 'base'. o union(), interect(), setdiff() and setequal() now coerce their arguments to be vectors (and they were documented only to apply to vectors). o uniroot() now works if the zero occurs at one of the ends of the interval (suggestion of Tamas Papp). o There is a new function View() for viewing matrix-like objects in a spreadsheet, which can be left up whilst R is running. o New function withVisible() allows R level access to the visibility flag. o zip.file.extract() has been moved to package utils. o A few more cases of subassignment work, e.g. [] <- and [] <- , with suitable coercion of the LHS. o There is a warning if \ is used unnecessarily in a string when being parsed, e.g. "\." where probably "\\." was intended. ("\." is valid, but the same as ".".) Thanks to Bill Dunlap for the suggestion. o Introduced the suffix L for integer literals to create integer rather than numeric values, e.g. 100L, 0x10L, 1e2L. o Set the parser to give verbose error messages in case of syntax errors. o The class "LinearMethodsList" has been extended and will be used to create list versions of methods, derived from the methods tables (environments). The older recursive "MethodsList" class will be deprecated (by the release of 2.5.0 if possible). o There are more flexible ways to specify the default library search path. In addition to R_LIBS and .Library, there are .Library.site (defaults to R_HOME/site-library) and R_LIBS_USER (defaults to a platform- and version-specific directory in ~/R). See ?.libPaths for details. o LAPACK has been updated to version 3.1.0. This should cause only small changes to the output, but do remember that the sign of eigenvectors (and principal components) is indeterminate. o PCRE has been updated to version 7.0. o Several functions handle row names more efficiently: - read.table() and read.DIF() make use of integer row names where appropriate, and avoid at least one copy in assigning them. - data.frame() and the standard as.data.frame() methods avoid generating long dummy row names and then discarding them. - expand.grid() and merge() generate compact 'automatic' row names. - data.matrix() and as.matrix.data.frame() have a new argument 'rownames.force' that by default drops 'automatic' row names. o [i, j] is substantially more memory-efficient when only a small part of the data frame is selected, especially when (part of) a single column is selected. o Command-line R (and Rterm.exe under Windows) accepts the options '-f filename', '--file=filename' and '-e expression' to follow other script interpreters. These imply --no-save unless --save is specified. o Invalid bytes in character strings in an MBCS now deparse/print in the form "\xc1" rather than "", which means they can be parsed/scanned. o Printing functions (without source attributes) and expressions now preserves integers (using the L suffix) and NAs (using NA_real_ etc where necessary). o The 'internal' objects .helpForCall, .tryHelp and topicName are no longer exported from 'utils'. o The internal regex code has been upgraded to glibc 2.5 (from 2.3.6). o Text help now attempts to display files which have an \encoding section in the specified encoding via file.show(). o R now attempts to keep track of character strings which are known to be in Latin-1 or UTF-8 and print or plot them appropriately in other locales. This is primarily intended to make it possible to use data in Western European languages in both Latin-1 and UTF-8 locales. Currently scan(), read.table(), readLines(), parse() and source() allow encodings to be declared, and console input in suitable locales is also recognized. New function Encoding() can read or set the declared encodings for a character vector. o There have been numerous performance improvements to the data editor on both Windows and X11. In particular, resizing the window works much better on X11. o Packages graphics and grid no longer require grDevices, as they might be used only with third-party devices. DEPRECATED & DEFUNCT o symbol.C() and symbol.For() are defunct, and have been replaced by wrappers that give a warning. o Calling a builtin function with an empty argument is now always an error. o The autoloading of ts() is defunct. o The undocumented reserved word GLOBAL.ENV has been removed. (It was yet another way to get the value of the symbol .GlobalEnv.) o The deprecated behaviour of structure() in adding a class when specifying with "tsp" or "levels" attributes is now defunct. o unix() is now finally defunct, having been deprecated for at least seven years. o Sys.putenv() is now deprecated in favour of Sys.setenv(), following the POSIX recommendation. o Building R with --without-iconv is deprecated. o Using $ on an atomic vector is deprecated (it was previously valid and documented to return NULL). o The use of storage.mode<- for other than standard types (and in particular for value "single") is deprecated: use mode<- instead. INSTALLATION o A suitable iconv (e.g. from glibc or GNU libiconv) is required. For 2.5.x only you can build R without it by configuring using --without-iconv. o There is support again for building on AIX (tested on 5.2 and 5.3) thanks to Ei-ji Nakama. o Autoconf 2.60 or later is used to create 'configure'. This makes a number of small changes, and incorporates the changes to the detection of a C99-compliant C compiler backported for 2.4.1. o Detection of a Java development environment was added such that packages don't need to provide their own Java detection. Newly added make variables are JAVAC, JAVAH, JAR and JAVA_CPPFLAGS. R CMD javareconf was updated to look for the corresponding Java tools as well. In addition, Java detection honors user-supplied environment variables JAVA_CPPFLAGS, JAVA_LIBS and JAVA_LD_LIBRARY_PATH. o Added workaround for reported non-POSIX sh on OSF1. (PR#9375) o 'make install-strip' now works, stripping the executables and also the shared libraries and modules on platforms where 'libtool' knows how to do so. o Building R as a shared library and standalone nmath now installs pkg-config files 'libR.pc' and 'libRmath.pc' respectively. o Added test for insufficiently complete implementation of sigaction. C-LEVEL FACILITIES o Functions str2type, type2char and type2str are now available in Rinternals.h. o Added support for Objective C in R and packages (if available). o R_ParseVector() has a new 4th argument 'SEXP srcfile' allowing source references to be attached to the returned expression list. o Added ptr_R_WriteConsoleEx callback which allows consoles to distinguish between regular output and errors/warnings. To ensure backward compatibility it is only used if ptr_R_WriteConsole is set to NULL. UTILITIES o Additional Sweave() internal functions are exported to help writing new drivers, and RweaveLatexRuncode() is now created using a helper function (all from a patch submitted by Seth Falcon). o The following additional flags are accessible from R CMD config: OBJC, OBJCFLAGS, JAR, JAVA, JAVAC, JAVAH, JAVA_HOME, JAVA_LIBS and JAVA_CPPFLAGS. o R CMD build now takes the package name from the DESCRIPTION file and not from the directory. (PR#9266) o checkS3methods() (and hence R CMD check) now checks agreement with primitive internal generics, and checks for additional arguments in methods where the generic does not have a '...' argument. codoc() now knows the argument lists of primitive functions. o R CMD INSTALL and R CMD REMOVE now use as the default library (if -l is not specified) the first library that would be used if R were run in the current environment (and they run R to find it). o There is a new front-end Rscript which can be used for #! scripts and similar tasks. See help("Rscript") and 'An Introduction to R' for further details. o R CMD BATCH (not Windows) no longer prepends 'invisible(options(echo = TRUE))' to the input script. This was the default unless --slave is specified and the latter is no longer overridden. On all OSes it makes use of the -f argument to R, so file("stdin") can be used from BATCH scripts. On all OSes it reports proc.time() at the end of the script unless q() is called with options to inhibit this. o R CMD INSTALL now prepends the installation directory (if specified) to the library search path. o Package installation now re-encodes R files and the NAMESPACE file if the DESCRIPTION file specifies an encoding, and sets the encoding used for reading files in preparing for LazyData. This will help if a package needs to be used in (say) both latin1 and UTF-8 locales on different systems. o R CMD check now reports on non-ASCII strings in datasets. (These are a portability issue, which can be alleviated by marking their encoding: see 'Writing R Extensions'.) o Rdiff now converts CRLF endings in the target file, and converts UTF-8 single quotes in either to ASCII quotes. o New recommended package 'codetools' by Luke Tierney provides code-analysis tools. This can optionally be used by 'R CMD check' to detect problems, especially symbols which are not visible. o R CMD config now knows about LIBnn . o New recommended package 'rcompgen' by Deepayan Sarkar provides support for command-line completion under the Unix terminal interface (provided readline is enabled) and the Windows Rgui and Rterm front ends. BUG FIXES o gc() can now report quantities of 'Vcells' in excess of 16Gb on 64-bit systems (rather than reporting NA). o Assigning class "factor" to an object now requires it has integer (and not say double) codes. o structure() ensures that objects with added class "factor" have integer codes. o The "formula" and "outer" attributes of datasets 'ChickWeight', 'CO2', 'DNase', 'Indometh', 'Loblolly', 'Orange' and 'Theoph' now have an empty environment and not the environment used to dump the datasets in the package. o Dataset 'Seatbelts' now correctly has class c("mts", "ts"). o str() now labels classes on data frames more coherently. o Several 'special' primitives and .Internals could return invisibly if the evaluation of an argument led to the visibility flag being turned off. These included as.character(), as.vector(), call(), dim(), dimnames(), lapply(), rep(), seq() and seq_along(). Others (e.g. dput() and print.default()) could return visibly when this was not intended. o Several primitives such as dim() were not checking the number of arguments supplied before method dispatch. o Tracing of primitive functions has been corrected. It should now be the case that tracing either works or is not allowed for all primitive functions. (Problems remain if you make a primitive into a generic when it is being traced. To be fixed later.) o max.col() now omits infinite values in determining the relative tolerance. o R CMD Sweave and R CMD Stangle now respond to --help and --version like other utilities. o .libPaths() adds only existing directories (as it was documented to, but could add non-directories). o setIs() and setClassUnion() failed to find some existing subclasses and produced spurious warnings, now fixed. o data.frame() ignored 'row.names' for 0-column data frames, and no longer treats an explicit row.names=NULL differently from the default value. o identical() looked at the internal structure of the 'row.names' attribute, and not the value visible at R level. o abline(reg) now also correctly works with intercept-only lm models, and abline() warns more when it's called illogically. o warning() was truncating messages at getOption("warning.length") - 1 (not as documented), with no indication. It now appends '[... truncated]'. o Stangle/Sweave were throwing spurious warnings if options 'result' or 'strip.white' were unset. o all.equal() was ignoring 'check.attributes' for list and expression targets, and checking only attributes on raw vectors. Logical vectors were being compared as if they were numeric, (with a mean difference being quoted). o Calculating the number of significant digits in a number was itself subject to rounding errors for digits >= 16. The calculation has been changed to err on the side of slightly too few significant digits (but still at least 15) rather than far too many. (An example is print(1.001, digits=16).) o unlink() on Unix-alikes failed for paths containing spaces. o substr() and friends treated NA 'start' or 'stop' incorrectly. o merge(x, y, all.y = TRUE) would sometimes incorrectly return logical columns for columns only in y when there were no common rows. o read.table(fn, col.names=) on an empty file returned NULL columns, rather than logical(0) columns (which is what results from reading a file with just a header). o grid.[xy]axis(label=logical(0)) failed. o expression() was unnecessarily duplicating arguments. o as.expression() returned a single-element expression vector, which was not compatible with S: it now copies lists element-by-element. o supsmu(periodic = TRUE) could segfault. (PR#9502, detection and patch by Bill Dunlap.) o pmax/pmin called with only logical arguments did not coerce to numeric, although they were documented to do so (as max/min do). o methods() did not know that cbind() and rbind() are internally generic. o dim(x) <- NULL removed the names of x, but this was always undocumented. It is not clear that it is desirable but it is S-compatible and relied on, so is now documented. o which(x, arr.ind = TRUE) did not return a matrix (as documented) if 'x' was an array of length 0. o C-level duplicate() truncated CHARSXPs with embedded nuls. o Partial matching of attributes was not working as documented in some cases if there were more than two partial matches or if "names" was involved. o data(package=character(0)) was not looking in ./data as documented. o summary.mlm() failed if some response names were "" (as can easily happen if cbind() is used). o The postscript() and pdf() drivers shared an encoding list but used slightly different formats. This caused problems if both were used with the same non-default encoding in the same session. (PR#9517) o The data editor was not allowing Inf, NA and NaN to be entered in numerical columns. It was intended to differentiate between empty cells and NAs, but did not do so: it now does so for strings. o supsmu() could segfault if all cases had non-finite values. (PR#9519) o plnorm(x, lower.tail=FALSE) was returning the wrong tail for x <= 0. (PR#9520) o which.min() would not report a minimum of +Inf, and analogously for which.max(). (PR#9522) o 'R CMD check' could fail with an unhelpful error when checking Rd files for errors if there was only one file and that had a serious error. (PR#9459) o try() has been reimplemented using tryCatch() to solve two problems with the original implementation: (i) try() would run non-NULL options("error") expressions for errors within a try, and (ii) try() would catch user interrupts. o str(obj) could fail when obj contained a dendrogram. o Using [, ] <- NULL failed (PR#9565) o choose(n, k) could return non-integer values for integer n and small k on some platforms. o nclass.scott(x) and nclass.FD(x) no longer return NaN when var(x) or IQR(x) (respectively) is zero. hist() now allows breaks = 1 (which the above patch will return), but not breaks = Inf (which gave an obscure error). o strptime("%j") now also works for the first days of Feb-Dec. (PR#9577) o write.table() now recovers better if 'file' is an unopened connection. (It used to open it for both the column names and the data.) o Fixed bug in mosaicplot(sort=) introduced by undocumented change in R 2.4.1 (changeset r39655). o contr.treatment(n=0) failed with a spurious error message. (It remains an error.) o as.numeric() was incorrectly documented: it is identical to as.double. o jitter(rep(-1, 3)) gave NaNs. (PR#9580) o max.col() was not random for a row of zeroes. (PR#9542) o ansari.test(conf.int=TRUE, exact=FALSE) failed. o trace() now works on S3 registered methods, by modifying the version in the S3 methods table. o rep(length=1, each=0) segfaulted. o postscript() could overflow a buffer if used with a long 'command' argument. o The internal computations to copy complete attribute lists did not copy the flag marking S4 objects, so the copies no longer behaved like S4 objects. o The C code of nlminb() was altering a variable without duplicating it. (This did not affect nlminb() but would have if the code was called from a different wrapper.) o smooth(kind = "3RS3R") (the current default) used .C(DUP = FALSE) but altered its input argument. (This was masked by duplication in as.double.) o The signature for the predefined S4 method for as.character() was missing '...' . o readBin() could read beyond the end of the vector when size-changing was involved. o The C entry point PrintValue (designed to emulate auto-printing) would not find show() for use on S4 objects, and did not have the same search path (for show(), print() and print() methods) as auto-printing. Also, auto-printing and print() of S4 objects would fail to find 'show' if the methods name space was loaded but the package was not attached (or otherwise not in the search path). o print() (and auto-printing) now recognize S4 objects even when 'methods' is not loaded, and print a short summary rather than dump the internal structure. o Sweave and Stangle had problems due to partial matching of code chunk names when run with split=TRUE. o install.packages() on a source package now ensures that R CMD INSTALL sees the same library search path as install.packages() did when computing dependencies. o density() now ensures its 'y' values are non-negative. (PR#8876) o is.finite() and is.infinite() (and many other primitives) are not internally generic and so do not support S4 methods, which can no longer be set. (PR#7951) o nls(algorithm = "port") now accepts a list 'start' argument, as for the other methods (and as documented). o Standard errors from the "ar" method of predict() could be wrong for the last p predictions for models near non-stationarity. (PR#9614) ************************************************** * * * 2.4 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.4.1 patched NEW FEATURES o The Simplified Chinese translations have been completed. BUG FIXES o co.intervals() sometimes failed to cover the largest value. o tempfile() is now random across sessions as well as within a session. (On some systems it would give the same hex suffix at the start of each session.) o Added infinite recursion test to internal function isMissing. (PR#9426) o The "Date" and "POSIXt" methods for cut() were not choosing the first day of the year for breaks = "years". (In part, PR#9433.) o R is now able to deparse/print invalid multibyte strings in MBCS locales (such as UTF-8) using hex escapes. This means that e.g. demo(Hershey) works in all such locales. o optimize() could give incorrect answers in some rare problems with exact symmetry about the midpoint of the interval supplied. (PR#9438) o The residuals from an lm() fit with no coefficients but an offset were incorrect. o oneway.test() was expecting a literal formula and did not accept a variable containing a formula. o The legacy Quartz device (used by console R) displayed its window outside the screen estate in some dual-head setups. Now it will be always displayed in the center of the main screen. o read.ftable() was not functional on non-seekable connections such as URLs. o Some large memory allocations could cause segfaults or crashes (e.g. followup to PR#9557). o Sweave() would drop characters from the end of chunk names ending in "R". (PR#9567) o library(), i.e. its internal checkConflicts(), now (again) prints "The following object(s) are masked .." only once per masked package. o methods:::cbind(x) {one argument} now works, calling cbind2(x) when 'x' is an S4 object. CHANGES IN R VERSION 2.4.1 INSTALLATION o The extraction of info from Subversion for an SVN checkout now also works for svn >= 1.4.0. However, on Windows the 'Last Changed Date' will be in the local timezone, and not in GMT as previously. o configure uses code borrowed from autoconf 2.60 to try harder to ensure that a C99-compliant compiler is used. (It does so by appending to CC.) This avoids problems with systems such as FC5 which override CFLAGS and thereby lose flags such as -std=gnu99. NEW FEATURES o rainbow(), heat.colors(), terrain.colors(), topo.colors() and cm.colors() all gain an 'alpha' argument to be passed to hsv(). o dput() will give an incorrect representation of the row names of a data frame with integer row names. This is now corrected when the object is recreated. C-LEVEL FACILITIES o Using STRICT_R_HEADERS applies to more reported clashes with Windows headers, including Calloc and Realloc. These and Free need to be prefixed by R_ when STRICT_R_HEADERS is defined. DEPRECATED & DEFUNCT o The previously undocumented behaviour of structure() in adding a class when specifying "tsp" or "levels" attributes is now deprecated (with a warning). BUG FIXES o Fixed warning() to use .dfltWarn intead of .dfltStop for default handling (PR#9274). o R would slow down when the product of the length of a vector and the length of a character vector used to subset it exceeded 2^31. (PR#9280) o merge() now allows zero-row data frames. o add1.lm() had been broken by other changes for weighted fits. o axis.POSIXct() would sometimes give the wrong labels. o Help for a method call would fail. (PR#9291) o gzfile() returned an object of class "file" not "gzfile". (PR#9271) o load()ing from a connection had a logic bug in when it closed the connection. (PR#9271) o The lowess() algorithm is unstable if the MAD of the residuals becomes (effectively) zero: R now terminates the iterations at that point. (This may result in quite different answers.) The 'delta' argument was incorrectly documented. (PR#9264) o abbreviate() would only work for strings of up to 8191 bytes, but this was not checked. Now longer strings are errors. o Drawing X11 rotated text was buggy for VERY small (negative) angle of rotation. Reported by Ben Bolker. (PR#9301) o The X11 data editor would crash in an MBCS locale if R was compiled with FC's CFLAGS that add buffer overflow and stack-smashing detection. o rect() was not accepting border=NA in some cases involving cross-hatching. o Fixes to S4 group generics to ensure that the correct number of active arguments are in the signature of the group and all members. Also a fix to keep the 'groupMembers' slot up to date. o S4 group generic "Logic" (with '&', '|', but not '!') has been created, following the green book (apart from '!'). o removeClass() now takes care to remove any subclass references to the deleted class. o mle() (in stats4) might not have worked as intended when the order of parameters in 'start' differed from that in the log-likelihood. (PR#9313) o dotchart() now properly restores par() settings after itself. o system() on Mac OS X was blocking arbitrary signals during the call although only SIGPROF was meant to be blocked. o methods cached via callNextMethod() and (sometimes) as() were being cached as directly specified although in fact they were inherited. Caused problems in later search for inherited methods. o str() works properly for method definitions and other S4-classed function objects. o JAVA_LIBS are now set correctly on MacOS X. o Fix null-termination issue suspected of causing crash with Fedora Extra RPMS (PR#9339, Justin Harrington, analysis and fix from Bill Dunlap). o Name spaces restored via a saved session silently failed to cache their methods because the methods package was not yet attached. Fixed by attaching methods before restoring data. o rbind()ing a list to a data frame generated invalid row names, which were an error in 2.4.0. (PR#9346) o boxplot.stats(x) now returns the correct minimum instead of an error for x <- c(1,Inf,Inf,Inf), and hence boxplot(x) "works". o promptClass() now uses \linkS4class{} instead of of \link{-class}. o gc() no longer reports nonsense values for the number of used Vcells if the true value exceeds 2^31 (and hence over 16Gb of heap is in use): it now reports NA. (PR#9345) o rapply() now detects more user errors in supplying arguments. (PR#9349) o boxplot() was ignoring argument 'boxfill'. (PR#9352) o plot.lm(which = 6, id.n = 0) did not work. (PR#9333) o .deparseOpts("delayPromises") was not matching the C code, returning 64 rather than 32. o bxp() could use partial matching on 'pars' when finding defaults for some of its parameters, e.g. a setting of 'cex.axis' in 'pars' or inline was used to set a default for 'outcex'. o acf() now allows lag.max = 0 except when type="partial", and forces the lag 0 autocorrelation to 1. (PR#9360) o hist(*, include.lowest=., right=., plot=FALSE) does not warn anymore, (PR#9356) and more. o Some bugs in caching superclass/subclass relations and in removing those relations on detach and on removeClass() have been fixed. o readBin() could return one too many strings if 'n' was an over-estimate. (PR#9361) o A request for an opaque colour in the pdf() device after a translucent one did not set the transparency back to opaque in 2.4.0. Semi-transparent background colours were not being plotted on the pdf() device. o plot.lm(which=5) in the case of constant leverage re-ordered the factor levels but not the residuals, so the labelling by factor level was often incorrect. o packBits() was not accepting a logical argument. (PR#9374) o make install was omitting doc/FAQ and doc/RESOURCES. o A two-sample t.test(x, y, var.equal=TRUE) did not allow one of the groups to be of size one. o The "ts" method for print() failed on some corrupted objects of class "ts", e.g. those without a "tsp" attribute. o structure() reordered the "class" value given if there was a "tsp" value specified. o pairs() now does pass appropriate parts of '...' to the 'diag.panel' argument. (PR#9384) o plot.lm() was using an incorrect estimate of dispersion for some GLMs (including family=binomial and family=poisson). (PR#9316) o Subsetting operators were setting R_Visible too early, so assignments in arguments could make the result invisible. (PR#9263) o The tk-GUI was displaying a warning due to an extra comma in the list of manuals (PR#9396) o packageDescription() now gives an explicit error on a corrupt DESCRIPTION file. o There was a scoping issue with tcltk callbacks given as unevaluated expressions. This has only been partially fixed, a complete fix probably requires redesign. o trace() had its return value documented incorrectly and was sometimes visible when it should not have been. o pchisq() would sometimes use the wrong tail when calculating non-central probabilities with lower.tail = FALSE. (PR#9406) o rm() could remove the wrong objects when passed an expression. (PR#9399) Now only names are allowed in the '...' argument, and the incorrect documentation of what happened with character objects is corrected. o url() was not supporting 'encoding' except on file:// URLs. CHANGES IN R VERSION 2.4.0 USER-VISIBLE CHANGES o The startup message now prints first the version string and then the copyright notice (to be more similar to R --version). o save() by default evaluates promise objects. The old behaviour (to save the promise and its evaluation environment) can be obtained by setting the new argument 'eval.promises' to FALSE. (Note that this does not apply to promises embedded in objects, only to top-level objects.) o The functions read.csv(), read.csv2(), read.delim(), read.delim2() now default their 'comment.char' argument to "". (These functions are designed to read files produced by other software, which might use the # character inside fields, but are unlikely to use it for comments.) o The bindings in the base environment/name space (currently the same thing) are now locked. This means that the values of base functions cannot be changed except via assignInNamespace() and similar tricks. o [[ on a factor now returns a one-element factor (and not an integer), as.list() on a factor returns a list of one-element factors (and not of character vectors), and unlist() on a list of factors returns a factor (and not an integer vector). These changes may affect the results of sapply() and lapply() applied to factors. o mauchly.test() now returns the W statistic (for comparability with SAS and SPSS), rather than the z (which was accidentally not named in the output) o sort(x, decreasing = FALSE, ...) is now a generic function. This means that 'partial' is no longer the second argument, and calls which used positional matching may be incorrect: we try to detect them. o See the section on 'Changes to S4 methods': all packages depending on 'methods' need to be re-installed. NEW FEATURES o agrep(), grep(), strwrap(), strtrim(), substr() and related functions now coerce arguments which should be character via as.character() rather than internally (so method dispatch takes place, e.g. for factors). chartr(), charfold(), tolower() and toupper() now coerce their main argument if necessary to a character vector via as.character(). Functions which work element-by-element on character vectors to give a character result now preserve attributes including names, dims and dimnames (as suggested by the Blue Book p. 144). Such functions include charfold(), chartr(), gsub(), strtrim(), sub(), substr(), tolower() and toupper(). (Note that coercion of a non-character argument may lose the attributes.) agrep(value = TRUE) preserves names for compatibility with grep(). nchar() has always preserved dims/dimnames (undocumented before) and now also preserves names. o .Deprecated and .Defunct take a new parameter, msg, that allows for the specification of the message printed and facilitates deprecation of calling sequences etc. o .Fortran() will map 'name' to lower case, and will work with 'name' containing underscores. o The default is now .saveRDS(compress = TRUE) o The :: operator now also works for packages without name spaces that are on the search path. o [[ on a list does not duplicate the extracted element unless necessary. (It did not duplicate in other cases, e.g. a pairlist.) o argsAnywhere() works like args() on non-exported functions. o as.data.frame() gains a '...' argument. o Added an as.data.frame() method for class "ftable". o as.list() is now handled by internal code and no longer loses attributes such as names. as.list() no longer duplicates (unnecessarily). o as.POSIX[cl]t can now convert character strings containing fractional seconds. o attach() can now attach a copy of an environment. o available.packages() and installed.packages() gain a 'fields' argument thanks to Seth Falcon. o axis.POSIXct() uses a different algorithm for ranges of 2 to 50 days that will mark days at midnight in the current timezone (even if the graph crosses a DST change). o body<-() and formals<-() default to envir = environment(fun), that is they do not by default change the environment. (Previously they changed it to parent.frame().) o New function combn(x, m, ..) for computing on all combinations of size 'm' (for small 'm' !). o The cumxxx() functions now handle logical/integer arguments separately from numeric ones, and so return an integer result where appropriate. o data.frame() has a new argument 'stringsAsFactor'. This and the default for read.table(as.is=) are set from the new global option 'stringsAsFactors' via the utility function default.stringsAsFactors(). o dev.interactive() now has an optional argument 'orNone'. o df() now has a noncentrality argument 'ncp', based on a contribution by Peter Ruckdeschel. o example() gains an argument 'ask' which defaults to "TRUE when sensible", but the default can be overridden by setting option 'example.ask'. o expand.grid() now has an argument 'KEEP.OUT.ATTRS' which can suppress (the potentially expensive) "out.attrs" attribute. It no longer returns an extraneous 'colnames' attribute. o The subset and subassign methods for factors now handle factor matrices, and dim() can be set on a factor. o There is now a format() method for class "ftable". o head(x, n) and tail(x, n) now also work for negative arguments, thanks to Vincent Goulet. o head.matrix() and tail.matrix() are no longer hidden, to be used for building head() and tail() methods for other classes. o If help() finds multiple help files for a given topic, a menu of titles is used to allow interactive choice. o help.search() now rebuilds the database if 'package' specifies a package not in the saved database. o hist(*, plot = FALSE) now warns about unused arguments. o history() gains a 'pattern' argument as suggested by Romain Francois. o integer(0) now prints as that rather than "numeric(0)" (it always deparsed as "integer(0)"). o interaction(..., drop=TRUE) now gives the same result as interaction(...)[,drop=TRUE] (it used to sometimes give a different order for the levels). o lag.plot() produces a conventional plot (not setting mfrow) if only one plot is to be produced. o lapply() does much less copying. Vector X are handled without duplication, and other types are coerced via as.list(). (As a result, package 'boot' runs its examples 4% faster.) lapply() now coerces to a list (rather than traverse the pairlist from the beginning for each item). o legend() has new parameters 'box.lwd' and 'box.lty'. o lines() gains a simple method for isoreg() results. o load() no longer coerces pairlists to lists (which was undocumented, but has been happening since 1998). o make.link() now returns an object of class "link-glm". The GLM families accept an object of this class for their 'link' argument, which allows user-specified link functions. Also, quasi() allows user-specified variance functions. o mapply() uses names more analogously to lapply(), e.g.. o matplot() now accepts a 'bg' argument similarly to plot.default() etc. o median() is now generic, and its default method uses mean() rather than sum() and so is more widely applicable (e.g. to dates). o Dummy functions memory.size() and memory.limit() are available on Unix-alikes, for people who have not noticed that documentation is Windows-specific. o merge() works more efficiently when there are relatively few matches between the data frames (for example, for 1-1 matching). The order of the result is changed for 'sort = FALSE'. o merge() now inserts row names as a character column and not a factor: this makes the default sort order more comprehensible. o Raw, complex and character vectors are now allowed in model frames (there was a previously undocumented restriction to logical, integer and numeric types.). Character vectors in a formula passed to model.matrix() are converted to factors and coded accordingly. o modifyList() utility, typically for housekeeping nested lists. o x <- 1:20; y <- rnorm(x); nls(y ~ A*exp(-x^2/sig)) no longer returns an unhelpful error message. In this and similar cases, it now tries a wild guess for starting values. o Ops.difftime() now handles unary minus and plus. o Ops.Date() and Ops.POSIXt() now allow character arguments (which are coerced to the appropriate class before comparison, for Ops.POSIXt() using the current time zone). o There is a new option(max.contour.segments = 25000) which can be raised to allow extremely complex contour lines in contour() and contourLines(). (PR#9205) o options(max.print = N) where N defaults to 99999 now cuts printing of large objects after about N entries. print(x, ..., max = N) does the same for the default method and those building on print.default(). options("menu.graphics") controls if graphical menus should be used when available. options("par.ask.default") allows the default for par("ask") to be set for a newly-opened device. (Defaults to FALSE, the previous behaviour.) The way option("papersize") is set has been changed. On platforms which support the LC_PAPER locale category, the setting is taken first from the R_PAPERSIZE environment variable at run time, then from the LC_PAPER category ("letter" for _US and _CA locales and "a4" otherwise). On other platforms (including Windows and older Unixen), the choice is unchanged. o package.skeleton() gains arguments 'namespace' and 'code_files'. o par(ask=TRUE) now only applies to interactive R sessions. o parse() now returns up to 'n' expressions, rather than fill the expressions vector with NULL. (This is now compatible with S.) o The 'version' argument for pdf() is now increased automatically (with a warning) if features which need a higher level are used. o pie() now allows expressions for 'labels', and empty slices. o There is a new '%.%' operator for mathematical annotations (plotmath) which draws a centred multiplication dot (a \cdot in LaTeX), thanks to Uwe Ligges. o predict.lm() gains a 'pred.var' argument. (Wishlist PR#8877.) o print.summary.{aov,glm,lm,nls} and print.{aov,glm} make use of naprint() to report when na.action altered the model frame. o print.table(T, zero.print=ch) now also replaces 0 by ch when T is non-integer with integer entries. o Recursive rapply() which is similar to lapply but used recursively and can restrict the classes of elements to which it is applied. o r2dtable() has been moved to package 'stats'. o New function read.DIF() to read Data Interchange Format files, and (on Windows) this format from the clipboard. o New experimental function readNEWS() to read R's own "NEWS" file and similarly formatted ones. o readLines() has a new argument 'warn' to suppress warnings: the default behaviour is still to warn. o reg.finalizer() has a new argument 'onexit' to parallel the C-level equivalent R_RegisterFinalizerEx. o rep() is now a primitive function and under some conditions very much faster: rep.int() is still a little faster (but does less). (Because it is primitive there are minor changes to the call semantics: see the help page.) o The 'row.names' of a data frame may be stored internally as an integer or character vector. This can result in considerably more compact storage (and more logical row names from rbind) when the row.names are 1:nrow(x). However, such data frames are not compatible with earlier versions of R: this can be ensured by supplying a character vector as 'row.names'. row.names() will always return a character vector, but direct access to the attribute may not. The internal storage of row.names = 1:n just records 'n', for efficiency with very long vectors. The "row.names" attribute must be a character or integer vector, and this is now enforced by the C code. o The "data.frame" and "matrix" methods for rowsum() gain an 'na.rm' argument. o Experimental support for memory-use profiling via Rprof(), summaryRprof(), Rprofmem() and tracemem(). o save.image() [also called by sys.save.image() and hence from q()] now defaults to saving compressed binary images. To revert to the previous behaviour set option "save.image.defaults": see ?save.image. o There is a new primitive seq.int() which is slightly more restricted than seq() but often very much faster, and new primitives seq_along() and seq_len() which are faster still. o serialize(connection = NULL) now returns a raw vector (and not a character string). unserialize() accepts both old and new formats (and has since 2.3.0). o setwd() now returns the previously current directory (invisibly). o The function sort() is now sort.int(), with a new generic function sort() which behaves in the same way (except for the order of its argument list) for objects without a class, and relies on the '[' method for objects with a class (unless a specific method has been written, as it has for class "POSIXlt"). o sort.list() now implements complex vectors (PR#9039), and how complex numbers are sorted is now documented. o spline() and splinefun() now follow approx[fun] to have an argument 'ties = mean' which makes them applicable also when 'x' has duplicated values. o str(x) does not print the S3 "class" attribute when it is the same as 'mode' (which is printed anyway, possibly abbreviated) and it puts it beside mode for atomic objects such as S3 class "table". o str() now outputs 'data.frame' instead of `data.frame'; this may affect some strict (Package) tests. o str() now takes also its defaults for 'vec.len' and 'digits.d' from options('str') which can be set by the new strOptions(). o symnum() has a new argument 'numeric.x' particularly useful for handling 0/1 data. o Sys.getlocale() and Sys.setlocale() support LC_MESSAGES, LC_PAPER and LC_MEASUREMENT if the platform does. o Sweave has a new options 'pdf.encoding' and 'pdf.version' for its Rweave driver. o The character vector used by an output textConnection() has a locked binding whilst the connection is open. There is a new function textConnectionValue() to retrieve the value of an output textConnection(). o traceback() gains a 'max.lines' argument. .Traceback is no longer stored in the workspace. o warning(immediate. = TRUE) now applies to getOption("warn") < 0 and not just == 0. o warnings() is now an accessor function for 'last.warning' (which is no longer stored in the workspace) with a print() method. o The internal internet download functions have some new features from libxml 2.6.26. o There is an option "HTTPUserAgent" to set the User Agent in R download requests etc. Patch from S. Falcon. o PCRE has been updated to version 6.7. o The C function substituteList now has tail recursion expanded out, so C stack overflow is less likely. (PR#8141, fix by Kevin Hendricks) o The (somewhat soft) 1023/4 byte limit on command lines is now documented in 'An Introduction to R'. o The maximum number of open connections has been increased from 50 to 128. o There is a new manual 'R Internals' on R internal stuctures plus the former appendices of 'Writing R Extensions'. o The autoloads introduced at the package re-organization have been almost completely removed: the one that remains is for ts(). o The setting of the various Java configuration variables has been improved to refer to JAVA_HOME, and they are now documented in the R-admin manual. o It is (again) possible to calculate prediction intervals from "lm" objects for the original data frame, now with a warning that the intervals refer to future observations. Weighted intervals have also been implemented, with user-specifiable weights. Warnings are given in cases where the default behaviour might differ from user expectations. See the ?predict.lm for details. CHANGES TO S4 METHODS o The default prototype object for S4 classes will have its own internal type in 2.4.0, as opposed to being an empty list (the cause of several errors in the code up to 2.3.1). Note that old binary objects, including class definitions, will be inconsistent with the type, and should be recreated. o S4 method dispatch has been completely revised to use cached generic functions and to search for the best match among inherited methods. See ?Methods and http://developer.r-project.org/howMethodsWork.pdf o Objects created from an S4 class are now marked by an internal flag, tested by isS4() in R and by macro IS_S4_OBJECT() in C. This is an efficient and reliable test, and should replace all earlier heuristic tests. o Some changes have been made to automatic printing of S4 objects, to make this correspond to a call to show(), as per 'Programming with Data'. o S4 generic and class definitions are now cached when the related package is loaded. This should improve efficiency and also avoid anomalous situations in which a class or generic cannot be found. o trace() now creates a new S4 class for the traced object if required. This allows tracing of user-defined subclasses of "function". DEPRECATED & DEFUNCT o The re-named tcltk functions tkcmd, tkfile.tail, tkfile.dir, tkopen, tkclose, tkputs, tkread are now defunct. o Argument 'col' of bxp() has been removed: use 'boxfill'. o Use of NULL as an environment is now an error. o postscriptFont() is defunct: use Type1Font(). o La.chol() and La.chol2inv() are defunct (they were the same as the default options of chol() and chol2inv). o La.svd(method = "dgesvd") is defunct. o Files install.R and R_PROFILE.R in packages are now ignored (with a warning). o The following deprecated command-line options to INSTALL have been removed (use the fields in the DESCRIPTION file instead): -s --save --no-save --lazy --no-lazy --lazy-data --no-lazy-data o Graphical parameter 'tmag' is obsolete. o mauchley.test() (package 'stats') is now defunct. o symbol.C() and symbol.For() are deprecated. They are required in S for use with is.loaded(), but are not so required in R. o load()ing an object saved in one of the formats used prior to R 1.4.0 is deprecated. Such objects should be re-saved in the current format. o save(version = 1) is now deprecated. C-LEVEL FACILITIES o The convenience function ScalarLogical now coerces all non-zero non-NA values to TRUE. o The vector accessor functions such as INTEGER, REAL and SET_VECTOR_ELT now check that they are called on the correct SEXPTYPE (or at least on a compatible one). See `Writing R Extensions' for the details and for a stricter test regime. o It is no longer possible to pass list variables to .C(DUP = FALSE): it would have given rise to obscure garbage collection errors. o allocString is now a macro, so packages using it will need to be reinstalled. o R_ParseVector was returning with object(s) protected in the parser if the status was PARSE_INCOMPLETE or PARSE_ERROR. o There is a new function Rf_endEmbeddedR to properly terminate a session started by Rf_initEmbeddedR, and both are now available on Windows as well as on Unix-alikes. These and related functions are declared in a new header . If R_TempDir is set when embedded R is initialized it is assumed to point to a valid session temporary directory: see `Writing R Extensions'. o There is a new interface allowing one package to make C routines available to C code in other packages. The interface consists of the routines R_RegisterCCallable and R_GetCCallable. These functions are declared in . This interface is experimental and subject to change. In addition, a package can arrange to make use of header files in another (already installed) package via the 'LinkingTo' field in the DESCRIPTION file: see 'Writing R Extensions'. UTILITIES o R CMD SHLIB now handles (as linker commands) -L*, -l* and *.a. o R CMD check now: - warns if there are non-ASCII characters in the R code (as these will likely be syntax errors in some locale). - tests Rd cross-references by default, and tests for (syntactically) valid CITATION metadata. - tests that the package can be loaded, and that the package and name space (if there is one) can each be loaded in startup code (before the standard packages are loaded). - tests for empty 'exec' or 'inst' directories. - checks if $(FLIBS) is used when $(BLAS_LIBS) is. - checks that all packages (except non-S4-using standard packages) used in ::, :::, library() and require() calls are declared in the DESCRIPTION file, and 'methods' is declared if S4 classes or methods are set. - throws an error if the standard packages 'methods' and 'stats4' are imported from in the NAMESPACE file and not declared in the DESCRIPTION file. o The test script produced by massage-Examples.pl no longer creates objects in the base environment. o New utilties R CMD Stangle and R CMD Sweave for extracting S/R code from and processing Sweave documentation, respectively. o The DESCRIPTION file of packages may contain an 'Enhances:' field. o An R CMD javareconf script has been added to allow Java configuration to be updated even after R has been installed. INSTALLATION o The C function realpath (used by normalizePath()) is hidden on some systems and we try harder to find it. o There is a new option --enable-BLAS-shlib, which compiles the BLAS into a dynamic library -lRblas and links against that. For the pros and cons see the R-admin manual. The defaults are now --without-blas (so you have explicitly to ask for an external BLAS), and --enable-BLAS-shlib unless a usable external BLAS is found or on AIX or on MacOS X 10.2 and earlier. o MacOS X did not like having LSAME in both BLAS and LAPACK libraries, so it is no longer part of the R-internal LAPACK. We now require an external BLAS to provide LSAME: it seems that nowadays all do. o The configure test for 'whether mixed C/Fortran code can be run' has been improved as on one system that test passed but the Fortran run-time library was broken. o A precious configure variable DEFS can be set to pass defines (e.g. -DUSE_TYPE_CHECKING_STRICT) to C code when compiling R. o There is now a test for visible __libc_stack_end on Linux systems (since it is not visible on some recent glibc's built from the sources). o MacOS X 10.4 and higher now use two-level namespaces, single module in a shared library and allow undefined symbols to be resolved at run-time. This implies that common symbols are now allowed in package libraries. --enable-BLAS-shlib is supported for internal BLAS, external BLAS framework and external static BLAS. An external dynamic library BLAS is NOT supported. (But it can be easily used by replacing internal BLAS library file later.) MacOS X < 10.4 does not support --enable-BLAS-shlib. o Dynamic libraries and modules use a flat namespace on MacOS X 10.4 and higher if either Xcode tools don't support dynamic lookup (Xcode < 2.3) or the FORCE_FLAT_NAMESPACE environment variable is set. (The latter was introduced temporarily for testing purposes and may go away anytime.) o configure now defaults to 'run-time linking' on AIX (and AIX < 4.2 is no longer allowed), using -bexpall rather than export/import files. If this works, it allows R to be built in the same way as other Unix-alikes, including with R as a shared library and with a shared BLAS. o The "mac.binary" package type now defaults to universal binary. If a repository supports architecture-specific Mac binaries, they can be requested by using "mac.binary.xxx" in contrib.url(), where xxx is the desired architecture. o There is a new configure option --enable-memory-profiling to enable memory profiling with tracemem, Rprof, Rprofmem. BUG FIXES o The name of a Fortran symbol reported to be missing by .Fortran() is now the actual name. (What was reported to be an 'entry point' was missing the common leading underscore.) o print() on a MBCS character string now works properly a character at a time rather than a byte at time. (This does not affect MBCSs like UTF-8 and the Windows DBCSes which have non-ASCII lead bytes and always worked correctly.) o glm() now recalculates the null deviance whenever there is an offset (even if it is exactly zero to avoid a discontinuity in that case, since the calculations with and without offset are done by different algorithms). o Amongst families, quasi() accepted an expression for link and no other did. Now all accept an expression which evaluates to a one-element character vector (although e.g. 'logit' is taken as a name and not an expression). o trace() now accepts arguments where= and signature= for the old-style trace (no tracer or exit, edit==FALSE) and just prints a message on entry. Also the undocumented feature of where=function now works for generic functions as well. o callNextMethod() failed for recursive use when the methods had nonstandard argument lists. Now enforces the semantic rule that the inheritance is fixed when the method containing the callNextMethod() is installed. See Details in the documentation. o UseMethod() looked for the defining environment of 'generic' as if it were the current function, although some functions are generic for methods of a different generic. Lookup for S3 methods is confined to functions: previously a non-function 'fun.class' could have masked a function of the same name. o Line types (lty) specified as hex strings were documented not to allow zero, but some devices accepted zero and handled it in a device-dependent way. Now it is an error on all devices. (PR#8914) o Subassignment for a time series can no longer extend the series: it used to attempt to but failed to adjust the tsp attributes. Now window() must be used. o Function AIC() in package 'stats4' was not dispatching correctly on S4 classes via logLik() because of name space issues. o Subsetting LANGSXPs could break the call-by-value illusion. (PR#7924) (patch from Kevin Hendricks). o parse() with n > 1 gave a syntax error if fewer than n statements were available. o parse() with n > 1 gave strange results on some syntax errors. (PR#8815) o lag.plot() now respects graphical parameters for the axes. o Using a wrong link in family() now gives more consistent error messages. o sort.list(method="radix") works on factors again. o object.size() is more accurate for vector objects (it takes into account the smaller header and also the fixed sizes used in the node classes for small vector objects). o addmargins(T, ...) now returns a "table" when 'T' is a "table", as its help page has always suggested. o remove() now explicitly precludes removing variables from baseenv() and throws an error (this was previously ignored). o Saving the workspace at the end of a session now works as has long been intended, that is it is saved only if something has been added/deleted/changed during the current session. o The search for bindings in <<-, ->> and assign(inherits=TRUE) was omitting the base package, although this was not documented. Now the base package is included (but most bindings there are locked). o dweibull(0, shape) was NaN not Inf for shape < 1. Also, the help for dgamma and dweibull gave support as x > 0, but returned non-zero values for x = 0. (PR#9080) o Subsetting arrays no longer preserves attributes (it was removed for matrices in 1998). o The "factor" method of as.character() no longer maps level "NA" to "" (a legacy of before there were NA character strings). o terms(keep.order=TRUE) was not returning a valid "order" attribute. o The DLL registration code was not freeing .External symbols. o The internet download routines expected URLs of less than 4096 bytes, but did not check. Now this is checked, and http:// URLs are allowed to be up to 40960 bytes. o parse(n=-1) threw a stack-imbalance error, and parse(n=3) did not cope correctly with EOF during input. o Zero-column data frames had no names (rather than character(0)). o by() and acf() could get confused when they used very long expressions as names. o residuals(, type="working") was NA for cases with zero weight (whereas they are well-defined even though the case was not used during the fitting) and the actual value is now returned. This allows residuals to be computed from fits with 'y = FALSE'. The residuals in a fitted "glm" object are computed more accurately: the previous formula was subject to cancellation. o loess() now checks the validity of its 'control' argument. o rownames(<0-row matrix>, do.NULL=FALSE) was wrong. (PR#9136) o apply() now works as documented when applied over 2 or more margins with one of zero extent. (It used to drop dimensions.) o head() and tail() now also work row-wise for "table" and "ftable" objects. o NextMethod() could throw an error/crash if called from a method that was called directly rather than from a generic (so .Method was unset). o order(x, na.last = NA) failed for a zero-length x. o grep(pat, x, value = TRUE, perl = L) preserved names for L == TRUE && !is.na(pat) but not otherwise. Now it always does. o [rc]bind() now find registered methods and not just visible ones. o Printing a factor no longer ignores attributes such as names and dim/dimnames. o Command-line arguments after --encoding were ignored. o The check for impossible confidence levels was off by one in wilcox.test (PR#8557) o [[ on an environment could create aliases. (PR#8457) o pt() with a very small (or zero) non-centrality parameter could give an unduly stringent warning about 'full precision was not achieved'. (PR#9171) o writeChar() could segfault if 'nchars' was given silly values. o qt() and rt() did not work for vector 'ncp', and qt() did not work for negative 'ncp'. o ns() failed to work correctly when 'x' was of length one. o identical() ignored tags on pairlists (including names of attributes) and required an identical ordering for attribute values in their pairlists. Now names are compared on pairlists, and attribute sets are treated as unordered. o If they were unused arguments supplied to a closure, only the first non-empty one was reported, despite the message. Unmatched empty arguments (such as f(1,,) for a function of one argument) were ignored. They are now an error. o Calling a builtin with empty arguments used to silently remove them (and this was undocumented). Now this is an error unless builtin is c() or list() or there are only trailing empty arguments, when it is a warning (for the time being: this will be made an error in R 2.5.0). o install.packages() ignored 'configure.args' if the vector was unnamed. o biplot() now works if there are missing values in the data. o biplot() now passes par() values to all four axes (not just those on sides 1 and 2). o [.acf now handles an empty first index. o Deparsing uses backticks more consistently to quote non-syntactic names. o Assigning to the symbol in a for() loop with a list/expression/pairlist index could alter the index. Now the loop variable is explicitly read-only. (PR#9216) o Using old.packages() (and hence update.packages()) on an empty (or non-existent) library failed with an obscure message. o plot.xy() could segfault if supplied with an invalid 'col' argument. (PR#9221) o menu() with graphics=TRUE attempted to use Tcl/Tk on unix even if DISPLAY was not set (in which case Tk is not available and so the attempt is bound to fail). o The print() method for 'dist' objects prints a matrix even for n = 2. o The cumxxx functions were missing some PROTECTs and so could segfault on long vectors (especially with names or where coercion to numeric occurred). o The X11() device no longer produces (apparently spurious) 'BadWindow (invalid Window parameter)' warnings when run from Rcmdr. o legend() assumed that widths and heights of strings were positive, which they need not be in user coordinates with reversed axes. (In part, PR#9236) o The plot() methods for "profile.nls" objects could get confused if 'which' had been used in the profile() call. (PR#9231) o boxplot() did not passed named arguments (except graphics parameters) to bxp() as documented. (PR#9183) o Only genuinely empty statements act as 'return' in the browser, not say those starting with a comment char. (PR#9063) o summary.mlm() incorrectly used accessor functions to fake an "lm" object. (PR#9191) o prettyNum() was not preserving attributes, despite being explicitly documented to. (PR#8695) o It was previously undocumented what happened if a graphical parameter was passed in both '...' and 'pars' to boxplot() and bxp(), and they behaved differently. Now those passed in '...' have precedence in both cases. o A failed subassignment could leave behind an object '*tmp*'. The fix also sometimes gives better error messages. o Using SIGUSR1 on Unix now always terminates a session, and no longer is caught by browser contexts and restarts (such as try()). o In the 'graphics' package, in-line 'font=5' was being ignored (report by Tom Cook). o nls() looked for non-parameter arguments in a function call in the wrong scope (from the body of nls). o Printing of complex numbers could misbehave when one of the parts was large (so scientific notation was used) and the other was so much smaller that it had no significant digits and should have been printed as zero (e.g. 1e80+3e44i). o Using install.packages with type="mac.binary" and target path starting with ~ failed with a cryptic message while unpacking. o getwd() now works correctly when the working directory is unavailable (e.g. unreadable). o The alternative hypothesis in wilcox.test() was labelled by an unexplained quantity 'mu' which is now spelled out. The alternative hypothesis in ks.test() is clearer both in the documentation and in the result. (PR#5360) ************************************************** * * * 2.3 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.3.1 patched BUG FIXES o help.search() incorrectly documented where the data base was saved and what was in it. o str(, strict.width=.) produced an extraneous first line. o Improve bessel[IJKY]() out-of-range checks; for instance, besselI(0, nu) no longer warns. o cbind() on some non-vector objects (e.g. names) segfaulted. o xy.coords(numeric(0)) gave a misleading error message. o INSTALL has been changed to avoid a few problems with the base (not XPG4) Solaris sed. o There were problems in src/main/printutils.c on some platforms that did not have va_copy. o [pq]unif allowed infinite ranges but handled them inconsistently. (PR#8958) [pq]unif claimed to allow min == max but did not do so. o Subassignment within raw arrays of 3 or more dimensions was not implemented. o environment<-() did not even check if duplication was required. o rt() was wrong for non-central t-distributions. qt() was wrong for non-central distributions far enough in the left tail (reported by Long Qu). (PR#9050) o R CMD Rdconv --type=Ssgm no longer drops a \keyword{} if it is the only one (one letter typo patch by Bill Dunlap). (PR#9051) o poly(x, y, .., raw = TRUE) now does follow 'raw'. o plot(0/ -1:1, type = "s") now works. (PR#9046) o write.table(row.names=FALSE) does now quote column names. (PR#9044) o rf() was wrong for non-central F (PR#9055). qf() was wrong for non-central F with denominator df > 1e8. o The C function dummy_vfprint had a potential memory leak. This showed up with some connections when used with output of more than 100,000 bytes in a single call. o vignette() did not work if more than one vignette with the same name was installed (PR#9069). o mget() did not check that 'ifnotfound' was a list, and worked incorrectly if it was a list of length > 1. Now 'ifnotfound' is (if possible) coerced to a list. o The internal check for conformance of time series assumed that the 'tsp' attribute was integer, whereas it was enforced to be numeric by the attribute-setting code. o Sweave() did not treat \begin{document} in comments correctly (PR#9073). o diff(x) gave wrong result when x was "POSIXlt" object. o title() could crash R when passed a long non-character vector. (PR#9115) o Functions rgb(), hsv() and hcl() lost names unless 'alpha' was specified. (In part, PR#9118) o The residuals() method for "glm" failed unless 'y = TRUE' for the fit: it now works for "working" and "partial" residuals and gives a useful error message for other types. (PR#9124) o R CMD Rdconv did not close the .Rd file, which matters on Windows. (PR#9126) o anova.mlm() could fail if applied to a single model. o stopifnot() now gives a better message for a *long* expression. CHANGES IN R VERSION 2.3.1 NEW FEATURES o In lm() and glm(), offsets are allowed to be length 1 (and if so are replicated to the number of cases). o The \uxxxx notation for Unicode characters in input strings can now be used on any platform which supports MBCS, even if the current locale is not MBCS (provided that the Unicode character is valid in the current character set). o The quasibinomial() family now allows the "cauchit" link. (PR#8851) o edit.data.frame() no longer (silently) coerces character columns to factor. C-LEVEL FACILITIES o The variables controlling stack checking are made available via Rinterface.h to front-ends embedding R: see 'Writing R Extensions' o R_SignalHandlers (defined in Rinterface.h) can be set to 0 to suppress the R signal handlers in front-ends embedding R. INSTALLATION CHANGES o There have been a number of changes to help installation on platforms that no one had beta-tested. - Changes related to older header files, e.g. on Redhat 8.0/9. - Problems with 'make install' on older (<3) versions of bash on Solaris and elsewhere. - AIX 5.2/gcc issues with needing -lm when making modules X11 and vfonts. - Some versions of Solaris and AIX had an fcntl.h that redefined 'open' to be 'open64' and thereby broke compilation of src/main/connections.c and elsewhere. o 'make uninstall' works better on a build using a named subarchitecture. BUG FIXES o min(), max(), sum() and prod() gave nonsensical answers with an empty list or raw argument. o sum() on a data frame did not allow multiple arguments. (PR#8385) o charmatch() and pmatch() did not specify they applied only to character vectors. Now they do, and attempt to coerce 'x' and 'target' to character before attempting matching. o The Summary() methods for data.frame, Date, POSIXct, POSIXlt and difftime all required an argument which can match 'x', although the generics did not. o regexpr() now accepts 0-length 'text' inputs. o help.search() no longer errors out on a wrongly installed package (with no "hsearch.rds" file). o The LaTeX version of the package reference manual was omitting some topics, and was not sorting the foo-package topic first. o Serializing (e.g. via save()) is better protected against C stack overflow, which will now abort the conversion but no longer crashes the R process on some platforms. o rbind()ing dataframes with a single row could lead to a corrupt data frame (a problem with the fix to PR#8506). o plot(lm(y ~ 1)) now works also for 'which = 5'. o dbeta(0, 1, a, 0) now correctly gives 'a' (limit) instead of 0, and dbeta(0, a, b, ncp) now returns Inf instead of NaN. o demo(Hershey) was failing on the Cyrillic octal codes in locales (e.g. UTF-8) in which these are invalid. o mean() on an integer (or logical) vector was treating NAs as actual values (unless na.rm = TRUE). o mean() on a complex vector was calculated incorrectly in code to improve precision (PR#8842, John Peters). o Graphical parameters bg, cex, col, lty, and lwd were being checked as being of length one even by functions such as title() that ignored them. (Functions such as lines() and points() allow them to be of length > 1, so they might be passed through ... to other high-level graphical functions which then used to reject them.) o str() now is fast again for large character vectors. o edit() would default the environment of a function to .BaseEnv, instead of to .GlobalEnv. o lm() and glm() coerce their 'weights' and 'offset' values to vector to avoid problems with specifying them as 1D or n x 1 arrays. o image() with one or both axes on log scales would give a spurious warning; contour() would give an error. o legend() with log axes would place the title in the wrong place. o edit.data.frame() was not returning factors edited with factor.mode="numeric" to factors. o edit.matrix() tried to set rownames and colnames from the original matrix even if the sizes had been altered, and ignored changes made to the column names. 'edit.row.names' has a more sensible default (if the rownames are non-NULL). o bindingIsLocked() was returning invalid values of a logical vector on some platforms. o merge.data.frame() did not make the column names unique (by appending elements of 'suffixes') when performing a Cartesian product. (PR#8676) o rbind.data.frame() matches up the names of columns (which was undocumented), but failed to do so when checking if it was dealing with a factor column. (PR#8868) If rbind() was used on data frames with duplicated names it produced a corrupt data frame. o dt(x, df, ncp= not.0) no longer gives erratic values for |x| < ~1e-12. (PR#8874) o \code{\linkS4class{.}} now works. o ccf() aligns time series by ts.intersect() rather than ts.union() and so is less likely to need a non-default na.action. (PR#8893) o optim(method="CG") could return a value that did not correspond to $par for very badly behaved functions on which the second phase of the line search failed. (PR#8786) o print.ts() could fail on a corrupt time series: it now warns and does the best it can. CHANGES IN R VERSION 2.3.0 USER-VISIBLE CHANGES o In the grid package there are new 'arrow' arguments to grid.line.to(), grid.lines(), and grid.segments() (grid.arrows() has been deprecated). The new 'arrow' arguments have been added BEFORE the 'name', 'gp' and 'vp' arguments so existing code that specifies any of these arguments *by position* (not by name) will fail. o all.equal() is more stringent, see the PR#8191 bug fix below. o The data frame argument to transform() is no longer called 'x', but '_data'. Since this is an invalid name, it is less likely to clash with names given to transformed variables. (People were getting into trouble with transform(data, x=y+z).) NEW FEATURES o arima.sim() has a new argument 'start.innov' for compatibility with S-PLUS. (If not supplied, the output is unchanged from previous versions in R.) o arrows() has been changed to be more similar to segments(): for example col=NA omits the arrow rather than as previously (undocumented) using par("col"). o as.list() now accepts symbols (as given by as.symbol() aka as.name()). o atan2() now allows one complex and one numeric argument. o The 'masked' warnings given by attach() and library() now only warn for functions masking functions or non-functions masking non-functions. o New function Axis(), a generic version of axis(), with Date and POSIX[cl]t methods. This is used by most of the standard plotting functions (boxplot, contour, coplot, filled.contour, pairs, plot.default, rug, stripchart) which will thus label x or y axes appropriately. o pbeta() now uses TOMS708 in all cases and so is more accurate in some (e.g. when lower.tail = FALSE and when one of the shape parameters is very small). o [qr]beta(), [qr]f() and [qr]t() now have a non-centrality parameter. o [rc]bind and some more cases of subassignment are implemented for raw matrices. (PR8529 and 8530) o The number of lines of deparsed calls printed by browser() and traceback() can be limited by the option "deparse.max.lines". (Wish of PR#8638.) o New canCoerce() utility function in "methods" package. o [pq]chisq() are considerably more accurate for moderate (up to 80) values of ncp, and lower.tail = FALSE is fully supported in that region. (They are somewhat slower than before.) o chol(pivot = TRUE) now gives a warning if used on a (numerically) non-positive-definite matrix. o chooseCRANmirror() consults the CRAN master (if accessible) to find an up-to-date list of mirrors. o cov.wt() is more efficient for 'cor = TRUE' and has a new 'method' argument which allows 'Maximum Likelihood'. o do.call() gains an 'envir' argument. o eigen() applied to an asymmetric real matrix now uses a tolerance to decide if the result is complex (rather than expecting the imaginary parts of the eigenvalues to be exactly zero). o New function embedFonts() for embedding fonts in PDF or PostScript graphics files. o fisher.test() now uses p-values computed via hypergeometric distributions for all 2 by 2 tables. This might be slightly slower for a few cases, but works much better for tables with some large counts. There is a new option to simulate the p-value for larger than 2 x 2 tables. o for() now supports raw vectors as the set of indices. o getNativeSymbolInfo() is vectorized for the 'name' argument. It returns a named list of NativeSymbolInfo objects, but is backward compatible by default when called with a character vector of length 1, returning the NativeSymbolInfo object. o help.search() no longer attempts to handle packages installed prior to R 2.0.0, and reports the current path to the package (rather than where it was originally installed: this information is not shown by the print() method). o Added "hexmode" to parallel "octmode". o install.packages() now does tilde expansion on file paths supplied as 'pkgs'. o install.packages() has additional arguments 'configure.args' and 'clean' which allow the caller to provide additional arguments to the underlying R CMD INSTALL shell command when installing source packages on a Unix-alike. o is.loaded() has a new argument 'type' to confine the search to symbols for .C, .Fortran, .Call or .External: by default it looks for a symbol which will match any of them. It is now internal and not primitive, so argument matching works in the usual way. o The symmetry test for matrices used in eigen() has been ``exported'' as the 'matrix' method of a new S3-generic 'isSymmetric(). o .leap.seconds and the internal adjustment code now know about the 23rd leap second on 2005-12-31: the internal code uses a run-time test to see if the OS does. o The 'col' argument of legend() now defaults to par("col") (which defaults to "black", the previous default), so that the lines/symbols are shown in the legend in the colour that is used on the plot. o log2() and log10() call C functions of the same name if available, and will then be more likely to be precise to machine accuracy. o new.packages() gains a ... argument to pass e.g. 'destdir' to install.packages(). (Wish of PR#8239.) o nls() now supports 'weights'. o The vector passed as the first argument of the 'fn' and 'gr' arguments of optim() has the names (if any) given to argument 'par'. o options(expressions) is temporarily increased by 500 during error-handling. This enables e.g. traceback() to work when the error is reaching the limit on the nesting of expressions. o page() accepts general R objects, not just names (and previously undocumented) character strings. This allows the object to be specified as a call, for example. More options are allowed in its '...' argument. o pairs() allows a wider class of inputs, including data frames with date and date-time columns. o par() and the in-line use of graphical parameters produce more informative error messages, distinguishing between non-existent pars and inappropriate use of valid pars. Graphical parameters 'family', 'lend', 'ljoin' and 'lmitre' can now be set in-line. There is no longer a warning if non-settable pars are used in-line, but there is an appropriate warning if unknown pars are passed. The length limit for the 'family' parameter has been increased to 200 bytes, to allow for the names of some CID-keyed fonts in multi-byte locales. o The pdf() device now allows 'family' to be specified in the same generality as postscript(). o The pdf() device writes /FontDescriptor entries for all fonts except the base 14, and does not write font entries for unused fonts. o Plotmath allows 'vartheta', 'varphi' and 'varsigma' (or 'stigma') as synonyms for 'theta1', 'phi1' and 'sigma1', and the help page has a note for TeX users. o plot.xy() now takes its default arguments from the corresponding par() settings, so points(type="l") and lines(type="p") behave in the same way (and more obviously, also for type="b"). o poly() has a new argument 'raw', mainly for pedagogical purposes. o The class "POSIXlt" now supports fractional seconds (as "POSIXct" has always done). The printing of fractional seconds is controlled by the new option "digits.secs", and by default is off. o postscript() supports family = "ComputerModernItalic" for Computer Modern with italic (rather than slanted) faces. o The postscript()/pdf() font metrics for the 14 standard fonts (only, not the rest of the common 35) have been updated to versions from late 1999 which cover more glyphs. There are also a few differences in the metrics and hence the output might be slightly different in some cases. o The way families can be specified for postscript() and pdf() has been expanded to include CID-keyed fonts, with new functions Type1Font() and CIDFont() to set up such fonts families. o prettyNum() has new arguments 'preserve.width' and 'zero.print'. When the former is not "none", as in calls from format() and formatC(), the resulting strings are kept at the desired width when possible even after adding of 'big.mark' or 'small.mark'. o proc.time() and system.time() now record times to 1ms accuracy where available (most Unix-like systems). o The initialization methods for the quasi() family have been changed to depend on the variance function, and in particular to work better for the "mu(1-mu)" variance function. (PR#8486) o read.table() gains a 'flush' argument passed to scan(). o require() now takes a 'lib.loc' argument. o The second argument 'size' to sample() is required to have length 1, so that errors when supplying arguments are more easily detected. o The default is now compress = !ascii in save() (but not save.image). o scan() and write.table() now have some interruptibility, which may be useful when processing very large files. o A new heuristic test, seemsS4Object() is supplied, along with a similar C-level test, R_seemsS4Object(object). The test detects probable S4 objects by their class's attribute. See the help page. o S3 classes can now be made non-virtual S4 classes by supplying a prototype object in the arguments to setOldClass(). o splinefun() returns a function that now also has a 'deriv' argument and can provide up to the 3rd derivative of the interpolating spline, thanks to Berwin Turlach. o stopifnot(A) now gives a better error message when A has NAs, and uses "not all TRUE" when A has length >= 2. o str()'s default method has a new argument 'strict.width' which can be used to produce strict 'width' conforming output. A new options(str = list(strict.width = *)) setting allows to control this for a whole session. o summary.nls() has a new argument 'correlation' that defaults to FALSE (like summary.lm). o Sys.sleep() has sub-millisecond resolution on Unix-alikes with gettimeofday(). o Sys.time() now has sub-millisecond accuracy on systems supporting the POSIX call gettimeofday, and clock-tick accuracy on Windows. o The new function timestamp() adds a time stamp to the saved command history on consoles which support it. o New function tcrossprod() for efficiently computing x %*% t(x) and x %*% t(y). o The suffix used by tempfile() is now in hex on all platforms and guaranteed to be at least 6 hex digits (usually 8). o trace() now works more consistently and more like its documentation, in particular the assertions about old tracing being removed for new. For debugging purposes (of R) a mechanism for debugging the trace computations themselves was added. See trace.R. o The implementation of trace() has beem made more general by calling a function to do the trace interaction, and recover() now detects trace calls to trim the irrelevant code underneath. o unserialize() can now also read a byte stream from a raw vector. o The useDynLib() directive in the NAMESPACE file now accepts the names of the native/foreign symbols that are to be resolved in the DLL for use in .C/.Call/.Fortran/.External calls. These can be used as regular R variables instead of the (routine name, PACKAGE) pairs currently recommended. Alternative names can be given for the R variables mapping to these symbols. The native routine registration information can also be used directly via useDynLib(name, .registration = TRUE). See the 'Writing R Extensions' manual for more details. checkFF() (package 'tools') has been updated accordingly. o validObject() has an option complete=TRUE that recursively checks the objects in the slots. Not used when new(...) checks validity. o New Vectorize() function, a wrapper for mapply(). o write.ftable() has gained an argument 'append = FALSE' (thanks to Stephen Weigand). o On Unix-alikes, X11() now has arguments to request the initial position of the window, and 'gamma' defaults to the value of getOptions("gamma"). These changes are consistent with the windows() device. o X11() and the Unix-alike data entry window can have properties (including geometry) set by X resources: see their help files. o xy.coords() & xyz.coords() now have NULL defaults for their 'y' or 'y' and 'z' arguments. This is more consistent with their earlier documentation, and may be convenient for using them. o Non-syntactic names of list elements are now printed quoted by backticks rather than double quotes. o There is some basic checking for imminent C stack overflow (when the evaluation depth and the user interrupts are checked). On systems with suitable OS support (not Windows), segfaults from C stack overflow are caught and treated as an R error. New function Cstack_info() reports on stack size and usage. options(expressions) reverts to the default of 5000 now stack checking is in place. o Package tcltk does not try to initialize Tk on Unix-alikes unless a DISPLAY variable is present. This allows packages dependent on tcltk to be installed without access to an X server. o The code used to guess timezone offsets where not supplied by the OS uses a different algorithm that is more likely to guess the summer-time transitions correctly. o Package tools contains translation tables 'Adobe_glyphs' and 'charset_to_Unicode'. o Changed the environment tree to be rooted in an empty environment, available as emptyenv(). baseenv() has been modified to return an environment with emptyenv() as parent, rather than NULL. o gettext has been updated to 0.14.5. o PCRE has been updated to version 6.4. o The method $.DLLInfo resolves the specified symbol in the DLL, returning a NativeSymbolInfo object. Use [[ to access the actual values in the DLLInfo object. o On systems with either vasprintf or both va_copy and a vsnprintf which reports the size of buffer required, connections such as gzfile() and bzfile() can now write arbitrarily long lines, not just 100000 chars. o The R session temporary directory is now set in C code using the same algorithm whether or not the shell front-end is used and on all platforms. This looks at environment variables TMPDIR, TMP and TEMP in turn, and checks if they point to a writable directory. o Some of the classical tests put unnecessary restrictions on the LHS in the formula interface (e.g., t.test(x+y ~ g) was not allowed). o On suitably equipped Unix-alike systems, segfaults, illegal operations and bus errors are caught and there is a simple error-handler which gives the user some choice as to what to do in interactive use. [Experimental.] On Windows access violations and illegal instructions are caught with a simple error handler. [Experimental.] o Tracebacks now include calls such as .C/.Fortran/.Call, which will help if errors occur in R code evaluated by compiled code and in tracebacks presented by the segfault etc handlers. o Treatment of signature objects and method definition objects has been modified to give cleaner printing and more consistency in the treatment of signatures. A sometimes useful utility, methodSignatureMatrix(), is now exported. o Printing the results of codoc() from package tools now helpfully summarizes the found code/documentation mismatches. o R refrains from printing a final EOL upon exiting the main loop if the quiet flag is on and if the save action is known (e.g. this is true for --slave). DEPRECATED & DEFUNCT o The deprecated and undocumented use of atan() with two arguments has been removed: instead use atan2(). o write.table0() is defunct in favour of write.table(). o format.char() is defunct in favour of format.default(). o Support for the long-deprecated (and no longer documented) arguments --min-vsize --min-nsize --max-vsize --max-nsize --vsize --nsize of R CMD BATCH has been removed. o The 'debian' subdirectory has been removed from the sources. o The 'vfont' argument of axis() and mtext() has been removed: use par(family=) instead. o The unused graphical parameter "type" has been removed: it invited confusion with the 'type' argument to default methods of plot(), points() and lines(). o nlsMethod() and profiler() are no longer exported from the stats name space (and nlsMethod.plinear() is no longer registered as a method, as nlsMethod() was not generic). o The re-named tcltk functions tkcmd, tkfile.tail, tkfile.dir, tkopen, tkclose, tkputs, tkread are now formally deprecated. o Argument 'col' of bxp() is now formally deprecated. o Use of NULL as an environment is deprecated and gives a warning. o postscriptFont() is deprecated in favour of Type1Font() (which is just a change of name). o La.chol() and La.chol2inv() are deprecated (they have since R 1.7.0 been the same as the default options of chol() and chol2inv). o La.svd(method = "dgesvd") is deprecated. o The use of install.R and R_PROFILE.R files in packages is deprecated: use the DESCRIPTION file instead to arrange to save an image or to load dependent packages. The following command-line options to INSTALL are deprecated (use the fields in the DESCRIPTION file instead): -s --save --no-save --lazy --no-lazy --lazy-data --no-lazy-data o Graphical parameter 'tmag' (which is long unused) is deprecated. INTERNATIONALIZATION A set of patches supplied by Ei-ji Nakama has been incorporated. o New postscript encodings for CP1253, CP1257 and Greek (ISO 8859-7). o Support for East Asian CID-keyed fonts in pdf() and postscript(). Although these usually contain Latin characters no accurate AFMs are available and so CID-keyed fonts are intended only for use with CJK characters. o Wide-character width functions wc[s]width are provided that overcome problems found with OS-supplied ones (and those previously used by R on Windows). This means that double-width CJK characters are now supported on all platforms. It seems that the width of some characters (and not just CJK characters) depends on which CJK locale's fonts are in use and also on the OS. Revised wide-character classification functions are provided for use on Windows, AIX and MacOS X to replace deficient OS-supplied ones. o There is support for MBCS charsets in the pictex() graphics device, and rotated (by 90 degrees) text may work better. o The \u (and \U except on Windows) notation for characters which is supported by the parser in all MBCS charsets is now always interpreted as a Unicode point, even on platforms which do not encode wchar_t in Unicode. These are now a syntax error in single-byte locales. o The default encoding for postscript() and pdf() is chosen to be suitable for the current locale, if that is a single-byte locale which is supported. This covers European (including Greek) and Cyrillic languages. In UTF-8 locales, a suitable single-byte encoding is chosen for postscript() and pdf(), and text translated to it. o xfig() gains an 'encoding' argument. o There are some message translations into Spanish. INSTALLATION CHANGES o The encoding files for pdf()/postscript() have been moved to directory 'enc' in package 'grDevices'. o Support for MBCS is only enabled if iconv is found and it supports enough conversions. (libiconv does.) o In an MBCS locale, make check now translates the graphics examples from Latin-1. This ensures that they will work correctly in UTF-8: it is possible that in other MBCS locales they will now fail (rather than work completely incorrectly). o There is a new test, 'test-Docs', which as part of 'make check-devel' tests the code in the documentation. Currently it runs doc/manual/R-{exts,intro}.R and the compiled code in R-exts.c. o The workaround to allow an external LAPACK-containing BLAS such as libsunperf to be used with the internal LAPACK has been removed. If you have such a library you may now need to use --with-lapack. It is no longer possible to use some older versions of libsunperf, e.g. Forte 7 on 64-bit builds. o A substitute for mkdtemp is provided, so it is now always used for R_TempDir. o Most of the functions checked for by 'configure' also have declarations checked for in the appropriate header. o The top-level documentation files AUTHORS COPYING.LIB COPYRIGHTS FAQ RESOURCES THANKS have been moved to doc, and COPYING and NEWS are installed there. The file Y2K has been removed from the distribution. o The extension .lo is no longer used in building R (only in the optional build of libRmath.so): this allows a considerable simplification of the Makefiles. o Direct support for f2c has been removed: it can still be used via a script which makes it look like a Fortran compiler. (src/scripts/f77_f2c is an example of such a script.) o There is a new flag SAFE_FFLAGS which is used for the compilation of dlamc.f. It is set by configure for known problem cases (recent g77 and gfortran), but can be overridden by the user. o The standard autoconf macros for large-file support are now used, and these are enabled unless --disable-largefile is specified. This replaces --enable-linux-lfs (and is now selected by default). o Visibility attributes are used where supported (gcc4/gfortran on some platforms, also gcc3/g77 on FC3 and partially elsewhere). The main benefit should be faster loading (and perhaps better optimized code) in some of the dynamic shared objects (e.g. libR.so and stats.so). o The *PICFLAGS are taken to be -fpic rather than -fPIC where possible. This will make no difference on most platforms: -fPIC is needed on Sparc (and still used there), but -fpic should give slightly better performance on PowerPC (although -fPIC is used on PPC64 as it is needed to build libR.so there). o More use is made of inlining for small utility functions such as isReal. Because this can only be done portably with C99 constructs (and we know of no actual implementation), this is only done for the GNU C compiler. o There is an experimental feature to allow shared installations of sub-architectures. See the R-admin manual. o All platforms now use R's internal implementation of strptime, which allows fractional seconds. (The major platforms were already using it.) o The dlcompat work-around for old Mac OS X systems (<= 10.2) has been removed. External dlcompat must be installed if needed. UTILITIES o R CMD check now uses an install log by default. o R CMD check works for packages whose package name is different from the directory name in which it is located. o R CMD INSTALL now uses more randomness in the temporary directory name even on systems without mktemp -d. o R CMD f77 has been removed now f2c is no longer supported. o The version string shown in the startup message and by "R --version", and that stored in variable R.version.string are now in exactly the same format. o The base name of a help file needs to be valid as part of a file:// URL, so R CMD check now checks the names are ASCII and do not contain % . o R CMD check now warns about unknown sections in Rd files, and invalid names for help, demo and R files, as well as unlikely file names in the 'src' directory. The latter is controlled by option --check-subdirs and by default is done if checking a tarball without a configure script. R CMD build excludes invalid files in the 'man', 'R' and 'demo' subdirectories. o \usepackage[noae]{Sweave} in the header of an Sweave file supresses auto-usage of the ae package ("almost European" fonts) and T1 input encoding. DOCUMENTATION o Rd format now allows \var{} markup inside \code{} and \examples{}. o Markup such as --, ---, < and > is handled better when converting .Rd files to [C]HTML. o There is new markup \link[=dest]{name} to generate a link to topic 'dest' which is shown as 'name', and \linkS4class{abc} which expands to \link[=abc-class]{abc}, for cross-referencing the recommended form of documentation for S4 classes. PACKAGE INSTALLATION o There is now some support for Fortran 90/95 code in packages: see `Writing R Extensions'. o Installation of man sources and demos is now done by R code. The restrictions on the names of help files, R files and of demos are now enforced (see `Writing R Extensions'). o Packages which contain compiled code can now have more than one dot in their name even on Windows. o The Meta/hsearch.rds database saved now contains LibPath="". This information is now always recreated when help.search() is run, but the field is retained for back-compatibility. o update.packages() now has a '...' argument to be passed to install.packages(), including the formerly separate arguments 'destdir' and 'installWithVers'. o Make macros AR and RANLIB are now declared in etc/Makeconf for use by packages which wish to make static libraries. C-LEVEL FACILITIES o qgamma and rgamma in Rmath.h now check for non-positive arguments. o The BLAS which ships with R now contains the complete set of double-complex BLAS routines, rather than just those used in R. has been corrected to add the missing double-precision BLAS functions drotmg and drotm, and to exclude lsame (which is a Lapack auxiliary function and is now declared in ). It also includes the double complex routines added for this release of R provided Fortran doublecomplex is usable on the platform. o and now declare all the entry points as 'extern'. o The flag SAFE_FFLAGS is made available to packages via etc/Makeconf and R CMD config. It can be used where optimization needs to be defeated, e.g. in LAPACK setup. o getNativeSymbolInfo has a withRegistrationInfo argument which causes the address field to be a reference to the registration information if it is available for that symbol. If the registration information is not available, the address is a reference to the native symbol. The default is FALSE which is backward compatible, returning just the address of the symbol and ignoring registration information. o errorcall and warningcall are now declared in (they might be needed in front-ends). o R_FlushConsole and R_ProcessEvents are now declared in . o The R_Sock* functions supporting socket connections are no longer declared in R-ftp-http.h as they are not loaded into R itself, and are now hidden in the module's DLL on suitable systems. BUG FIXES o Quoted arguments to the R script after --args are now passed quoted to the R executable and so will be shown as expected by commandArgs(). (They were previously split at whitespace even inside quotes on Unix-alikes but not on Windows.) o axis() now supports pars 'xaxp'/'yaxp' as inline arguments. o sort() now does not return inappropriate attributes such as "dim" and "tsp": it only returns names. sort(x, partial=) no longer returns unsorted names, and drops names (since it is supplied for efficiency). o Use of non-central F in pf() gives accurate values for larger ncp. o R CMD build --binary does a better job of cleaning up after failure to re-make vignettes. o reg-test-1.R tested system(intern=TRUE) which depends on popen and so is not supported on all platforms. o Changed apparent mis-spelling of "Gibraltar" in dataset 'eurodist'. o sysconf() is now used to find the number of clock ticks/second: under some circumstances glibc reported CLK_TCK = 60 when the true value was 100. o identical() was not allowing for embedded nuls in character strings. (NB: the comparison operators including == do not, and never will.) o The profile() and profiler() methods for "nls" objects now support algorithm = "plinear" and algorithm = "port". o The signal handlers for signals USR1 and USR2 where not restored if the signal arrived when interrupts were suspended. o Certain combinations of S4 inheritance could cause inherited methods to override some directly specified methods. o Some cases of named signatures in calls to setMethod() caused errors. o all.equal() is now more consistent and "picky" about mismatching attributes, in particular names(); this is a part of the propositions by Andy Piskorkski (PR#8191). o load() when applied to a connection leaves it open/not as it found it, and checks explicitly for having a binary readable connection. o The p-values given by stat.anova() (called from several anova() methods) are now NA (rather than spurious) if non-nested models give rise to changes in deviance with a different sign from changes in degrees of freedom. o Built-ins were reported as the relevant call in C-level error()s iff R profiling was in progress. Now they are never reported. o Too-long signatures (with no names) were not being caught in setMethod(). o Slot names in prototype() are being more thoroughly checked. o signif() is more likely to follow the 'round to even' rule for exactly representable numbers, e.g. signif(0.25, 1). (Related to PR#8452.) o nls() now works correctly with some low-dimensional fits, e.g. with one or zero non-linear parameters. o glm() could give an inappropriate error message if all possible coefficients were invalid (e.g. a log-linear binomial model with no intercept and a not all positive predictor). o solve() gives clearer error messages for some incorrect usages. (PR#8494 and similar) o The gaussian() family was missing the 'valideta' component (which could be needed for the "inverse" link function). The starting values supplied by the gaussian family could be invalid for the "log" and "inverse" link functions. This is now reported. o data.matrix() did not work correctly on zero-row data frames. (PR#8496 and other problems.) o The DSC comments in the files from postscript(onefile=FALSE) now label all files as having page 1 of 1, as some other software seems to expect that. o The axis labels chosen for logarithmic axis are now less likely to be linear and inappropriate (when the range is more than 10 and less than 100). (PR#1235) o Staircase lines (types "s" and "S") are now drawn continuously rather than a point at a time and so line types, mitring and so on work. (PR#2630) o Calling par(mfg) before doing any plotting resulted in NewPage never being called on the device, which in turn resulted in incorrect output for postcript() and pdf() devices. (Reported by Marc Schwartz in discussion of the non-bug PR#7820.) o terms.formula needed to add parentheses to formulae with terms containing '|'. (PR#8462) o pbirthday() and qbirthday() now also work for very improbable events {those you are typically *not* interested in}. o Only source help files starting with an upper- or lower-case letter or digit and extension .Rd or .rd are documented to be processed. This is more liberal in that starting with a digit is now also allowed, but rule is now enforced. o nls(algorthm="port") was always taking positive numeric differences and so could exceed the upper bounds. o methods:::.asEnvironmentPackage() was not allowing for versioned installs. o .find.package() now reports which package(s) it cannot find in the case it stops with an error. o The standard Unix-alike version of file.show() gives an informative message if it cannot open a file rather than the (possibly incorrect) 'NO FILE'. o window() did not allow non-overlapping ranges with extend = TRUE. (PR#8545) o pbinom(size = 0) now returns correct values (not NaN). (PR#8560) o [dp]binom(x, *) for x < 0 now always returns 0. (PR#8700) analogous change in pgeom(), pnbinom() and ppois(). o [dqpr]geom and [dpqr]nbinom() now all consistently accept prob = 1 but not prob = 0. qgeom(prob=1) now gives the correct values (not -1). o INSTALL on Unix-alikes was not loading dependent packages when preparing for lazy-loading. o qcauchy(1) now gives +Inf instead of just a very large number. o df(0, f1, *) now properly returns Inf, 1, or 0 for f1 < , = , or > 2. o qbinom(), qnbinom() and qpois() now use a better search and normally reach the answer very quickly when it is large (instead of being slow or infinite-looping). o pt(x, df) lost accuracy in the far tails (when |x| > 1e154) for small df (like df = 0.001 for which such extremes are not unlikely). o dbeta(x, a, b) underflowed internally and incorrectly gave 0 for very small x and a. o None of the warnings about convergence failures or loss of precision in nmath (distribution and special functions) were being reported to the R user. o dnt was missing from standalone nmath (under Unix-alikes). o split() now accepts factors with numeric (but not storage mode integer) codes. o The utilities such as 'check' now report active version numbers again, as SVN 'last changed revision' numbers. o addmargins() did not accept a name for 'FUN', only an expression. o '+' for POSIXt objects now takes the tzone from whichever object has it, so date+x is the same as x+date if x is numeric. o mean.default() and var() compute means with an additional pass and so are often more accurate, e.g. the variance of a constant vector is (almost) always zero and the mean of such a vector will be equal to the constant value to machine precision. (PR#1228) sum(), prod(), mean(), rowSums() and friends use a long double accumulator where available and so may be more accurate. (This is particularly helpful on systems such as Sparc and AMD64 where long double gives considerably greater exponent range and precision than double.) o read.dcf() now gives a warning on malformed lines. o add1.[g]lm now try harder to use the environment of the formula in the orginal fit to look for objects such as the 'data' and 'subset' arguments. o gaussian()$aic was inconsistent with e.g. the lm results from AIC() and extractAIC() for weighted fits: it treated the weights as case weights and not variance factors. o system() on Unix-alikes ignored non-logical values of 'intern' and treated 'intern = NA' as true. o as.table() now produces non-NA rownames when converting a matrix of more than 26 rows. (PR#8652) o Partial sorting used an algorithm that was intended only for a few values of 'partial' and so could be far slower than a full sort. It now switches to a barebones full sort for more than 10 values of 'partial' and uses a more efficient recursive implementation for 2...10. o summary.glm() returned an estimate of dispersion of Inf for a gaussian glm with zero residual degrees of freedom and then treated that as a known value. It now uses the estimate NaN, which is consistent with summary.lm(). o Sys.sleep() on Unix-alikes was restricted to about 2147 seconds and otherwise might never have returned. (PR#8678) o is(obj, Cl) could wrongly report TRUE when Cl was a classUnion and multiple inheritance was involved. o confint[.lm / .default] used label "100 %" for level = 0.999 o Empty entries (i.e., extraneous ",") in NAMESPACE files now give a better error message early at parsing time instead of a less comprehensible one later at load time. o all.equal(n1, n2) could erroneously return NA when n1, n2 contained large integers. o anova.mlm() didn't handle multi-df effects properly in the single-model case (PR#8679) o anova.mlm() had its colnames mangled by data.frame() (needed check.names=FALSE). o summary.glm() gave an NA estimate of dispersion for fits with zero weights. (PR#8720) o qhyper() had too small a tolerance for right-continuity on some platforms so was not always an inverse to phyper(). o rownames<-.data.frame() and dimnames<-.data.frame() tested the length(s) of the replacement value(s) before coercion, which can change the length (e.g. for class "POSIXlt"). o max() and min() ignored the largest/smallest representable integer, as well as Inf/-Inf. (PR#8731) o write.table() assumed factors had integer codes: it now allows malformed factors with numeric codes (and otherwise throws an error). o Worked around a Solaris restriction which meant that Sys.sleep() was only effective for times of up to one second. o sink(, split=TRUE) now works correctly, but is allowed only on platforms that support va_copy or __va_copy. (PR#8716) o factanal(), prcomp() and princomp() now only check that columns in the model frame that will be used are numeric (they previously also checked columns which were part of negative terms in the formula). o Misuse of $ in apply could corrupt memory. (PR#8718) o apply() could fail if the function returned NULL (e.g. if there was a single row). o registerS3method() failed due to a typo. (It was almost never used.) o Registering an S3 method for an S3 generic in another package that was converted to an S4 generic in the same package as the S3 method, registered the method in the wrong place. o Recall() used lookup for the function in use and so could fail if that was an S3 method not on the search path. o Rdconv -t Ssgm failed if it encountered \link[opt]{arg}. o uniroot() did not give a warning (as documented) if it failed to converge in 'maxiter' steps. (PR#8751) o eapply() (and as.list.environment()) did not work for the base environment/name space. (PR#8761) o Added protection in configure against systems for which using xmkmf fails to report a C or C++ compiler. o expand.grid() was constructing a data frame 'by hand' and so setting integer row.names (which are documented to be character). It now sets character row names, and row.names.data.frame() coerces to character. o qbeta() used == on volatile doubles for its convergence test, which failed with gcc 3.3.x on ix86 Linux. We now use a less fragile test (and lose a negligible amount of accuracy). o ls.str() was missing inherits=FALSE, and so could have reported on an object of the same name but a different mode in the enclosure of the given environment. o logLik.nls assumed that sigma^2 had been estimated, but did not count this in the 'df' attribute. ************************************************** * * * 2.2 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.2.1 patched INSTALLATION CHANGES o The macro SOCKLEN_T has been replaced by R_SOCKLEN_T to work around a problem with the headers of AIX 5.3. BUG FIXES o sub(fixed = TRUE) could get wrong the length of the character string for elements of the result after the first. o legend() worked out which elements of 'lty' were valid before resizing 'lty', and so could fail if 'lty' was a different length from 'legend'. o str() sometimes used much too many spaces (in 2.2.x). o eigen(eispack=TRUE) accessed areas off the matrix in some circumstances (some asymmetric matrices with both complex conjugate pair and real eigenvalues). o strptime() in 2.2.1 sometimes did not set $isdst when it was previously set. o Another case of infinite influence has been worked around. (An addendum to PR#8367.) o qr.coef() worked incorrectly with multiple rhs in the LAPACK-using cases. (PR#8476/8) o rbind.data.frame() gave a corrupt data frame if one of the named arguments was a zero-row data frame. (PR#8506) o Checks for NULL in the rho argument of the C-level findVar function have been added. o The C-level substitute function was handling NULL in its 'rho' argument incorrectly. o The code for pgamma() introduced in 2.1.0 failed for large values of 'shape' where the previous code was perfectly acceptable, despite the claim to be uniformly better. For example, pgamma(0.9e100, 1e100) was NaN. (PR#8528) o There was no command 'ls' in browser() nor 'next' in debug(), despite the documentation (which has been corrected). Command 'where' in the browser() no longer changes to step-though mode. o factor.scope() could report incorrectly that interaction terms were not in the upper scope when such terms in the model and the upper scope had different orders for the main effects. (Another manifestation of PR#7842.) o The "lm" method of drop1() was giving incorrect results for weighted fits (since deviance.lm() was called on a non-"lm" object). o dotchart() was miscalculating the space for the labels in the left margin. (PR#8681) o r <- glm(.....); all.equal(r,r) # now gives TRUE instead of an error o plot.acf() with a multiple time series was sometimes miscalculating the 'ylim' value for the plot after the first. (PR#8705) CHANGES IN R VERSION 2.2.1 USER-VISIBLE CHANGES o options("expressions") has been reduced to 1000: the limit of 5000 introduced in 2.1.0 was liable to give crashes from C stack overflow. NEW FEATURES o Use of 'pch' (e.g. in points) in the symbol font 5 is now interpreted in the single-byte encoding used by that font. Similarly, strwidth now recognizes that font 5 has a different encoding from that of the locale. (These are likely to affect the answer only in MBCS locales such as UTF-8.) o The URW font metrics have been updated to versions from late 2002 which cover more glyphs, including Cyrillic. o New postscript encodings for CP1250 (Windows East European), ISO Latin-7 (8859-13, Latvian, Lithuanian and Maori), Cyrillic (8859-5), KOI8-R, KOI8-U and CP1251. o configure has more support for the Intel and Portland Group compilers on ix86 and x86_64 Linux. o R CMD INSTALL will clean up if interrupted (e.g. by ctrl-C from the keyboard). o There is now a comprehensive French translation of the messages, thanks to Philippe Grosjean, Frederic Lehobey, Jean Thioulouse and Emmanuel Paradis. DEPRECATED & DEFUNCT o The undocumented use of atan() with two arguments is deprecated: instead use atan2() (as documented). o The 'vfont' argument of axis() and mtext() is deprecated (it currently warns and does nothing). o The function mauchley.test() is deprecated (was a misspelling) and replaced by mauchly.test() BUG FIXES o The malloc's of AIX and OSF/1 which return NULL for size 0 are now catered for in src/main/regex.c. o Names of list elements which are missing are now printed as $ and not $"NA" (which is how the non-missing name "NA" is printed). (Brought up in discussion of PR#8161.) o help.start() was not linking R.css for use by its front page and immediate links (2.2.0 only). o Indexing by character NA matched the name "NA". o The arith-true test used random inputs and did not set the seed, so it could fail very occasionally. o arima() with 'fixed' supplied and p=0 for the non-seasonal part could give spurious warnings about 'some AR parameters were fixed'. o summary.matrix() could give an infinite recursion on some classed objects (e.g. those of class "Surv"). o The 255th character in an 8-bit character set was not handled correctly as a letter on some platforms where C char is signed: for example it was printed as \377 and not allowed in variable names. (Spotted by Alexey Shipunov in Russian encodings.) o Conversion from POSIXct to POSIXlt is done more accurately around the change of DST in years not supported by the OS (pre-1970 on Windows and some others, and in the far past or future). o chisq.test(cbind(1:0, c(7,16)), simulate.p = TRUE) gave wrong P-values on some platforms. (PR#8224) o pdf() was not writing details of the encoding to the file correctly. (Spotted by Alexey Shipunov in Russian encodings.) o image() was failing with an error when plotting a matrix of all NA values. (PR#8228) image() could fail if called with add=FALSE (the default) and length(x)=1 for either x or y, as it uses the plot coordinates of the previous plot (if any). o tools::checkMD5sums was not accepting file names with spaces in. o The plot() method for TukeyHSD() needed updating after adding adjusted p-values. (PR#8229) o read.fwf() did not work for header = TRUE. (PR#8226) o diag() failed when its argument had NA values in its dimnames. o [g]sub(pcre=TRUE) did not work correctly with \U and \L in a UTF-8 locale, even on the example on the help page. o promptMethods() was failing if the "methods" argument was supplied. o is.loaded() now finds Fortran symbols whether or not the registration mechanism has been used. o ISODateTime() mistakenly corrected non-existent times (when DST was being started) in the current time zone. o Some replacement operations on data frames gave incorrect answers, e.g. DF[3:4, "y"] if column "y" did not exist or was a matrix. o getGraphicsEvent() would cause memory corruption if passed an empty prompt. o qr() and chol() now pivot the colnames of the result when pivoting is used. (PR#8258) o example(points) omitted pch=0, although it was valid and said in the text to be illustrated. o plot.default() had an unused 'lab' argument, thereby preventing the 'lab' graphics parameter being passed through '...' . o Although polygon(col = NA) was the stated default, specifying NA was not equivalent to omitting the argument (but col=NULL was equivalent). o Im(-1) was pi. (PR#8272, a side effect from all previous versions of R returning the same value for Im and Arg of non-complex numbers.) o symbols(fg) defaulted to colour 1, not par("col") as documented. It does now defaults to par("col"). o par("family") did not check the length of the value (up to 49 bytes) and so could segfault. o aggregate.ts() did not allow for rounding in frequencies such as 1/5. o prcomp(tol=) was not dropping the sdev's corresponding to dropped columns. o Subassignment of a vector which increased the length of the vector _and_ had the wrong length of replacement could occasionally segfault. (This has been there since at least mid 1997.) o The registration of .Fortran symbols was broken: these could only be looked up if there were also .Call symbols registered! o R CMD build was incorrectly rejecting the recommended form of name for a translation package, 'Translation-ll'. (PR#8314) o numericDeriv() gave nonsense results unless the variables were real, which was not checked. o predict.prcomp() would sometimes give an error when predicting a single observation. (PR#8324) o mapply() could segfault if MoreArgs was not a list. (PR#8332) o The arith-true test used identical() on floating-point results, and this allowed a failure when the relative difference was less than .Machine$double.eps but non-zero. o qbinom() was not accepting p = -Inf when log.p = TRUE, although it is a legitimate value. o write.csv[2] only accepted logical constants for 'row.names', and now accepts variables. o Conversion of .Rd files did not correctly match braces enclosing a whole argument, e.g. \eqn{{\bf a}}{a}. o The C function pythag (used if hypot was not available) would infinite-loop on systems with effective optimizing compilers. o Writing long formats (more than 1000 bytes) with connections that use dummy_vfprintf could fail on some systems. The limit has been changed to 100000 bytes pending a more complete fix in R 2.3.0. o Making in src/nmath/standalone without making R was not making Rmath.h. o Both the R front-end and INSTALL could find the attempted temporary directory name already in use on platforms without mktemp (and a genuine Bourne shell /bin/sh, not bash). Now both the process ID and a timestamp are used to create the directory name. o [dpqr]gamma now return NaN for an invalid 'shape' parameter (rather than throw an error), for consistency with other distribution functions. o t() now longer drops dimnames 'list(NULL,NULL)' or 'list(NULL)'. o Influence measures such as rstandard() and cooks.distance() could return infinite values rather than NaN for a case which was fitted exactly. Similarly, plot.lm() could fail on such examples. plot.lm(which = 5) had to be modified to only plot cases with hat < 1. (PR#8367) lm.influence() was incorrectly reporting 'coefficients' and 'sigma' as NaN for cases with hat = 1, and on some platforms not detecting hat = 1 correctly. o Rmath.h for standalone Rmath was not recording HAVE_WORKING_LOG, so R_log was not available on platforms defining it. o HoltWinters() was using a slightly incorrect formula in the C code. o dir.create() could be confused by a trailing slash on the path, and by paths containing drives on Windows. o The search for tcl/tkConfig.sh looked in 'lib' before 'lib64' directories (and not at all in /usr/local/lib64) and so might prefer 32- to 64-bit versions if both are available. o nlminb() used an uninitialized variable unless bounds were supplied, and so failed on 64-bit Solaris. CHANGES IN R VERSION 2.2.0 USER-VISIBLE CHANGES o plot() uses a new default 'which = 5' for the fourth panel when 'which' is not specified. o The SVN revision number will appear after the date in the welcome message. The date shown is now the date of the last change to the sources rather than the date the sources were prepared. o is.null(expression()) now returns FALSE. Only NULL gives TRUE in is.null(). o graphics::xy.coords, xyz.coords and n2mfrow have been moved to the grDevices name space (to be available for grid as well). graphics::boxplot.stats, contourLines, nclass.*, and chull have been moved to the grDevices name space. The C code underlying chull() has been moved to package grDevices. o split(x, f), split<-() and unsplit() now by default split by all levels of a factor f, even when some are empty. Use split(x, f, drop = TRUE) if you want the old behavior of dropping empty levels. split() and split<-() are S3 generic functions with new arguments 'drop' and '...' and all methods now should have 'drop' and '...' arguments as well. o The default for 'allowEscapes' in both read.table() and scan() has been changed to FALSE. o The default for 'gcFirst' in system.time() is now TRUE. NEW FEATURES o .Platform has a new component 'path.sep', the separator used between paths in environment variables such as PATH and TEXINPUTS. o anova.mlm() now handles the single-model case. o Hexadecimal values are now allowed for as.numeric() and as.integer() on all platforms, and as integer constants in R code. o attach() now prints an information message when objects are masked on the search path by or from a newly attached database. o axis() now returns 'at' positions. o axis() has a new argument 'hadj' to control horizontal adjustment of labels. o axis.Date() and axis.POSIXct() now accept a 'labels' argument (contributed by Gavin Simpson). o barplot() now has arguments 'log = ""' and 'add = FALSE' (as in barplot2() from package 'gplots'). o baseenv() has been added, to return the base environment. This is currently still NULL, but will change in a future release. o boxplot() now responds to supplying 'yaxs' (via bxp()). (Wish of PR#8072.) o capabilities() has a new component 'NLS'. o cbind() and rbind() now react to 'deparse.level' = {0,1,2} (as in another system not unlike R). o Experimental versions of cbind() and rbind() in methods package, based on new generic function cbind2(x,y) and rbind2(). This will allow the equivalent of S4 methods for cbind() and rbind() --- currently only after an explicit activation call, see ?cbind2. o New functions cdplot() and spineplot() for conditional density plots and spine plots or spinograms. Spine plots are now used instead of bar plots for x-y scatterplots where y is a factor. o checkDocFiles() in package 'tools' now checks for bad \usage lines (syntactically invalid R code). o The nonparametric variants of cor.test() now behave better in the presence of ties. The "spearman" method uses the asymptotic approximation in that case, and the "kendall" method likewise, but adds a correction for ties (this is not necessary in the Spearman case). o The X11 dataentry() now has support for X Input Methods (contributed by Ei-ji Nakama). o density() is now an S3 generic where density.default() {former density()} has new argument 'weights' for specifying observation masses different than the default 1/N -- based on a suggestion and code from Adrian Baddeley. o download.packages() now carries on if it encounters a download error (e.g. a repository with a corrupt index). o dump() now skips missing objects with a warning rather than throw an error. o Added "POSIXlt" methods for duplicated() and unique(). o Function encoded_text_to_latex() in package tools translates Latin 1,2,9 and UTF-8 encoded character vectors to LaTeX escape sequences where appropriate. o encodeString() allows justify = "none" for consistency with format.default(). Some argument names have been lengthened for clarity. o file(), fifo() and pipe() now (if possible) report a reason if they fail to open a connection. o format.default() now has a 'width' argument, and 'justify' can now centre character strings. format.default() has new arguments 'na.encode' to control whether NA character strings are encoded (true by default), and 'scientific' to control the use of fixed/scientific notation for real/complex numbers. How format() works on a list is now documented, and uses arguments consistently with their usage on an atomic vector. o format.info() now has a 'digits' argument, and is documented to work for all atomic vectors (it used to work for all but raw vectors.). o New function glob2rx() for translating `wildcard' aka `globbing' to regular expressions. o There is a new function gregexpr() which generalizes regexpr() to search for all matches in each of the input strings (not just the first match). o [g]sub() now have a 'useBytes' argument like grep() and regexpr(). o [g]sub(perl = TRUE) support \L and \U in the replacement. o iconv() has been moved from 'utils' to 'base'. o identify()'s default method has additional arguments 'atpen' and 'tolerance' (following S). o KalmanForecast() and KalmanLike() now have an optional argument fast=FALSE to prevent their arguments being modified. o Exact p-values are available in ks.test() for the one-sided and two-sided one-sample Kolmogorov-Smirnov tests. o labels() now has a method for "dist" objects (replacing that for names() which was withdrawn in 2.1.0). o library() now explicitly checks for the existence of directories in 'lib.loc': this avoids some warning messages. o loadNamespace(keep.source=) now applies only to that name space and not others it might load to satisfy imports: this is now consistent with library(). o match.arg() has a new argument 'several.ok = FALSE'. o max.col() has a new argument for non-random behavior in the case of ties. o memory.profile() now uses the type names returned by typeof() and no longer has two unlabelled entries. o methods() now warns if it appears to have been called on a non-generic function. o The default mosaicplot() method by default draws grey boxes. o nlminb(), similar to that in S-PLUS, added to package 'stats'. o New algorithm "port" (the nl2sol algorithm available in the Port library on netlib) added to the nls() function in the 'stats' package. o object.size() now supports more types, including external pointers and weak references. o options() now returns its result in alphabetical order, and is documented more comprehensively and accurately. (Now all options used in base R are documented, including platform-specific ones.) Some options are now set in the package which makes use of them (grDevices, stats or utils) if not already set when the package is loaded. o New option("OutDec") to set the decimal point for output conversions. o New option("add.smooth") to add smoothers to a plot, currently only used by plot.lm(). o pie() has new optional arguments 'clockwise' and 'init.angle'. o plot.lm() has two new plots (for 'which' = 5 or 6), plotting residuals or cook distances versus (transformed) leverages - unless these are constant. Further, the new argument 'add.smooth' adds a loess smoother to the point plots by default, and 'qqline = TRUE' adds a qqline() to the normal plot. The default for 'sub.caption' has been improved for long calls. o R.home() has been expanded to return the paths to components (which can as from this version be installed elsewhere). o readbin() and writeBin() now support raw vectors as well as filenames and connections. o read.dcf() can now read gzipped files. o read.table() now passes 'allowEscapes' to scan(). o sample(x, size, prob, replace = TRUE) now uses a faster algorithm if there are many reasonably probable values. (This does mean the results will be different from earlier versions of R.) The speedup is modest unless 'x' is very large _and_ 'prob' is very diffuse so that thousands of distinct values will be generated with an appreciable frequency. o scatter.smooth() now works a bit more like other plotting functions (e.g., accepts a data frame for argument 'x'). Improvements suggested by Kevin Wright. o signif() on complex numbers now rounds jointly to give the requested number of digits in the larger component, not independently for each component. o New generic function simulate() in the 'stats' package with methods for some classes of fitted models. o smooth.spline() has a new argument 'keep.data' which allows to provide residuals() and fitted() methods for smoothing splines. o Attempting source(file, chdir=TRUE) with a URL or connection for 'file' now gives a warning and ignores 'chdir'. o source() closes its input file after parsing it rather than after executing the commands, as used to happen prior to 2.1.0. (This is probably only significant on Windows where the file is locked for a much shorter time.) o split(), split<-(), unsplit() now have a new argument 'drop = FALSE', by default not dropping empty levels; this is *not* back compatible. o sprintf() now supports asterisk `*' width or precision specification (but not both) as well as `*1$' to `*99$'. Also the handling of `%' as conversion specification terminator is now left to the system and doesn't affect following specifications. o The plot method for stl() now allows the colour of the range bars to be set (default unchanged at "light gray"). o Added tclServiceMode() function to the tcltk package to allow updating to be suspended. o terms.formula() no longer allows '.' in a formula unless there is a (non-empty) 'data' argument or 'allowDotAsName = TRUE' is supplied. We have found several cases where 'data' had not been passed down to terms() and so '.' was interpreted as a single variable leading to incorrect results. o New functions trans3d(), the 3D -> 2D utility from persp()'s example, and extendrange(), both in package 'grDevices'. o TukeyHSD() now returns p-values adjusted for multiple comparisons (based on a patch contributed by Fernando Henrique Ferraz P. da Rosa). o New functions URLencode() and URLdecode(), particularly for use with file:// URLs. These are used by e.g. browse.env(), download.file(), download.packages() and various help() print methods. o Functions utf8ToInt() and intToUtf8() to work with UTF-8 encoded character strings (irrespective of locale or OS-level UTF-8 support). o [dqp]wilcox and wilcox.test work better with one very large sample size and an extreme first argument. o write() has a new argument 'sep'. o write.csv[2] now also support row.names = FALSE. o The specification of the substitutions done when processing Renviron files is more liberal: see ?Startup. It now accepts forms like R_LIBS=${HOME}/Rlibrary:${WORKGRP}/R/lib . o Added recommendation that packages have an overview man page -package.Rd, and the promptPackage() function to create a skeleton version. o Replacement indexing of a data frame by a logical matrix index containing NAs is allowed in a few more cases, in particular always when the replacement value has length one. o Conversion of .Rd files to latex now handles encoding more comprehensively, including some support for UTF-8. o The internal regex code has been upgraded to glibc-2.3.5. Apart from a number of bug fixes, this should be somewhat faster, especially in UTF-8 locales. o PCRE has been updated to version 6.2. o zlib has been updated to version 1.2.3. o bzip2 has been updated to version 1.0.3. o Complex arithmetic is now done by C99 complex types where supported. This is likely to boost performance, but is subject to the accuracy with which it has been implemented. o The printing of complex numbers has changed, handling numbers as a whole rather than in two parts. So both real and imaginary parts are shown to the same accuracy, with the 'digits' parameter referring to the accuracy of the larger component, and both components are shown in fixed or scientific notation (unless one is entirely zero when it is always shown in fixed notation). o Error messages from .C() and .Fortran(), and from parsing errors, are now more informative. o The date and date-time functions work better with dates more than 5000 years away from 1970-01-01 (by making dubious assumptions about the calendar in use). o There is now a traditional Chinese translation, and a much more extensive Russian translation. DEPRECATED & DEFUNCT o Capability "IEEE754" is defunct. o loadURL() is defunct: use load(url()). o delay() is defunct: use delayedAssign() instead. o The 'CRAN' argument to update.packages(), old.packages(), new.packages(), download.packages() and install.packages() is defunct in favour of 'repos'. o write.table0() is deprecated in favour of the much faster write.table(). o format.char() is deprecated in favour of format.default(). o R_HOME/etc/Rprofile is no longer looked for if R_HOME/etc/Rprofile.site does not exist. (This has been undocumented since R 1.4.0.) o CRAN.packages() is deprecated in favour of available.packages(). o Rd.sty no longer processes pre-2.0.0 conversions containing \Link. o The stubs for the defunct device GNOME/gnome have been removed. o print.matrix() (which has been identical to print.default since R 1.7.0) has been removed. INSTALLATION o LDFLAGS now defaults to -L/usr/local/lib64 on most Linux 64-bit OSes (but not ia64). The use of lib/lib64 can be overridden by the new variable LIBnn. o The default installation directory is now ${prefix}/${LIBnn}/R, /usr/local/lib64/R on most 64-bit Linux OSes and /usr/local/lib/R elsewhere. o The places where the doc, include and share directory trees are installed can be specified independently: see the R-admin manual. o We now test for wctrans_t, as apparently some broken OSes have wctrans but not wctrans_t (which is required by the relevant standards) . o Any external BLAS found is now tested to see if the complex routine zdotu works correctly: this provides a compatibility test of compiler return conventions. o Installation without NLS is now cleaner, and does not install any message catalogues. o src/modules/lapack/dlamc.f is now compiled with -ffloat-store if f2c/gcc are used, as well as if g77 is used. o All the Fortran code has been checked to be fully F77 compliant so there are no longer any warnings from F95 compilers such as gfortran. o The (not-recommended) options --with-system-zlib, --with-system-bzlib and -with-system-pcre now have 'system' in the name. o If a Java runtime environment is detected at configure time its library path is appended to LD_LIBRARY_PATH or equivalent. New Java-related variables JAVA_HOME (path to JRE/JDK), JAVA_PROG (path to Java interpreter), JAVA_LD_PATH (Java library path) and JAVA_LIBS (flags to link against JNI) are made available in Makeconf. o Ei-ji Nakama was contributed a patch for FPU control with the Intel compilers on ix86 Linux. MAC OS X INSTALLATION o --with-blas="-framework vecLib" --with-lapack and --with-aqua are now the default configure options. o The default framework version name was changed to not contain the patch level (i.e. it is now 2.2 instead of 2.2.0). Also it can be overridden at configure time by setting FW_VERSION to the desired name. o The Rmath stand-alone library is now correctly installed inside the R.framework if R was configured as a framework. In addition, make install-Rmath-framework will install a stand-alone Rmath framework in /Library/Frameworks (unless overridden by RMATH_FRAMEWORK_DIR specifying full framework path and name including the .framework extension). PACKAGE INSTALLATION o The encoding for a packages' 00Index.html is chosen from the Encoding: field (if any) of the DESCRIPTION file and from the \encoding{} fields of any Rd files with non-ASCII titles. If there are conflicts, first-found wins with a warning. o R_HOME/doc/html/packages.html is now remade by R not Perl code. This may result in small changes in layout and a change in encoding (to UTF-8 where supported). o The return value of new.packages() is now updated for any packages which may be installed. o available.packages() will read a compressed PACKAGES.gz file in preference to PACKAGES if available on the repository: this will reduce considerably the download time on a dialup connection. The downloaded information about a repository is cached for the current R session. o The information about library trees found by installed.packages() is cached for the current session, and updated only if the modification date of the top-level directory has been changed. o A data index is now installed for a package with a 'data' dir but no 'man' dir (even though it will have undocumented data objects). o contrib.url path for type="mac.binary" has changed from bin/macosx/ to bin/macosx//contrib/ where corresponds to R.version$arch UTILITIES o checkFF() used by R CMD check has since R 2.0.0 not reported missing PACKAGE arguments when testing installed packages with name spaces. It now - treats installed and source packages in the same way. - reports missing arguments unless they are in a function in the name space with a useDynLib declaration (as the appropriate DLL for such calls can be searched for). o Rd2dvi sets the encoding(s) used appropriately. If UTF-8 encoding is used, latex >= 2003/12/01 is required. o codoc() allows help files named pkg_name-defunct.Rd to have undocumented arguments (and not just base-defunct.Rd). C-LEVEL FACILITIES o C function massdist() {called from density()} has new argument 'xmass' (= weights). o Raw vectors passed to .C() are now passed as unsigned char * rather than as SEXPs. (Wish of Keith Frost, PR#7853) o The search for symbols in a .C/.Call/... call without a package argument now searches for an enclosing name space and so finds functions defined within functions in a name space. o R_max_col() has new (5th) argument '*ties_meth' allowing non-random behavior in the case of ties. o The header files have been rationalized: the BLAS routine LSAME is now declared in BLAS.h not Linpack.h, Applic.h no longer duplicates routines from Linpack.h, and Applic.h is divided into API and non-API sections. o memory.c has been instrumented so that Valgrind can track R's internal memory management. To use this, configure using --with-valgrind-instrumentation=level where level is 1 or 2. Both levels will find more bugs with gctorture(TRUE). Level 2 makes Valgrind run extremely slowly. o Some support for raw vectors has been added to Rdefines.h. o R_BaseEnv has been added, to refer to the base environment. This is currently equal to R_NilValue, but it will change in a future release. BUG FIXES o %/% has been adjusted to make x == (x %% y) + y * ( x %/% y ) more likely in cases when extended-precision registers were interfering. o Operations on POSIXct objects (such as seq(), max() and subsetting) try harder to preserve time zones and warn if inconsistent time zones are used. o as.function.default() no longer asks for a bug report when given an invalid body. (PR#1880, PR#7535, PR#7702) o Hershey fonts and grid output (and therefore lattice output) now rescale correctly in fit-to-window resizing on a Windows graphics device. Line widths also scale now. o Plotmath has more support for multibyte characters (contributed by Ei-ji Nakama). o The X11() device now hints the window manager so that decorations appear reliably under e.g. the GNOME WM (contributed by Ei-ji Nakama). o Subsetting a matrix or an array as a vector used to attempt to use the row names to name the result, even though the array might be longer than the row names. Now this is only done for 1D arrays when it is done in all cases, even matrix indexing. (Tidies up after the fix to PR#937.) o Constants in mathlib are declared 'const static double' to avoid performance issues with the Intel Itanium compiler. o The parser checks the format of numeric constants more thoroughly so for example '123E-' is no longer valid. o contourLines() no longer requires an open device (used to start a device unnecessarily). Fix suggested by Barry Rowlingson. o capabilities() used partial matching but was not documented to: it no longer does so. o kernel(1,0) printed wrongly; kernel(, *) now returns a named kernel in all cases; plot(kernel(.),..) is more flexible. o qgamma(1,s) didn't give +Inf for some s. o installed.packages() and download.packages() now always return a matrix as documented, possibly with 0 rows (rather than a 0-length character vector or NULL). o Arithmetic operations on data frames no longer coerce the names to syntatically valid names. o Units are now properly recycled in grid layouts when 'widths' or 'heights' are shorter than the number of columns or rows (PR#8014). o DF <- data.frame(A=1:2, B=3:4); DF[1, 1:3] <- NULL gave a wrong error message. o spline()/spinefun()'s C code had a memory access buglet which never lead to incorrect results. (PR#8030) o sum() was promoting logical arguments to double not integer (as min() and other members of its group do). o loess() had a bug causing it to occasionally miscalculate standard errors (PR#7956). Reported by Benjamin Tyner, fixed by Berwin Turlach. o library(keep.source=) was ignored if the package had a name space (the setting of options("keep.source.pkgs") was always used). o hist.POSIXct() and hist.Date() now respect par("xaxt"). o The 'vfont' argument was not supported correctly in title(), mtext(), and axis(). The 'vfont' argument is superseded by the par(family=) approach introduced in 2.0.0. This bug-fix just updates the warning messages and documentation to properly reflect the new order of things. o The C-level function PrintGenericVector could overflow if asked to print a length-1 character vector of several thousand characters. This could happen when printing a list matrix, and was fatal up to 2.1.1 and silently truncated in 2.1.1 patched. o What happened for proc.time() and system.time() on (Unix-alike) systems which do not support timing was incorrectly documented. (They both exist but throw an error.) Further, systen.time() would give an error in its on.exit expression. o weighted.residuals() now does sensible things for glm() fits: in particular it now agrees with an lm() fit for a Gaussian glm() fit. (PR#7961). o The 'lm' and 'glm' methods for add1() took the weights and offset from the original fit, and so gave errors in the (dubious) usage where the upper scope resulted in a smaller number of cases to fit (e.g. by omitting missing values in new variables). (PR#8049) o demo() had a 'device' argument that did nothing (although it was documented to): it has been removed. o Setting new levels on a factor dropped all existing attributes, including class "ordered". o format.default(justify="none") now by default converts NA character strings, as the other values always did. o format.info() often gave a different field width from format() for character vectors (e.g. including missing values or non-printable characters). o axis() now ensures that if 'labels' are supplied as character strings or expressions then 'at' is also supplied (since the calculated value for 'at' can change under resizing). o Defining S4 methods for "[" had resulted in changed behavior of S3 dispatch in a very rare case which no longer happens. o Fixed segfault when PostScript font loading fails, e.g., when R is unable to find afm files (reported by Ivo Welch). o R CMD BATCH now also works when does not end in a newline on Unix-alike platforms. o terms.formula() got confused if the 'data' argument was a list with non-syntactic names. o prompt() and hence package.skeleton() now produce *.Rd files that give no errors (but warnings) when not edited, much more often. o promptClass() and promptMethods() now also escape "%" e.g. in '%*%' and the latter gives a message about the file written. o wilcox.test() now warns when conf.level is set higher than achievable, preventing errors (PR#3666) and incorrect answers with extremely small sample sizes. o The default (protection pointer) stack size (the default for '--max-ppsize') has been increased from 10000 to 50000 in order to match the increased default options("expressions") (in R 2.1.0). o The R front-end was expecting --gui=tk not Tk as documented, and rejecting --gui=X11. o Rdconv -t latex protected only the first << and >> in a chunk against conversion to guillemets. o callNextMethod() and callGeneric() have fixes related to handling arguments. o ls.diag() now works for fits with missing data. (PR#8139) o window.default() had an incorrect tolerance and so sometimes created too short a series if 'start' or 'end' were zero. o Some (fairly pointless) cases of reshape left a temporary id variable in the result (PR#8152) o R CMD build used 'tar xhf' which is invalid on FreeBSD systems (and followed tar chf, so there could be no symbolic links in the tarball). o Subassignment of length zero vectors to NULL gave garbage answers. (PR#8157) o Automatic coercion of raw vectors to lists was missing, so for a list (or data frame) z, z[["a"]] <- raw_vector did not work and now does. This also affected DF$a <- raw_vector for a data frame DF. o The internal code for commandArgs() was missing PROTECTs. o The width for strwrap() was used as one less than specified. o R CMD INSTALL was not cleaning up after an unsuccessful install of a non-bundle which was not already installed. ************************************************** * * * 2.1 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.1.1 patched BUG FIXES o runmed(x, k = -1) now gives an error instead of a seg.fault. o File creation errors in pdf(), postscript(), xfig() resulted in a pointer being freed twice. (Reported by Matt McCall) o model.matrix(~ .^2, data=foo) now works as most people would expect (it used to expand '.' after using the a^2 = a rule). o ftable() and xtabs() had a check for interactions that would not work correctly with '.' in the formula. o The formula method for pairs() was ignoring the na.action argument. o scan() with the default separator (only) was stripping backslashes inside quoted string inputs even if allowEscapes = FALSE. o The "col" parameter to pairs() is now treated consistently with plot(): it affects the data, not the axes. (Patch submitted by Olaf Mersmann) o Sweave failed for \Sexpr{character(0)}. Sweave-test-1.Rnw contained a print() statement that is no longer needed. o Typo meant that R_alloc was limited to 2^31 not 2^34 on 64-bit builds of R. o pgamma(Inf, shape) did not terminate for shape = 1.1 and some other values (but not all) (PR#8001). This affected pchisq(Inf, df1) and pf(Inf, df1, Inf) for some values of df1. o regexpr("[a-z]", ...) could cause a buffer overrun with multi-byte character sets, leading to random errors later. o Installing a package with a 'data' directory which contains files but those files generate no objects (e.g. the BioC package makecdfenv) created an incorrectly formatted data index that data(package="pkg") could not read. o density(1/(0:2)) now works again (PR#8033). o make.names() was not respecting the allow_=FALSE argument. o Arg(-1) now gives pi, not 0; Arg(0i + -1) always worked. o --enable-linux-lfs had been broken at 2.1.0. o Printing was not allowing for double-width characters in its layout. o axTicks() is now also correct for reverse axis. (PR#7973) o signif() rounded some numnbers near 1e-308 to the wrong number of places (this showed up in the print-tests.R), and made unnecessary rounding errors on some platforms on e.g. signif(18000, 3). o window() was sometimes failing incorrectly due to representation errors when the new and old deltat were not both integers (e.g. multiples of 0.1). o The 'lm' methods for add1() and drop1() ignored offsets (which were added for lm() after they written). (PR#8049) o atan2(0+1i, 0+0i) was incorrectly NA (from a typo in complex.c). o The POSIXct method for as.Date() was rounding part days before 1970-01-01 upwards rather than discarding them. o qpois() was using an incorrect starting point and so could be unnecessarily slow for large lambda. (PR#8058) o Formatting of complex numbers with nsmall > 0 could be incorrect (and was in print-tests.Rout) because of a typo. o The test for new levels when predicting using model.frame() sometimes reported levels that were not actually used. o order(c("5","6",NA,"4",NA), na.last=FALSE) was incorrect. o Coercion to raw could give spurious messages about discarding imaginary parts. Coercion of a list to raw was behaving inconsistently for out-of-range values (and not warning). o cor.test(method = "spearman") gave NA p-values for _very_ long vectors. (PR#8087) o Switching to the Mersenne-Twister RNG could cause a segfault on first use. A user-supplied RNG without user_unif_{nseed,seedloc} was being re-initialized at each call. o Reading very short tables from stdin() with read.table would fail because of a typo. CHANGES IN R VERSION 2.1.1 NEW FEATURES o bug.report() now reports the locale in use. o upgrade.packageStatus() allows user input "c" to cancel the upgrade, just as update.packages() does. o glm() now accepts 1D arrays (e.g. tables) as a response, dropping them to a vector whilst preserving names. o df() with one infinite df now works (to match pf()). o Added tclServiceMode() function to the tcltk package to allow updating to be suspended. o The Encoding: field of a DESCRIPTION file is now documented, and used by packageDescription() and library(help=). o There has been progress on translations: existing translations have been revised and expanded, and French and Korean have been added. The Windows installer supports a wide range of languages for installation. BUG FIXES o lm(qr=FALSE) now works. o predict.glm() not longer loses names for "response" predictions. (PR#7792) o Typo in menu(graphics=TRUE) meant it failed on Unix if tcltk was not available. o When names.dist() was removed, the result of cmdscale() lost its rownames. The example also lost the labels. o R CMD check assumed 'tar' was GNU tar and so supported -z. o read.table() was not handing escaped quotes inside quoted fields in the first five lines of the file. (PR#7789) It was also not handling correctly EOF in the first five lines when reading from stdin(). (PR#7772) o 'make uninstall' was incomplete. o make.packages.html() called by help.start() was failing if there were installed packages with help titles invalid in the current locale. o printCoefmat(signif.legend = FALSE) was non-functional. (PR#7802) o Some as.date.frame() methods failed because the expression deparsed into multiple lines. (PR#7808) o setRepositories() had a typo. (PR#7810) o Printing arrays/data frames with multibyte characters in the column labels was sometimes misaligned or using excessive space. (PR#7803) o The Tcl/Tk console did not support multibyte characters. o as.POSIXlt() could give infinite recursion if passed a corrupt "POSIXct" object (generated by an incorrect call to c.POSIXct, PR#7826). o update.packages() was not passing 'type' correctly to install.packages(). o Printing the result of an unbalanced model.tables() call sometimes got confused if terms() had rearranged interaction terms. (PR#7829) o .Platform$pkgType was wrong on the CRAN MacOS X build, and .install.macbinary() was missing. o as.personList() as used by citation() got confused by names containing "and". (PR#7797) o Subscripting an array by a matrix containing zero or negative values or the wrong number of columns was not handled consistently. (PR#7824) o select.list(multiple=TRUE) now detects and tries again for invalid text input. o add1.[g]lm could give strange results with interaction terms when the model and the upper scope had different orders for the main effects. (PR#7842) o A bug had sneaked into the anova.mlmlist() code, affecting the Greenhouse-Geisser epsilon. Code wrongly assumed a matrix to be symmetric. (Thanks to Bela Bauer.) o anova.mlmlist() and mauchley.test() are now more tolerant to rank deficiency in the M and X matrices (also when they are implicitly generated via model.matrix()). o anova.mlm had a scoping issue (PR#7898) o pf() with infinite df is allowed again. It is now more accurate for extreme ratios of dfs, especially when there is a non-centrality parameter. o df() was inaccurate for large df (1e16 or greater). o dt() was inaccurate for large df (1e9 or greater) with a non-centrality parameter. o runmed(*, algorithm="Turlach") seg.faulted in rare cases. o strwrap() now makes a reasonable job of text that is invalid in the current locale. o Reading with encoding "UCS-2LE" will remove any Byte Order Mark, as most implementations of iconv fail to handle BOMs (which are present in 'Windows Unicode' files). o unique() for a list was incorrectly reporting `unimplemented'. o The parser's contextstack was not protected against overflow, e.g. more than 50 unmatched '('. (PR#7859) o source(file, chdir = TRUE) was not checking that 'file' was a filepath (rather than a URL). For 2.1.0 only, it did not work even if 'file' was a filepath. o Hershey fonts were being sized based on pixels not points so came out too small on devices where pixels were noticeably different from points (e.g., win.printer() and high-resolution screens). Fix means that default size of Hershey fonts may be slightly different, for example, smaller by default on PostScript and PDF. o The branch cuts in the complex versions of the inverse trigonometric and hyperbolic functions were non-standard. (PR#7871) o truncate() on file() connections was limited to files < 2Gb. It now works for larger files at least on 64-bit OSes and others where ftruncate supports such files. (Related to PR#7879) o proj.aovlist() did not work correctly on objects fitted from a data frame with row names. o The coding standards recommendations had nuke-trailing-whitespace where newer versions of ESS need ess-nuke-trailing-whitespace. (PR#7888) o package.skeleton() missed the first newline in the DESCRIPTION file. o pbirthday() reported p = 1 too often when coincident > 2. o plot(1:3, exp(1:3), log = "y", ylim = c(30,1)) {reversed log-scale axis} now works, based on Uwe Ligges' suggestions. (PR#7894) o install.packages() was aborting when a package in a bundle was chosen from a menu. It failed if more than one package in a bundle was chosen from the command line. o qcauchy() suffered from underflow in the extreme tails. (PR#7902) o Printing of raw matrices/arrays was not implemented. (PR#7912) o getCallingDLL()'s default first argument did not correspond to its description and has been changed. The mismatch caused symbols in .C/.Call/.Fortran calls without a PACKAGE= argument to be potentially looked up in the wrong name space. o Binary save() of raw vectors was not working correctly on big-endian platforms. (PR#7812) o as.Date.factor() now accepts a format argument. o Workaround added for FreeBSD which does not have alloca.h _and_ does not allow alloca() to be declared. o identify() now respects 'cex'. (PR#660) Warnings from identify() are now printed immediately even on consoles with delayed printing. CHANGES IN R VERSION 2.1.0 USER-VISIBLE CHANGES o box plots {by boxplot() or bxp()} now have the median line three times the normal line width in order to distinguish it from the quartile ones. o Unix-alike versions of R can now be used in UTF-8 locales on suitably equipped OSes. See the internationalization section below. o The meaning of 'encoding' for a connection has changed: See the internationalization section below. o There has been some rationalization of the format of warning/error messages, to make them easier to translate. Generally names of functions and arguments are single-quoted, and classes double-quoted. o Reading text files with embedded "\" (as in Windows file names) may now need to use scan(* , allowEscapes = FALSE), see also below. NEW FEATURES o %% now warns if its accuracy is likely to be affected by lack of precision (as in 1e18 %% 11, the unrealistic expectation of PR#7409), and tries harder to return a value in range when it is. o abbreviate() now warns if used with non-ASCII chars, as the algorithm is designed for English words. o The default methods for add1() and drop1() check for changes in the number of cases in use. The "lm" and "glm" methods for add1() quoted the model on the original fitted values when using (with a warning) a smaller set of cases for the expanded models. o Added alarm() function to generate a bell or beep or visual alert. o all/any() now attempt to coerce their arguments to logical, as documented in the Blue Book. This means e.g. any(list()) works. o New functions for multivariate linear models: anova.mlm(), SSD(), estVar(), mauchley.test() (for sphericity). vcov() now does something more sensible for "mlm" class objects. o as.data.frame.table() has a new argument 'responseName' (contributed by Bill Venables). o as.dist() and cophenetic() are now generic, and the latter has a new method for objects of class "dendrogram". o as.ts() is now generic. o binomial() has a new "cauchit" link (suggested by Roger Koenker). o chisq.test() has a new argument 'rescale.p'. It is now possible to simulate (slowly) the P value also in the 1D case (contributed by Rolf Turner). o choose(n,k) and lchoose(.) now also work for arbitrary (real) n in accordance with the general binomial theorem. choose(*,k) is more accurate (and faster) for small k. o Added colorRamp() and colorRampPalette() functions for color interpolation. o colSums()/rowSums() now allow arrays with a zero-length extent (requested by PR#7775). o confint() has stub methods for classes "glm" and "nls" that invoke those in package MASS. This avoids using the "lm" method for "glm" objects if MASS is not attached. confint() has a default method using asymptotic normality. o contr.SAS() has been moved from the 'nlme' package to the 'stats' package. o New function convertColors() maps between color spaces. colorRamp() uses it. o The cov() function in the non-Pearson cases now ranks data after removal of missing values, not before. The pairwise-complete method should now be consistent with cor.test. (Code contributed by Shigenobu Aoki.) o Added delayedAssign() function to replace delay(), which is now deprecated. o dir.create() has a new argument 'recursive' serving the same purpose as Unix's mkdir -p. o do.call() now takes either a function or a character string as its first argument. The supplied arguments can optionally be quoted. o duplicated() and unique() now accept "list" objects, but are fast only for simple list objects. o ecdf() now has jumps of the correct size (a multiple of 1/n) if there are ties. (Wished by PR#7292). o eff.aovlist() assumed orthogonal contrasts for any term with more than one degree of freedom: this is now documented and checked for. Where each term only occurs in only one stratum the efficiencies are all one: this is detected and orthogonal contrasts are not required. o New function encodeString() to encode character strings in the same way that printing does. o file("clipboard") now work for reading the primary selection on Unix-alikes with an active X11 display. (It has long worked for reading and writing under Windows.) The secondary selection can also be read: see ?file. file() now allows mode "w+b" as well as "w+". o file.append() has been tuned, including for the case of appending many files to a single file. o Functions flush.console() and select.list() are now available on all platforms. There is a Tcl/Tk-based version of select.list() called tk_select.list() in package tcltk. o gc() now reports maximum as well as current memory use. o A new function getGraphicsEvent() has been added which will allow mouse or keyboard input from a graphics device. (NB: currently only the Windows screen device supports this function. This should improve before the 2.1.0 release.) o New functions gray.colors()/grey.colors() for gray color palettes. o grep(), gsub(), sub() and regexpr() now always attempt to coerce their 'pattern', 'x', 'replacement' and 'text' arguments to character. Previously this was undocumented but done by [g]sub() and regexpr() for some values of their other arguments. (Wish of PR#7742.) o gsub/sub() have a new 'fixed' method. o New function hcl() for creating colors for a given hue, chroma and luminance (i.e. perceptual hsv). o isTRUE() convenience function to be used for programming. o kmeans() now returns an object of class "kmeans" which has a print() method. Two alternative algorithms have been implemented. If the number of centres is supplied, it has a new option of multiple random starts. o The limits on the grid size in layout() are now documented, and have been raised somewhat by using more efficient internal structures. o legend() now accepts positioning by keyword, e.g. "topleft", and can put a title within the legend. (Suggested by Elizabeth Purdom in PR#7400.) o mahalanobis() now has a '...' argument which is passed to solve() for computing the inverse of the covariance matrix, this replaces the former 'tol.inv' argument. o menu() uses a multi-column layout if possible for more than 10 choices. menu(graphics = TRUE) is implemented on most platforms via select.list() or tk_select.list(). o New function message() in 'base' for generating "simple" diagnostic messages, replacing such a function in the 'methods' package. o na.contiguous() is now (S3) generic with first argument renamed to 'object'. o New function normalizePath() to find canonical paths (and on Windows, canonical names of components). o The default in options("expressions") has been increased to 5000, and the maximal settable value to 500000. o p.adjust() has a new method "BY". o pbeta() now uses a different algorithm for large values of at least one of the shape parameters, which is much faster and is accurate and reliable for very large values. (This affects pbinom(), pf(), qbeta() and other functions using pbeta at C level.) o pch="." now by default produces a rectangle at least 0.01" per side on high-resolution devices. (It used to be one-pixel square even on high-resolution screens and Windows printers, but 1/72" on postscript() and pdf() devices.) Additionally, the size is now scalable by 'cex'; see ?points and note that the details are subject to change. o pdf() now responds to the 'paper' and 'pagecentre' arguments. The default value of 'paper' is "special" for backward-compatibility (this is different from the default for postscript()). o plot.data.frame() tries harder to produce sensible plots for non-numeric data frames with one or two columns. o The predict() methods for "prcomp" and "princomp" now match the columns of 'newdata' to the original fit using column names if these are available. o New function recordGraphics() to encapsulate calculations and graphics output together on graphics engine display list. To be used with care. o New function RSiteSearch() to query R-related resources on-line (contributed by Jonathan Baron and Andy Liaw). o scan() arranges to share storage of duplicated character strings read in: this can dramatically reduce the memory requirements for large character vectors which will subsequently be turned into factors with relatively few levels. For a million items this halved the time and reduced storage by a factor of 20. scan() has a new argument 'allowEscapes' (default TRUE) that controls when C-style escapes in the input are interpreted. Previously only \n and \r were interpreted, and then only within quoted strings when no separator was supplied. scan() used on an open connection now pushes back on the connection its private `ungetc' and so is safer to use to read partial lines. o scatter.smooth() and loess.smooth() now handle missing values in their inputs. o seq.Date() and seq.POSIXt() now allow 'to' to be before 'from' if 'by' is negative. o sprintf() has been enhanced to allow the POSIX/XSI specifiers like "%2$6d", and also accepts "%x" and "%X". sprintf() does limited coercion of its arguments. sprintf() accepts vector arguments and operates on them in parallel (after re-cycling if needed). o New function strtrim() to trim character vectors to a display width, allowing for double-width characters in multi-byte character sets. o subset() now has a method for matrices, similar to that for data frames. o Faster algorithm in summaryRprof(). o sunflowerplot() has new arguments 'col' and 'bg'. o sys.function() now has argument 'which' (as has long been presaged on its help page). o Sys.setlocale("LC_ALL", ) now only sets the locale categories which R uses, and Sys.setlocale("LC_NUMERIC", ) now gives a warning (as it can cause R to malfunction). o unclass() is no longer allowed for environments and external pointers (since these cannot be copied and so unclass() was destructive of its argument). You can still change the "class" attribute. o File-name matching is no longer case-insensitive with unz() connections, even on Windows. o New argument 'immediate.' to warning() to send an immediate warning. o New convenience wrappers write.csv() and write.csv2(). o There is a new version for write.table() which is implemented in C. For simple matrices and data frames this is several times faster than before, and uses negligible memory compared to the object size. The old version (which no longer coerces a matrix to a data frame and then back to a matrix) is available for now as write.table0(). o The functions xinch(), yinch(), and xyinch() have been moved from package 'grDevices' into package 'graphics'. o Plotmath now allows underline in expressions. (PR#7286, contributed by Uwe Ligges.) o BATCH on Unix no longer sets --gui="none" as the X11 module is only loaded if needed. o The X11 module (and the hence X11(), jpeg() and png() devices and the X-based dataentry editor) is now in principle available under all Unix GUIs except --gui="none", and this is reflected in capabilities(). capabilities("X11") determines if an X server can be accessed, and so is more likely to be accurate. o Printing of arrays now honours the 'right' argument if there are more than two dimensions. o Tabular printing of numbers now has headers right-justified, as they were prior to version 1.7.0 (spotted by Rob Baer). o Lazy-loading databases are now cached in memory at first use: this enables R to run much faster from slow file systems such as USB flash drives. There is a small (less than 2Mb) increase in default memory usage. o The implicit class structure for numeric vectors has been changed, so that integer/real vectors try first methods for class "integer"/"double" and then those for class "numeric". The implicit classes for matrices and arrays have been changed to be "matrix"/"array" followed by the class(es) of the underlying vector. o splines::splineDesign() now allows the evaluation of a B-spline basis everywhere instead of just inside the "inner" knots, by setting the new argument `outer.ok = TRUE'. o Hashing has been tweaked to use half as much memory as before. o Readline is not used for tilde expansion when R is run with --no-readline, nor from embedded applications. Then "~name" is no longer expanded, but "~" still is. o The regular expression code is now based on that in glibc 2.3.3. It has stricter conformance to POSIX, so metachars such as { } + * may need to be escaped where before they did not (but could have been). o New encoding 'TeXtext.enc' improves the way postscript() works with Computer Modern fonts. o Replacement in a non-existent column of a data frame tries harder to create a column of the correct length and so avoid a corrupt data frame. o For Windows and readline-based history, the saved file size is re-read from R_HISTSIZE immediately before saving. o Collected warnings during start-up are now printed before the initial prompt rather than after the first command. o Changes to package 'grid': - preDrawDetails(), drawDetails(), and postDrawDetails() methods are now recorded on the graphics engine display list. This means that calculations within these methods are now run when a device is resized or when output is copied from one device to another. - Fixed bug in grid.text() when 'rot' argument has length 0. (privately reported by Emmanuel Paradis) - New getNames() function to return just the names of all top-level grobs on the display list. - Recording on the grid display list is turned off within preDrawDetails(), drawDetails(), and postDrawDetails() methods. - Grid should recover better from errors or user-interrupts during drawing (i.e., not leave you in a strange viewport or with strange graphical parameter settings). - New function grid.refresh() to redraw the grid display list. - New function grid.record() to capture calculations with grid graphics output. - grobWidth and grobHeight ("grobwidth" and "grobheight" units) for primitives (text, rects, etc, ...) are now calculated based on a bounding box for the relevant grob. NOTE: this has changed the calculation of the size of a scalar rect (or circle or lines). - New arguments 'warn' and 'wrap' for function grid.grab() - New function grid.grabExpr() which captures the output from an expression (i.e., not from the current scene) without doing any drawing (i.e., no impact on the current scene). - upViewport() now (invisibly) returns the path that it goes up (suggested by Ross Ihaka). - The 'gamma' gpar has been deprecated (this is a device property not a property of graphical objects; suggested by Ross Ihaka). - New 'lex' gpar; a line width multiplier. - grid.text() now handles any language object as mathematical annotation (instead of just expressions). - plotViewport() has default value for 'margins' argument (that match the default value for par(mar)). - The 'extension' argument to dataViewport() can now be vector, in which case the first value is used to extend the xscale and the second value is used to extend the y scale. (suggested by Ross Ihaka). - All 'just' arguments (for viewports, layouts, rectangles, text) can now be numeric values (typically between 0 [left] and 1 [right]) as well as character values ("left", "right", ...). For rectangles and text, there are additional 'hjust' and 'vjust' arguments which allow numeric vectors of justification in each direction (e.g., so that several pieces of text can have different justifications). (suggested by Ross Ihaka) - New 'edits' argument for grid.xaxis() and grid.yaxis() to allow specification of on-the-fly edits to axis children. - applyEdit(x, edit) returns x if target of edit (i.e., child specified by a gPath) cannot be found. - Fix for calculation of length of max/min/sum unit. Length is now (correctly) reported as 1 (was reported as length of first arg). - Viewport names can now be any string (they used to have to be a valid R symbol). - The 'label' argument for grid.xaxis() and grid.yaxis() can now also be a language object or string vector, in which case it specifies custom labels for the tick marks. INTERNATIONALIZATION o Unix-alike versions of R can now be used in UTF-8 and other multi-byte locales on suitably equipped OSes if configured with option --enable-mbcs (which is the default). [The changes to font handling in the X11 module are based on the Japanization patches of Ei-ji Nakama.] Windows versions of R can be used in `East Asian' locales on suitable versions of Windows. See the 'Internationalization' chapter in the 'Installation and Administration' manual. o New command-line flag --encoding to specify the encoding to be assumed for stdin (but not for a console). o New function iconv() to convert character vectors between encodings, on those OSes which support this. See the new capabilities("iconv"). o The meaning of 'encoding' for a connection has changed: it now allows any charset encoding supported by iconv on the platform, and can re-encode output as well as input. As the new specification is a character string and the old was numeric, this should not cause incorrect operation. o New function localeToCharset() to find/guess encoding(s) from the locale name. o nchar() returns the true number of bytes stored (including any embedded nuls), this being 2 for missing values. It has an optional argument 'type' with possible non-default values "chars" and "width" to give the number of characters or the display width in columns. o Characters can be entered in hexadecimal as e.g. \x9c, and in UTF-8 and other multibyte locales as \uxxxx, \u{xxxx}, \Uxxxxxxxx or \U{xxxxxxxx}. Non-printable Unicode characters are displayed C-style as \uxxxx or \Uxxxxxxxx. o LC_MONETARY is set to the locale, which affects the result of Sys.localeconv(), but nothing else in R itself. (It could affect add-on packages.) o source() now has an 'encoding' argument which can be used to make it try out various possible encodings. This is made use of by example() which will convert (non-UTF-8) Latin-1 example files in a UTF-8 locale. o read/writeChar() work in units of characters, not bytes. o .C() now accepts an ENCODING= argument where re-encoding is supported by the OS. See `Writing R Extensions'. o delimMatch (tools) now reports match positions and lengths in units of characters, not bytes. The delimiters can be strings, not just single ASCII characters. o .Rd files can indicate via a \encoding{} argument the encoding that should be assumed for non-ASCII characters they contain. o Phrases in .Rd files can be marked by \enc{}{} to show a transliteration to ASCII for use in e.g. text help. o The use of 'pch' in points() now allows for multi-byte character sets: in such a locale a glyph can either be specified as a multi-byte single character or as a number, the Unicode point. o New function l10n_info() reports on aspects of the locale/charset currently in use. o scan() is now aware of double-byte locales such as Shift-JIS in which ASCII characters can occur as the second ('trail') byte. o Functions sQuote() and dQuote() use the Unicode directional quotes if in a UTF-8 locale. o The infrastructure is now in place for C-level error and warning messages to be translated and used on systems with Native Language Support. This has been used for the startup message in English and to translate Americanisms such as 'color' into English: translations to several other languages are under way, and some are included in this release. See 'Writing R Extensions' for how to make use of this in a package: all the standard packages have been set up to do translation, and the 'language' 'en@quot' is implemented to allow Unicode directional quotes in a UTF-8 locale. o R-level stop(), warning() and message() messages can be translated, as can other messages via the new function gettext(). Tools xgettext() and xgettext2pot() are provided in package tools to help manage error messages. gettextf() is a new wrapper to call sprintf() using gettext() on the format string. o Function ngettext() allows the management of singular and plural forms of messages. UTILITIES o New functions mirror2html() and checkCRAN(). o R CMD check has a new option '--use-valgrind'. o R CMD check now checks that Fortran and C++ files have LF line endings, as well as C files. It also checks Makevars[.in] files for portable compilation flags. o R CMD check will now work on a source tarball and prints out information about the version of R and the package. o tools:::.install_package_code_files() (used to collate R files when installing packages) ensures files are separated by a line feed. o vignette() now returns an object of class "vignette" whose print() method opens the corresponding PDF file. The edit() method can be used to open the code of the vignette in an editor. o R CMD INSTALL on Unix has a new option '--build' matching that on Windows, to package as tarball the installed package. o R CMD INSTALL on Unix can now install binary bundles. o R CMD build now changes src files to LF line endings if necessary. o R CMD build now behaves consistently between source and binary builds: in each case it prepares a source directory and then either packages that directory as a tarball or calls R CMD INSTALL --build on the prepared sources. This means that R CMD build --binary now respects .Rbuildignore and will rebuild vignettes (unless the option --no-vignettes is used). For the latter, it now installs the current sources into a temporary library and uses that version of the package/bundle to rebuild the vignettes. o R CMD build now reports empty directories in the source tree. o New function write_PACKAGES() in package 'tools' to help with preparing local package repositories. (Based on a contribution by Uwe Ligges.) How to prepare such repositories is documented in the 'R Installation and Administration' manual. o package.skeleton() adds a bit more to DESCRIPTION. o Sweave changes: - \usepackage[nogin]{Sweave} in the header of an Sweave file supresses auto-setting of the graphical parameter like width of graphics. - The new \SweaveInput{} command works similar to LaTeX's \input{} command. - Option value strip.white=all strips all blank lines from the output of a code chunk. - Code chunks with eval=false are commented out by Stangle() and hence no longer tested by R CMD check. DOCUMENTATION o File doc/html/faq.html no longer exists, and doc/manual/R-FAQ.html (which has active links to other manuals) is used instead. (If makeinfo >= 4.7 is not available, the version on CRAN is linked to.) o Manual 'Writing R Extensions' has further details on writing new front-ends for R using the new public header files. o There are no longer any restrictions on characters in the \name{} field of a .Rd file: in particular _ is supported. C-LEVEL FACILITIES o There are new public C/C++ header files Rinterface.h and R_ext/RStartup.h for use with external GUIs. o Added an onExit() function to graphics devices, to be executed upon user break if non-NULL. o ISNAN now works even in C++ code that undefines the 'isnan' macro. o R_alloc's limit on 64-bit systems has been raised from just under 2^31 bytes (2Gb) to just under 2^34 (16Gb), and is now checked. o New math utility functions log1pmx(x), lgamma1p(x), logspace_add(logx, logy), and logspace_sub(logx, logy). DEPRECATED & DEFUNCT o The aqua module for MacOS X has been removed: --with-aqua now refers to the unbundled Cocoa GUI. o Capabilities "bzip2", "GNOME, "libz" and "PCRE" are defunct. o The undocumented use of UseMethod() with no argument was deprecated in 2.0.1 and is now regarded as an error. o Capability "IEEE754" is deprecated. o The 'CRAN' argument to update.packages(), old.packages(), new.packages(), download.packages() and install.packages() is deprecated in favour of 'repos', which replaces it as a positional argument (so this is only relevant for calls with named args). o The S3 methods for getting and setting names of "dist" objects have been removed (as they provided names with a different length from the "dist" object itself). o Option "repositories" is no longer used and so not set. o loadURL() is deprecated in favour of load(url()). o delay() is deprecated. Use delayedAssign() instead. INSTALLATION CHANGES o New configure option --enable-mbcs to enable support for UTF-8 locales, on by default. o R_XTRA_[CF]FLAGS are now used during the configuration tests, and [CF]PICFLAGS if --enable-R-shlib was specified. This ensures that features such as inlining are only used if the compilation flags specified support them. (PR#7257) o Files FAQ, RESOURCES, doc/html/resources.html are no longer in the SVN sources but are made by 'make dist'. o The GNOME GUI is unbundled, now provided as a package on CRAN. o Configuring without having the recommended packages is now an error unless --with-recommended-packages=no (or equivalent) is used. o Configuring without having the X11 headers and libraries is now an error unless --with-x=no (or equivalent) is used. o Configure tries harder to find a minimal set of FLIBS. Under some circumstances this may remove from R_LD_LIBRARY_PATH path elements that ought to have specified in LDFLAGS (but were not). o The C code for most of the graphics device drivers and their afm files are now in package grDevices. o R is now linked against ncurses/termlib/termcap only if readline is specified (now the default) and that requires it. o Makeinfo 4.7 or later is now required for building the HTML and Info versions of the manuals. PACKAGE INSTALLATION CHANGES o There are new types of packages, identified by the Type field in the DESCRIPTION file. For example the GNOME console is now a separate package (on CRAN), and translations can be distributed as packages. o There is now support of installing from within R both source and binary packages on MacOS X and Windows. Most of the R functions now have a 'type' argument defaulting to getOption("pkgType") and with possible values "source", "win.binary" and "mac.binary". The default is "source" except under Windows and the CRAN GUI build for MacOS X. o install.packages() and friends now accept a vector of URLs for 'repos' or 'contriburl' and get the newest available version of a package from the first repository on the list in which it is found. The argument 'CRAN' is still accepted, but deprecated. install.packages() on Unix can now install from local .tar.gz files via repos = NULL (as has long been done on Windows). install.packages() no longer asks if downloaded packages should be deleted: they will be deleted at the end of the session anyway (and can be deleted by the user at any time). If the repository provides the information, install.packages() will now accept the name of a package in a bundle. If 'pkgs' is omitted install.packages() will use a listbox to display the available packages, on suitable systems. 'dependencies' can be a character vector to allow only some levels of dependencies (e.g. not "Suggests") to be requested. o There is a new possible value update.packages(ask="graphics") that uses a widget to (de)select packages, on suitable systems. o The option used is now getOption("repos") not getOption("CRAN") and it is initially set to a dummy value. Its value can be a character vector (preferably named) giving one or several repositories. A new function chooseCRANmirror() will select a CRAN mirror. This is called automatically if the contrib.url() encounters the initial dummy value of getOption("repos") A new function setRepositories() can be used to create getOption("repos") from a (platform-specific) list of known repositories. o New function new.packages() to report uninstalled packages available at the requested repositories. This also reports incomplete bundles. It will optionally install new packages. o New function available.packages(), similar to CRAN.packages() but for use with multiple repositories. Both now only report packages whose R version requirements are met. o update.packages() and old.packages() have a new option 'checkBuilt' to allow packages installed under earlier versions of R to be updated. o remove.packages() can now remove bundles. o The Contains: field of the DESCRIPTION file of package bundles is now installed, so later checks can find out if the bundle is complete. o packageStatus() is now built on top of *.packages, and gains a 'method' argument. It defaults to the same repositories as the other tools, those specified by getOption("repos"). BUG FIXES o Configuring for Tcl/Tk makes use of ${TK_LIB_SPEC} ${TK_LIBS} not ${TK_LIB_SPEC} ${TK_XLIBSW}, which is correct for recent versions of Tk, but conceivably not for old tkConfig.sh files. o detach() was not recomputing the S4 methods for primitives correctly. o Methods package now has class "expression" partly fixed in basic classes, so S4 classes can extend these (but "expression" is pretty broken as a vector class in R). o Collected warnings had messages with unneeded trailing space. o S4 methods for primitive functions must be exported from name spaces; this is now done automatically. Note that is.primitive() is now in "base", not "methods". o Package grid: - Fixed bug in grid.text() when "rot" argument has length 0. (reported by Emmanuel Paradis) o .install_package_vignette_index() created an index even in an empty 'doc' directory. o The print() method for factors now escapes characters in the levels in the same way as they are printed. o str() removed any class from environment objects. str() no longer interprets control characters in character strings and factor levels; also no longer truncates factor levels unless they are longer than 'nchar.max'. Truncation of such long strings is now indicated ''outside'' the string. str() was misleading for the case of a single slot. str() now also properly displays S4 class definitions (such as returned by getClass(). o print.factor(quote=TRUE) was not quoting levels, causing ambiguity when the levels contained spaces or quotes. o R CMD check was confused by a trailing / on a package name. o write.table() was writing incorrect column names if the data frame contained any matrix-like columns. o write.table() was not quoting row names for a 0-column x. o t(x)'s default method now also preserves names(dimnames(x)) for 1D arrays 'x'. o r <- a %*% b no longer produces names(dimnames(r)) == c("", "") unless one of a or b has named dimnames. o Some .Internal functions that were supposed to return invisibly did not. This was behind PR#7397 and PR#7466. o eval(expr, NULL, encl) now looks up variables in encl, as eval(expr, list(), encl) always did o Coercing as.data.frame(NULL) to a pairlist caused an error. o p.adjust(p, ..) now correctly works when `p' contains NAs (or when it is of length 0 or length 2 for method = "hommel"). o 'methods' initialization was calling a function intended for .Call() with .C(). o optim() needed a check that the objective function returns a value of length 1 (spotted by Ben Bolker). o X11() was only scaling its fonts to pointsize if the dpi was within 0.5 of 100dpi. o X11() font selection was looking for any symbol font, and sometimes got e.g. bold italic if the server has such a font. o dpois(*, lambda=Inf) now returns 0 (or -Inf for log). o Using pch="" gave a square (pch=0)! Now it is regarded as the same as NA, which was also undocumented but omits the point. o Base graphics now notices (ab)lines which have a zero coordinate on log scale, and omits them. (PR#7559) o stop() and warning() now accept NULL as they are documented to do (although this seems of little use and is equivalent to ""). o weighted.mean() now checks the length of the weight vector w. o getAnywhere() was confused by names with leading or trailing dots (spotted by Robert McGehee) o eval() was not handling values from return() correctly. o par(omd) is now of the form c(x1, x2, y1, y2) to match the documentation and for S-PLUS compatibility. [Previously, par(omd) was of the form c(bottom, left, top, right) like par(oma) and par(omi)] o formatC() did not check its 'flag' argument, and could segfault if it was incorrect. (PR#7686) o Contrasts needed to be coerced to numeric (e.g. from integer) inside model.matrix. (PR#7695) o socketSelect() did not check for buffered input. o Reads on a non-blocking socket with no available data were not handled properly and could result in a segfault. o The "aovlist" method for se.contrast() failed in some very simple cases that were effectively not multistratum designs, e.g. only one treatment occurring in only one stratum. o pgamma() uses completely re-written algorithms, and should work for all (even very extreme) arguments; this is based on Morten Welinder's contribution related to PR#7307. o dpois(10, 2e-308, log=TRUE) and similar cases gave -Inf. o x <- 2^(0:1000); plot(x, x^.9, type="l", log="xy")# and x <- 2^-(1070:170); plot(x, x^.9, type="l", log="xy")# now both work o summary.lm() asked for a report on a reasonable occurrence, but the check failed to take account of NAs. o lm() was miscalculating 'df.residual' for empty models with a matrix response. o summary.lm() now behaves more sensibly for empty models. o plot.window() was using the wrong sign when adjusting xlim/ylim for positive 'asp' and a reversed axis. o If malloc() fails when allocating a large object the allocator now does a gc and tries the malloc() again. o packageSlot() and getGroupMembers() are now exported from the 'methods' package as they should from documentation and the Green Book. o rhyper() was giving numbers slightly too small, due to a bug in the original algorithm. (PR#7314) o gsub() was sometimes incorrectly matching ^ inside a string, e.g. gsub("^12", "x", "1212") was "xx". o [g]sub(perl = TRUE) was giving random results for a 0-length initial match. (PR#7742) o [g]sub was ignoring most 0-length matches, including all initial ones. Note that substitutions such as gsub("[[:space:]]*", " ", ...) now work as they do in 'sed' (whereas the effect was previously the same as gsub("[[:space:]]+", " ", ...)). (In part PR#7742) o Promises are now evaluated when extracted from an environment using '$' or '[[ ]]'. o reshape(direction="wide") had some sorting problems when guessing time points (PR#7669) o par() set 'xaxp' before 'xlog' and 'yaxp' before 'ylog', causing PR#831. o The logic in tclRequire() to check the availability of a Tcl package turned out to be fallible. It now uses a try()-and-see mechanism instead. o Opening a unz() connection on a non-existent file left a file handle in use. o "dist" objects of length 0 failed to print. o INSTALL and the libR try harder to find a temporary directory (since there might be one left over with the same PID). o acf() could cause a segfault with some datasets. (PR#7771) o tan(1+LARGEi) now gives 0+1i rather than 0+NaNi (PR#7781) o summary(data.frame(mat = I(matrix(1:8, 4)))) does not go into infinite recursion anymore. o writeBin() performed byte-swapping incorrectly on complex vectors, also swapping real and imaginary parts. (PR#7778) o read.table() sometimes discarded as blank lines containing only white space, even if sep=",". ************************************************** * * * 2.0 SERIES NEWS * * * ************************************************** CHANGES IN R VERSION 2.0.1 patched NEW FEATURES o warnings() now looks only in the workspace for `last.warning' (suggested by PR#7363). o The search for browsers now starts with firefox, and has mozilla ahead of netscape. PACKAGE INSTALLATION CHANGES o A package DESCRIPTION file which contains a Built field (it should not!) is now worked around, with loud warnings. BUG FIXES o split() was accepting raw and list vectors as input, but not populating the output correctly. split() now handles vectors with names internally and so is almost as fast as on vectors without names (and maybe 100x faster than before). o subset() now throws an error if its 'subset' argument is not logical whereas it could appear to work and give wrong answers with e.g. a numeric 'subset' argument. o sum(), max(), min(), prod(), any() and all() were incorrectly using partial matching for their na.rm argument. o kmeans() now ensures that the initial cluster centers it chooses are distinct. o text()'s default method could segfault if passed 0-length coordinates. o R CMD INSTALL (Unix) was sometimes leaving a temporary dir behind. (PR#7230) o read.table() could fail when the row.names were looking like numbers. o seq(length= ) now always returns "integer" storage. o labels.lm was broken. (PR#7417) o The hashing used for character vectors in object.size() was inefficient for vectors with thousands of identical values not sharing the same storage (an unusual case). o data.matrix() now warns if applied to a data frame with classed columns. o Plotting histograms where the expression used to deparse to multiple lines now gives a sensible default title. (PR#7421) o The print method for getDLLRegisteredRoutines() was badly designed, and failed for all the standard packages' DLLs. o getDLLRegisteredRoutines.character() was broken. o write.table(x, row.names=FALSE) was incorrect for a 0-column x. o write.table() assumed dec = "," if it was not ".". o morley.tab used by R-intro was not installed on Unix since 2.0.0 (it was on Windows). o gsub(perl=TRUE) returned a string which printed with trailing garbage if there was a match at the beginning whose replacement was shorter. (PR#7479) Similarly, the result was truncated if a replacement at the beginning was longer. o Some PDF readers do not define PDFDocEncoding, so pdf()'s ISOLatin1 encoding is now derived from WinAnsi rather than PDFDocEncoding. o xfig() had not been updated for the 2.0.0 alpha changes. Using more than one plot with a non-white background and with onefile=FALSE could segfault. o sprintf() did not check for buffer overflow on character strings. (PR#7554) o The error message when evaluation depth was exceeded itself caused an error in deparsing and so was not shown. o The 'datalist' file was not being used during installation (2.0.1 only). o replications() was wrongly reporting lack of balance in designs with interactions (and had since unique() gained a matrix method). CHANGES IN R VERSION 2.0.1 NEW FEATURES o Platform equivalence in library() is tested by a new function testPlatformEquivalence() which ignores the 'vendor' field and can be customized by cognescenti. o The assignment form of split() allows recycling of vectors within the value list. In particular, things like split(x, g) <- lapply(split(x, g), mean) now work DOCUMENTATION o Manual `Writing R Extensions' has new sections on writing portable packages and on writing new front-ends for R -- the latter will be more comprehensive in R 2.1.0 which has new public header files. DEPRECATED & DEFUNCT o The aqua module in MacOS X is deprecated. o Capabilities "bzip2", "GNOME, "libz" and "PCRE" are deprecated. o The GNOME GUI on Unix-alikes is deprecated as part of R; it will be available in another form as from R 2.1.0. o The undocumented use of UseMethod() with no argument is now formally deprecated. INSTALLATION CHANGES o Building on Alpha OSF/1 no longer forces the C flag -std1, which appears to be no longer needed. (PR#7257) o The compiler flag -mieee-fp is no longer used on i386 Linux (these days it is only passed to the linker and was only invoked for compilation steps). o -D__NO_MATH_INLINES is only used on older ix86 glibc-based systems which need it (tested at configure time). This leads to small improvements in speed and accuracy on modern systems. o If makeinfo >= 4.5 is not available, warnings are given that some of the HTML manuals will be missing, and the index page given by help.start() will link to CRAN versions of those manuals. o Files aclocal.m4 and acinclude.m4 used in maintainer builds are not longer included in the distribution. C-LEVEL FACILITIES o It was not clear in 'Writing R Extensions' that some of the entry points in the 'Utilities' section were not declared in (they were in ). Now all the entry points in that section are declared in , included by . BUG FIXES o The grid.grab() function in package grid would throw an error if there were no viewports pushed (now returns NULL). o model.frame.default() takes row names from the response variable if that has suitable names and there is no 'data' argument. (This follows S but was not previously implemented in R.) o write.table() was not respecting the 'dec' argument for complex numbers. o write.table() printed a mixture of numeric and complex numbers as all complex. (PR#7260) o R CMD INSTALL failed with versioned installs on packages which save images (only). o dlogis() gave NaN not 0 for large negative arguments. o Importing from another name space was broken for versioned installs, incorrectly reporting something like "package 'imported_from' does not have a name space". o The GNOME interface under Linux/Unix was broken. (PR#7276) o For the jpeg/png devices under Linux/Unix, under certain rare circumstances clipping needed to be cleared before starting a new page. (PR#7270, which has been the case since the devices were introduced in 1.1.0.) o First lattice plot (first grid.newpage() call) did not start a new page IF there had been a previous traditional graphics plot (on the same device). o Using install.packages() to install the same package to more than one library gave an incorrect warning message. (If there were two or more such packages it might give an error.) o .packages(all.available=TRUE) returned packages with an invalid version field in their DESCRIPTION whereas .find.packages() and packageDescription() did not. Now all do not. o packageDescription() now correctly reports that a package does not exist, rather than that its DESCRIPTION file is 'missing or broken'. o 'make dist' from builddir != sourcedir was copying not linking recommended packages to *.tgz. o Slots in prototype objects can inherit from locally defined classes (which were not being found correctly before). o Several fixes to the behavior of as() when there are either coerce= or replace= methods supplied in a call to setIs(). Related fixes to setIs() to handle correctly previous methods, if there were any. o splinefun(1[0], 1[0])(1) doesn't segfault anymore (PR#7290). spline() and splinefun() now also work with missing values by omiting them. o ecdf() was failing on inputs containing NAs. (Part of PR#7292) o tools:::.install_package_description was splitting the Built: field across lines on platforms with very long names. o capabilities() was wrong for the Aqua GUI on MacOS X. o Using Rprof() with a non-writable 'file' argument is now a non-fatal error and does not abort R. o binom.test() did not deparse its arguments early enough such that the reported data were ugly if x was a table. o Systems based on glibc, including those using R's substitute for strptime, were handling strptime("2001", "%Y") incorrectly, in some cases crashing. R's substitute code has been corrected (but problems may remain if glibc is used). See the ?strptime for what should happen (which is system-specific). o untrace() after trace() failed if package 'methods' was attached. (PR#7301) o summary.stepfun() was reporting for n > 6 summaries of the knots and levels as the actual values. Both print() and summary() methods called the constant values "step heights", although they were not the heights of the steps. o is.na/is.nan() were giving spurious warnings if applied to a raw vector. o is.atomic() gave incorrect result (false) for a raw vector. o rank() and order() accepted raw and list inputs, but did not give a sensible answer (always 1:n). Similarly, partial sorts of a raw vector were accepted but did nothing. o require() without a version argument tried for an unversioned load of a package even though a versioned install was already loaded. This often led to a message that a required package was being loaded when it was not actually being loaded. o str() made use of attributes() instead of slot(), and hence didn't properly print NULL slots. o contrib.url() now handles URLs ending in '/' correctly. o str() removed any class from externalptr objects. o logLik() and hence AIC() failed or gave incorrect answers on "lm" fits with na.action = na.exclude (and perhaps other na.actions's except na.omit and na.fail). o pmax() and pmin() sometimes used NAs in internal subassignments, and sometimes these failed. o Subassigning an expression, e.g. expr[2] <- 1, could leave an invalid object and so cause a segfault. (PR#7326) o download/install.packages() would misbehave if there was more than one version of a package in a repository. o sort(partial=) silently ignored some other arguments: using 'decreasing' or 'index.return' or supplying a factor are now errors. o The ave() function had trouble if the grouping contained unused levels. o read.fwf() got confused by skip > 0 and could infinite loop under some circumstances. (PR#7350) o upgrade(x, ask = FALSE) was broken for a "packageStatus" object. o Class "raw" had been omitted from the list of basic classes in the "methods" package and so could not be used in S4 classes. o Function getGroupMembers(), part of the definition of S4 classes, had been promised for release 2.0, but slipped through. o toLatex(sessionInfo()) produced incorrect LaTeX on some platforms due to special characters in the platform identifier. ********************************************************* * * * News of 1.x.y and 2.0.0 is in file `ONEWS' * * News of 1.0.0 and earlier is in file `OONEWS' * * * *********************************************************