THE R TASK LIST ``Somebody, somebody has to, you see ...'' The Cat in the Hat Comes Back. ---------------------------------------------------------------------- TASK: Multiple Graphics Device Drivers STATUS: Open FROM: Everyone R needs to have multiple active device drivers and a means for copying pictures from one device to another, etc. etc. [ This is a medium-sized task. It would be most useful to ] [ do this in conjunction with moving to an event driven model. ] [ Greg Warnes has written some code which maintains, a device ] [ "display list". How much memory this might devour in the ] [ multiple device case is an open question. There is also ] [ the question of what to do about the graphics parameters. ] [ Should each device maintain a complete "par" state, or ] [ should some parameters (like col, lty, font ...) be global. ] [ Could a user have any memory of the last values in effect ] [ for a driver which had been idle for a while. ] [ This is just about to hit the top of the list. ] ---------------------------------------------------------------------- TASK: complex gamma and log gamma function not implemented STATUS: Open FROM: R@stat.auckland.ac.nz [ This is quite low priority. Complain if you need it. ] [ The Fullerton library has complex gamma function code. ] ---------------------------------------------------------------------- TASK: solution of complex linear systems STATUS: Open FROM: R@stat.auckland.ac.nz [ Really just a matter of grabbing the correct linpack code. ] [ How general do we want to be here ... ] ---------------------------------------------------------------------- TASK: "nlm" documentation inaccuracies STATUS: Open FROM: jlindsey@luc.ac.be The help for nlm is still called minimize although the contents have been updated. As well, when an illegal value is fed to nlm, the error message contains msg instead of print.level. [ The documentation looks ok. The function needs to be ] [ rewritten so that it uses derivative information. ] ---------------------------------------------------------------------- TASK: "data.entry" problems STATUS: Open FROM: p.dalgaard@kubism.ku.dk the as.character problem in de() - probably better to fix even though it does make lists out of frames. there's no way to change a data value to NA in data.entry, etc. ... earlier message ... (Peter Dalgaard) data.entry et al do not seem to have been adjusted for the new data frame structure. This is actually a problem where a list is passed where a vector of character strings is expected. To fix it change snames <- substitute(list(...))[-1] to snames <- as.character(substitute(list(...))[-1]) However, there needs to be a look at the de... code. When a data frame is edited it is returned as a list. This can be cured with judicious use of "data.frame". [ The indicated change has been made, but other changes ] [ are needed. ] ---------------------------------------------------------------------- TASK: "x11" printcmd STATUS: Open FROM: maechler@stat.math.ethz.ch There is in theory a "printcmd" argument to x11, which is ignored. Make it do something. ---------------------------------------------------------------------- TASK: "source" requires a terminating newline on EOF STATUS: Open FROM: Kurt.Hornik@ci.tuwien.ac.at source() fails in many cases where a file has no final newline. (R&R, sorry for being ridiculouly nasty about things that don't work for files without a final newline. I have Emacs' next-line-add-newlines set to nil ...) This seems to be a problem with parse() in src/main/source.c in combo with the code in gram.y ... I know this is NOT something to quickly fix over the weekend. Please simply put it into your PROJECTS file. [ This is actually a syntax error according to the R grammar ] [ but maybe we can do something. ] ---------------------------------------------------------------------- TASK: help file ALIAS() and LINK() constructions STATUS: Closed FROM: R@stat.auckland.ac.nz How do we know which file to LINK to? There needs to a step which fills in the file name on the basis of all ALIAS declarations. [ A proprocessing step is needed. First we build a table ] [ of aliases and corresponding file names. Then we pass ] [ throught the files building the correct LINK references. ] [ The new Rdconv and build-html... solve `everything' ] ---------------------------------------------------------------------- TASK: "paste" problem STATUS: Closed FROM: maechler@stat.math.ethz.ch in S, paste(....., collapse = string) always returns ONE string (a character vector of length 1), according to documentation and several examples. in R, this is not true: R> paste(rep(" ",0), collapse="...") #anything for collapse character(0) S> paste(rep(" ",0), collapse="...") #anything for collapse [1] "" Again, I think R is more logical than S here, but it was decided that in minor cases, compatibility comes first... [ We now return "" in the zero length case. ] ---------------------------------------------------------------------- TASK: missing functionality - modelling STATUS: Open FROM: maechler@stat.math.ethz.ch aov, print.aov, summary.aov,... (!) which I really missed for teaching a few months ago. [ We'll get to this - it actually should be fun. ] ---------------------------------------------------------------------- TASK: warnings option STATUS: Open FROM: maechler@stat.math.ethz.ch which reminds me that we/I also would like something similar as S's options(warn = k) k= 0 : [default] print warnings k= -1 : do nothing (maybe append warnings to some temp-file) k= 1 : produce an error ('warning' becomes 'stop'). ---------------------------------------------------------------------- TASK: R has no stderr STATUS: Open FROM: Friedrich.Leisch@ci.tuwien.ac.at When I invoke R like R 2>errlog I would error messages expect to go to the file errlog instead of the screen. [ We don't have standard error. This is problematic on ] [ platforms other than Unix. ---------------------------------------------------------------------- TASK: "print.default" fix STATUS: Open FROM: la-jassine@aix.pacwan.net When you fix print.default, please also add prefix= ---------------------------------------------------------------------- TASK: "print.default" fix STATUS: Open FROM: jlindsey@luc.ac.be print.default in S has an option, right=T, but R does not ---------------------------------------------------------------------- TASK: "postscript" fix STATUS: Open FROM: la-jassine@aix.pacwan.net postscript() also needs the options onefile, print.it, and append (even if they are not supported yet it would be nice if the arguments could be accepted and ignored). [ I added these as arguments, but they have no effect. ] ---------------------------------------------------------------------- TASK: task scheduling STATUS: Open FROM: gwhite@cabot.bio.dfo.ca More generally, the range of things that can be done in R would be greater if there was a simple scheduling mechanism. Is there a way to have a specific function invoked just before the command prompt returns after a function? Such a function could be used to run save(...) or check for various external cues (update of a file's timestamp) to control an analysis. I doubt it would make sense to have full context switching in R, but perhaps save() could be done in a way that would allow it to be used even in a long calculation under some timer control. I expect the user would need to provide a list of the data objects that need to be saved. ---------------------------------------------------------------------- TASK: Inf numerics STATUS: Open FROM: plummer@iarc.fr Could we have an Inf object in R? I would find it useful. [ Sigh. I wish we had designed this in. ] [ It will be a pain to ADD. ] ---------------------------------------------------------------------- TASK: Auto-save STATUS: Open FROM: > BTW: How about putting auto-save-workspace on the task list? > Or just a manual save.work() currently, you can lose quite a > bit of work to an unexpected segfault. (And q()+restart is > cumbersome, esp. if you need to reattach subsetted dataframes, etc.) Perhaps call it save.image() instead and use save(list = ls(), file = ".RData") as was suggested some time ago? (Whatever the result is, it needs to go in the FAQ, which goes into great length about that under R data can get lost when a crash occurs, but does not say how to save them ...) ---------------------------------------------------------------------- TASK: "chisquare.test" problem STATUS: Open FROM: Can you change the explicit "cat" statement in the "chisquare.test" function which insists on writing to the screen even when the output is redirected to a variable? (Using "htest" class as in "t.test" function.) [ Should we switch to the library one. ] ---------------------------------------------------------------------- TASK: Graphics inconsistencies STATUS: Openish FROM: Bill.Venables@adelaide.edu.au While transferring some old S-code I came across some minor inconsistencies between R and S that are probably more nuisance value than they would take to fix. I report them here for reference, (but not in any campaigning mood, of course...) 1. No frame() command in R and so no graceful way to clear a plotting screen. (Or is there?) [ Added ] 2. There is a dev.off() function, but no other dev.xxx functions. (The dev.xxx group are S-PLUS and not vanilla S, by the way.) There is no graphics.off() function. [ Long-term project ] 3. If dfr is a data frame with components "x", "y" and some others then points(dfr) uses dfr as an xy-list in S but not in R. If there is some non-numeric component it actually fails in R. This may be S being a bit inconsistent, but the behaviour is different. [ Fixed? ] 4. The plotting marks are a bit gappy in R and even the ones that are there do not correspond to their S counterparts. Here is a little function to make a wall chart showing the gaps: [ We now have all the S symbols and a new set of R ones. ] show.marks <- function() { if(!exists(".Device") || is.null(.Device)) x11() plot(1, type="n", axes=F, xlab="", ylab="") oldpar <- par() par(usr = c(-0.01, 5.01, -0.01, 5.01), pty = "s") for(i in 0:18) { x <- 1/2 + (i %% 5) y <- 4.5 - (1/2 + (i %/% 5)) points(x + 1/5, y - 1/5, pch = i, cex = 3) text(x - 1/5, y + 1/5, i, adj = 0.5, cex = 1.5) } abline(h = 1:5 - 0.5, lty = 1) segments(0:5, rep(0.5, 5), 0:5, rep(4.5, 5)) par(oldpar) invisible() } 5. In S you may extend a list by assigning to a new component. For example if lis has components "x" and "y", only, you can extend it by assigning to lis$z, lis["z"] or lis[, "z] (the last if it is also a data frame). In R only the first of these works; the others give a "subscript out of bounds" error. (This may have been discussed while I was not paying attention, in which case I apologize.) [ Fixed in 0.50. ] ---------------------------------------------------------------------- TASK: Function pointer access STATUS: Open FROM: I want to report two problems with the Fortran code of R. 1) Configure does not adapt GETSYMBOLS.in if the Fortran Compiler does not add underscores to the symbol names. 2) There is a name conflict if the Fortran Compiler does not add underscores because there exist a Fortran function FMIN and a C function fmin(). Thus the name of the Fortran FMIN should be changed. [ This is fixed I think. ] Currently I am rewriting my robust location-scale code in C. I intend to make this new code available as a library once a standard for such libraries has been agreed upon. As I would like to allow prospective users to experiment with private psi/chi functions I need access to the hash table of available function pointers. Is it possible that you insert a function into dotcode.c that contains the code fragment form lines 482 to 495 and returns a function pointer? ---------------------------------------------------------------------- TASK: Partial string matching STATUS: Open FROM: Is there an existing partial string match function which could be used in place of pstrmatch in subset.c??? If not can pstrmatch take on the functions of all partial match functions? ---------------------------------------------------------------------- Post 0.49 Additions ---------------------------------------------------------------------- TASK: Name Attributes on Calls STATUS: Closed (almost) FROM: A call with tagged arguments is something like a list, the tags can be used to access elements, but the names attribute is absent, until the call is coerced to a list. (Attempting to set the names() causes evaluation. Changing "list" to "blipblop" causes an 'Error: couldn't find function "blipblop"' at that point.) > j<-substitute(list(a=1, b=2)) > j list(a = 1, b = 2) > j$b [1] 2 > names(j) NULL > names(j)<-NULL > j [[1]] [1] 1 [[2]] [1] 2 [At least under SunOS this is fixed. RG] [However, 'names(j) <- NULL' has no effect in R, but does in S. MM] ---------------------------------------------------------------------- TASK: String NAs Via the Back Door. STATUS: Open FROM: Ok, the right solution seems to be names(as.list(j)), but then we run into some other fun with NA's... Shouldn't the real NA print without quotes? > ch[1]<-paste("N","A",sep="") > is.na(ch) [1] FALSE FALSE FALSE > ch [1] "NA" "a" "b" > ch[1]=="NA" [1] TRUE > ch[1]<-"NA" > is.na(ch) [1] TRUE FALSE FALSE [ We need a real NA. At present there is confusion between ] [ the string "NA" and the NA value for strings. One solution ] [ would be to use R_NilValue to indicate the missing string ] [ value, and let NA be just an ordinary string in all cases. ] [ This would be incompatible with S, but still an improvement. ] ---------------------------------------------------------------------- TASK: Directory Structure STATUS: Closed FROM: + Friedrich + Paul Gilbert > Regarding the location of data for libraries it might be easier if > everything for one library is included in one subdirectory. At least > it would certainly be easier to clean-up, which I like to do every few > years. Thus the code file, data, and any compiled code would be in > one subdirectory under $RHOME/library. Like library/
/ library/
/data library/
/exec (scripts and or binaries which only make sense for the add-on) library/
/funs library/
/help library/
/html library/
/objs (*.so) ??? > I realize this means a small change to the way libraries are now > found, but in the end I think it would be much cleaner. I think the changes would not be too hard, and we need to do something about the directory structure anyway. Actually, I think if R&R ok'ed something like that, Fritz and I would take a look. (In a way, I NEED to do something like that anyway, because I promised it for making an official Debian package ...) Would it mean that we also employ the S library/section concept? ---------------------------------------------------------------------- TASK: Startup Processing STATUS: Open FROM: The x11() window can be a nuisance to have popping up at startup (esp. on small screens) when you're not working with graphics. However, currently you can't get rid of it without modifying the systemwide Rprofile. Current logic is: Run $RHOME/library/Rprofile if ./.Rprofile exists run it else if $HOME/.Rprofile exists run that endif I think it should be Run $RHOME/library/Rsetup if ./.Rprofile exists run it else if $HOME/.Rprofile exists run that else if $RHOME/library/Rprofile exists run that endif i.e. essential system initialisation goes in Rsetup, the rest in Rprofile, which can be overridden by the user. Currently, the line if(interactive()) x11() is the candidate to move from one to the other. BTW, it really should read if(interactive() && getenv("DISPLAY")!="") x11() [BTW2: getenv() implemented using system()? is that really necessary?] >> I more or less agree, BUT: I'd like (in the future) to have the system-wide Rprofile searched in a site-specific location as well (similar to Emacs, following the idea of keeping the distribution and the site-specific things apart). So it would be system-wide Rsetup (which should basically be platform-specific stuff, cause otherwise it could go into base as well?) if .Rprofile exists run it else if ~/.Rprofile exists run it else if Rprofile exists on the default library search path, run it and that search path could e.g. specify all `library' trees with a compile-time default of ~/lib/R:/usr/local/lib/R/site:/usr/local/lib/R/${version} and settable at run time via e.g. the environment variable R_PATH. ---------------------------------------------------------------------- TASK: Old Unfixed Problems STATUS: Open FROM: I noticed the following problems (all already reported, but not in TASKS). * TASKS.OLD has Btw, here's another way to produce a segfault with admittedly nonsense code: R> x <- 1:5 R> dimnames(x)[1,2] <- NULL Segmentation fault [ Hmmm. This seems to have gone away. I get the error ] [ message "Error: incorrect number of subscripts on array". ] [ Verified by Rossini ...] On my Linux system, I still get the segfault. Perhaps others could check that? * File permissions in data should be 644. * In src/unix/system.c, one `Rdata' should be `RData' (d -> D). * The documentation for the noncentral chisquare distribution is not quite correct. (rnchisq does not exist, the existing functions have x, df and the noncentrality parameter as args, and the density should be pnchisq(x, df, lambda) = exp(-lambda / 2) * sum_{r=0}^\infty \frac{lambda^r}{2^r r!} pchisq(x, df + 2r) (semiTeX notation only, sorry). ---------------------------------------------------------------------- TASK: New Problems STATUS: Closed FROM: New minor remarks: * The documentation for `image' still has the old order z, x, y. * Perhaps one should add `par(ask = T)' in the image demo? * Perhaps one should save the original value of par() at the beginning of the graphics demo, and restore that at its end (s.t. typically asking is turned off again). ---------------------------------------------------------------------- TASK: Multiplatform Support STATUS: Open FROM: I've modified the "$RHOME/bin/R" and "$RHOME/cmd/filename" so that you can use the same directories for multiple machines. That is, machines running various flavors of UNIX can access the same directories. The modified structure adds the directories $RHOME/bin/$OSTYPE/ $RHOME/lib/$OSTYPE/ to hold the machine specific binaries. For instance, here the $RHOME directory contains two subdirectories, $RHOME/bin/solaris/ $RHOME/bin/sunos4/ which each hold the appropriate R.binary file. These two modified functions assume that the environment variable $OSTYPE is appropriately set, as is done automatically by the shell tcsh. If it is not set, the directory names collapse to the original values, $RHOME/bin/ and $RHOME/lib/ To use them, create the approprate directories and place the correct binaries therein. ( Note that the makefiles will not do this automatically!) Then replace $RHOME/bin/R and $RHOME/cmd/filename with the modified ones. ---------------------------------------------------------------------- TASK: Platform Independence STATUS: Open FROM: Friedrich.Leisch@ci.tuwien.ac.at IMHO we should definetely have platform-dirs for everything that's possibly platform-dependent ... resulting in something like /
/ e.g. for R code and /
// e.g. for exec and dynload-objects. for exec there's a problem though, as some exec's are shell/perl/whatever-scripts and *should* work on any platform ... ---------------------------------------------------------------------- TASK: Poly STATUS: Open FROM: PS1. There was also `poly' function in your snapshot WORK tree ... do you already have a final version of that? ---------------------------------------------------------------------- TASK: Naming with Numeric Values and "unlist" STATUS: Open FROM: R> l <- list("11" = 1:5) R> l $11 [1] 1 2 3 4 5 R> unlist(l) 111 112 113 114 115 1 2 3 4 5 [ Bug or feature ? ] ---------------------------------------------------------------------- TASK: all.names needed STATUS: Open FROM: I could not find the all.names function in R so I created the enclosed. Comments, criticisms, or changes to a one-liner by creating nested anonymous functions are welcome. I'll try to work out a corresponding all.vars function. ### $Id: TASKS,v 1.3 1997/11/11 07:58:05 maechler Exp $ ### Some replacement functions that are missing in R ### Determine all the names (symbols) occuring in an object. ### This is probably grossly inefficient. all.names <- function (x) { if (mode(x) == "symbol") return(as.character(x)) if (length(x) == 0) return(NULL) if (is.recursive(x)) return(unlist(lapply(as.list(x), all.names))) character(0) } ---------------------------------------------------------------------- TASK: "sys.function" problem STATUS: Open FROM: I attempted to create a recursive anonymous function to be called within another function. You may want to stop reading for a bit and consider how that would be done. That is, how do you recursively call a function that has never been assigned a name? OK, you're back. You probably came up with a better solution than I did but I used (sys.function())(arg) to do the recursion. The piece of code looks like flist <- (function(x) { if (mode(x) == "call") { if (x[[1]] == as.name("/")) return(c(sys.function()(x[[2]]), sys.function()(x[[3]]))) if (x[[1]] == as.name("(")) # for R return(sys.function()(x[[2]])) } if (mode(x) == "(") return(sys.function()(x[[2]])) # for S list(x) })(getGroupsFormula(data, form, ...)[[2]]) ## I know it's horribly obscure. Blame Bill Venables for teaching me this. Regretably, it doesn't work in R. Using the debugger one finds that sys.function() returns the function being called the first time through but the second time through it returns NULL. Is this a bug or a feature? ---------------------------------------------------------------------- TASK: Matrix multiply problems STATUS: Open FROM: Both of these used to work and seem useful and harmless (and work in S): R> matrix(1,ncol=1)%*%c(1,2) Error in matrix(1, ncol = 1) %*% c(1, 2) : non-conformable arguments R> matrix(1,ncol=1)*(1:2) Error: dim<- length of dims do not match the length of object ---------------------------------------------------------------------- TASK: "update" comments and fixes STATUS: Open FROM: 1. To make update() work with a new formula for glms, change the first line of the glm() function from call <- sys.call( to call<-match.call() (this means that the formula component of the returned call is labelled so that update can find it) 2. update.lm doesn't do anything with its weights= argument Add if (!missing(weights)) call$weights<-substitute(weights) Similarly, to get update to work properly on glms you need a lot more of these if statements (see update.glm at the end of the message). 3. update.lm evaluates its arguments in the wrong frame. It creates a modified version of the original call and evaluates it in sys.frame(sys.parent()). If update.lm is called directly this is correct, but if it is called via update() the correct frame is sys.frame(sys.parent(2)). Worse still, if it is called by NextMethod() from another update.foo() the correct frame is still higher up the list. My solution (a bit ugly) is to move up the list of enclosing calls checking at each stage to see if the call is NextMethod, update or an update method. It can be seen at the end of update.glm at the bottom of this message, and something of this sort needs to be added to other update methods. update.glm<-function (glm.obj, formula, data, weights, subset, na.action, offset, family, x) { call <- glm.obj$call if (!missing(formula)) call$formula <- update.formula(call$formula, formula) if (!missing(data)) call$data <- substitute(data) if (!missing(subset)) call$subset <- substitute(subset) if (!missing(na.action)) call$na.action <- substitute(na.action) if (!missing(weights)) call$weights <- substitute(weights) if (!missing(offset)) call$offset <- substitute(offset) if (!missing(family)) call$family <- substitute(family) if (!missing(x)) call$x <- substitute(x) notparent <- c("NextMethod", "update", methods(update)) for (i in 1:(1+sys.parent())) { parent <- sys.call(-i)[[1]] if (is.null(parent)) break if (is.na(match(as.character(parent), notparent))) break } eval(call, sys.frame(-i)) } ---------------------------------------------------------------------- TASK: Wisdom STATUS: Open FROM: Some of the "eternal truths" about the S language are: - every object has a mode obtainable by mode(object) [ok] - every object has a length obtainable by length(object) [ok] - every object can be coerced to a list of the same length [not yet, even for expression()s (and functions)] One can imagine that code that messes around with functions and other expressions in R will break fairly quickly when these conditions do not hold. I don't know how much work would be involved in patching over these differences between R and S but I suspect it would not be a trivial undertaking. ---------------------------------------------------------------------- TASK: frametools STATUS: Open FROM: The following three functions are designed to make manipulation of dataframes easier. I won't write detailed docs just now, but if you follow the example below, you should get the general picture. Comments are welcome, esp. re. naming conventions. Note that these functions are definitely not portable to S because they rely on R's scoping rules. Not that difficult to fix, though: The nm vector and the "parsing" functions need to get assigned to (evaluation) frame 1 (the "expression frame" of S), and preferably removed at exit. data(airquality) aq<-airquality[1:10,] select.frame(aq,Ozone:Temp) subset.frame(aq,Ozone>20) modify.frame(aq,ratio=Ozone/Temp) Notice that in modify.frame(), any *new* variable must appear as a tag, not as the result of an assignment, i.e.: modify.frame(aq,Ozone<-log(Ozone)) works as expected modify.frame(aq,lOzone<-log(Ozone)) does not. This is mainly because it was tricky to figure out what part of a left hand side constitutes a new variable to be created (note that indexing could be involved). So assignments to non-existing variables just create them as local variables within the function. Making a virtue out of necessity, that might actually be considered a feature... ---------------------------------------- "select.frame" <- function (dfr, ...) { subst.call <- function(e) { if (length(e) > 1) for (i in 2:length(e)) e[[i]] <- subst.expr(e[[i]]) e } subst.expr <- function(e) { if (is.call(e)) subst.call(e) else match.expr(e) } match.expr <- function(e) { n <- match(as.character(e), nm) if (is.na(n)) e else n } nm <- names(dfr) e <- substitute(c(...)) dfr[, eval(subst.expr(e))] } "modify.frame" <- function (dfr, ...) { nm <- names(dfr) e <- substitute(list(...)) if (length(e) < 2) return(dfr) subst.call <- function(e) { if (length(e) > 1) for (i in 2:length(e)) e[[i]] <- subst.expr(e[[i]]) substitute(e) } subst.expr <- function(e) { if (is.call(e)) subst.call(e) else match.expr(e) } match.expr <- function(e) { if (is.na(n <- match(as.character(e), nm))) if (is.atomic(e)) e else substitute(e) else substitute(dfr[, n]) } tags <- names(as.list(e)) for (i in 2:length(e)) { ee <- subst.expr(e[[i]]) r <- eval(ee) if (!is.na(tags[i])) { if (is.na(n <- match(as.character(tags[i]), nm))) { n <- length(nm) + 1 dfr[[n]] <- numeric(nrow(dfr)) names(dfr)[n] <- tags[i] nm <- names(dfr) } dfr[[tags[i]]][] <- r } } dfr } "subset.frame" <- function (dfr, expr) { nm <- names(dfr) e <- substitute(expr) subst.call <- function(e) { if (length(e) > 1) for (i in 2:length(e)) e[[i]] <- subst.expr(e[[i]]) e } subst.expr <- function(e) { if (is.call(e)) subst.call(e) else match.expr(e) } match.expr <- function(e) { if (is.na(n <- match(as.character(e), nm))) e else dfr[, n] } r <- eval(subst.expr(e)) r <- r & !is.na(r) dfr[r, ] } ---------------------------------------------------------------------- TASK: General Problems STATUS: Open FROM: 1. A gentle reminder that the default has not been changed for saving .RData in batch mode (as was promised). 2. The degrees of freedom for the null deviance in glm are wrong when some observations are weighted out. This can give silly answers, for example when applying anova. The number of weighted out observations should be subtracted, as in other df calculations. 3. The null deviance itself is wrong in glm when an offset is used. It can be smaller than that when variables are added to the model! 4. R gave a segmentation fault when I tried to fit a model with 49 factor levels in glm (using R -v4). All these glm problems were with poisson. 5. R still does not read my environmental variables to set memory size. Suggestions: 1. d, p, q, and r functions for inverse Gauss and Laplace distributions. 2. Add a fifth function for continuous distributions, the hazard function, h. For example, ht <- function(...) dt(...)/(1-pt(...)) is the Student t hazard function. For writing likelihood functions, these would be much faster in C than R and some such as Weibull can be simplified. 3. Add the five functions for three parameter distributions such as generalized F, extreme value, etc., Box-Cox,... (I have the densities, cumulative, and hazard as R functions.) 4. Philippe Lambert and I have d and p functions working in R for the four-parameter stable family by inverting the characteristic function with a Fourier transform (requires C code). S-plus only has the r function for stables. ---------------------------------------------------------------------- TASK: Generic Print STATUS: Open FROM: Paul Gilbert I have always thought that typing the name of an object generated a call to the print method for the object, however, (in 0.49) I redefined the generic print method as print <- function(x, ...) {if (is.tframe(x)) UseMethod("print.tframe") else UseMethod("print") } Now I have an object z which returns TRUE to is.tframe(z) and > class(z) [1] "ts" "tframe" Then > print(z) [1] 1981.50 2006.25 4.00 But > z Error: comparison is possible only for vector types > traceback() [1] "c(\"print.ts(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\", \\\"tframe\\\"\", " [2] "c(\"print(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\", \\\"tframe\\\"\", " This is generating a call to the class method print.ts rather than to print.tframe.ts as is done when I use print(z). If my understanding that typing the name of an object should generate a call to the print method for the object then this is a bug. Otherwise, could someone please explain to me what it does. Thanks. ---------------------------------------------------------------------- TASK: getenv() STATUS: Open FROM: Paul Gilbert Here are two small problems I've pointed out before, but still seem to be in 0.49. 1/ getenv() should return everything, not complain missing item. ---------------------------------------------------------------------- TASK: summary.default STATUS: Closed FROM: Paul Gilbert 2/ In summary.default ... sumry[i, 2] <- if (is.object(ii)) class(ii) should be changed to ... sumry[i, 2] <- if (is.object(ii)) paste(class(ii), collapse=" ") so that it works with lists of lists. (This fix was suppose to be added to Splus 4.) [The solution is now different: cls <- class(ii) sumry[i, 2] <- if (length(cls) > 0) cls[1] else "-none-" ] ---------------------------------------------------------------------- TASK: Time Series Problems STATUS: Open FROM: Here are four problems with ts: 1/ ts matrix subscripting should support drop=F: > z<- matrix(1:10,5,2) > z <-ts(z) > z[,1,drop=F] Error in [.ts(z, , 1, drop = F) : unused argument to function [ok] 2/ == and other comparisons with non-ts matrices should work: > z <- matrix( 1:10,5,2) > ts(z) Time-Series: Start = c(1, 1) End = c(5, 1) Frequency = 1 [,1] [,2] [1,] 1 6 [2,] 2 7 [3,] 3 8 [4,] 4 9 [5,] 5 10 > z == ts(z) Error: invalid time series parameters specified 3/ The generic functions start and end need default methods to return a result for matrices as previously and in S. The following seems to work. start.default <- function (x) start(ts(x)) end.default <- function (x) end(ts(x)) 4/ In the function start.ts (and in end.ts) ts[1] in the last line is not defined. Perhaps I am missing something? start.ts function (x) { ts.eps <- .Options$ts.eps if (is.null(ts.eps)) ts.eps <- 1e-06 tsp <- attr(as.ts(x), "tsp") is <- tsp[1] * tsp[3] if (abs(is - round(is)) < ts.eps) { is <- floor(tsp[1]) fs <- floor(tsp[3] * (tsp[1] - is) + 0.001) c(is, fs + 1) } else ts[1] } ---------------------------------------------------------------------- TASK: Recycling problems STATUS: Open FROM: Paul Gilbert In R 0.49 comparison of logic matrices with & and | seems to sometimes generate false warning messages about longer object length is not a multiple of shorter object length. I have not been able to isolate the exact circumstances. ---------------------------------------------------------------------- TASK: Generic "write" function STATUS: Open FROM: Following my posting of a write.table() function, Martin suggested that one could have a generic write() function and special methods for e.g. time series, data frames, etc. Well, a month has passed since ... What does everyone think? Is it a good idea, or would write.table() be enough? If we think that it is not enough, which arguments should the write methods typically allow? What about write.xxx (x, # object file = # filename, default stdout append = # obvious sep = # obvious eol = # end of line char ...) ??? On the other hand, it seems clear that something like write.table() is nice, and what it should do. But what about classes other than data.frame? Note that S has a write(.) function which would be our write.default(.) your write.table would be our write.data.frame The only addition would be a 'write.matrix' which would be 'like' write.data.frame, the only problem being that 'matrix' is not a class (yet). [Note that in S4, everything has a class; I'm voting for matrices to have a class in R ..] write.default could 'despatch' to write.matrix if x is a matrix. ---------------------------------------------------------------------- TASK: Comparison with NA and Zero-Length Vectors STATUS: Open FROM: + Thomas: Any comparison with NULL generates an error Error: comparison is possible only for vector types whereas in S(-PLUS) it gives NA, which seems more sensible. Along similar lines, comparison with a length 0 vector returns logical(0) in R but NA in S. Martin: Isn't logical(0) more logical than NA ? I agree that it would be best (convenience) if 'NULL==1' returned the same as 'numeric(0)==1'. At the moment, I don't see why compatibility with S should be important here: if( NULL == anything) or, e.g., if( numeric(0) == numeric(0) ) give an error anyway, i.e., you have to test for length 0 _anyway_ in the cases where one comparison argument may have zero length. Thomas: I didn't (previously) make any comment on this -- I only said that NA was more logical than an error message. However, the advantage of returning NA is that NA | TRUE is TRUE, NA & FALSE is FALSE, which doesn't happen with logical(0). Also, from a compatibility point of view one of them is tested with is.na(), the other with length(), so it can matter which one you use. Of course no-one should deliberately write code where it matters, but these things happen. It seems in fact that logical(0) | TRUE causes R to freeze (R0.49, sparc solaris). Robert: Well, we thought logical(0) & T should return logical(0) logical(0) | T should return logical(0) already we have NA | T returns T and NA & T returns NA ---------------------------------------------------------------------- TASK: Modules STATUS: Open FROM: I came across a paper on scheme module design that may be under consideration for rnrs -- I'm a bit hazy on that. At any rate, it is at http://www.cs.princeton.edu/~blume/modules.dvi. I haven't read it carefully yet, but it is fairly heavily influenced by SML but doesn't go too far overboard (well maybe a bit). ---------------------------------------------------------------------- May 1. ---------------------------------------------------------------------- TASK: "abline" incompatibility STATUS: Closed; Fixed uncertain why.... (Aug 6, 97). FROM: I found a little different behavior of R with S. at R-0.49: > a [1] 12 23 22 34 44 54 55 70 78 > plot(a) > abline(lsfit(seq(1,len=length(a)), a)) Error: no applicable method for "coefficients" at S (from AT&T '92) result draw coefficient line without error. Then I think to need define a function as followed: coefficients.default <- function(x) x$coef ---------------------------------------------------------------------- TASK: Legend problems STATUS: Open FROM: When legend is used, the box around it has the line-type of the last call to lines or plot instead of solid always. ---------------------------------------------------------------------- TASK: "rnorm" change STATUS: Open FROM: Paul Gilbert For some reason I cannot determine, the function rnorm seems to be returning different values in R 0.49 than it did in R 0.16.1 (in Linux ELF). The function runif is unchanged. [ I believe I changed the underlying generator. ] [ I was worried about behavior in the extreme tails. ] [ Should we change back again? ] ---------------------------------------------------------------------- TASK: "formula" problems STATUS: Open FROM: Several bugs (no solutions, yet). These might be well known. 1) If one does, e.g., mymod <- lm(y ~ x); formula(mymod) then one does not get back the formula (one gets, Error: invalid formula) CLOSED (Aug 6, 97, RG). 2) if x is of mode numeric, then the model formula mymod <- lm(y ~ x + x^2) is not processed as S would do it. The model is fit ignoring the x^2 term, however mymod$call includes the x^2 term. This seems to be a bug (or maybe feature) in applying model formulae operators to numeric quantities. I expect (from experience with S) that x^2 will be interpreted as a math operator. Whatever the right thing to do is, it needs to be documented. ---------------------------------------------------------------------- TASK: formula problems STATUS: Open FROM: Mike Meyer writes: > Several bugs (no solutions, yet). These might be well known. > 1) If one does, e.g., mymod <- lm(y ~ x); formula(mymod) > then one does not get back the formula (one gets, Error: invalid formula) Yep. Seems that we need a formula.lm<-function(x)formula(x$terms) > 2) if x is of mode numeric, then the model formula > mymod <- lm(y ~ x + x^2) > is not processed as S would do it. The model is fit ignoring the x^2 term, We had that topic a while back. I think it was concluded that it is a feature, because mixing model formulas and arithmetic ditto is bad practice. (I don't have any strong feeling about this, personally. As long as R won't introduce those awful Helmert contrasts as default...) ---------------------------------------------------------------------- TASK: formula problems STATUS: Open FROM: Peter Dalgaard writes: > > > > 2) if x is of mode numeric, then the model formula > > mymod <- lm(y ~ x + x^2) > > is not processed as S would do it. The model is fit[ted] > > ignoring the x^2 term... > > We had that topic a while back. I think it was concluded that > it is a feature, because mixing model formulas and arithmetic > ditto is bad practice. I don't recall we did, but in any case I'd like to re-open it. There is an anomaly in the way : and ^ terms are handled in the sense that the logical and useful thing is obvious but does not happen. Let me give an example. Suppose a and b are factors, x and y are not. A term such as (a + b + x + y)^2 should be expanded out binomial fashion, coefficients stripped away and the remaining products treated as : products. Then S copes with terms like a:a, a:b and a:x fine, even x:y is handled by having it generate a column of xy-products, as it should. But a term such as x:x does not generate a column of x-squares, it is merely removed as it would be if it were a factor. This is a complete anomaly, and one that I don't think would be hard or dangerous for R to rectify. Indeed it would be very useful to generate a complete second degree regression in three variables using y ~ (1 + x1 + x2 + x3)^2. As it is now it generates linear and product terms only and omits the powers. Go figure. > (I don't have any strong feeling about this, personally. As > long as R won't introduce those awful Helmert contrasts as > default...) Ah, the Helmert contrasts b\^ete noir. For ANOVA the contrast matrix used is mostly irrelevant. For regression models I agree, treatment contrasts would be generally more easily interpreted. I presume the reason they were used at all is because if you have equal replication of everything the Helmert contrasts give you a model matrix with orthogonal columns, so all estimates are uncorrelated. Whenever do you get equal replication, though? ---------------------------------------------------------------------- TASK: formula problems STATUS: Open FROM: Bill Venables writes: > A term such as (a + b + x + y)^2 should be expanded out binomial > fashion, coefficients stripped away and the remaining products > treated as : products. Then S copes with terms like a:a, a:b and > a:x fine, even x:y is handled by having it generate a column of > xy-products, as it should. I tend to agree. > Ah, the Helmert contrasts b\^ete noir. For ANOVA the contrast > matrix used is mostly irrelevant. For regression models I agree, > treatment contrasts would be generally more easily interpreted. Understatement of the year... Last time I bumped into them, it took me and a colleague more than an hour to figure out how to interpret the regression coefficients, and, I may add, the solution was *not* what the white book said it was (it's not just one level minus the average of the preceding, the parameter is also scaled by the reciprocal of the level number). [There's a split-second solution -- see below -- but we sort of didn't think of it at the time...] > I presume the reason they were used at all is because if you have > equal replication of everything the Helmert contrasts give you a > model matrix with orthogonal columns, so all estimates are > uncorrelated. Whenever do you get equal replication, though? Hardly ever. Actually, I though that the point was not so much ortogonality, but the successive testing (A=B, A=B=C, A=B=C=D,...). However that is just plainly wrong outside of balanced ANOVA's. And, even in that case, once the first two levels differ, the rest of the coefficients lose all meaning. ---------------------------------------------------------------------- TASK: formula problems STATUS: Open FROM: We also need to fix formula.default. At the moment it only looks for x$formula. Other standard places to keep a formula are x$call$formula and x$terms. How about formula.default<-function (x) { if (!is.null(x$formula)) return(eval(x$formula)) if (!is.null(x$call$formula)) return(eval(x$call$formula)) if (!is.null(x$terms)) return(x$terms) switch(typeof(x), NULL = structure(NULL, class = "formula"), character = formula(eval(parse(text = x)[[1]])), call = eval(x), stop("invalid formula")) } One disdvantage to extracting the formula from $terms instead of $call$formula is that in S a terms object is not a formula. On the other hand it doesn't really matter as long as people use the formula() function. ---------------------------------------------------------------------- TASK: formula problems STATUS: Open FROM: Peter Dalgaard writes: > Bill Venables writes: > > Ah, the Helmert contrasts b\^ete noir. For ANOVA the contrast > > matrix used is mostly irrelevant. For regression models I agree, > > treatment contrasts would be generally more easily interpreted. > Understatement of the year... Last time I bumped into them, it took me > and a colleague more than an hour to figure out how to interpret the > regression coefficients, and, I may add, the solution was *not* what > the white book said it was (it's not just one level minus the average > of the preceding, the parameter is also scaled by the reciprocal of > the level number). [There's a split-second solution -- see below -- > but we sort of didn't think of it at the time...] A few weeks ago I gave a fairly detailed discussion of how to relate contrast matrices and their interpretation in s-news. I could re-issue it or post it to people if that was their wish. There is also to be an extended discussion of the subject in V&R2 due out in July, with a further elaboration to appear (real soon now...) in the online complements. > > I presume the reason they were used at all is because if you have > > equal replication of everything the Helmert contrasts give you a > > model matrix with orthogonal columns, so all estimates are > > uncorrelated. Whenever do you get equal replication, though? > > Hardly ever. Actually, I though that the point was not so much > ortogonality, but the successive testing (A=B, A=B=C, A=B=C=D,...). > However that is just plainly wrong outside of balanced ANOVA's. > And, even in that case, once the first two levels differ, the rest > of the coefficients lose all meaning. Indeed. That's why I tended to discount that possibility myself. Here is a contrast matrix generator I sometimes prefer to use that corresponds to testing A=B, B=C, C=D, ... Of course the contrasts are not mutually orthogonal. How it works is left as a little puzzle. (This function works in S. I haven't tested it in R, but it should work if lower.tri() is available.) contr.sdif <- function(n, contrasts = T) { # contrasts generator giving `successive difference' contrasts. if(is.numeric(n) && length(n) == 1) { if(n %% 1 || n < 2) stop("invalid number of levels") lab <- as.character(seq(n)) } else { lab <- as.character(n) n <- length(n) if(n < 2) stop("invalid number of levels") } if(contrasts) { contr <- col(matrix(nrow = n, ncol = n - 1)) upper.tri <- !lower.tri(contr) contr[upper.tri] <- contr[upper.tri] - n structure(contr/n, dimnames = list(lab, paste( lab[-1], lab[ - n], sep = "-"))) } else structure(diag(n), dimnames = list(lab, lab)) } > contr.sdif(4) 2-1 3-2 4-3 1 -0.75 -0.5 -0.25 2 0.25 -0.5 -0.25 3 0.25 0.5 -0.25 4 0.25 0.5 0.75 ---------------------------------------------------------------------- TASK: startup processing STATUS: Open FROM: 2) Again, along the lines of something that S does that is actually useful. In S you can set the S_FIRST environment variable and have this used as the equivalent of the R .Rprofile file. Might it be a good idea to allow an R_FIRST environment variable as well. That way I could set user specific preferences that apply no matter what directory I have working in. ---------------------------------------------------------------------- TASK: Function Argument Naming STATUS: Open FROM: There is a problem with 'default argument evaluation' when I use an existing function name as argument name : sintest <- function(x, y = 2, sin= sin(pi/4)) { ## Purpose: Test of "default argument evaluation" ## -------- Fails for R-0.49. Martin Maechler, Date: 9 May 97. c(x=x, y=y, sin=sin) } ## R-0.49: R> sintest(1) ##> Error in sintest(1) : recursive default argument reference ## S-plus 3.4 (being 100% ok): S> sintest(1) x y sin 1 2 0.7071068 Warning messages: looking for function "sin", ignored local non-function in: sintest(1) ------------------------------------------------------- The following shows bugs, both in R and S: sintest2 <- function(x ,y = 2) { ## Purpose: Test of "default argument evaluation" ## -------- Fails for S-plus 3.4. Martin Maechler, Date: 9 May 97. c(x=x, y=y, sin=sin) } R> sintest2(1) [[1]] [1] 1 [[2]] [1] 2 [[3]] --------------- is almost okay, the buglet being that the names have been dropped from the list. But watch this: S> sintest2(1) function(x = 1, y = 2, sin.x) sin2 = .Internal(sin(x), "do_math", T, 109) --- returning a function ((now we see, why S's way of treating functions as lists sometimes badly sucks)). ---------------------------------------------------------------------- TASK: Function argument naming STATUS: Open FROM: Martin Maechler wrote: [ Stuff above. ] For better or worse, S and R allow default expressions to contain references variables that are (or rather may be) created in the function body, so (in R and Splus) > x<-1 > f<-function(a,b=x) { if (a) x<-2; b} > f() Error: Argument "a" is missing, with no default > f(T) [1] 2 > f(F) [1] 1 More traditional lexical scoping would make the reference to x in the default always be global, but lots of code would break. I think we're stuck with this behavior as a corollary to the way S wants default arguments to work. Actually S is a bit inconsistent in its error message -- if you have a non-function argument it gives the same message as R, > g<-function(x=x) x >g() Error in g(): Recursive occurrence of default argument "x" Dumped Also in R's lexical scoping you probably do want the argument name to shadow any outer definitions if you want to be able to define default arguments that are recursive functions, e.g. > g<-function(n, nfac=function(x) { if (x <= 1) 1 else nfac(x-1)*x }) nfac(n); > g(6) [1] 720 ---------------------------------------------------------------------- TASK: Adding List Elements by Name STATUS: Open FROM: This works in Splus: > x<-list() > x[["f"]]<-1 > zz<-"g" > x[[zz]]<-2 In R both variants fail unless the name is already on the list. The first one can be replaced by x$f, but there's seems to be no substitute for the other one (oh yes I found one, but it's not fit to print!). This comes up if you e.g. want to create a variable in a data frame with a name given by a character string. ---------------------------------------------------------------------- TASK: Bug in "approx" STATUS: Open FROM: When the function approx is called with the argument rule=2, one gets the error message Error: NAs in foreign function call (arg 6) Besides, the meaning of rule=1 or rule=2 is opposite to that described in the help text and used in S-plus. For example, in R: R> approx(1:10,2:11,xout=5:15,rule=1) $x [1] 5 6 7 8 9 10 11 12 13 14 15 $y [1] 6 7 8 9 10 11 11 11 11 11 11 R> approx(1:10,2:11,xout=5:15,rule=2) Error: NAs in foreign function call (arg 6) but in S-plus: > approx(1:10,2:11,xout=5:15,rule=1) $x: [1] 5 6 7 8 9 10 11 12 13 14 15 $y: [1] 6 7 8 9 10 11 NA NA NA NA NA > approx(1:10,2:11,xout=5:15,rule=2) $x: [1] 5 6 7 8 9 10 11 12 13 14 15 $y: [1] 6 7 8 9 10 11 11 11 11 11 11 The reason for this bug can be found in the last lines of the code of approx: if (rule == 1) { low <- y[1] high <- y[length(x)] } else if (rule == 2) { low <- NA high <- low } else stop("invalid extrapolation rule in approx") y <- .C("approx", as.double(x), as.double(y), length(x), xout = as.double(xout), length(xout), as.double(low), as.double(high))$xout return(list(x = xout, y = y)) If (rule == 2) the values of low and high are set to NA. Immediately afterwards, the foreign function "approx" is called with these values, leading to the error Error: NAs in foreign function call (arg 6) To obtain the same behavior as in S-plus (and as in the help-text) the commands for (rule == 1) and (rule == 2) have to be exchanged. ---------------------------------------------------------------------- TASK: Names and unlisting (bug/feature) STATUS: Open FROM: hornik@ci.tuwien.ac.at R> l <- list("11" = 1:5) R> l $11 [1] 1 2 3 4 5 R> unlist(l) 111 112 113 114 115 1 2 3 4 5 I ran into this weekend ... ---------------------------------------------------------------------- TASK: "all.names" function needed STATUS: Open FROM: I could not find the all.names function in R so I created the enclosed. Comments, criticisms, or changes to a one-liner by creating nested anonymous functions are welcome. I'll try to work out a corresponding all.vars function. ### $Id: TASKS,v 1.3 1997/11/11 07:58:05 maechler Exp $ ### Some replacement functions that are missing in R ### Determine all the names (symbols) occuring in an object. ### This is probably grossly inefficient. all.names <- function (x) { if (mode(x) == "symbol") return(as.character(x)) if (length(x) == 0) return(NULL) if (is.recursive(x)) return(unlist(lapply(as.list(x), all.names))) character(0) } ### Local variables: ### mode: R ### End: And from Martin: Doug, your 'all.names' function [wow, I didn't even know it in S..] seems to have been written with S in your mind; you are exactly demonstrating some of the 'fine' differences between R & S >1> all.names <- function (x) >2> { >3> if (mode(x) == "symbol") return(as.character(x)) >4> if (length(x) == 0) return(NULL) >5> if (is.recursive(x)) return(unlist(lapply(as.list(x), all.names))) >6> character(0) >7> } 1) length(x) is not always defined in R; e.g. it is NOT for functions. --> Delete line 4 2) functions are NOT lists and cannot be coerced to, which makes line 5 fail for function objects. As a matter of fact, I once was also a bit interested in this. At that time, 'args' did not yet exist, and I wanted to abuse 'as.list' for functions. I was told that this is 'bad' (functions have nothing to do with lists; in R, functions can have a defining environment going with them...) and 'args' is now provided which helps my most immediate need. In short: I don't think you can define an all.names(.) function which works with functions arguments, in the current version of R. ---------------------------------------------------------------------- TASK: "sys.function" problem STATUS: Open FROM: + This was either an attempt to get an early lead in some future obfuscating R contest or a way of getting around the different scoping rules of R and S. I attempted to create a recursive anonymous function to be called within another function. You may want to stop reading for a bit and consider how that would be done. That is, how do you recursively call a function that has never been assigned a name? OK, you're back. You probably came up with a better solution than I did but I used (sys.function())(arg) to do the recursion. The piece of code looks like flist <- (function(x) { if (mode(x) == "call") { if (x[[1]] == as.name("/")) return(c(sys.function()(x[[2]]), sys.function()(x[[3]]))) if (x[[1]] == as.name("(")) # for R return(sys.function()(x[[2]])) } if (mode(x) == "(") return(sys.function()(x[[2]])) # for S list(x) })(getGroupsFormula(data, form, ...)[[2]]) ## I know it's horribly obscure. ## Blame Bill Venables for teaching me this. Regretably, it doesn't work in R. Using the debugger one finds that sys.function() returns the function being called the first time through but the second time through it returns NULL. Is this a bug or a feature? ---------------------------------------------------------------------- TASK: Matrix multiply problems STATUS: Open FROM: Both of these used to work and seem useful and harmless: R> matrix(1,ncol=1)%*%c(1,2) Error in matrix(1, ncol = 1) %*% c(1, 2) : non-conformable arguments R> matrix(1,ncol=1)*(1:2) Error: dim<- length of dims do not match the length of object ---------------------------------------------------------------------- TASK: "write" function STATUS: Open? FROM: Following my posting of a write.table() function, Martin suggested that one could have a generic write() function and special methods for e.g. time series, data frames, etc. Well, a month has passed since ... What does everyone think? Is it a good idea, or would write.table() be enough? If we think that it is not enough, which arguments should the write methods typically allow? What about write.xxx (x, # object file = # filename, default stdout append = # obvious sep = # obvious eol = # end of line char ...) ??? On the other hand, it seems clear that something like write.table() is nice, and what it should do. But what about classes other than data.frame? Martin Maechler: Note that S has a write(.) function which would be our write.default(.) your write.table would be our write.data.frame The only addition would be a 'write.matrix' which would be 'like' write.data.frame, the only problem being that 'matrix' is not a class (yet). [Note that in S4, everything has a class; I'm voting for matrices to have a class in R ..] write.default could 'despatch' to write.matrix if x is a matrix. ---------------------------------------------------------------------- TASK: "ls.print" problem STATUS: Closed, Aug 6, 97 RG. FROM: ls.print produces error that I don't seem to be able to trace. Output of the commands as follows: (hyeung is a 24x2 matrix of data) ------------------------------------------------- > summary(hyeung) x.1 x.2 Min. : 28.0 Min. : 10.0 1st Qu.: 72.0 1st Qu.: 87.5 Median : 86.5 Median : 92.5 Mean : 81.0 Mean : 82.5 3rd Qu.: 97.0 3rd Qu.:100.0 Max. :100.0 Max. :100.0 > summary(lsfit(hyeung[,1],hyeung[,2])) Length Class Mode coef 2 -none- numeric residuals 24 -none- numeric intercept 1 -none- logical qr 6 -none- list > ls.print(lsfit(hyeung[,1],hyeung[,2])) trace: ls.print(lsfit(hyeung[, 1], hyeung[, 2])) Error: missing value in ``n1 : n2'' ---------------------------------------------------------------------- TASK: Comparisons with zero length things STATUS: Open FROM: Thomas: Any comparison with NULL generates an error Error: comparison is possible only for vector types whereas in S(-PLUS) it gives NA, which seems more sensible. Along similar lines, comparison with a length 0 vector returns logical(0) in R but NA in S. Martin: Isn't logical(0) more logical than NA ? I agree that it would be best (convenience) if 'NULL==1' returned the same as 'numeric(0)==1'. At the moment, I don't see why compatibility with S should be important here: if( NULL == anything) or, e.g., if( numeric(0) == numeric(0) ) give an error anyway, i.e., you have to test for length 0 _anyway_ in the cases where one comparison argument may have zero length. Thomas: I didn't (previously) make any comment on this -- I only said that NA was more logical than an error message. However, the advantage of returning NA is that NA | TRUE is TRUE, NA & FALSE is FALSE, which doesn't happen with logical(0). Also, from a compatibility point of view one of them is tested with is.na(), the other with length(), so it can matter which one you use. Of course no-one should deliberately write code where it matters, but these things happen. It seems in fact that logical(0) | TRUE causes R to freeze (R0.49, sparc solaris). Robert: Well, we thought logical(0) & T should return logical(0) logical(0) | T should return logical(0) already we have NA | T returns T and NA & T returns NA Martin: Ok, given the above argument, returning NA is logical, too. However, I'd also argue that logical(0) | TRUE -> TRUE logical(0) & FALSE -> FALSE logical(0) & TRUE -> logical(0) logical(0) | FALSE -> logical(0) ThLu> It seems in fact that logical(0) | TRUE causes R to freeze ThLu> (R0.49, sparc solaris). Yes: > logical(0) | TRUE Warning in logical(0) | TRUE : longer object length is not a multiple of shorter object length Floating exception ~~~~~~~~~~~~~~~~~~ [and 'core' dump] ---------------------------------------------------------------------- TASK: Misc STATUS: Open FROM: Here are two small problems I've pointed out before, but still seem to be in 0.49. 1/ getenv() should return everything, not complain missing item. 2/ In summary.default ... sumry[i, 2] <- if (is.object(ii)) class(ii) should be changed to ... sumry[i, 2] <- if (is.object(ii)) paste(class(ii), collapse=" ") so that it works with lists of lists. (This fix was suppose to be added to Splus 4.) ---------------------------------------------------------------------- TASK: Method lookup for "print" STATUS: Open FROM: I have always thought that typing the name of an object generated a call to the print method for the object, however, (in 0.49) I redefined the generic print method as print <- function(x, ...) {if (is.tframe(x)) UseMethod("print.tframe") else UseMethod("print") } Now I have an object z which returns TRUE to is.tframe(z) and > class(z) [1] "ts" "tframe" Then > print(z) [1] 1981.50 2006.25 4.00 But > z Error: comparison is possible only for vector types > traceback() [1] "c(\"print.ts(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\", \\\"tframe\\\"\", " [2] "c(\"print(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\", \\\"tframe\\\"\", " This is generating a call to the class method print.ts rather than to print.tframe.ts as is done when I use print(z). If my understanding that typing the name of an object should generate a call to the print method for the object then this is a bug. Otherwise, could someone please explain to me what it does. Thanks. ---------------------------------------------------------------------- TASK: Time Series Problems STATUS: Open FROM: Here are four problems with ts: 1/ ts matrix subscripting should support drop=F: > z<- matrix(1:10,5,2) > z <-ts(z) > z[,1,drop=F] Error in [.ts(z, , 1, drop = F) : unused argument to function 2/ == and other comparisons with non-ts matrices should work: > z <- matrix( 1:10,5,2) > ts(z) Time-Series: Start = c(1, 1) End = c(5, 1) Frequency = 1 [,1] [,2] [1,] 1 6 [2,] 2 7 [3,] 3 8 [4,] 4 9 [5,] 5 10 > z == ts(z) Error: invalid time series parameters specified > 3/ The generic functions start and end need default methods to return a result for matrices as previously and in S. The following seems to work. start.default <- function (x) start(ts(x)) end.default <- function (x) end(ts(x)) 4/ In the function start.ts (and in end.ts) ts[1] in the last line is not defined. Perhaps I am missing something? start.ts function (x) { ts.eps <- .Options$ts.eps if (is.null(ts.eps)) ts.eps <- 1e-06 tsp <- attr(as.ts(x), "tsp") is <- tsp[1] * tsp[3] if (abs(is - round(is)) < ts.eps) { is <- floor(tsp[1]) fs <- floor(tsp[3] * (tsp[1] - is) + 0.001) c(is, fs + 1) } else ts[1] } ---------------------------------------------------------------------- TASK: False warnings STATUS: Open FROM: In R 0.49 comparison of logic matrices with & and | seems to sometimes generate false warning messages about longer object length is not a multiple of shorter object length. I have not been able to isolate the exact circumstances. ---------------------------------------------------------------------- TASK: ISO-latin1 characters STATUS: Open FROM: There seems to be a problem in print.default with some ISO-latin1 characters (the chars AFTER ASCII in western Europe...) if they appear in strings. (no problem if they are part of a function comment, see below). Some of the characters lead to 4 character Hex-codes being printed instead: "" ## ^u prints as "0xFB" If you use the funny characters in comments of functions, they are stored and printed properly. HOWEVER: In a few rare cases, the strings are not even PARSED properly; the line 'ISOdiv <- ..' below gives a SYNTAX error. The following code shows the symptoms : -- ONLY if the e-mail between here and your place is 8-bit clean! -- (else: get it ftp://ftp.stat.math.ethz.ch/U/maechler/R/string-test.R ) frenchquotes <- "«...»" ## <<...>> frenchquotes Umlaute <- "äöü ÄÖÜ" # = "a "o "u "A "O "U Umlaute #- only the last one is not printed properly... A.accents <- "àáâãäåæ ÀÁÂÃÄÅÆ" # `a 'a ^a "a oa ae `A 'A ^A "A oA AE A.accents EI.accents <- "ÈÉÊËÌÍÎÏ èéêëìíîï" EI.accents O.accents <- "ÒÓÔÕÖØòóôõöø" O.accents U.accents <- "ÙÚÛÜÝùúûüý" U.accents ISO24x <- "¡¢£¤¥¦§ ¨©ª«¬­®¯" #octal 241..257 ISO26x <- "°±²³´µ¶· ¸¹º»¼½¾¿" #octal 260..277 ##--- THIS IS a Problem: It gives a SYNTAX error ! ISOdiv <- "×÷ Ðð Ññ Þþ ßÿ" ##-- One of these characters even was producing the same as 'q()' !! aa_ function(x) { x^2 ##- frenchquotes <- "«...»" ## <<...>> ##- Umlaute <- "äöü ÄÖÜ" # = "a "o "u "A "O "U ##- A.accents <- "àáâãäåæ ÀÁÂÃÄÅÆ" # `a 'a ^a "a oa ae `A 'A ^A "A oA AE ##- EI.accents <- "ÈÉÊËÌÍÎÏ èéêëìíîï" ##- O.accents <- "ÒÓÔÕÖØòóôõöø" ##- U.accents <- "ÙÚÛÜÝùúûüý" ##- ##- ISO24x <- "¡¢£¤¥¦§ ¨©ª«¬­®¯" #octal 241..257 ##- ISO26x <- "°±²³´µ¶· ¸¹º»¼½¾¿" #octal 260..277 ##- ISOdiv <- "×÷ Ðð" ##-- OMITTED further: SYNTAX error !! } aa ---------------------------------------------------------------------- TASK: String length problems STATUS: Closed ? FROM: This is not a cat(.) but a string storing/parsing problem: nchar("\n \n") # gives 2 instead of 3 [ Hmmm. Was this typed to readline I wonder? There it ] [ seems that ^L must be escaped with ^V. Using the ANSI ] [ \f will now produce a literal formfeed. Indeed, using ] [ any of the ANSI C escapes will work. ] [ However, using the '^L' (emacs C-q C-l) in a string is still dropped: > "\n \n" [1] "\n\n" ---------------------------------------------------------------------- TASK: Fontend STATUS: Open FROM: Some time ago there was the suggestion to add a PLATFORM subdir level for bin (and eventually the library subdirs with `binaries'), and the idea to have the shell wrapper automagically call the right binary. I mentioned that one might be able to use the shell variables OSTYPE and HOSTTYPE for that, noticing however that e.g on my Debian Linux/GNU/ix86 bash tcsh OSTYPE Linux linux HOSTTYPE i386 i386-linux Hmm ... It seems (a colleague just checked that) that these variables are not POSIX either, and hence I'd say rather useless for our purpose. In the absence of a reliable run-time possibility to determine the current platform, it seems to be natural to use `platform' as obtained at compile-time for possibly distinguishing the various binaries etc, and leave it at the discretion of the sysadmin to ensure that the R script in the path calls the right binary. If I am missing something obvious, please let me know. ---------------------------------------------------------------------- TASK: Resetting Graphical Parameters STATUS: Open FROM: BY THE WAY: It would be nice to be able to say par(reset = TRUE) (or similar) for resetting all the graphical parameters to their (device-dependent) default values. [ This will require a little work. Perhaps the easiest thing ] [ to do is to add a new device driver call "reset". This would ] [ be best left to the multiple acyive device driver project. ] ---------------------------------------------------------------------- TASK: .Options not working in all cases STATUS: Open FROM: The .Options vector had been introduced a while ago after my suggestion (see Ross's E-mail below). .Options$digits is used be default in several print methods (eg print.lm), however, deparse(.) e.g., uses options()$width, and not .Options$width. Another problem is that .Options is still not in the documentation (on-line help). Before one could add it there, we'd need ``the specs''. I think the (at least my) idea was that options(.) queries or sets elements in the .Options list and all functions -- including the internal ones -- use .Options. As far as I know, this is what S does. Currently, this is NOT the case in R. Ross said a while ago: >>> From: Ross Ihaka >>> Date: Wed, 11 Dec 1996 17:10:59 +1300 (NZDT) >>> To: Martin Maechler >>> Cc: R-testers mailing list >>> Subject: R-alpha: options() and .Options -- ? Ross> Martin Maechler writes: >> This is not a bug report, rather than some remarks as a >> "request for comments": >> >> It is clear that options( foo = bar ) >> sets the option and also updates the builtin() .Options list : >> >> > options(myopt = pi) >> > .Options$my >> [1] 3.14159265 >> >> In S-plus, it was (is) possible to use .Options locally in a function >> frame in order to just affect some options during evaluation of that >> function. Ross> I have made some changes so that such local assignments to .Options Ross> will work. The down side is that such assignments will also work at Ross> top level with the changes shadowing the real system options. Ross> This also may be ok. It would have the advantage that options would Ross> then be preserved from session to session. Is this a good idea or a Ross> bad idea? ---------------------------------------------------------------------- TASK: STATUS: FROM: ---------------------------------------------------------------------- TASK: STATUS: FROM: ---------------------------------------------------------------------- TASK: STATUS: FROM: ---------------------------------------------------------------------- TASK: STATUS: FROM: ---------------------------------------------------------------------- TASK: STATUS: FROM: ---------------------------------------------------------------------- TASK: STATUS: FROM: ----------------------------------------------------------------------