Note: This file hasn't been updated in quite a while. We'll be looking into
      *real* bug-tracking systems which should make it obsolete. -pd

Update: We now have the bug-tracking system, but need to walk through the 
	list and see if anything here is still relevant.

			   THE R TASK LIST

	     ``Somebody, somebody has to, you see ...''
		  The Cat in the Hat Comes Back.


----------------------------------------------------------------------
TASK:	Multiple Graphics Device Drivers
STATUS:	Open
FROM:	Everyone
	R needs to have multiple active device drivers and a means for
	copying pictures from one device to another, etc. etc.
	[ This is a medium-sized task.	It would be most useful to     ]
	[ do this in conjunction with moving to an event driven model. ]
	[ Greg Warnes has written some code which maintains, a device  ]
	[ "display list".  How much memory this might devour in the    ]
	[ multiple device case is an open question.  There is also     ]
	[ the question of what to do about the graphics parameters.    ]
	[ Should each device maintain a complete "par" state, or       ]
	[ should some parameters (like col, lty, font ...) be global.  ]
	[ Could a user have any memory of the last values in effect    ]
	[ for a driver which had been idle for a while.		       ]
	[ This is just about to hit the top of the list.	       ]

----------------------------------------------------------------------
TASK:	complex gamma and log gamma function not implemented
STATUS:	Open
FROM:	R@stat.auckland.ac.nz
	[ This is quite low priority.  Complain if you need it.	 ]
	[ The Fullerton library has complex gamma function code. ]

----------------------------------------------------------------------
TASK:	solution of complex linear systems
STATUS:	Open
FROM:	R@stat.auckland.ac.nz
	[ Really just a matter of grabbing the correct linpack code. ]
	[ How general do we want to be here ...			     ]

----------------------------------------------------------------------
TASK:	"nlm" documentation inaccuracies
STATUS:	Open
FROM:	jlindsey@luc.ac.be
	The help for nlm is still called minimize although the
	contents have been updated. As well, when an illegal
	value is fed to nlm, the error message contains msg
	instead of print.level.
	[ The documentation looks ok.  The function needs to be ]
	[ rewritten so that it uses derivative information.	]

----------------------------------------------------------------------
TASK:	"data.entry" problems
STATUS:	Open
FROM:	p.dalgaard@kubism.ku.dk
	the as.character problem in de() - probably better to fix even
	though it does make lists out of frames.
	there's no way to change a data value to NA in data.entry, etc.
	... earlier message ...
	(Peter Dalgaard) data.entry et al do not seem to have been
	adjusted for the new data frame structure.  This is actually
	a problem where a list is passed where a vector of character
	strings is expected.  To fix it change
		snames <- substitute(list(...))[-1]
	to
		snames <- as.character(substitute(list(...))[-1])
	However, there needs to be a look at the de... code.  When
	a data frame is edited it is returned as a list.  This can
	be cured with judicious use of "data.frame".
	[ The indicated change has been made, but other changes ]
	[ are needed.						]

----------------------------------------------------------------------
TASK:	"x11" printcmd
STATUS: Open
FROM:	maechler@stat.math.ethz.ch
	There is in theory a "printcmd" argument to x11, which
	is ignored.  Make it do something.

----------------------------------------------------------------------
TASK:	"source" requires a terminating newline on EOF
STATUS:	Open
FROM:	Kurt.Hornik@ci.tuwien.ac.at
	source() fails in many cases where a file has no final
	newline.  (R&R, sorry for being ridiculouly nasty about
	things that don't work for files without a final newline.
	I have Emacs' next-line-add-newlines set to nil ...)
	This seems to be a problem with parse() in src/main/source.c
	in combo with the code in gram.y ...
	I know this is NOT something to quickly fix over the weekend.
	Please simply put it into your PROJECTS file.
	[ This is actually a syntax error according to the R grammar ]
	[ but maybe we can do something.			     ]

----------------------------------------------------------------------
TASK:	help file ALIAS() and LINK() constructions
STATUS:	Closed
FROM:	R@stat.auckland.ac.nz
	How do we know which file to LINK to?  There needs to a step
	which fills in the file name on the basis of all ALIAS
	declarations.
	[ A proprocessing step is needed.  First we build a table  ]
	[ of aliases and corresponding file names.  Then we pass   ]
	[ throught the files building the correct LINK references. ]

	[ The new Rdconv and build-html... solve `everything' ]
----------------------------------------------------------------------
TASK:	"paste" problem
STATUS:	Closed
FROM:	maechler@stat.math.ethz.ch
	in S,
		paste(....., collapse = string)
	always returns ONE string  (a character vector of length 1),
	according to documentation and several examples.
	in R, this is not true:
		R> paste(rep(" ",0), collapse="...") #anything for collapse
		character(0)
		S> paste(rep(" ",0), collapse="...") #anything for collapse
		[1] ""
	Again, I think	R  is more logical than S here, but it was decided
	that in minor cases, compatibility comes first...
	[ We now return "" in the zero length case. ]

----------------------------------------------------------------------
TASK:	missing functionality - modelling
STATUS:	Open
FROM:	maechler@stat.math.ethz.ch
	aov,  print.aov, summary.aov,...  (!)
	which I really missed for teaching  a few months ago.
	[ We'll get to this - it actually should be fun. ]

----------------------------------------------------------------------
TASK:	warnings option
STATUS:	Open
FROM:	maechler@stat.math.ethz.ch
	which reminds me that we/I also would like something similar as S's
	options(warn = k)
	k=  0 : [default]  print warnings
	k= -1 :	do nothing (maybe append warnings to some temp-file)
	k=  1 : produce an error ('warning' becomes 'stop').

----------------------------------------------------------------------
TASK:	R has no stderr
STATUS:	Open
FROM:	Friedrich.Leisch@ci.tuwien.ac.at
	When I invoke R like
		R 2>errlog
	I would error messages expect to go to the file errlog
	instead of the screen.
	[ We don't have standard error.	 This is problematic on ]
	[ platforms other than Unix.


----------------------------------------------------------------------
TASK:	"print.default" fix
STATUS:	Open
FROM:	la-jassine@aix.pacwan.net
	When you fix print.default, please also add prefix=

----------------------------------------------------------------------
TASK:	"print.default" fix
STATUS:	Open
FROM:	jlindsey@luc.ac.be
	print.default in S has an option, right=T, but R does not

----------------------------------------------------------------------
TASK:	"postscript" fix
STATUS:	Open
FROM:	la-jassine@aix.pacwan.net
	postscript() also needs the options onefile, print.it, and
	append (even if they are not supported yet it would be nice if
	the arguments could be accepted and ignored).
	[ I added these as arguments, but they have no effect. ]

----------------------------------------------------------------------
TASK:	task scheduling
STATUS:	Open
FROM:	gwhite@cabot.bio.dfo.ca
	More generally, the range of things that can be done in R would
	be greater if there was a simple scheduling mechanism.	Is
	there a way to have a specific function invoked just before the
	command prompt returns after a function?  Such a function could
	be used to run save(...) or check for various external cues
	(update of a file's timestamp) to control an analysis.
	I doubt it would make sense to have full context switching in
	R,  but perhaps save() could be done in a way that would allow
	it to be used even in a long calculation under some timer
	control.  I expect the user would need to provide a list of the
	data objects that need to be saved.

----------------------------------------------------------------------
TASK:	Inf numerics
STATUS:	Closed
FROM:	plummer@iarc.fr
	Could we have an Inf object in R? I would find it useful.
	[ On systems using IEEE arithmetic, the builtin Inf and NaN	]
	[ values are recognised and used as of 0.62.			]

----------------------------------------------------------------------
TASK:	Auto-save
STATUS:	Closed
FROM:	<p.dalgaard@kubism.ku.dk> <hornik@ci.tuwien.ac.at>
	> BTW: How about putting auto-save-workspace on the task list?
	> Or just a manual save.work() currently, you can lose quite a
	> bit of work to an unexpected segfault. (And q()+restart is
	> cumbersome, esp. if you need to reattach subsetted dataframes,
	> etc.) 
	Perhaps call it save.image() instead and use
		save(list = ls(), file = ".RData")
	as was suggested some time ago?
	(Whatever the result is, it needs to go in the FAQ, which goes
	into great length about that under R data can get lost when a
	crash occurs, but does not say how to save them ...)
	[ Added save.image() as above.  And yes, it's been in the FAQ   ]
	[ for quite some time now ...					]

----------------------------------------------------------------------
TASK:	"chisquare.test" problem
STATUS:	Closed
FROM:	<venkat@biosta.mskcc.org>
	Can you change the explicit "cat" statement in the
	"chisquare.test" function which insists on writing to the
	screen even when the output is redirected to a variable? (Using
	"htest" class as in "t.test" function.)
	[ Replaced by chisq.test(), formerly in the ctest package,      ]
	[ which properly returns an object of class "htest".		]
	
----------------------------------------------------------------------
TASK:	Graphics inconsistencies
STATUS: Closed
FROM:	Bill.Venables@adelaide.edu.au
	While transferring some old S-code I came across some minor
	inconsistencies between R and S that are probably more nuisance
	value than they would take to fix.  I report them here for
	reference, (but not in any campaigning mood, of course...)
	1. No frame() command in R and so no graceful way to clear a
	   plotting screen.  (Or is there?)
	   [ Added ]
	2. There is a dev.off() function, but no other dev.xxx functions.
	   (The dev.xxx group are S-PLUS and not vanilla S, by the way.)
	   There is no graphics.off() function.
	   [ Added in 0.62. ]
	3. If dfr is a data frame with components "x", "y" and some
	   others then points(dfr) uses dfr as an xy-list in S but not in
	   R.  If there is some non-numeric component it actually fails
	   in R.  This may be S being a bit inconsistent, but the
	   behaviour is different.
	   [ Fixed? ]
	4. The plotting marks are a bit gappy in R and even the ones
	   that are there do not correspond to their S counterparts.
	   Here is a little function to make a wall chart showing the
	   gaps:
	   [ We now have all the S symbols and a new set of R ones. ]
		show.marks <- function()
		{
		  if(!exists(".Device") || is.null(.Device)) x11()
		  plot(1, type="n", axes=F, xlab="", ylab="")
		  oldpar <- par()
		  par(usr = c(-0.01, 5.01, -0.01, 5.01), pty = "s")
		  for(i in 0:18) {
		    x <- 1/2 + (i %% 5)
		    y <- 4.5 - (1/2 + (i %/% 5))
		    points(x + 1/5, y - 1/5, pch = i, cex = 3)
		    text(x - 1/5, y + 1/5, i, adj = 0.5, cex = 1.5)
		  }
		  abline(h = 1:5 - 0.5, lty = 1)
		  segments(0:5, rep(0.5, 5), 0:5, rep(4.5, 5))
		  par(oldpar)
		  invisible()
		}
	5. In S you may extend a list by assigning to a new component.
	   For example if lis has components "x" and "y", only, you can
	   extend it by assigning to lis$z, lis["z"] or lis[, "z] (the
	   last if it is also a data frame).  In R only the first of
	   these works; the others give a "subscript out of bounds"
	   error.  (This may have been discussed while I was not paying
	   attention, in which case I apologize.)
	   [ Fixed in 0.50. ]

----------------------------------------------------------------------
TASK:	Function pointer access
STATUS:	Open
FROM:	<schwarte@feat.mathematik.uni-essen.de>
	I want to report two problems with the Fortran code of R.
	1) Configure does not adapt GETSYMBOLS.in if the Fortran Compiler
	   does not add underscores to the symbol names.
	2) There is a name conflict if the Fortran Compiler does not add
	   underscores because there exist a Fortran function FMIN and a
	   C function fmin(). Thus the name of the Fortran FMIN should be
	   changed.
	   [ This is fixed I think. ]
	Currently I am rewriting my robust location-scale code in C. I
	intend to make this new code available as a library once a
	standard for such libraries has been agreed upon. As I would
	like to allow prospective users to experiment with private
	psi/chi functions I need access to the hash table of available
	function pointers. Is it possible that you insert a function
	into dotcode.c that contains the code fragment form lines 482
	to 495 and returns a function pointer?

----------------------------------------------------------------------
TASK:	Partial string matching
STATUS:	Open
FROM:	<R@stat.auckland.ac.nz>
	Is there an existing partial string match function which could
	be used in place of pstrmatch in subset.c???
	If not can pstrmatch take on the functions of all partial match
	functions?

----------------------------------------------------------------------
Post 0.49 Additions
----------------------------------------------------------------------
TASK:	Name Attributes on Calls
STATUS: Closed (almost)
FROM:	<p.dalgaard@kubism.ku.dk>
	A call with tagged arguments is something like a list, the tags
	can be used to access elements, but the names attribute is absent,
	until the call is coerced to a list. (Attempting to set the names()
	causes evaluation. Changing "list" to "blipblop" causes an 'Error:
	couldn't find function "blipblop"' at that point.)

	> j<-substitute(list(a=1, b=2))
	> j
	list(a = 1, b = 2)
	> j$b
	[1] 2
	> names(j)
	NULL
	> names(j)<-NULL
	> j
	[[1]]
	[1] 1

	[[2]]
	[1] 2

	[At least under SunOS this is fixed. RG]

	[However, 'names(j) <- NULL' has no effect in R, but does in S.	 MM]

----------------------------------------------------------------------
TASK:	String NAs Via the Back Door.
STATUS:	Open
FROM:	<p.dalgaard@kubism.ku.dk>
	Ok, the right solution seems to be names(as.list(j)), but then we run
	into some other fun with NA's... Shouldn't the real NA print without
	quotes?
	> ch[1]<-paste("N","A",sep="")
	> is.na(ch)
	[1] FALSE FALSE FALSE
	> ch
	[1] "NA" "a"  "b"
	> ch[1]=="NA"
	[1] TRUE
	> ch[1]<-"NA"
	> is.na(ch)
	[1]  TRUE FALSE FALSE
	[ We need a real NA.  At present there is confusion between    ]
	[ the string "NA" and the NA value for strings.	 One solution  ]
	[ would be to use R_NilValue to indicate the missing string    ]
	[ value, and let NA be just an ordinary string in all cases.   ]
	[ This would be incompatible with S, but still an improvement. ]

----------------------------------------------------------------------
TASK:	Directory Structure
STATUS:	Closed
FROM:	<Kurt.Hornik@ci.tuwien.ac.at> + Friedrich + Paul Gilbert
	> Regarding the location of data for libraries it might be easier if
	> everything for one library is included in one subdirectory. At least
	> it would certainly be easier to clean-up, which I like to do every few
	> years.  Thus the code file, data, and any compiled code would be in
	> one subdirectory under $RHOME/library.
	Like
		library/<section>/
		library/<section>/data
		library/<section>/exec		(scripts and or binaries which
						 only make sense for the add-on)
		library/<section>/funs
		library/<section>/help
		library/<section>/html
		library/<section>/objs		(*.so)
	???
	> I realize this means a small change to the way libraries are now
	> found, but in the end I think it would be much cleaner.
	I think the changes would not be too hard, and we need to do something
	about the directory structure anyway.

	Actually, I think if R&R ok'ed something like that, Fritz and I would
	take a look.
	(In a way, I NEED to do something like that anyway, because I promised
	it for making an official Debian package ...)
	Would it mean that we also employ the S library/section concept?


----------------------------------------------------------------------
TASK:	Startup Processing
STATUS:	Open
FROM:	<p.dalgaard@kubism.ku.dk>
	The x11() window can be a nuisance to have popping up at startup (esp.
	on small screens) when you're not working with graphics. However,
	currently you can't get rid of it without modifying the systemwide
	Rprofile.
	Current logic is:
	Run $RHOME/library/Rprofile
	if ./.Rprofile exists
		run it
	else if $HOME/.Rprofile exists
		run that
	endif
	I think it should be
	Run $RHOME/library/Rsetup
	if ./.Rprofile exists
		run it
	else if $HOME/.Rprofile exists
		run that
	else if $RHOME/library/Rprofile exists
		run that
	endif
	i.e. essential system initialisation goes in Rsetup, the rest in
	Rprofile, which can be overridden by the user. Currently, the line

	if(interactive()) x11()
	is the candidate to move from one to the other. BTW, it really should read
	if(interactive() && getenv("DISPLAY")!="") x11()
	[BTW2: getenv() implemented using system()? is that really necessary?]

  >>	<Kurt.Hornik@ci.tuwien.ac.at>
	I more or less agree, BUT:
	I'd like (in the future) to have the system-wide Rprofile searched in a
	site-specific location as well (similar to Emacs, following the idea of
	keeping the distribution and the site-specific things apart).
	So it would be
		system-wide Rsetup (which should basically be platform-specific
		  stuff, cause otherwise it could go into base as well?)
		if .Rprofile exists run it else
		if ~/.Rprofile exists run it else
		if Rprofile exists on the default library search path, run it
	and that search path could e.g. specify all `library' trees with a
	compile-time default of
		~/lib/R:/usr/local/lib/R/site:/usr/local/lib/R/${version}
	and settable at run time via e.g. the environment variable R_PATH.

----------------------------------------------------------------------
TASK:	Old Unfixed Problems
STATUS:	Closed
FROM:	<Kurt.Hornik@ci.tuwien.ac.at>
	I noticed the following problems (all already reported, but not in
	TASKS).
	* File permissions in data should be 644.
	* In src/unix/system.c, one `Rdata' should be `RData' (d -> D).
	* The documentation for the noncentral chisquare distribution is
	not quite correct.  (rnchisq does not exist, the existing
	functions have x, df and the noncentrality parameter as args,
	and the density should be pnchisq(x, df, lambda)
	   = exp(-lambda / 2)
	     * sum_{r=0}^\infty \frac{lambda^r}{2^r r!} pchisq(x, df + 2r)
	(semiTeX notation only, sorry).
	[ All fixed now.						]

----------------------------------------------------------------------
TASK:	New Problems
STATUS:	Closed
FROM:	<Kurt.Hornik@ci.tuwien.ac.at>
	New minor remarks:
	* The documentation for `image' still has the old order z, x, y.
	* Perhaps one should add `par(ask = T)' in the image demo?
	* Perhaps one should save the original value of par() at the
	beginning of the graphics demo, and restore that at its end
	(s.t. typically asking is turned off again).

----------------------------------------------------------------------
TASK:	Multiplatform Support
STATUS:	Open
FROM:	<warnes@biostat.washington.edu>
	I've modified the "$RHOME/bin/R" and "$RHOME/cmd/filename" so that you
	can use the same directories for multiple machines.  That is, machines
	running various flavors of UNIX can access the same directories.
	The modified structure adds the directories
		$RHOME/bin/$OSTYPE/
		$RHOME/lib/$OSTYPE/
	to hold the machine specific binaries.
	For instance, here the $RHOME directory contains two subdirectories,
		$RHOME/bin/solaris/
		$RHOME/bin/sunos4/
	which each hold the appropriate R.binary file.
	These two modified functions assume that the environment
	variable $OSTYPE is appropriately set, as is done automatically
	by the shell tcsh.  If it is not set, the directory names
	collapse to the original values,
		$RHOME/bin/ and $RHOME/lib/
	To use them, create the approprate directories and place the
	correct binaries therein. ( Note that the makefiles will not do
	this automatically!)  Then replace $RHOME/bin/R and
	$RHOME/cmd/filename with the modified ones.

----------------------------------------------------------------------
TASK:	Platform Independence
STATUS:	Open
FROM:	Friedrich.Leisch@ci.tuwien.ac.at
	IMHO we should definetely have platform-dirs for everything that's
	possibly platform-dependent ... resulting in something like
		<library>/<section>/<type>
	e.g. for R code and
		<library>/<section>/<type>/<platform>
	e.g. for exec and dynload-objects.
	for exec there's a problem though, as some exec's are
	shell/perl/whatever-scripts and *should* work on any platform
	...

----------------------------------------------------------------------
TASK:	Poly
STATUS:	Open
FROM:	<Kurt.Hornik@ci.tuwien.ac.at>
	PS1.  There was also `poly' function in your snapshot WORK tree
	... do you already have a final version of that?

----------------------------------------------------------------------
TASK:	Naming with Numeric Values and "unlist"
STATUS:	Closed
FROM:	<hornik@ci.tuwien.ac.at>
	R> l <- list("11" = 1:5)
	R> l
	$11
	[1] 1 2 3 4 5
	R> unlist(l)
	111 112 113 114 115
	  1   2	  3   4	  5
	[ The same as S does, hence determined a feature		]

----------------------------------------------------------------------
TASK:	all.names needed
STATUS:	Closed
FROM:	<bates@stat.wisc.edu>
	I could not find the all.names function in R so I created the
	enclosed.  Comments, criticisms, or changes to a one-liner by
	creating nested anonymous functions are welcome.  I'll try to
	work out a corresponding all.vars function.
	[ all.names() and all.vars() added in 0.61.			]
----------------------------------------------------------------------
TASK:	"sys.function" problem
STATUS:	Open
FROM:	<bates@stat.wisc.edu>
	I attempted to create a recursive anonymous function to be called
	within another function.  You may want to stop reading for a bit and
	consider how that would be done.  That is, how do you recursively call
	a function that has never been assigned a name?
	OK, you're back.  You probably came up with a better solution than I
	did but I used (sys.function())(arg) to do the recursion.  The piece
	of code looks like
	  flist <- (function(x) {
	    if (mode(x) == "call") {
	      if (x[[1]] == as.name("/"))
		return(c(sys.function()(x[[2]]), sys.function()(x[[3]])))
	      if (x[[1]] == as.name("("))	# for R
		return(sys.function()(x[[2]]))
	    }
	    if (mode(x) == "(") return(sys.function()(x[[2]])) # for S
	    list(x)
	  })(getGroupsFormula(data, form, ...)[[2]])
	  ## I know it's horribly obscure.  Blame Bill Venables for teaching me this.
	Regretably, it doesn't work in R.  Using the debugger one finds that
	sys.function() returns the function being called the first time
	through but the second time through it returns NULL.  Is this a bug or
	a feature?

----------------------------------------------------------------------
TASK:	"update" comments and fixes
STATUS:	Open
FROM:	<thomas@biostat.washington.edu>
	1. To make update() work with a new formula for glms, change the
	first line of the glm() function from
		call <- sys.call(
	to
		call<-match.call()
	(this means that the formula component of the returned call is
	labelled so that update can find it)

	2. update.lm doesn't do anything with its weights= argument
	Add
		if (!missing(weights))
			call$weights<-substitute(weights)
	Similarly, to get update to work properly on glms you need a lot
	more of these if statements (see update.glm at the end of the message).

	3. update.lm evaluates its arguments in the wrong frame.
	It creates a modified version of the original call and evaluates
	it in sys.frame(sys.parent()).	If update.lm is called directly
	this is correct, but if it is called via update() the correct
	frame is sys.frame(sys.parent(2)). Worse still, if it is called
	by NextMethod() from another update.foo() the correct frame is
	still higher up the list.

	My solution (a bit ugly) is to move up the list of enclosing calls
	checking at each stage to see if the call is NextMethod, update or an
	update method.	It can be seen at the end of update.glm at the bottom of
	this message, and something of this sort needs to be added to other update
	methods.

	update.glm<-function (glm.obj, formula, data, weights, subset,
		na.action, offset, family, x)
	{
		call <- glm.obj$call
		if (!missing(formula))
			call$formula <- update.formula(call$formula, formula)
		if (!missing(data))
			call$data <- substitute(data)
		if (!missing(subset))
			call$subset <- substitute(subset)
		if (!missing(na.action))
			call$na.action <- substitute(na.action)
		if (!missing(weights))
			call$weights <- substitute(weights)
		if (!missing(offset))
			call$offset <- substitute(offset)
		if (!missing(family))
			call$family <- substitute(family)
		if (!missing(x))
			call$x <- substitute(x)
		notparent <- c("NextMethod", "update", methods(update))
		for (i in 1:(1+sys.parent())) {
			parent <- sys.call(-i)[[1]]
			if (is.null(parent))
				break
			if (is.na(match(as.character(parent), notparent)))
				break
		}
		eval(call, sys.frame(-i))
	}

----------------------------------------------------------------------
TASK:	Wisdom
STATUS:	Open
FROM:	<bates@stat.wisc.edu>
	Some of the "eternal truths" about the S language are:
	 - every object has a mode obtainable by mode(object)		[ok]
	 - every object has a length obtainable by length(object)	[ok]
	 - every object can be coerced to a list of the same length
				[not yet, even for expression()s
				 (and functions)]

	One can imagine that code that messes around with functions and
	other expressions in R will break fairly quickly when these
	conditions do not hold.	 I don't know how much work would be
	involved in patching over these differences between R and S
	but I suspect it would not be a trivial undertaking.

----------------------------------------------------------------------
TASK:	frametools
STATUS:	Open
FROM:	<p.dalgaard@kubism.ku.dk>
	The following three functions are designed to make manipulation of
	dataframes easier. I won't write detailed docs just now, but if you
	follow the example below, you should get the general picture. Comments
	are welcome, esp. re. naming conventions.

	Note that these functions are definitely not portable to S because
	they rely on R's scoping rules. Not that difficult to fix, though: The
	nm vector and the "parsing" functions need to get assigned to
	(evaluation) frame 1 (the "expression frame" of S), and preferably
	removed at exit.

	data(airquality)
	aq<-airquality[1:10,]
	select.frame(aq,Ozone:Temp)
	subset.frame(aq,Ozone>20)
	modify.frame(aq,ratio=Ozone/Temp)

	Notice that in modify.frame(), any *new* variable must appear as a
	tag, not as the result of an assignment, i.e.:

	modify.frame(aq,Ozone<-log(Ozone)) works as expected
	modify.frame(aq,lOzone<-log(Ozone)) does not.

	This is mainly because it was tricky to figure out what part of a left
	hand side constitutes a new variable to be created (note that indexing
	could be involved). So assignments to non-existing variables just
	create them as local variables within the function. Making a virtue
	out of necessity, that might actually be considered a feature...

	----------------------------------------
	"select.frame" <-
	function (dfr, ...)
	{
		subst.call <- function(e) {
			if (length(e) > 1)
				for (i in 2:length(e)) e[[i]] <- subst.expr(e[[i]])
			e
		}
		subst.expr <- function(e) {
			if (is.call(e))
				subst.call(e)
			else match.expr(e)
		}
		match.expr <- function(e) {
			n <- match(as.character(e), nm)
			if (is.na(n))
				e
			else n
		}
		nm <- names(dfr)
		e <- substitute(c(...))
		dfr[, eval(subst.expr(e))]
	}
	"modify.frame" <-
	function (dfr, ...)
	{
		nm <- names(dfr)
		e <- substitute(list(...))
		if (length(e) < 2)
			return(dfr)
		subst.call <- function(e) {
			if (length(e) > 1)
				for (i in 2:length(e)) e[[i]] <- subst.expr(e[[i]])
			substitute(e)
		}
		subst.expr <- function(e) {
			if (is.call(e))
				subst.call(e)
			else match.expr(e)
		}
		match.expr <- function(e) {
			if (is.na(n <- match(as.character(e), nm)))
				if (is.atomic(e))
					e
				else substitute(e)
			else substitute(dfr[, n])
		}
		tags <- names(as.list(e))
		for (i in 2:length(e)) {
			ee <- subst.expr(e[[i]])
			r <- eval(ee)
			if (!is.na(tags[i])) {
				if (is.na(n <- match(as.character(tags[i]),
					nm))) {
					n <- length(nm) + 1
					dfr[[n]] <- numeric(nrow(dfr))
					names(dfr)[n] <- tags[i]
					nm <- names(dfr)
				}
				dfr[[tags[i]]][] <- r
			}
		}
		dfr
	}
	"subset.frame" <-
	function (dfr, expr)
	{
		nm <- names(dfr)
		e <- substitute(expr)
		subst.call <- function(e) {
			if (length(e) > 1)
				for (i in 2:length(e)) e[[i]] <- subst.expr(e[[i]])
			e
		}
		subst.expr <- function(e) {
			if (is.call(e))
				subst.call(e)
			else match.expr(e)
		}
		match.expr <- function(e) {
			if (is.na(n <- match(as.character(e), nm)))
				e
			else dfr[, n]
		}
		r <- eval(subst.expr(e))
		r <- r & !is.na(r)
		dfr[r, ]
	}

----------------------------------------------------------------------
TASK:	General Problems
STATUS:	Open
FROM:	<jlindsey@luc.ac.be>
	1. A gentle reminder that the default has not been changed for saving
	.RData in batch mode (as was promised).
	2. The degrees of freedom for the null deviance in glm are wrong when
	some observations are weighted out. This can give silly answers, for
	example when applying anova. The number of weighted out observations
	should be subtracted, as in other df calculations.
	3. The null deviance itself is wrong in glm when an offset is used. It
	can be smaller than that when variables are added to the model!
	4. R gave a segmentation fault when I tried to fit a model with 49
	factor levels in glm (using R -v4). All these glm problems were with
	poisson.
	5. R still does not read my environmental variables to set memory
	size.

	Suggestions:
	1. d, p, q, and r functions for inverse Gauss and Laplace
	distributions.
	2. Add a fifth function for continuous distributions, the hazard
	function, h. For example, ht <- function(...) dt(...)/(1-pt(...))
	is the Student t hazard function.
	For writing likelihood functions, these would be much faster in C than
	R and some such as Weibull can be simplified.
	3. Add the five functions for three parameter distributions such as
	generalized F, extreme value, etc., Box-Cox,... (I have the densities,
	cumulative, and hazard as R functions.)
	4. Philippe Lambert and I have d and p functions working in R for the
	four-parameter stable family by inverting the characteristic function
	with a Fourier transform (requires C code). S-plus only has the r
	function for stables.

----------------------------------------------------------------------
TASK:	Generic Print
STATUS:	Open
FROM:	Paul Gilbert <la-jassine@aix.pacwan.net>
	I have	always thought that typing the name of an object generated
	a call to the print method for the object, however,  (in 0.49)
	I redefined the generic print method as
		print <- function(x, ...)
		{if (is.tframe(x))  UseMethod("print.tframe")
		  else UseMethod("print")
		}
	Now I have an object z which returns TRUE to is.tframe(z) and
	> class(z)
	[1] "ts"     "tframe"
	Then
	> print(z)
	[1] 1981.50 2006.25    4.00
	But
	> z
	Error: comparison is possible only for vector types
	> traceback()
	[1] "c(\"print.ts(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\",
	\\\"tframe\\\"\", "
	[2] "c(\"print(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\",
	\\\"tframe\\\"\", "
	This is generating a call to the class method print.ts
	rather than to print.tframe.ts as is done when I use
	print(z).  If my understanding that typing the name of an
	object should generate a call to the print method for the
	object then this is a bug. Otherwise, could someone please
	explain to me what it does. Thanks.

----------------------------------------------------------------------
TASK:	getenv()
STATUS:	Closed
FROM:	Paul Gilbert <la-jassine@aix.pacwan.net>
	Here are two small problems I've pointed out before, but still
	seem to be in 0.49.
	1/ getenv() should return everything, not complain missing item.
	[ Fixed now.  In fact, at least under Unix getenv() now returns	]
	[ the whole environment, as in S.				]

----------------------------------------------------------------------
TASK:	summary.default
STATUS:	Closed
FROM:	Paul Gilbert <la-jassine@aix.pacwan.net>
	2/ In summary.default
		 ...
			sumry[i, 2] <- if (is.object(ii))
				 class(ii)
	should be changed to
		...
			sumry[i, 2] <- if (is.object(ii))
				paste(class(ii), collapse=" ")
	so that it works with lists of lists. (This fix was suppose to be
	added to Splus 4.)

	[The solution is now different:
		cls <- class(ii)
	        sumry[i, 2] <- if (length(cls) > 0) cls[1] else "-none-"
	]
----------------------------------------------------------------------
TASK:	Time Series Problems
STATUS:	Open
FROM:	<la-jassine@aix.pacwan.net>
	Here are four problems with ts:
	1/  ts matrix subscripting should support drop=F:
	> z<- matrix(1:10,5,2)
	> z <-ts(z)
	> z[,1,drop=F]
	Error in [.ts(z, , 1, drop = F) : unused argument to function
	[ok]

	2/  == and other comparisons with non-ts matrices should work:
	> z <- matrix( 1:10,5,2)
	> ts(z)
	Time-Series:
	Start = c(1, 1)
	End = c(5, 1)
	Frequency = 1
	     [,1] [,2]
	[1,]	1    6
	[2,]	2    7
	[3,]	3    8
	[4,]	4    9
	[5,]	5   10
	> z == ts(z)
	Error: invalid time series parameters specified

	3/ The generic functions start and end need default methods to
	return a result for matrices as previously and in S. The
	following seems to work.

	start.default <- function (x) start(ts(x))
	end.default   <- function (x)	end(ts(x))
	[ Added ]

	4/ In the function start.ts (and in end.ts) ts[1] in the last line
	is not defined. Perhaps I am missing something?
	start.ts
	function (x)
	{
		ts.eps <- .Options$ts.eps
		if (is.null(ts.eps))
			ts.eps <- 1e-06
		tsp <- attr(as.ts(x), "tsp")
		is <- tsp[1] * tsp[3]
		if (abs(is - round(is)) < ts.eps) {
			is <- floor(tsp[1])
			fs <- floor(tsp[3] * (tsp[1] - is) + 0.001)
			c(is, fs + 1)
		}
		else ts[1]
	}
	[ Fixed ]


----------------------------------------------------------------------
TASK:	Recycling problems
STATUS:	Open
FROM:	Paul Gilbert <la-jassine@aix.pacwan.net>
	In R 0.49 comparison of logic matrices with & and | seems
	to sometimes generate false warning messages about longer
	object length is not a multiple of shorter object length.
	I have not been able to isolate the exact circumstances.

----------------------------------------------------------------------
TASK:	Generic "write" function
STATUS:	Open
FROM:	<Kurt.Hornik@ci.tuwien.ac.at>
	Following my posting of a write.table() function, Martin
	suggested that one could have a generic write() function
	and special methods for e.g. time series, data frames, etc.
	Well, a month has passed since ...
	What does everyone think?  Is it a good idea, or would
	write.table() be enough?  If we think that it is not enough,
	which arguments should the write methods typically allow?
	What about
	write.xxx (x,		# object
		   file =	# filename, default stdout
		   append =	# obvious
		   sep =	# obvious
		   eol =	# end of line char
		   ...)
	???
	On the other hand, it seems clear that something like
	write.table() is nice, and what it should do.  But what
	about classes other than data.frame?

	Note that S has a  write(.)  function  which would be our
		write.default(.)
	your	write.table	    would be our
		write.data.frame
	The only addition would be a  'write.matrix' which would be 'like'
	write.data.frame, the only problem being that  'matrix' is not a class
	(yet).
		[Note that in S4, everything has a class;
	I'm voting for matrices to have a class in R  ..]
	write.default  could 'despatch' to write.matrix if  x  is a matrix.


----------------------------------------------------------------------
TASK:	Comparison with NA and Zero-Length Vectors
STATUS:	Open
FROM:	<thomas@biostat.washington.edu> + <maechler@stat.math.ethz.ch>
	Thomas: Any comparison with NULL generates an error
	Error: comparison is possible only for vector types
	whereas in S(-PLUS) it gives NA, which seems more sensible.
	Along similar lines, comparison with a length 0 vector
	returns logical(0) in R but NA in S.

	Martin: Isn't  logical(0)  more	 logical than  NA ?
	I agree that it would be best  (convenience)
	if   'NULL==1' returned the same as 'numeric(0)==1'.
	At the moment, I don't see why compatibility with S should be
	important here:
		if( NULL == anything)
	or, e.g.,	if( numeric(0) == numeric(0) )
	give an error anyway, i.e., you have to test for length 0 _anyway_
	in the cases where one comparison argument may have zero length.

	Thomas: I didn't (previously) make any comment on this --
	I only said that NA was more logical than an error message.
	However, the advantage of returning NA is that NA | TRUE
	is TRUE, NA & FALSE is FALSE, which doesn't happen with
	logical(0). Also, from a compatibility point of view one
	of them is tested with is.na(), the other with length(),
	so it can matter which one you use.  Of course no-one should
	deliberately write code where it matters, but these things
	happen.
	It seems in fact that logical(0) | TRUE causes R to freeze
	(R0.49, sparc solaris).
	Robert: Well, we thought
	  logical(0) & T should return logical(0)
	  logical(0) | T should return logical(0)
	already we have NA | T returns T and  NA & T returns NA

----------------------------------------------------------------------
TASK:	Modules
STATUS:	Open
FROM:	<luke@stat.umn.edu>
	I came across a paper on scheme module design that may be
	under consideration for rnrs -- I'm a bit hazy on that. At
	any rate, it is at http://www.cs.princeton.edu/~blume/modules.dvi.
	I haven't read it carefully yet, but it is fairly heavily
	influenced by SML but doesn't go too far overboard (well
	maybe a bit).

----------------------------------------------------------------------
May 1.
----------------------------------------------------------------------
TASK:	"abline" incompatibility
STATUS:	Closed; Fixed uncertain why.... (Aug 6, 97).
FROM:	<nobu@psrc.isac.co.jp>
	I found a little different behavior of R with S.
	at R-0.49:
	    > a
	    [1] 12 23 22 34 44 54 55 70 78
	    > plot(a)
	    > abline(lsfit(seq(1,len=length(a)), a))
	    Error: no applicable method for "coefficients"
	at S (from AT&T '92) result draw coefficient line without error.
	Then I think to need define a function as followed:
	    coefficients.default <- function(x) x$coef

----------------------------------------------------------------------
TASK:	Legend problems
STATUS:	Open
FROM:	<jlindsey@luc.ac.be>
	When legend is used, the box around it has the line-type
	of the last call to lines or plot instead of solid always.

----------------------------------------------------------------------
TASK:	"rnorm" change
STATUS:	Open
FROM:	Paul Gilbert <la-jassine@aix.pacwan.net>
	For some reason I cannot determine, the function rnorm
	seems to be returning different values in R 0.49 than it
	did in R 0.16.1 (in Linux ELF). The function runif is
	unchanged.
	[  I believe I changed the underlying generator.       ]
	[  I was worried about behavior in the extreme tails.  ]
	[  Should we change back again?			       ]

----------------------------------------------------------------------
TASK:	"formula" problems
STATUS:	Open
FROM:	 <mikem@stat.cmu.edu>
	Several bugs (no solutions, yet).  These might be well known.
	1) If one does, e.g.,  mymod <- lm(y ~ x); formula(mymod)
	then one does not get back the formula (one gets,
	Error: invalid formula)
	CLOSED (Aug 6, 97, RG).
	2) if x is of mode numeric, then the model formula
		mymod <- lm(y ~ x + x^2)
	is not processed as S would do it.    The model is fit
	ignoring the x^2 term, however mymod$call includes the x^2
	term.  This seems to be a bug (or maybe feature) in applying
	model formulae operators to numeric quantities.	 I expect
	(from experience with S) that x^2 will be interpreted as
	a math operator.  Whatever the right thing to do is, it
	needs to be documented.

----------------------------------------------------------------------
TASK:	formula problems
STATUS:	Open
FROM:	<p.dalgaard@kubism.ku.dk>
	Mike Meyer <mikem@stat.cmu.edu> writes:
	> Several bugs (no solutions, yet).  These might be well known.
	> 1) If one does, e.g.,	 mymod <- lm(y ~ x); formula(mymod)
	> then one does not get back the formula (one gets, Error: invalid formula)
	Yep. Seems that we need a
	formula.lm<-function(x)formula(x$terms)
	> 2) if x is of mode numeric, then the model formula
	>	mymod <- lm(y ~ x + x^2)
	> is not processed as S would do it. The model is fit ignoring the x^2 term,
	We had that topic a while back. I think it was concluded
	that it is a feature, because mixing model formulas and
	arithmetic ditto is bad practice. (I don't have any strong
	feeling about this, personally. As long as R won't introduce
	those awful Helmert contrasts as default...)

----------------------------------------------------------------------
TASK:	formula problems
STATUS:	Open
FROM:	<wvenable@attunga.stats.adelaide.edu.au>
	Peter Dalgaard writes:
	 > >
	 > > 2) if x is of mode numeric, then the model formula
	 > > mymod <- lm(y ~ x + x^2)
	 > > is not processed as S would do it.	 The model is fit[ted]
	 > > ignoring the x^2 term...
	 >
	 > We had that topic a while back.  I think it was concluded that
	 > it is a feature, because mixing model formulas and arithmetic
	 > ditto is bad practice.
	I don't recall we did, but in any case I'd like to re-open it.

	There is an anomaly in the way : and ^ terms are handled in the
	sense that the logical and useful thing is obvious but does not
	happen.	 Let me give an example.  Suppose a and b are factors, x
	and y are not.

	A term such as (a + b + x + y)^2 should be expanded out binomial
	fashion, coefficients stripped away and the remaining products
	treated as : products.	Then S copes with terms like a:a, a:b and
	a:x fine, even x:y is handled by having it generate a column of
	xy-products, as it should.

	But a term such as x:x does not generate a column of x-squares,
	it is merely removed as it would be if it were a factor.  This is
	a complete anomaly, and one that I don't think would be hard or
	dangerous for R to rectify.  Indeed it would be very useful to
	generate a complete second degree regression in three variables
	using y ~ (1 + x1 + x2 + x3)^2.	 As it is now it generates linear
	and product terms only and omits the powers.  Go figure.

	 > (I don't have any strong feeling about this, personally.  As
	 > long as R won't introduce those awful Helmert contrasts as
	 > default...)

	Ah, the Helmert contrasts b\^ete noir.	For ANOVA the contrast
	matrix used is mostly irrelevant.  For regression models I agree,
	treatment contrasts would be generally more easily interpreted.

	I presume the reason they were used at all is because if you have
	equal replication of everything the Helmert contrasts give you a
	model matrix with orthogonal columns, so all estimates are
	uncorrelated.  Whenever do you get equal replication, though?

----------------------------------------------------------------------
TASK:	formula problems
STATUS:	Open
FROM:	<p.dalgaard@kubism.ku.dk>
	Bill Venables <wvenable@attunga.stats.adelaide.edu.au> writes:
	> A term such as (a + b + x + y)^2 should be expanded out binomial
	> fashion, coefficients stripped away and the remaining products
	> treated as : products.  Then S copes with terms like a:a, a:b and
	> a:x fine, even x:y is handled by having it generate a column of
	> xy-products, as it should.
	I tend to agree.
	> Ah, the Helmert contrasts b\^ete noir.  For ANOVA the contrast
	> matrix used is mostly irrelevant.  For regression models I agree,
	> treatment contrasts would be generally more easily interpreted.
	Understatement of the year... Last time I bumped into them, it took me
	and a colleague more than an hour to figure out how to interpret the
	regression coefficients, and, I may add, the solution was *not* what
	the white book said it was (it's not just one level minus the average
	of the preceding, the parameter is also scaled by the reciprocal of
	the level number). [There's a split-second solution -- see below --
	but we sort of didn't think of it at the time...]
	> I presume the reason they were used at all is because if you have
	> equal replication of everything the Helmert contrasts give you a
	> model matrix with orthogonal columns, so all estimates are
	> uncorrelated.	 Whenever do you get equal replication, though?
	Hardly ever. Actually, I though that the point was not so much
	ortogonality, but the successive testing (A=B, A=B=C, A=B=C=D,...).
	However that is just plainly wrong outside of balanced ANOVA's.
	And, even in that case, once the first two levels differ, the rest
	of the coefficients lose all meaning.

----------------------------------------------------------------------
TASK:	formula problems
STATUS:	Open
FROM:	<thomas@biostat.washington.edu>
	We also need to fix formula.default. At the moment it only
	looks for x$formula.  Other standard places to keep a
	formula are x$call$formula and x$terms.	 How about

	formula.default<-function (x)
	{
		if (!is.null(x$formula))
			return(eval(x$formula))
		if (!is.null(x$call$formula))
			return(eval(x$call$formula))
		if (!is.null(x$terms))
			return(x$terms)
		switch(typeof(x), NULL = structure(NULL, class = "formula"),
			character = formula(eval(parse(text = x)[[1]])),
			call = eval(x), stop("invalid formula"))
	}
	One disdvantage to extracting the formula from $terms
	instead of $call$formula is that in S a terms object is
	not a formula. On the other hand it doesn't really matter
	as long as people use the formula() function.

----------------------------------------------------------------------
TASK:	formula problems
STATUS:	Open <Bill's code does work with 0.5a1>
FROM:	<wvenable@attunga.stats.adelaide.edu.au>
	Peter Dalgaard writes:
	 > Bill Venables <wvenable@attunga.stats.adelaide.edu.au> writes:
	 > > Ah, the Helmert contrasts b\^ete noir.  For ANOVA the contrast
	 > > matrix used is mostly irrelevant.	For regression models I agree,
	 > > treatment contrasts would be generally more easily interpreted.
	 > Understatement of the year... Last time I bumped into them, it took me
	 > and a colleague more than an hour to figure out how to interpret the
	 > regression coefficients, and, I may add, the solution was *not* what
	 > the white book said it was (it's not just one level minus the average
	 > of the preceding, the parameter is also scaled by the reciprocal of
	 > the level number). [There's a split-second solution -- see below --
	 > but we sort of didn't think of it at the time...]

	A few weeks ago I gave a fairly detailed discussion of how to
	relate contrast matrices and their interpretation in s-news.  I
	could re-issue it or post it to people if that was their wish.

	There is also to be an extended discussion of the subject in V&R2
	due out in July, with a further elaboration to appear (real soon
	now...) in the online complements.

	 > > I presume the reason they were used at all is because if you have
	 > > equal replication of everything the Helmert contrasts give you a
	 > > model matrix with orthogonal columns, so all estimates are
	 > > uncorrelated.  Whenever do you get equal replication, though?
	 >
	 > Hardly ever. Actually, I though that the point was not so much
	 > ortogonality, but the successive testing (A=B, A=B=C, A=B=C=D,...).
	 > However that is just plainly wrong outside of balanced ANOVA's.
	 > And, even in that case, once the first two levels differ, the rest
	 > of the coefficients lose all meaning.

	Indeed.	 That's why I tended to discount that possibility myself.
	Here is a contrast matrix generator I sometimes prefer to use
	that corresponds to testing A=B, B=C, C=D, ...	Of course the
	contrasts are not mutually orthogonal.	How it works is left as a
	little puzzle.	(This function works in S.  I haven't tested it
	in R, but it should work if lower.tri() is available.)

	contr.sdif <- function(n, contrasts = T)
	{
	# contrasts generator giving `successive difference' contrasts.
	  if(is.numeric(n) && length(n) == 1) {
	    if(n %% 1 || n < 2)
	      stop("invalid number of levels")
	    lab <- as.character(seq(n))
	  }
	  else {
	    lab <- as.character(n)
	    n <- length(n)
	    if(n < 2)
	      stop("invalid number of levels")
	  }
	  if(contrasts) {
	    contr <- col(matrix(nrow = n, ncol = n - 1))
	    upper.tri <- !lower.tri(contr)
	    contr[upper.tri] <- contr[upper.tri] - n
	    structure(contr/n, dimnames = list(lab, paste(
	      lab[-1], lab[ - n], sep = "-")))
	  }
	  else structure(diag(n), dimnames = list(lab, lab))
	}
	> contr.sdif(4)
	    2-1	 3-2   4-3
	1 -0.75 -0.5 -0.25
	2  0.25 -0.5 -0.25
	3  0.25	 0.5 -0.25
	4  0.25	 0.5  0.75

----------------------------------------------------------------------
TASK:	startup processing
STATUS:	Open
FROM:	<mikem@stat.cmu.edu>
	2) Again, along the lines of something that S does that is
	actually useful.  In S you can set the S_FIRST environment
	variable and have this used as the equivalent of the R
	.Rprofile file.	  Might it be a good idea to allow an
	R_FIRST environment variable as well.  That way I could
	set user specific preferences that apply no matter what
	directory I have working in.

----------------------------------------------------------------------
TASK:	Function Argument Naming
STATUS:	Open
FROM:	<maechler@stat.math.ethz.ch>
	There is a problem with	 'default argument evaluation'
	when I use an existing function name as argument name :

	sintest <- function(x, y = 2, sin= sin(pi/4))
	{
	  ## Purpose: Test of	"default argument evaluation"
	  ## -------- Fails for	 R-0.49.  Martin Maechler, Date:  9 May 97.
	  c(x=x, y=y, sin=sin)
	}

	## R-0.49:
	R> sintest(1)
	##> Error in sintest(1) : recursive default argument reference

	## S-plus 3.4  (being 100% ok):
	S> sintest(1)
	  x y	    sin
	  1 2 0.7071068
	 Warning messages:
	   looking for function "sin", ignored local non-function in: sintest(1)

	-------------------------------------------------------
	The following shows bugs, both in R and S:

	sintest2 <- function(x ,y = 2)
	{
	  ## Purpose: Test of	"default argument evaluation"
	  ## -------- Fails for	 S-plus 3.4.  Martin Maechler, Date:  9 May 97.
	  c(x=x, y=y, sin=sin)
	}

	R> sintest2(1)
	[[1]]
	[1] 1

	[[2]]
	[1] 2

	[[3]]
	<primitive: sin>

	--------------- is almost okay,
			the buglet being that the names have been dropped from the list.

	But watch this:
	S> sintest2(1)
	function(x = 1, y = 2, sin.x)
	sin2 = .Internal(sin(x), "do_math", T, 109)

	--- returning a function

		((now we see, why S's way of treating functions as
		  lists sometimes badly sucks)).

----------------------------------------------------------------------
TASK:	Function argument naming
STATUS:	Open
FROM:	<luke@stat.umn.edu>
	Martin Maechler wrote:
	[ Stuff above. ]

	For better or worse, S and R allow default expressions to
	contain references variables that are (or rather may be)
	created in the function body, so (in R and Splus)

	> x<-1
	> f<-function(a,b=x) { if (a) x<-2; b}
	> f()
	Error: Argument "a" is missing, with no default
	> f(T)
	[1] 2
	> f(F)
	[1] 1

	More traditional lexical scoping would make the reference
	to x in the default always be global, but lots of code
	would break. I think we're stuck with this behavior as a
	corollary to the way S wants default arguments to work.

	Actually S is a bit inconsistent in its error message --
	if you have a non-function argument it gives the same
	message as R,

	> g<-function(x=x) x
	>g()
	Error in g(): Recursive occurrence of default argument "x"
	Dumped

	Also in R's lexical scoping you probably do want the argument
	name to shadow any outer definitions if you want to be able
	to define default arguments that are recursive functions,
	e.g.

	> g<-function(n, nfac=function(x) {
		if (x <= 1) 1 else nfac(x-1)*x }) nfac(n);
	> g(6)
	[1] 720

----------------------------------------------------------------------
TASK:	Adding List Elements by Name
STATUS:	Open
FROM:	<p.dalgaard@kubism.ku.dk>
	This works in Splus:
		> x<-list()
		> x[["f"]]<-1
		> zz<-"g"
		> x[[zz]]<-2
	In R both variants fail unless the name is already on the
	list. The first one can be replaced by x$f, but there's
	seems to be no substitute for the other one (oh yes I found
	one, but it's not fit to print!). This comes up if you e.g.
	want to create a variable in a data frame with a name given
	by a character string.

----------------------------------------------------------------------
TASK:	Bug in "approx"
STATUS:	Open
FROM:
	When the function approx is called with the argument rule=2, one gets
	the error message
		Error: NAs in foreign function call (arg 6)
	Besides, the meaning of rule=1 or rule=2 is opposite to that described
	in the help text and used in S-plus.
	For example, in R:
		R> approx(1:10,2:11,xout=5:15,rule=1)
		$x
		 [1]  5	 6  7  8  9 10 11 12 13 14 15
		$y
		 [1]  6	 7  8  9 10 11 11 11 11 11 11
	R> approx(1:10,2:11,xout=5:15,rule=2)
	Error: NAs in foreign function call (arg 6)
	but in S-plus:
	> approx(1:10,2:11,xout=5:15,rule=1)
		$x:
		 [1]  5	 6  7  8  9 10 11 12 13 14 15
		$y:
		 [1]  6	 7  8  9 10 11 NA NA NA NA NA
		> approx(1:10,2:11,xout=5:15,rule=2)
		$x:
		 [1]  5	 6  7  8  9 10 11 12 13 14 15
		$y:
		 [1]  6	 7  8  9 10 11 11 11 11 11 11
	The reason for this bug can be found in the last lines of the code of
	approx:
		if (rule == 1) {
			low <- y[1]
			high <- y[length(x)]
		}
		else if (rule == 2) {
			low <- NA
			high <- low
		}
		else stop("invalid extrapolation rule in approx")
		y <- .C("approx", as.double(x), as.double(y), length(x),
			xout = as.double(xout), length(xout), as.double(low),
			as.double(high))$xout
		return(list(x = xout, y = y))
	If (rule == 2) the values of low and high are set to NA. Immediately
	afterwards, the foreign function "approx" is called with these values,
	leading to the error
		Error: NAs in foreign function call (arg 6)
	To obtain the same behavior as in S-plus (and as in the help-text) the
	commands for (rule == 1) and (rule == 2) have to be exchanged.

----------------------------------------------------------------------
TASK:	Matrix multiply problems
STATUS:	Open
FROM:	<thomas@biostat.washington.edu>
	Both of these used to work and seem useful and harmless:
	R> matrix(1,ncol=1)%*%c(1,2)
	Error in matrix(1, ncol = 1) %*% c(1, 2) : non-conformable arguments
	R> matrix(1,ncol=1)*(1:2)
	Error: dim<- length of dims do not match the length of object

----------------------------------------------------------------------
TASK:	"write" function
STATUS:	Open?
FROM:	<Kurt.Hornik@ci.tuwien.ac.at>
	Following my posting of a write.table() function, Martin
	suggested that one could have a generic write() function
	and special methods for e.g. time series, data frames, etc.

	Well, a month has passed since ...

	What does everyone think?  Is it a good idea, or would
	write.table() be enough?  If we think that it is not enough,
	which arguments should the write methods typically allow?
	What about

	write.xxx (x,		# object
		   file =	# filename, default stdout
		   append =	# obvious
		   sep =	# obvious
		   eol =	# end of line char
		   ...)

	???

	On the other hand, it seems clear that something like
	write.table() is nice, and what it should do.  But what
	about classes other than data.frame?

	Martin Maechler:

	Note that S has a  write(.)  function  which would be our
	write.default(.)

	your	write.table	    would be our
		write.data.frame

	The only addition would be a  'write.matrix' which would be 'like'
	write.data.frame, the only problem being that  'matrix' is not a
	class (yet).  [Note that in S4, everything has a class;
		   I'm voting for matrices to have a class in R	 ..]
	write.default  could 'despatch' to write.matrix if  x  is a matrix.

----------------------------------------------------------------------
TASK:	"ls.print" problem
STATUS:	Closed, Aug 6, 97 RG.
	<summary does what it should as does ls.print now>
FROM:	<VENKAT@biosta.mskcc.org>
	ls.print produces error that I don't seem to be able to
	trace.	Output of the commands as follows: (hyeung is a
	24x2 matrix of data)
	-------------------------------------------------
	> summary(hyeung)
	       x.1	       x.2
	 Min.	: 28.0	 Min.	: 10.0
	 1st Qu.: 72.0	 1st Qu.: 87.5
	 Median : 86.5	 Median : 92.5
	 Mean	: 81.0	 Mean	: 82.5
	 3rd Qu.: 97.0	 3rd Qu.:100.0
	 Max.	:100.0	 Max.	:100.0
	> summary(lsfit(hyeung[,1],hyeung[,2]))
		  Length Class	Mode
	coef	   2	 -none- numeric
	residuals 24	 -none- numeric
	intercept  1	 -none- logical
	qr	   6	 -none- list
	> ls.print(lsfit(hyeung[,1],hyeung[,2]))
	trace: ls.print(lsfit(hyeung[, 1], hyeung[, 2]))
	Error: missing value in ``n1 : n2''

----------------------------------------------------------------------
TASK:	Comparisons with zero length things
STATUS:	Open
FROM:	<thomas@biostat.washington.edu>
	<maechler@stat.math.ethz.ch>

		Thomas:
	Any comparison with NULL generates an error
	Error: comparison is possible only for vector types
	whereas in S(-PLUS) it gives NA, which seems more sensible.
	Along similar lines, comparison with a length 0 vector returns
	logical(0) in R but NA in S.

		Martin:
	Isn't  logical(0)  more	 logical than  NA ?
	I agree that it would be best  (convenience)
	if 'NULL==1' returned the same as 'numeric(0)==1'.
	At the moment, I don't see why compatibility with S  should be
	important here:
			if( NULL == anything)
	or, e.g.,	if( numeric(0) == numeric(0) )

	give an error anyway, i.e., you have to test for length 0
	_anyway_ in the cases where one comparison argument may
	have zero length.

		Thomas:
	I didn't (previously) make any comment on this -- I only
	said that NA was more logical than an error message.
	However, the advantage of returning NA is that NA | TRUE
	is TRUE, NA & FALSE is FALSE, which doesn't happen with
	logical(0). Also, from a compatibility point of view one
	of them is tested with is.na(), the other with length(),
	so it can matter which one you use.  Of course no-one should
	deliberately write code where it matters, but these things
	happen.
	It seems in fact that logical(0) | TRUE causes R to freeze
	(R0.49, sparc solaris).

		Robert:
	Well, we thought
	  logical(0) & T should return logical(0)
	  logical(0) | T should return logical(0)
	  already we have NA | T returns T
		     and  NA & T returns NA

		Martin:
	Ok, given the above argument, returning NA is logical, too.
	However, I'd also argue that
		logical(0) | TRUE	-> TRUE
		logical(0) & FALSE	-> FALSE

		logical(0) & TRUE	-> logical(0)
		logical(0) | FALSE	-> logical(0)

	    ThLu> It seems in fact that logical(0) | TRUE causes R to freeze
	    ThLu> (R0.49, sparc solaris).
	Yes:
	> logical(0) | TRUE
	Warning in logical(0) | TRUE : longer object length
		is not a multiple of shorter object length
	Floating exception
	~~~~~~~~~~~~~~~~~~ [and 'core' dump]

----------------------------------------------------------------------
TASK:	Method lookup for "print"
STATUS:	Open
FROM:	<la-jassine@aix.pacwan.net>
	I have	always thought that typing the name of an object
	generated a call to the print method for the object, however,
	(in 0.49) I redefined the generic print method as

	print <- function(x, ...)
	 {if (is.tframe(x))  UseMethod("print.tframe")
	  else UseMethod("print")
	 }

	Now I have an object z which returns TRUE to is.tframe(z) and
	> class(z)
	[1] "ts"     "tframe"
	Then
	> print(z)
	[1] 1981.50 2006.25    4.00
	But
	> z
	Error: comparison is possible only for vector types
	> traceback()
	[1] "c(\"print.ts(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\",
	\\\"tframe\\\"\", "
	[2] "c(\"print(structure(c(1981.5, 2006.25, 4), class = c(\\\"ts\\\",
	\\\"tframe\\\"\", "

	This is generating a call to the class method print.ts
	rather than to print.tframe.ts as is done when I use
	print(z).  If my understanding that typing the name of an
	object should generate a call to the print method for the
	object then this is a bug. Otherwise, could someone please
	explain to me what it does. Thanks.

----------------------------------------------------------------------
TASK:	False warnings
STATUS:	Open
FROM:	<la-jassine@aix.pacwan.net>
	In R 0.49 comparison of logic matrices with & and | seems
	to sometimes generate false warning messages about longer
	object length is not a multiple of shorter object length.
	I have not been able to isolate the exact circumstances.

----------------------------------------------------------------------
TASK:	ISO-latin1 characters
STATUS:	Open
FROM:	<maechler@stat.math.ethz.ch>
	There seems to be a problem in print.default with some
	ISO-latin1 characters (the chars AFTER ASCII in western
	Europe...) if they appear in strings.  (no problem if they
	are part of a function comment, see below).

	Some of the characters lead to 4 character Hex-codes being
	printed instead:  "" ## ^u    prints as "0xFB"

	If you use the funny characters in comments of functions,
	they are stored and printed properly.

	HOWEVER: In a few rare cases, the strings are not even
	PARSED properly; the line 'ISOdiv <- ..' below gives a
	SYNTAX error.

	The following code shows the symptoms :
	 -- ONLY if  the e-mail between here and your place is
	 8-bit clean! -- (else: get it
	ftp://ftp.stat.math.ethz.ch/U/maechler/R/string-test.R )

	frenchquotes <- "�...�"	 ## <<...>>
	frenchquotes

	Umlaute <- "��� ���" # = "a "o "u  "A "O "U
	Umlaute #- only the last one is not printed properly...

	A.accents <- "������� �������" # `a 'a ^a "a oa ae  `A 'A ^A "A oA AE
	A.accents
	EI.accents <- "�������� ��������"
	EI.accents
	O.accents <- "������������"
	O.accents

	U.accents <- "����������"
	U.accents

	ISO24x <-  "������� ��������" #octal 241..257
	ISO26x <- "�������� ��������" #octal 260..277

	##--- THIS IS a Problem: It gives a SYNTAX error !
	ISOdiv <- "�� �� �� �� ��"
	##-- One of these characters even was producing the same as  'q()' !!

	aa_ function(x) {
	  x^2
	  ##- frenchquotes <- "�...�"  ## <<...>>
	  ##- Umlaute <- "��� ���" # = "a "o "u	 "A "O "U
	  ##- A.accents <- "������� �������" # `a 'a ^a "a oa ae  `A 'A ^A "A oA AE
	  ##- EI.accents <- "�������� ��������"
	  ##- O.accents <- "������������"
	  ##- U.accents <- "����������"
	  ##-
	  ##- ISO24x <-	 "������� ��������" #octal 241..257
	  ##- ISO26x <- "�������� ��������" #octal 260..277
	  ##- ISOdiv <- "�� ��" ##-- OMITTED further: SYNTAX error !!
	}
	aa

----------------------------------------------------------------------
TASK:	String length problems
STATUS:	Closed ?
FROM:	<maechler@stat.math.ethz.ch>
	This is not a  cat(.) but a  string storing/parsing problem:
	nchar("\n\n")	 # gives  2  instead of 3
	[ Hmmm.	 Was this typed to readline I wonder?  There it	 ]
	[ seems that ^L must be escaped with ^V.  Using the ANSI ]
	[ \f will now produce a literal formfeed.  Indeed, using ]
	[ any of the ANSI C escapes will work.			 ]

	[ However, using the '^L' (emacs C-q C-l) in a	string is still dropped:
	> "\n\n"
	[1] "\n\n"

----------------------------------------------------------------------
TASK:	Fontend
STATUS:	Open
FROM:
	Some time ago there was the suggestion to add a PLATFORM
	subdir level for bin (and eventually the library subdirs
	with `binaries'), and the idea to have the shell wrapper
	automagically call the right binary.

	I mentioned that one might be able to use the shell variables
	OSTYPE and HOSTTYPE for that, noticing however that e.g on
	my Debian Linux/GNU/ix86

			bash	tcsh
	OSTYPE		Linux	linux
	HOSTTYPE	i386	i386-linux

	Hmm ... It seems (a colleague just checked that) that these
	variables are not POSIX either, and hence I'd say rather
	useless for our purpose.

	In the absence of a reliable run-time possibility to
	determine the current platform, it seems to be natural to
	use `platform' as obtained at compile-time for possibly
	distinguishing the various binaries etc, and leave it at
	the discretion of the sysadmin to ensure that the R script
	in the path calls the right binary.

	If I am missing something obvious, please let me know.


----------------------------------------------------------------------
TASK:	Resetting Graphical Parameters
STATUS:	Open
FROM:	<maechler@stat.math.ethz.ch>
	BY THE WAY:
	  It would be nice to be able to say
		par(reset = TRUE)
	  (or similar) for resetting all the graphical parameters to their
	(device-dependent) default values.
	[ This will require a little work.  Perhaps the easiest thing	]
	[ to do is to add a new device driver call "reset".  This would ]
	[ be best left to the multiple acyive device driver project.	]

----------------------------------------------------------------------
TASK:	.Options not working in all cases
STATUS:	Open
FROM:	<maechler@stat.math.ethz.ch>
	The .Options vector had been introduced a while ago after my
	suggestion (see Ross's E-mail below).  .Options$digits is used
	be default in several  print methods (eg print.lm), however,
	deparse(.) e.g., uses options()$width, and not .Options$width.

	Another problem is that	 .Options
	is still not in the documentation (on-line help).
	Before one could add it there, we'd need ``the specs''.

	I think the (at least my) idea was that
		options(.)  queries or sets elements in the   .Options	list
	and all functions -- including the internal ones -- use	 .Options.
	As far as I know, this is what S does.
	Currently, this is NOT the case in R.

	Ross said a while ago:

   >>> From: Ross Ihaka <ihaka@stat.auckland.ac.nz>
   >>> Date: Wed, 11 Dec 1996 17:10:59 +1300 (NZDT)
   >>> To: Martin Maechler <maechler@stat.math.ethz.ch>
   >>> Cc: R-testers mailing list <R-testers@stat.math.ethz.ch>
   >>> Subject: R-alpha: options() and .Options -- ?

    Ross> Martin Maechler writes:
    >> This is not a bug report, rather than some remarks as a
    >> "request for comments":
    >>
    >> It is clear that	 options( foo = bar )
    >> sets the option and also updates the builtin()  .Options	 list :
    >>
    >> > options(myopt = pi)
    >> > .Options$my
    >> [1] 3.14159265
    >>
    >> In S-plus, it was (is) possible to use .Options locally in a function
    >> frame in order to just affect some options during evaluation of that
    >> function.

    Ross> I have made some changes so that such local assignments to .Options
    Ross> will work.  The down side is that such assignments will also work at
    Ross> top level with the changes shadowing the real system options.

    Ross> This also may be ok.	It would have the advantage that options would
    Ross> then be preserved from session to session.  Is this a good idea or a
    Ross> bad idea?

----------------------------------------------------------------------
TASK:
STATUS:
FROM:
----------------------------------------------------------------------
TASK:
STATUS:
FROM:
----------------------------------------------------------------------
TASK:
STATUS:
FROM:
----------------------------------------------------------------------
TASK:
STATUS:
FROM:
----------------------------------------------------------------------
TASK:
STATUS:
FROM:
----------------------------------------------------------------------
TASK:
STATUS:
FROM:
----------------------------------------------------------------------