Things that were on my TODO list and have been accomplished ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (newest 1st) o Incorporate the PAM speedup patches by Matthias Studer (unige.ch). It's based on a good paper etc ---> ~/R/MM/Pkg-ex/cluster/pamonce-mail.txt ---> ~/R/MM/Pkg-ex/cluster/pamonce.R o daisy() -- now allows 'metric = "gower"' which allows to use "Gower's formula" also for the all-continuous case. The help page contains "Gower" much more visibly o Jan.2006: Create a new silhoutte.clara(...., full = FALSE) where "full = TRUE" would compute all the silhouttes, not just those of the "best sample". o Dec.2005: need ./inst/CITATION for getting a nice citation() o May, 2004: New agnes(*, method = "flexible", par.method = *) to allow Lance-Williams formula. o Mar, 2004: When using a NAMESPACE, can rename `..dClass' in R/0aaa.R (and *.q): Renamed to 'dissiCl' o Dec, 2003: cutree(diana(), h = h0) gives complete non-sense without a warning -- the same as in S-plus .. For R 1.7.0 : o agnes(), diana() and pam() have a new argument `keep.diss' with a smart default. If FALSE, the new behavior is *NOT* to keep the dissimilarities ! --> save space in result object --> somewhat changed summary/print output , i.e. minor NON-compatibility! o a) pam()$diss is not a `proper' dissimilarity (class is only "dissimilarity", missing "dist") -> done Mo,17.3.03 b) as.dist() is not generic and does too much for "dist" objects, and also for "dissimilarity" {<< these were fixed for R 1.7.0} and the completely wrong thing for the pam()$diss like "dissimilarity"s... (wrong $Size !) o plot(ylab / xlab) -- for agnes -> bannerplot() & pltree() ./R/plothier.q ===> NOT DONE -- rather tell people {on the help()} to use the two underlying plot functions separately ! o Robert G: " silhouette.default() is wrong " --> ok, I found this was on R-help, Feb.7 and fixed in my sources subsequently Oct. 2002 {for "banner"; silhouette was earlier}: 1) Silhouette und Banner Plot : Label vertical axis of horizontal barplot using HORIZONTAL text {i.e., par(las = 1) or axis(*, las = 1)} --> Should become an optional (but probably default) feature of barplot()! July 2002: clara(): Must have allocation(?) bug in Fortran --> ../cluster_tests/clara.R --- bug (not allocation!) found and eliminated. o all [ agnes, clara, daisy, diana, fanny, pam ] now have something like valmisdat <- min(x2, na.rm=TRUE) - 0.5 #(double) VALue for MISsing DATa but this will go wrong as soon as min(x2) < -5e15 !!! --> now using something more sensible o Generally: clara(), pam(), agnes(), diana(),... should *not* keep the "diss" component (with all n^2/2 dissimilarities) in the result, by default when n >= 50 {i.e. get a new argument `` keep.diss = n < 50 '' } (done for 1.7.0, but default = n < 100) clara(): R/clara.q : we transpose *large* x[,] and in Fortran src/clara.c deal with x[] as vector anyway -- change this (done for 1.7.0) May 2002: data/flower.R : Big problem: V1..V3 should be binary, i.e. 0:1 man/flower.Rd : but as factors they become 1:2 ! ==> the example (.. type = list(asymm = 1) has probably been wrong all the time -- compare to JSS paper! --- no! the changes to cluster-1.5.1 which test these,etc give no difference! o clusplot() -- drawing MVE ellipses (-> .Fortran("spannel",..)) : S-plus has ellipses that just exactly contain all points. whereas our ellipses are *always* slightly too large. o Provide a user function ellipsoidhull <- function(x, n = 201) = a combination of internal .Fortran("spannel",.), cov.wt() & ellipse() o man/flower.Rd : use \describe{ \item ... for the 8 variables * OLDER 2) help files for plot.*() --- new arguments are in *.Rd, but not really documented. 5) a more thorough test of .Fortran("spannel", *) used in clusplot(*). -> done {new ellipsoidhull() } ------------- ./R/daisy.q : Now that daisy objects inherit from "dist", we can use as.matrix() and other dist methods which can be *very* useful ! --> also upgraded ./man/dissimilarity.object.Rd {remaining: improve our code or the examples ?!} 8) Make bannerplot() an (possibly later namespace internal ?) standalone function called from both plot.agnes() and plot.diana(), even plot.mona()!