This branch contains modifications to the representation of data in
STRSXPs that tries to avoid creation of CHARSXP objects, or at least
defer their creation until they are needed. Integers that are
converted to strings are initially stored as integers unless/until
they are needed, thus avoiding the sprintf as well as creating the
CHARSXP. (This is also done for doubles, but that probably isn't all
that useful.) Short ASCII strings are stored directly rather than
being boxed in CHARSXPs. The hope is that reducing the number of
CHARSXPs will reduce pressure on the GC.

At this point the main example where this helps is in avoiding
creating case labels in lm(). In R-devel, 

> n <- 10000000
> p <- 5
> x <- matrix(rnorm(n * p), n, p)
> y <- rnorm(n)
> 
> system.time(lm(y ~ x))
   user  system elapsed 
 34.039   3.584  37.611 
> system.time(lm(y ~ x))
   user  system elapsed 
 22.645   3.118  25.755 
> system.time(lm(y ~ x))
   user  system elapsed 
 20.743   2.735  23.471 

With these changes, the timings are

> system.time(lm(y ~ x))
   user  system elapsed 
  8.889   3.422  12.308 
> system.time(lm(y ~ x))
   user  system elapsed 
  8.688   3.400  12.083 
> system.time(lm(y ~ x))
   user  system elapsed 
  8.657   3.381  12.035 

This is clearly a useful improvement, but it can be had in other ways,
so the question is whether there are benefits in other settings.
Avoiding CHARSXP creation only for shorter strings may not be enough;
an alternative is to initially allocate all string data contiguously,
and only create CHARSXPs if elements are modified and new values do
not fit. To go further it would be useful to have some benchmarks of
code using character data.