Notes on enabling Atlas libs for Octave and R

I. Overview

As of the Debian releases 2.1.34-6 (for GNU Octave) and 1.3.0-3 (for GNU R),
both Octave and R can be used with Atlas, the Automatically Tuned Linear
Algebra Software, in order to obtain much faster linear algebra operations.

To make use of Atlas, Debian users need to install the Atlas libraries for
their given cpu architecture. Concretely, one of

    atlas2-base - Automatically Tuned Linear Algebra Software
    atlas2-p3 - Automatically Tuned Linear Algebra Software
    atlas2-p4 - Automatically Tuned Linear Algebra Software
    atlas2-athlon - Automatically Tuned Linear Algebra Software

must be installed. Here, 'base' provides generic libraries which run on all
platforms whereas 'p3', 'p4' and 'athlon' stand for the Pentium III and IV as
well as the AMD Athlon, respectively.  The actual libraries are installed in
/usr/lib/atlas (in the case of 'base') and in /usr/lib/$arch/atlas for the
cpu-specific versions. Here $arch stands for the cpu code used by the kernel
and shown in /proc/cpuinfo.

The Atlas libraries can be loaded dynamically instead of the (non-optimised)
blas libraries against which both Octave and R are compiled.


II. Using the Atlas libraries

In order to have the libraries loaded at run-time, the location needs to be
communicated to the dynamic linker/loader. [ There appears to be a bug in
ld.so as the location is already in ld.so.conf, but found "too late". ]

For Octave, use the variable LD_LIBRARY_PATH. On a computer with the
atlas2-base package:

    $ LD_LIBRARY_PATH=/usr/lib/atlas octave2.1 -q
    octave2.1:1> X=randn(1000,1000);t=cputime();Y=X'*X;cputime-t
    ans = 7.9600

    $ edd@homebud:~> octave2.1 -q
    octave2.1:1> X=randn(1000,1000);t=cputime();Y=X'*X;cputime-t
    ans = 61.520

For R, the R_LD_LIBRARY_PATH variable has to be used, and its value needs to
be copied out of /usr/bin/R (or edited therein). For an Athlon machine, the
example becomes

    $ R_LD_LIBRARY_PATH=/usr/lib/R/bin:/usr/local/lib:/usr/X11R6/lib:/usr/lib/3dnow/atlas:/usr/lib:/usr/X11R6/lib:/usr/lib/gcc-lib/i386-linux/2.95.4:. R --vanilla -q
    > mm <- matrix(rnorm(10^6), ncol = 10^3)
    > system.time(crossprod(mm))
    [1] 2.38 0.04 2.84 0.00 0.00

    $ R --vanilla -q
    > mm <- matrix(rnorm(10^6), ncol = 10^3)
    > system.time(crossprod(mm))
    [1] 28.28  0.08 33.54  0.00  0.00
    > 

Running such a small example is highly recommded to ascertain that the
libraries are indeed found, and to "prove" that the speed gain is real (and
significant) for problems of at least a medium size as the 1000x1000 examples
above.

Note that the example use "/usr/lib/atlas" for the atlas2-base package;
Athlon users should employ "/usr/lib/3dnow/atlas", Pentium III users should
employ "/usr/lib/xmm/atlas" and Pentium IV users should employ
"/usr/lib/26/atlas".

Lastly, it should be pointed out that it is probably worthwhile to locally
compile, and thereby optimise, the Atlas libraries if at least a moderately
intensive load is expected.


III.  See also

The Atlas packages have a very detailed README.Debian file which should be
consulted; it also details local recompilation. Sources and documentation for
Atlas are at   http://www.netlib.org/atlas.


IV.  Acknowledgements

Camm Maguire developed the scheme of overloading Atlas over the default blas
libraries and deserves all the credit. Many thanks to John Eaton for helping
debug some errors in the initial setup, and to Doug Bates for work on the R
package. 


 -- Dirk Eddelbuettel <edd@debian.org>  Tue, 21 Aug 2001 21:37:15 -0500