GFortran Issues with LAPACK II



This is an update on my previous post from May.

A number of things changed since: GFortran started adopting a fix that by default prevents optimizations which break code calling BLAS/LAPACK functions from C without hidden length arguments. R has been updated to internally add these hidden length arguments (and also in other cases where LTO type mismatch was detected). R has exported macros for use in packages so that they can follow suit when calling BLAS/LAPACK and CRAN has been working with maintainers of the affected packages. On the other hand, binaries of BLAS/LAPACK implementations in Linux distributions started to emerge with the problem present. OpenBLAS in Fedora 30 is compiled with the versions of GFortran that still perform the aggressive optimization. Hence, R packages not yet fixed to provide the hidden arguments may and do crash in some cases.

Changes in GFortran

GFortran 9.2 has been released with a new option -ftail-call-workaround, which disables tail-call optimization in procedures with character arguments that call implicitly prototyped procedures. This option is enabled by default, so GFortran 9.2 is safe again for use with code that doesn’t pass the hidden lengths to BLAS/LAPACK character arguments (of length 1). One can also use -ftail-call-workaround=2 to disable tail-call optimizations in all procedures with character arguments. The option is thus less invasive than -fno-optimize-sibling-calls, allowing for tail-call optimizations in more cases, but on the other hand it is declared to be likely withdrawn in some future release of Fortran. Also it is declared that the default for the option may change. The option has already been added also to GFortran 7 and GFortran 8, but those have not been yet released with the change. Credits to Jakub Jelinek for implementing this new option.

R does not yet use the new option, but instead still uses the more conservative option -fno-optimize-sibling-calls. It would be possible to switch to the new option based on a configure test that would check whether it is available, reverting to -fno-optimize-sibling-calls if that was not the case.

The key benefit of this (possibly temporary) option for R is that it is enabled by default: LAPACK and BLAS implementations that have note been explicitly built with -fno-optimize-sibling-calls nor another option preventing the dangerous optimizations would for some time become safe again to use even from R packages that have not yet been fixed. The same applies well beyond R, even LAPACKE and CBLAS need to be fixed.

Changes in R

The headers for BLAS and LAPACK included with R have been extended so that C declarations of BLAS and LAPACK functions include also the hidden length arguments. All calls to BLAS and LAPACK from C code of R itself have been fixed to pass this argument (it is always 1 for the computational functions). This work has been done by Brian Ripley.

The actual type of the hidden length argument and whether it is to be used or not is configurable via macros in R (FCONE, FC_LEN_T, internally FCLEN in function declarations) and is Fortran-compiler dependent. R detects at build time whether the argument is to be used by the Fortran compiler R is being built with, but it considers only size_t as the type. GFortran 7 used int, but for the purpose of passing 1, which is in fact never read, and for the purpose of providing a “scratch” space for the callee to pass another 1, which will be never read, using a wider type is fine.

This is not only simpler than properly detecting the exact type, but also safer, because when R uses an external BLAS/LAPACK implementation (not the reference one included in R), we have no control over how that one has been built and the selection is done at runtime/dynamic linking time. Moreover, optimized BLAS/LAPACK implementations may replace some functions by custom implementations in C, while other functions (typically from LAPACK) will re-use the reference Fortran implementation, hence some functions will need the hidden argument but other not. As current compilers typically use a 64-bit hidden length argument, size_t is a safer choice. Note also that on 64-bit CPUs, 64-bit slack slots will typically be used even for int arguments.

Changes in R packages

R packages should also be fixed to provide the hidden length arguments when calling from C to Fortran. CRAN now does an additional check using LTO, reporting LTO type mismatches when these problems still exist, and package authors are being asked to fix those. Many packages have already been fixed, sometimes with the help of the CRAN team.

Packages may fix their calls to BLAS and LAPACK using the same new macros that R uses in its code (FCONE, FC_LEN_T, USE_FC_LEN_T). This is documented in Writing R Extensions. These macros can be used also in those packages that for whichever reason chose to include their own version of LAPACK or even BLAS.

Alternatively, package authors may also use iso_c_binding (Fortran 2003) to create Fortran functions with standardized calling convention from C, as mentioned in the previous version of this blog and now also in Writing R Extensions.

Calls from Fortran to BLAS and LAPACK are not affected: the hidden length arguments will be correctly added by the Fortran compiler. To be safe, both the client code and BLAS/LAPACK need to be built by the same Fortran compiler, but in practice many compilers will be compatible and GFortran will add the hidden lengths, which as noted seems currently safe to do even when they are not expected.

I have seen and debugged an example where a package called to BLAS/LAPACK both from C (without the proper lengths) and from Fortran. The LTO type mismatch warnings seemed to suggest that the calls from Fortran were wrong, but that was not the case, that warning appeared because of incorrect calls from C that however matched the (old and incorrect) headers without the hidden length arguments.

Note we have considered also alternative solutions to the problem, primarily looking for some that would not require modification of source code of packages that call into BLAS/LAPACK. I’ve experimented with a C preprocessor solution based on meta-programming tricks (started from an if-then-else implementation in the C preprocessor), which allowed automatic rewriting of some calls using F77_CALL macro in the usual form of F77_CALL(foo)(x), while leaving other calls intact. This worked on simple examples, but would have been difficult to debug for package authors if problems appeared and it has been far from simple. Also, some packages insist on including their own version of LAPACK/BLAS and those would have to be adjusted anyway, possibly with more effort. Also the alternative solution could have a surprising effect in case of (yet unlikely) re-use of the name of a BLAS/LAPACK function for something else. The solution adopted in R, while it requires packages to be updated manually, is intentionally very simple.

Changes in Linux distributions

It always takes some time after a GCC release before it is used to build binary packages for Linux distributions. As only the 9.x series now has a release with -ftail-call-workaround and this release (9.2) is quite recent, Linux distributions unfortunately can be expected to come soon with BLAS/LAPACK implementations compiled with the aggressive optimizations; applications not passing hidden character lengths, including R packages not yet fixed, are expected to crash or work incorrectly. This has already been seen on Fedora 30 with OpenBLAS.

Fedora 30 now comes with OpenBLAS with this problem (all versions including the libRblas.so replacement, serial, OpenMP, threaded). OpenBLAS in Fedora 29, Ubuntu 19.04, Ubuntu 18.04, Debian 10 and Debian 9 seems to be ok. Reference (netlib) LAPACK in Fedora 30, Fedora 29, Ubuntu 19.04, Ubuntu 18.04, Debian 10 and Debian 9 seems fine.

These observations are likely to change, unfortunately in the short term for both the worse and the better. The only defense for R packages is to correctly pass the hidden length argument. Still, it would make sense if BLAS/LAPACK implementations in Linux distributions were all rebuilt in a way so that the lengths were not required.