10 March 2009

Dark secrets of the art: C pointers and Fortran pointers


Lately, I've been implementing some new numerical algorithms for benchmarking.  Normally I would write such things in C rather than Fortran, out of habit and experience.  However, since this code calls Fortran routines (specifically, in the BLAS and LAPACK) a lot, I found it less trouble to write it in Fortran, than to try to manage calling C from Fortran.  This is because C and Fortran have different linking conventions, and C has no standard interface for calling into Fortran.  Different compilers have different standards about translating Fortran subroutine and function names into linker symbols -- sometimes they add one or more underscores, sometimes not.  It's deceptively easy if you only use one compiler, such as gcc; developers that aim to build with multiple compilers often discover the differences by accident, and account for them with ad-hoc symbol translation macros (or paper them over by build systems such as GNU Autoconf, that hide the ad-hockery for you).  Fortran subroutines and functions also have no idea of "pass by value," so when calling them from C, you have to keep track of which arguments are inputs and which are "outputs," and make sure to copy the output arguments in and out. 

These factors have annoyed me enough in the past that for this project, I taught myself enough "modern" Fortran to get by.  (I define "modern" as Fortran 90 and newer, or generally, anything with free-form input and colon array slice notation.)  However, I have users who code exclusively in C.  Fortran >= 2003 has what C does not, though:  an ISO standard for interoperability with C.  You can give a Fortran routine a C binding, which solves the linker symbol name problem.  More importantly, you can get Fortran and C pointers to talk to each other.


Yes, Fortran >= 90 does actually have a pointer type!  (It has dynamic memory allocation too, unlike in previous Fortran standards!)  Fortran's pointers to arrays are "smarter" than C pointers in some sense:  they have a "rank" and a "shape," which means (for instance) that you can compute with a submatrix of a larger matrix, whose elements aren't necessarily contiguous, as if it were just a plain old matrix.  The corresponding C code would look much less like linear algebra and much more like the innards of some device driver.  The price of Fortran pointers' ease of use is that a Fortran pointer contains more information than just a memory address.  You can't just pass a Fortran pointer into C space and expect C to understand it, or vice versa.  Fortran 2003 has an ISO_C_BINDING module whose C_F_POINTER subroutine solves this particular problem:  it pulls in a C pointer from C space, and associates it to a Fortran pointer along with a rank and shape. 

I'm currently using the ISO_C_BINDING module to build a C interface to my Fortran library.  It's easy enough that it makes me wonder why many other languages don't come equipped with a language-level C binding!