10 March 2009

Dark secrets of the art: C pointers and Fortran pointers

Lately, I've been implementing some new numerical algorithms for benchmarking.  Normally I would write such things in C rather than Fortran, out of habit and experience.  However, since this code calls Fortran routines (specifically, in the BLAS and LAPACK) a lot, I found it less trouble to write it in Fortran, than to try to manage calling C from Fortran.  This is because C and Fortran have different linking conventions, and C has no standard interface for calling into Fortran.  Different compilers have different standards about translating Fortran subroutine and function names into linker symbols -- sometimes they add one or more underscores, sometimes not.  It's deceptively easy if you only use one compiler, such as gcc; developers that aim to build with multiple compilers often discover the differences by accident, and account for them with ad-hoc symbol translation macros (or paper them over by build systems such as GNU Autoconf, that hide the ad-hockery for you).  Fortran subroutines and functions also have no idea of "pass by value," so when calling them from C, you have to keep track of which arguments are inputs and which are "outputs," and make sure to copy the output arguments in and out. 

These factors have annoyed me enough in the past that for this project, I taught myself enough "modern" Fortran to get by.  (I define "modern" as Fortran 90 and newer, or generally, anything with free-form input and colon array slice notation.)  However, I have users who code exclusively in C.  Fortran >= 2003 has what C does not, though:  an ISO standard for interoperability with C.  You can give a Fortran routine a C binding, which solves the linker symbol name problem.  More importantly, you can get Fortran and C pointers to talk to each other.

Yes, Fortran >= 90 does actually have a pointer type!  (It has dynamic memory allocation too, unlike in previous Fortran standards!)  Fortran's pointers to arrays are "smarter" than C pointers in some sense:  they have a "rank" and a "shape," which means (for instance) that you can compute with a submatrix of a larger matrix, whose elements aren't necessarily contiguous, as if it were just a plain old matrix.  The corresponding C code would look much less like linear algebra and much more like the innards of some device driver.  The price of Fortran pointers' ease of use is that a Fortran pointer contains more information than just a memory address.  You can't just pass a Fortran pointer into C space and expect C to understand it, or vice versa.  Fortran 2003 has an ISO_C_BINDING module whose C_F_POINTER subroutine solves this particular problem:  it pulls in a C pointer from C space, and associates it to a Fortran pointer along with a rank and shape. 

I'm currently using the ISO_C_BINDING module to build a C interface to my Fortran library.  It's easy enough that it makes me wonder why many other languages don't come equipped with a language-level C binding!


evaline said...

If you'd be willing to post an example of your code, it would be a HUGE help to this fellow scientific programmer. I am having major issues getting my double pointer (i.e., BASIC MATRIX) in C++ to play nice when I pass it over to fortran... seems to be a lot of extra stuff floating in the memory stream. Anyway, it would help. :)

HilbertAstronaut said...

Not ignoring your comment, just really, really busy at the moment ;-)

Can you be more specific about "extra stuff floating in the memory stream"? I encountered some weird issues with C_F_POINTER and multithreading on some compilers, but not others. (gfortran should be safe. I don't want to name names for the accused compiler vendor until I have evidence that it wasn't my program's fault.) It would be nice to get some independent verification of the issues I encountered. Let me know if you would like to take the discussion to e-mail or have it here.

benetion said...

Hi, is it possible to define a C pointer in fortran and pass it back and forth calling a C function?

HilbertAstronaut said...

Yes, a C pointer in Fortran would have the C_PTR derived type. C function pointers have the C_FUNPTR derived type. C_LOC(X) resp. C_FUNLOC(X) return the C address (as a C_PTR resp. C_FUNPTR) of the argument X. If you use ISO_C_BINDING and declare the C function interfaces appropriately in your Fortran code, you can pass in or return C pointers. Combine this with C_F_POINTER to treat the C pointer as a Fortran array in your Fortran code.

benetion said...

Thanks for the response. Can I be a little more specific? I would like to create a C pointer in Fortran (as you suggested). Then I call a C function to allocate memory (a pointer array), and come back to Fortran. Is it correct that now I can use the pointer array in another C function? What if it is cuda c, if it is more interesting? Thanks!

HilbertAstronaut said...

Yes, you can use the C_PTR (in this case, it's a C_PTR and not a C_FUNPTR) as many times as you want, as long as none of the C functions deallocate it. When you see "C_PTR", think "void*". Do remember to deallocate it eventually, though. Fortran won't help you with that.

Pointers into GPU device memory are still of type C_PTR. Fortran doesn't care where the memory lives (on the GPU or on the CPU). Just be sure not to try to read or write data allocated on the GPU in your Fortran CPU code.

If you're interested in mixing Fortran and CUDA, you might want to look at some of the compiler annotation products out there, like PGI's CUDA Fortran: