Scatter/Gather thoughts

by Johan Petersson

When standards collide: the problem with dlsym

Let's say you need to write some code that dynamically loads a library and calls a function in it. A module or plugin would be typical, but for this short example, we'll use Expat. The language is C++, but could just as easily have been C. We get the ugly function pointer declarations out of the way first:

extern "C" typedef XML_Parser parsercreate_t(const XML_Char*);

...then the library needs to be loaded...

void *lib = dlopen("", RTLD_LAZY);

...and we must make sure dlopen was successful, of course. That was easy, now we can just use dlsym to find function addresses:

parsercreate_t *pc = dlsym(lib, "XML_ParserCreate");

invalid conversion from `void*' to `XML_ParserStruct*(*)(const XML_Char*)'

Oops, obviously we need a cast.

void *p = dlsym(lib, "XML_ParserCreate");
parsercreate_t *pc = static_cast<parsercreate_t*>(p);

invalid static_cast from type `void*' to type `XML_ParserStruct*(*)(const XML_Char*)'

Um... maybe reinterpret_cast?

void *p = dlsym(lib, "XML_ParserCreate");
parsercreate_t *pc = reinterpret_cast<parsercreate_t*>(p);

ISO C++ forbids casting between pointer-to-function and pointer-to-object

The error message could hardly be clearer. There is no valid cast between pointer to function and pointer to object. You see, neither C nor C++ requires that a function pointer can be stored in a void pointer and vice versa; depending on the platform there may be no way to perform the conversion.

If you have ever written x86 real mode code you'll realize why such conversions may not be possible (hint: you have pointers of different sizes in certain memory models). While x86 real mode may seem irrelevant today we can't simply dismiss such considerations as obsolete. There are reasons why we may want to resurrect something similar to near and far pointers in the future.

Yet, dlsym returns void*. What's going on here? dlsym is part of a SUSv3/POSIX standard which does require that an object of type void* can hold a pointer to a function. Presumably, a platform where this is not the case could never be POSIX compliant, which seems a bit extreme. In any event, it doesn't really help us, because we are still not allowed to cast between the types.

You could resort to a variety of tricks to avoid the direct object/function pointer cast:

Conversions that preserve the bit pattern of the pointer will work in practice as long as the pointer representations are sufficiently similar. But there are no guarantees – well, except for the union hack, which guarantees undefined behaviour.

You probably noticed that I omitted the C-style cast from my earlier example. Alas, most C and C++ compilers will allow the conversion when you use a C-style cast. You may not even get a warning, even though it's prohibited in ISO C as well as ISO C++. This kind of conversion is a common compiler extension. So common, in fact, that many people don't realize it's not in the standards.

Converting between function and object pointers is the topic of C++ Standard Core Language Active Issue #195, which has yet to be resolved. One suggestion is to allow the cast and give it implementation-defined behaviour only when the conversion is possible. That would match existing practice fairly closely, but I'm not sure it's a good idea. An inherently unportable feature should probably remain a compiler extension.

Interestingly, in Windows the problem is the same in reverse: GetProcAddress returns FARPROC, which is a function pointer. You could argue that this is marginally better than the corresponding POSIX function, since functions are more commonly loaded dynamically.

If you design the interface of the dynamically loaded object it's possible to avoid the problem entirely by only exposing the supported kind of pointers, e.g. getting the pointer to an array of function pointers with dlsym. But that's not always possible (or desirable). Besides, both dlsym and GetProcAddress are meant to be used from C and could easily have been made compatible with C.

The Open Group page on dlsym indicates that a future version may either add a new function to return function pointers, or the current interface may be deprecated in favor of two new functions: one that returns data pointers and the other that returns function pointers. I think that is the best solution. Having to break the language rules just to use a function is silly.

10 December, 2004