C++ dlopen mini HOWTO

Aaron Isotton

aaron@isotton.com

$Id: C++-dlopen-mini-HOWTO.xml,v 1.8 2002/06/19 21:51:19 aisotton Exp $

2002-06-19

Revision History
Revision 1.002002-06-19Revised by: AI
Moved copyright and license section to the beginning. Added terms section. Minor changes.
Revision 0.972002-06-19Revised by: JYG
Entered minor grammar and sentence level changes.
Revision 0.962002-06-12Revised by: AI
Added bibliography. Corrected explanation of extern functions and variables.
Revision 0.952002-06-11Revised by: AI
Minor improvements.
Revision 0.92002-06-10Revised by: AI
First draft proposed.

Table of Contents
1. Introduction
1.1. Copyright and License
1.2. Disclaimer
1.3. Credits / Contributors
1.4. Feedback
1.5. Terms Used in this Document
2. The Problem
2.1. Name Mangling
2.2. Classes
3. The Solution
3.1. extern "C"
3.2. Loading Functions
3.3. Loading Classes
4. See Also
Bibliography

1. Introduction

A question which frequently arises among Unix C++ programmers is how to load C++ functions and classes dynamically using the dlopen API.

In fact, that is not always simple and needs some explanation. That's what this mini HOWTO does.

An average understanding of the C and C++ programming language and of the dlopen API is necessary to understand this document.

This HOWTO's master location is http://www.isotton.com/howtos/C++-dlopen-mini-HOWTO/.


1.3. Credits / Contributors

In this document, I have the pleasure of acknowledging (in alphabetic order):


2. The Problem

At some time you might have to load a library (and use its functions) at runtime; this happens most often when you are writing some kind of plug-in or module architecture for your program.

In the C language, loading a library is very simple (calling dlopen, dlsym and dlclose is enough), with C++ this is a bit more complicated. The difficulties of loading a C++ library dynamically are partially due to name mangling, and partially due to the fact that the dlopen API was written with C in mind, thus not offering a suitable way to load classes.

Before explaining how to load libraries in C++, let's better analyze the problem by looking at name mangling in more detail. I recommend you read the explanation of name mangling, even if you're not interested in it because it will help you understanding why problems occur and how to solve them.


2.1. Name Mangling

In every C++ program (or library, or object file), all non-static functions are represented in the binary file as symbols. These symbols are special text strings that uniquely identify a function in the program, library, or object file.

In C, the symbol name is the same as the function name: the symbol of strcpy will be strcpy, and so on. This is possible because in C no two non-static functions can have the same name.

Because C++ allows overloading (different functions with the same name but different arguments) and has many features C does not — like classes, member functions, exception specifications — it is not possible to simply use the function name as the symbol name. To solve that, C++ uses so-called name mangling, which transforms the function name and all the necessary information (like the number and size of the arguments) into some weird-looking string which only the compiler knows about. The mangled name of foo might look like foo@4%6^, for example.

One of the problems with name mangling is that the C++ standard (currently [ISO14882]) does not define how names have to be mangled; thus every compiler mangles names in its own way. Some compilers even change their name mangling algorithm between different versions (notably g++ 2.x and 3.x). Even if you worked out how your particular compiler mangles names (and would thus be able to load functions via dlsym), this would most probably work with your compiler only, and might already be broken with the next version.


3. The Solution


3.2. Loading Functions

In C++ functions are loaded just like in C, with dlsym. The functions you want to load must be qualified as extern "C" to avoid the symbol name being mangled.

The function hello is defined in hello.cppas extern "C"; it is loaded in main.cpp with the dlsym call. The function must be qualified as extern "C" because otherwise we wouldn't know its symbol name.


3.3. Loading Classes

Loading classes is a bit more difficult because we need an instance of a class, not just a pointer to a function.

We cannot create the instance of the class using new because the class is not defined in the executable, and because (under some circumstances) we don't even know its name.

The solution is achieved through polymorphism. We define a base, interface class with virtual members in the executable, and a derived, implementation class in the module. Generally the interface class is abstract (a class is abstract if it has pure virtual functions).

As dynamic loading of classes is generally used for plug-ins — which must expose a clearly defined interface — we would have had to define an interface and derived implementation classes anyway.

Next, while still in the module, we define two additional helper functions, known as class factory functions. One of these functions creates an instance of the class and returns a pointer to it. The other function takes a pointer to a class created by the factory and destroys it. These two functions are qualified as extern "C".

To use the class from the module, load the two factory functions using dlsym just as we loaded the the hello function; then, we can create and destroy as many instances as we wish.

Example 2. Loading a Class

Here we use a generic polygon class as interface and the derived class triangle as implementation.

main.cpp:

#include "polygon.hpp"
#include <iostream>
#include <dlfcn.h>

int main() {
    using std::cout;
    using std::cerr;

    // load the triangle library
    void* triangle = dlopen("./triangle.so", RTLD_LAZY);
    if (!triangle) {
        cerr << "Cannot load library: " << dlerror() << '\n';
        return 1;
    }

    // load the symbols
    create_t* create_triangle = (create_t*) dlsym(triangle, "create");
    destroy_t* destroy_triangle = (destroy_t*) dlsym(triangle, "destroy");
    if (!create_triangle || !destroy_triangle) {
        cerr << "Cannot load symbols: " << dlerror() << '\n';
        return 1;
    }

    // create an instance of the class
    polygon* poly = create_triangle();

    // use the class
    poly->set_side_length(7);
        cout << "The area is: " << poly->area() << '\n';

    // destroy the class
    destroy_triangle(poly);

    // unload the triangle library
    dlclose(triangle);
}

polygon.hpp:

#ifndef POLYGON_HPP
#define POLYGON_HPP

class polygon {
protected:
    double side_length_;

public:
    polygon()
        : side_length_(0) {}

    void set_side_length(double side_length) {
        side_length_ = side_length;
    }

    virtual double area() const = 0;
};

// the types of the class factories
typedef polygon* create_t();
typedef void destroy_t(polygon*);

#endif

triangle.cpp:

#include "polygon.hpp"
#include <cmath>

class triangle : public polygon {
public:
    virtual double area() const {
        return side_length_ * side_length_ * sqrt(3) / 2;
    }
};


// the class factories

extern "C" polygon* create() {
    return new triangle;
}

extern "C" void destroy(polygon* p) {
    delete p;
}

There are a few things to note when loading classes:


4. See Also


Bibliography

ISO14482 ISO/IEC 14482-1998 — The C++ Programming Language. Available as PDF and as printed book from http://webstore.ansi.org/.

STR2000 Bjarne Stroustrup The C++ Programming Language, Special Edition. ISBN 0-201-70073-5. Addison-Wesley.