Libraries and dynamic linking

A long time ago, people realized that re-writing everything for every program was simply not worth the effort. Let’s go back and look at the first and simplest C program we wrote:

#include <stdio.h>

int main(int argc, char **argv)
{
    /* Print a message to the screen */
    printf("Hello World!");

    return 0;
}

Take a moment and think about how much has to happen between printf("Hello World!") and the words “Hello World!” actually showing up on your screen. Seriously, think about that for a second. At bare minimum, the letters have to be converted into pixel values and those pixel values have to be somehow communicated to the graphics hardware which then sends them to the display. In reality, it gets vastly more complicated than that.

In order to handle all of these details in a reasonable way, programs are built out of layers upon layers of abstraction. One major abstraction layer is the operating systems which handles most of the details about talking to the hardware. The operating system is (roughly) comprised three things: the kernel which talks directly to the hardware; system libraries which allow programs to talk to the kernel; and system programs which provide most of the rest of the functionality including the user interface. On top of the operating system, a program may use any number of other libraries to do anything from number theory to drawing 3-D graphics.

Linking

We will eventually talk more about specific libraries or how to use them. Before we do that, however, we should continue our previous discussion of the linking stage of compiling. When you compile a standard multi-file program, the compiler first takes the C source code files and compiles them to create object files. Then, as a separate stage, it takes those object files and combines them together to form an executable. This last stage, called the linking stage is where the compiler actually computes memory addresses for all of the constant data and functions.

In the final compiled executable, functions and variables no longer have names in the same sense as the do in the source file. Instead, they are simply referred to by their addresses. When you call a function, it doesn’t actually look the function up by name; instead, it simply knows the address of the first instruction in the function and tells the computer to go there. When the object files are linked together, the linker arranges the contents of the object files into an executable and then goes through and fills in all those addresses.

During the linking stage, you can also link libraries into your code. A library is really just a bunch of compiled code that is packaged together. Libraries come in two forms: static and dynamic. A static library (.lib on windows, .a on Linux or MacOS) is little more than a bunch of object files put together. (You can think of it as a zip file full of object files.) When a static library is linked into your program, the objects in the standard library simply get added to the objects from your object files and the whole thing is then linked into an executable.

Dynamic libraries, on the other hand, are quite a bit more complicated. Instead of being linked into your program during the linker stage, dynamic libraries are linked into the program when it is executed. The dynamic library itself contains not only the compiled code but also a table of symbol names and addresses within the dynamic library. During the linking stage, the linker puts some sort of a place-holder in for the functions and constant data contained in the dynamic library. Then, when the program gets executed, it loads the dynamic library into memory and finishes the linking process.

Libraries

Why would you want to use libraries?