Linking and working with multiple files

As your programming projects grow larger, they will contain more and more lines of code and keeping everything organized can be a challenge. Fortunately, thanks to the linking stage, you can break your program into multiple files. Splitting the project into one file per logical component keeps things much more organized and makes the program easier to write.

Go read the Wikipedia article on linkers.

Writing programs with multiple files

For the sake of example, consider the following program that computes the factorial of a number:

/* factorial.c: Contains the factorial function */

int factorial(int n)
{
    if (n < 1) {
        return 1;
    } else {
        return n * factorial(n - 1);
    }
}
/* main.c: Contains the main function that calls the factorial function */
#include <stdio.h>

int factorial(int i);

int main(int argc, char **argv)
{
    int n, fctl;

    if (argc <= 1) {
        fprintf(stderr, "Not enough arguments\n");
        return 1;
    }
    n = atoi(argv[1]);

    fctl = factorial(n);

    printf("%d\n", fctl);

    return 0;
}

The factorial.c file is fairly simple: it contains a nice little recursive factorial function. Most of the interesting stuff is in main.c. Consider the first few lines of the main function:

if (argc <= 1) {
    fprintf(stderr, "Not enough arguments\n");
    return 1;
}
n = atoi(argv[1]);

This is a simple example of argument parsing. We have seen many examples of arguments to command-line programs but haven’t talked much about how to make your program accept arguments. The standard main function takes two arguments: an int argument count and a char ** array of strings. Every command has at least one argument: the name of the executable. In the case above, we first check to see how many arguments we have. If the user did not provide enough arguments, we print a little error message and return with a non-zero value (non-zero return from main indicates an error). Otherwise, we take the second argument (which should be a number) and use the standard library function atoi to convert it to an integer.

Another thing we have yet to see is the following line:

int factorial(int i);

This is called a function prototype. Normally, at any given point in the file, the compiler only knows about things defined at some point earlier on in the current file. If you try and use a function that has not yet been declared, the compiler has to guess the argument and return types and may not get them correct. The prototype tells the compiler that there is a function called factorial declared somewhere and what its return type and arguments are. While the prototype does not actually define the function, it does give the compiler enough information to handle the function call. This way the function can be defined lower down in the file or in a different file entirely.

Compiling programs with multiple files

The next thing to discuss is how we compile this little program. The simplest way to compile a project with multiple files is to tell GCC about all of the files at once:

$ gcc -o factorial main.c factorial.c

While this works, it becomes very impractical once you start getting a lot of files. Also, this requires that you build the entire project every single time you make one little change. While not a problem for small projects, this can take a lot of time when your projects start to get large. The way to get around this is to take advantage of the linker:

$ gcc -c factorial.c
$ gcc -c main.c
$ gcc -o factorial main.o factorial.o

The first two commands tell GCC to create an object file out of C source code file. The object file contains the compiled version of the code in a format that the linker can piece together into an actual executable later. The third line is the actual linking command. On some systems, the linker is a different command; with the GNU toolchain, GCC is both the linker and the compiler.

While compiling the source files into object files first looks more complicated, it’s not that bad. This method has the advantage that, if you change something in your code, you only have to recompile the source files that have changed. Since the compiling process is more time-consuming than the linking process, this can save a lot of time in big projects. Eventually, we will learn about a tool called make that allows you to automate the process of building from multiple.