Memory and resource management

Probably the biggest complaint that I hear from people about learning C is memory management. They are confused about where all of those pointers are going and how do you keep track of them all.

When writing small programs you can often be lazy about memory (and resource) management because small programs don’t acquire many resources and often don’t run long enough for holding resources to become an issue. However, as programs grow, good resource management becomes both more difficult and more necessary. While it seems difficult at first, resource management isn’t nearly as bad as it looks. There are a number of very standard paradigms that have proven to work extremely well in the real world. While I will not claim to give a complete discussion of the topic, I will try to discuss the ones that I find to be the most useful in practice.

Before we go on, allow me to make a few definitions to aid our discussion. As a disclaimer, I do not know for sure if these terms (or the definitions I am about to give them) are standard in computer science. However, they will do for the current discussion.

Thus far, the only resources we have discussed are files and dynamically allocated memory but there are many other things that can fall into this category. The primary similarity between all of these is that they are all acquired from the system and need to be released when they are no longer in use. There are a lot of different techniques to resource management. The best way to handle a particular resource in your program depends on the nature of your program and of the resource in question. Here, I will attempt here to give a decent overview of different techniques along with their pros and cons.

Before we go on to specific techniques, we should note that when a program exists, regardless of how, all of its resources are automatically released. Therefore, if your program crashes or has a fatal error, you don’t need to worry about releasing resources and you are free to simply let it close.

Ownership

When I was young and complained about having to clean my room, my mother would tell me, “If you just put your toys away when you were done playing with them, this would never be a problem.” While I never liked this piece of advice as a child, it turns out to be a very good way to think about resource management. This concept, that I will call ownership, is the most basic resource management concept and all of the others can be built on top of it.

The simplest form of ownership, which you could call non-transferable ownership can be stated as follows: Whatever entity acquires a resource is responsible for releasing it. We have already seen this form of ownership in a couple of places. For one, it is implicit in the concept of a call stack. When a function gets called, its local variables get “created” and when the function returns, its local variables get “destroyed”. Technically, nothing is being dynamically allocated and deallocated, but there is still this concept of when a variable exists and is useable. Another place where we seen an implicit non-transferable ownership relationship is with the fields of a structure. Instead of thinking of a structure as a block of data that’s divided up, it is often useful to think of it as its own entity that owns its members. When the structure gets destroyed, so do the members.

Let’s look at an example of this model. Consider the following function that reads a matrix from a file. (Don’t worry too much about the matrix structure now, we will fill it out later.)

:::c
#include <stdio.h>
#include <errno.h>
#include <stdint.h>

struct matrix {
    unsigned int rows;
    unsigned int cols;
    size_t alloc_size;
    double *data;
};

int matrix_resize(struct matrix *mat, unsigned int rows, unsigned int cols);

int
matrix_read_from_file(struct matrix *mat, char *fname)
{
    uint32_t rows, cols;
    size_t size;
    FILE *file;

    file = fopen(fname, "r");
    if (file == NULL)
        return -1;

    size = fread(&rows, sizeof(rows), 1, file);
    if (size < sizeof(rows)) {
        if (! ferror(file)) {
            errno = EIO;
        }
        fclose(file);
        return -1;
    }

    size = fread(&cols, sizeof(cols), 1, file);
    if (size < sizeof(cols)) {
        if (! ferror(file)) {
            errno = EIO;
        }
        fclose(file);
        return -1;
    }

    if (matrix_resize(mat, rows, cols) < 0) {
        fclose(file);
        return -1;
    }

    size = fread(mat->data, sizeof(*mat->data), rows * cols, file);
    if (size < rows * cols) {
        if (! ferror(file)) {
            errno = EIO;
        }
        fclose(file);
        return -1;
    }

    fclose(file);

    return 0;
}

The thing to notice about the above example is that the matrix_read_from_file function both opens the file and closes it. We could have the matrix_read_from_file take a FILE * instead of a file name and, in that case, the calling function would be responsible for opening and closing in the file. By putting all of the open/close logic in the same function it is much easier to ensure that it is correct and that the file always gets closed. It also makes it easier when you write code that uses that function because you don’t need extra logic to figure out whether or not you have to close the file afterwards.

The other thing to notice in the above example is that the file is always closed, regardless of whether the operation was successful or not. Resource management is very closely tied to error handling in this way. You have to make sure that any resources that are acquired are released if there is an error. The errors in the case above are likely not fatal errors. (For example, you may want to just skip that file and continue when there is an error.) Therefore, if we don’t properly clean up our open files, they will stay open for the duration of the program.

Let us consider another example. We will fill out the matrix structure in the above example by adding a few functions for dealing with that data field.

:::c
#include <stdlib.h>
#include <errno.h>
#include <stdint.h>

struct matrix {
    unsigned int rows;
    unsigned int cols;
    size_t alloc_size;
    double *data;
};

/* Initializes an empty matrix */
void
matrix_init(struct matrix *mat)
{
    mat->rows = 0;
    mat->cols = 0;
    mat->alloc_size = 0;
    mat->data = NULL;
}

/* Initializes a matrix with a given size */
int
matrix_init_size(struct matrix *mat, unsigned int rows, unsigned int cols)
{
    mat->rows = rows;
    mat->cols = cols;
    mat->alloc_size = rows * cols;

    mat->data = malloc(mat->alloc_size * sizeof(*mat->data));
    if (mat->data == NULL) {
        errno = ENOMEM;
        return -1;
    }

    return 0;
}

/* Resizes the matrix.  The original matrix data may not be left intact */
int
matrix_resize(struct matrix *mat, unsigned int rows, unsigned int cols)
{
    double *new_data;

    if (mat->alloc_size < rows * cols || rows * cols < mat->alloc_size / 4) {
        /* Only reallocate if it is too small or way too big */

        mat->alloc_size = rows * cols;
        free(mat->data);
        mat->data = malloc(mat->alloc_size);
        if (mat->data == NULL) {
            mat->alloc_size = 0;
            errno = ENOMEM;
            return -1;
        }
    }

    mat->rows = rows;
    mat->cols = cols;

    return 0;
}

int
matrix_destroy(struct matrix *mat)
{
    free(mat->data);
}

In the above example, the matrix owns the data pointed to by its data field. The matrix itself may be the child of some other object. Even though you may have a local variable of type struct matrix, it still has to be acquired by one of the matrix_init functions and released by the matrix_destroy function. When the matrix is acquired/released it handles acquiring/releasing the dynamically allocated memory to store its entries. This way, all of the logic required to handle memory allocation for the internals of a matrix occurs in one place and you don’t have to worry about it anywhere else. When dealing with structures that have own other structures, it is often useful to invoke family tree terminology as we will see shortly.

Along with the concept of ownership is the idea that ownership can, in certain cases, be transferred. This transferable model of ownership can be described by the following principles:

If you decide to transfer ownership, the transfer should be clear and well documented. One example of ownership transfer is a return value. Say, for instance, that you have a function that returns a pointer to a structure. In this case, it is likely that the calling function will be responsible for freeing that pointer. This concept of transferring ownership is something that you have already seen. The malloc and free functions, for example, have an implicit transfer of ownership from the standard library to your code and back.

When transfering ownership from the calling function to the callee, you need to be very careful. You can transfer ownership this direction but you almost always want it to be unconditional. By unconditional, I mean that ownership of the resource is always transfered to the callee regardless of how the function is called and whether or not it is successful. You want to avoid cases where there is any ambiguity over who owns a particular resource. It is tempting, for instance, to write a file I/O function that takes a FILE * and closes it if there is an error. The problem here is that, not only do you have resource acquiring and releaseing code in two different places, but you also have to make the two work together in a complex way. While it may seem like a good idea at the time, this extra logic extremely error-prone. Instead, if you pass a FILE * into a function, then the called function should either leave it open or take ownership and always close it.

Tree structure (parent/child)

Another common way of thinking about ownership is with the parent/child or tree structure. In the tree structure, each object (usually a structure) can have any number of children. If A is a child of B then we say that B is the parent of A. In this setting, the parent is responsible for releasing its children when it gets released. The primary difference between this and ownership is one of terminology. Usually the tree terminology is reserved for structures (and objects in an object oriented programming setting).

There are two primary advantages to the tree terminology. First, is that it allows you to think less in terms of acquiring and releasing and more about relationships between the data. This can greatly simplify tracking ownership in many cases. For example, suppose you were writing a program that could interpret and evaluate mathematical expressions. If we were given the expression 5(3 + 4)/7, we might store this expression in the following tree:

Tree representation of the expression 5(3+4)/7

When parsing 5(3 + 4)/7, we could have one function that parses the expression and then hands back its representation. Instead of thinking about ownership in terms of the calling function receiving ownership each of those objects (numbers, products, sums, etc.) we can think of each element in the tree owning its children and the calling function receiving ownership of the top-most element in the tree. There are many other cases where we see trees. A few more examples are scenegraphs in computer graphics; document trees when parsing HTML, XML, or other document formats; or just any time you have structures inside of other structures.

A second advantage to the tree terminology is that it makes explicit the nature of the ownership graph. Specifically, the ownership graph in this setting is a directed graph where each vertex has indegree at most 1 and whose underlying simple graph is acyclic. In other words, each node only has one parent and nobody is their own grandfather. This is very important because it means that we can clean up an entire tree of resources by simply releasing the parent. In the above example, the owner of the tree simply needs to release the N/D node and the entire tree will get released.

Reference counting and copy-on-write

Reference counting is a more general memory management technique than trees or direct ownership. This added generality and power comes at the cost of being more complicated. A reference counted resource is one that keeps track of how many references to it exist and, when that count hits zero, is automatically released.

There are two different ways you can think about reference counting. One is to consider a reference as a resource in its own right that gets acquired and released. In this case, the reference count correspond to the number of currently active references. This count hitting zero indicates that no more references exist so it can’t be accessed anymore and can therefore be released. A second way to look at it is from a graph theory perspective. If you consider references as directed edges going from whatever has the reference to the resource, then the reference count is the indegree of the resource vertex. As long as your reference graph is acyclic, this allows you to very easily determine if any of the connected components are inaccessible from the rest of the graph and release the entire component if it is.

Before I go on to an example of reference counting, it should be noted that you will probably never have to implement it. That said, many libraries use reference counting internally so it will be good for you to know about it. Usually reference counting is used when you want to have a lot of references to a resource without all of the ownership snags associated with multiple references. Some objects, such as a large matrix, are too heavy to be constantly recreating or copying around. Also, you don’t want to have more copies than necessary of a 10,000x10,000 matrix because it takes about 800 MB, and you don’t have enough RAM to store a lot of them. The obvious answer to this would be to simply create the matrix in the heap and pass pointers to it around. However, this creates the problem of who’s job it is to release it. Using reference counting gives you the freedom to create multiple references while making sure that it gets released when your program is finished with it.

Let’s re-implement our matrix structure from above only with reference-counting in the back-end.

:::c
#include <stdlib.h>
#include <errno.h>
#include <stdint.h>
#include <string.h>

struct matrix_data {
    size_t alloc_size;
    unsigned int ref_count;
    double *data;
};

struct matrix {
    unsigned int rows;
    unsigned int cols;
    struct matrix_data *data;
};

struct matrix_data *
matrix_data_alloc(size_t elements)
{
    struct matrix_data *data;

    data = malloc(sizeof(*data->data));
    if (data == NULL) {
        errno = ENOMEM;
        return NULL;
    }
    
    data->alloc_size = elements;

    data->data = malloc(data->alloc_size * sizeof(*data->data));
    if (data->data == NULL) {
        free(data);
        errno = ENOMEM;
        return NULL;
    }

    return data;
}

/* resize a matrix_data structure */
int
matrix_data_resize(struct matrix_data *data, size_t elements)
{
    if (data->alloc_size < elements || elements < data->alloc_size / 4) {
        /* Only reallocate if it is too small or way too big */

        data->alloc_size = elements;
        free(data->data);
        data->data = malloc(data->alloc_size * sizeof(*data->data));
        if (data->data == NULL) {
            data->alloc_size = 0;
            errno = ENOMEM;
            return -1;
        }
    }

    return 0;
}

/* Decrements the matrix data reference count and releases it if needed */
void
matrix_data_decref(struct matrix_data *data)
{
    if (data->ref_count > 1) {
        data->ref_count--;
    } else {
        free(data->data);
        free(data);
    }
}

/* Initializes an empty matrix */
void
matrix_init(struct matrix *mat)
{
    mat->rows = 0;
    mat->cols = 0;
    mat->data = NULL;
}

/* Initializes a matrix with a given size */
int
matrix_init_size(struct matrix *mat, unsigned int rows, unsigned int cols)
{
    mat->rows = rows;
    mat->cols = cols;
    mat->data = matrix_data_alloc(rows * cols);
    if (mat->data == NULL) {
        return -1;
    }

    return 0;
}

void
matrix_copy(struct matrix *dest, struct matrix *src)
{
    src->data->ref_count++;
    matrix_data_decref(dest->data);

    dest->data = src->data;
    dest->rows = src->rows;
    dest->cols = src->cols;
}

int
matrix_deep_copy(struct matrix *dest, struct matrix *src)
{
    matrix_resize(dest, src->rows, src->cols);
    if (src->rows == 0 || src->cols == 0) {
        /* nothing to copy */
        return;
    }

    memcpy(dest->data->data, src->data->data,
            src->rows * src->cols * sizeof(src->data->data));
}

int
matrix_destroy(struct matrix *mat)
{
    if (mat->data != NULL) {
        matrix_data_decref(mat->data);
    }
}

The above example code looks very similar to our original matrix code. The main difference is that we now have a secondary structure matrix_data that actually stores the giant array of double. Instead of actually destroying the data in matrix_destroy we just decrement the reference count. When the reference count gets decremented by matrix_data_decref, it checks to see if the reference count hits zero and, if it does, destroys the matrix data.

Copy-on-write

While reference counting is powerful, what is even more powerful is the concept of copy-on-write. The idea behind copy-on-right is to leverage the power of reference counting to avoid copying unless absolutely needed. This is done by having reference-counted data and then checking the reference count before anything is done that would modify the data. If the operation being performed is a “write” operation (modifies the data) and if the reference count is above one, then it makes a copy and modifies the copy. This way you can copy the structure and perform read-only operations as much as you want without actually copying the data; however, if you ever modify the data, it ensures that no other references are affected. In this way, a copy-on-write structure “feels” like each individual reference is its own copy.

As an example, let’s modify our matrix structure again. Because of the way we constructed the structure, we really only need to modify the matrix_resize function to support copy-on-right.

/* Resizes the matrix.  The original matrix data may not be left intact */
int
matrix_resize(struct matrix *mat, unsigned int rows, unsigned int cols)
{
    if (rows == 0 || cols == 0) {
        /* If the new size would be 0, just remove the matrix_data */
        if (mat->data != NULL) {
            matrix_data_decref(mat->data);
            mat->data == NULL;
        }
    } else if (mat->data == NULL) {
        /* If we don't have a matrix_data, make a new one */
        mat->data = matrix_data_alloc(rows * cols);
        if (mat->data == NULL) {
            return -1;
        }
    } else if (mat->data->ref_count > 1) {
        /* We are not the only reference, so we need to copy */
        matrix_data_decref(mat->data);

        mat->data = matrix_data_alloc(rows * cols);
        if (mat->data == NULL) {
            return -1;
        }
    } else {
        /* Finally, if we are in this case, we can just resize */
        if (matrix_data_resize(mat->data, rows * cols) < 0) {
            return -1;
        }
    }

    mat->rows = rows;
    mat->cols = cols;

    return 0;
}

Because the matrix_read_from_file function calls matrix_resize, it does not need to be modified; matrix_resize will handle making a copy if needed.

Assuming we had another function called matrix_multiply, we could work with our new matrix structure as follows:

struct matrix A;
struct matrix B;

matrix_init(&A);
matrix_init(&B);

matrix_read_from_file(&B, fname);
matrix_copy(&A, &B);

/* Now A and B both reference the same data */

matrix_multiply(&B, &A, &A);

/* At this point, B == A^2 */

matrix_destroy(&A);
matrix_destroy(&B);

Pitfalls of reference counting

Before you decide that reference counting is the solution to everything, you need to be ware of a few pitfalls. While reference counting is powerful, it is also dangerous if not used carefully. When working with or considering using a reference-counted system, you need to consider a few things.

  1. It is often unnecessary and only adds complexity. Much of the time, a simple ownership model is sufficient and adding reference counting only makes things more complicated.

  2. It can be unpredictable and makes it harder to find bugs. In a simple ownership (or even tree) model, it is fairly easy to track memory leaks back to the source because every acquire is paired to a release (or assigning a parent). With reference counting, resources are being released in a different place than they are being acquired so the missing release could be anywhere in your code.

  3. If copy-on-write is used, it can actually lead to more copies, not less. When using a reference-counted resource, it can be tempting to hold on to extra references because they “don’t cost anything”. However, not releasing those references may mean that some other piece of code performs an unneeded copy. For example, suppose you have two matrices A and B that are both references to the same 10,000 by 10,000 matrix. Let’s say that, at some point in your program, you are no longer using A but never released it. If you change B in a copy-on-write system, a copy will be made and now you have two copies of the 10,000 by 10,000 matrix when only one copy is needed. While this sounds easy to avoid, it is often more subtle than it looks. For this reason, many reference-counted systems avoid copy-on-write and force the user to make explicit copies.

  4. Possibly the biggest issue with reference counting is cycles. At the beginning of the discussion of reference counting, we considered the references as forming a directed acyclic graph. If the reference graph ever has a cycle, you can end up with a component of the graph that is disconnected from the rest of your program but in which every resource has at least one reference. When this happens, you get a memory leak that is almost impossible to find. One partial solution to this is the concept of a weak reference which we will not discuss in detail here. A simpler solution is to ensure that all reference-counted resources are sinks; i.e. they do not hold references to anything. While it solves the problem, this significantly reduces the power of reference counting.

Garbage Collection

In many interpreted languages such as Python or Java, they use a garbage collector. A garbage collector constantly examines the reference graph and, when it encounters any bits that are disconnected from the original graph, it automatically releases them. This allows the language to remove most of the burden of memory management from the programmer (in most languages you still have to manually release things like files). This can be nice because it allows the programmer to be lazy and not have to worry about the details of memory management. Also, it can reduce memory leaks because you are trusting in a properly implemented garbage collector instead of every single piece of a given program.

While garbage collectors are nice, they have their downsides as well. One is that they are frequently less efficient than well-done memory management via one of these other methods. (In multi threaded applications, they can get better performance in some circumstances.) Also, garbage collection is non-deterministic in that things do not necessarily get deleted right away but instead wait until the garbage collector runs. When we get to C++, we will make very good use of deterministic destruction. Another issue is that, when the garbage collector runs, it stalls the entire program. While this isn’t an issue for most things, it can be a problem in hi-performance or real-time applications.