Arrays and Strings

Thus far the only context in which we have discussed variables is when you have a single variable that holds a single value. However, it is frequently useful to have a block of values that are grouped together. A couple immediate examples of this are strings and matrices. A string (discussed more below) can be thought of as simply a list of characters. A matrix, is a big block of (usually floating-point) numbers. In both of these cases, we would like to have a single variable that corresponds to the entire list or block of data. In C, there are two ways to group data together: arrays and structures. We will discuss arrays below and leave the discussion of structures for a later date.

Arrays

An array is a sequence or list of values of the same type that are storred sequentially in memory. To declare an array, you declare it just as you would a variable, only you put the number of items after it in brackets([]). For example, you could declare an array of 10 integers with the following line:

int arr[10];

To access the elements in the array, you use the [] (subscript) operator. If you wanted to access the fifth element in the array, you would write arr[4]. No, the number 4 there is not a typo. This is because C array indices start at 0, not 1 as in some other languages. The first element of any C array is actually given by arr[0].

The fact that arrays are stored sequentially in memory is important. This means that the above array of 10 integers consumes a single 40-byte block of memory (assuming a 32bit integer). When you access an array with arr[N], the computer knows where arr[0] is stored and it knows that N * sizeof(int) is the offset. We will discuss memory and offsets in detail later when we discuss pointers. The point here is that if you try to access arr[100] of the 10 element array above, the compiler will happily let you because all it knows are offsets. However, the program may crash as soon as it tries because that element doesn’t exist. This is called an overflow.

In order to demonstrate array semantics and show that they are useful, consider the following function:

int sum(int *values, unsigned int num_values)
{
    int s;
    int i;

    s = 0;
    for (i = 0; i < num_values; ++i)
        s += values[i];
    return s;
}

Strings

We have seen strings before, but only in the sense of passing them into the printf function. We can now discuss them in more detail. We said before that the purpose of the char type was to store a character. When we write on the page, our words are just a sequence of characters. Then what do you suppose a string is? That’s right, a string is stored as an array of char values. However, there is one more detail: the last of of the string is always the zero character. This allows us to always find the end of the string. There isn’t always a way in C to find out the length of an array. By making \0 a special character and placing it at the end of every string, we can easily find the ends of strings and ensure that we don’t overflow. You can easily count the number of characters in a string with the following function:

int strlen(char *string)
{
    int num;

    for (num = 0;; ++num) {
        if (string[num] == '\0') {
            return num;
        }
    }
}