Whenever you write code that does more than direct calculations, you will have to handle error cases. This includes anything that interacts with a user, parses command-line arguments, reads from a file, or even allocates dynamic memory. Consider the following example:
/* add.c: Adds two numbers and prints the result to standard output */
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv)
{
int i,j;
i = atoi(argv[1]);
j = atoi(argv[1]);
printf("%d\n", i + j);
return 0;
}
This looks very simple, but there are an amazing amount of things that can go wrong. What happens if the user doesn’t give you two arguments? What happens if they give you more than two? What if one of the arguments isn’t a number? Each of these cases counts as an exceptional case. When dealing with anything from outside your program, you have to assume that anything can happen and handle all of those cases. A more better version of the above example would look something like this:
/* sum.c: An improved version of add.c */
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv)
{
int i;
long int sum;
char *endptr;
if (argc <= 2) {
fprintf(stderr, "At least two arguments are required for a sum, ");
fprintf(stderr, "%d given.\n", argc - 1);
return 3;
}
sum = 0;
for (i = 1; i < argc; i++) {
endptr = argv[i];
sum += strtol(argv[i], &endptr, 10);
if (endptr == argv[i]) {
/* this will happen if argv[i] is not an integer */
fprintf(stderr, "Invalid Argument: %s\n", argv[i]);
return 3;
}
}
printf("%ld\n", sum);
return 0;
}
As you can see, we check two basic things. The first is a simple
check on the number of arguments. Then, as we loop through the arguments
we check to make sure that each argument given is an actual integer. If
we get an error in either of these cases, we simply print an error
message to stderr
and exit with a non-zero exit status.
The first thing to consider about any exceptional case is: What is an
appropriate response? For many errors, particularly in command-line
software, the best response is simply to give the user an error message
and quit. These are referred to as fatal errors. For
example, if malloc
returns NULL, the system has run out of
memory and there is probably nothing you can do. There are very few
times that you need to “nicely” handle an out-of-memory error. On the
other hand, suppose your program was processing a series of files one at
a time. What should you do if one of the files fails to open or if there
is an error processing it? In this case, you may want to print an error
letting the user know that one of the files failed and then continue to
process the file. It’s also possible that the best response is simply to
print and error and die, canceling the entire operation.
The proper response to an error depends on the type of the error, the type of program, and how the user interacts with it. Programs that have graphical interfaces should almost never have fatal errors. This is because users would much rather see a nice little error box than to have the program crash. On the other hand, in a command-line program that is not meant to be interactive, fatal errors are much more acceptable. Suppose the only way to pass a filename to your program is via a command line argument. Then, if the file fails to open, the best thing to do is probably to simply tell the user and quit.
Errors that need to be handled gracefully are a lot more difficult to deal with than errors that should simply kill the program. This is because, unlike in our simple example above, errors can occur deep inside the program far away from the main function. In this case, each function has to detect the error and either do something with it or clean up and tell the function that called it that there was an error.
Whenever you have an error that doesn’t just kill the program, you have to have some way of propagating it up the call stack. For example, suppose you have a function that opens a file. If the file fails to open, say, because it doesn’t exist, then you may want to tell the user and allow them to pick a different one. The problem is that the function that actually opens the file may be several function calls away from the point where the user is interacting with the program. Instead of handling the file open error and interacting with the user in a function that’s primarily I/O focused, it makes more sense to propagate the error back to the function interacting with the user. This way the program can ask the user for a new file and start the process over again.
People have come up with a number of different solutions to the error handling problem. Most object-oriented languages such as Java, C++, and C# have a concept of an exception that corresponds to an exceptional condition in the code and the language takes care of propagating the error up until the point where it is “caught”. The C language does not have an explicit exception concept. Instead, you have to manually propagate errors. While this can look like a lot of work, it isn’t usually too bad.
With any error, there are usually two pieces of information: that an
error occurred, and what happened. Sometimes these are combined into a
single error flag that has a value that indicates there is no error. The
exact way that errors are handled will vary from library to library or
program to program. In some of the C standard library, and in most
system calls, errors are handled with a combination of return values and
errno
. Usually, this is done by having an integer return
type and returning zero or a positive number in the standard case and a
negative number (usually -1) if there is an error. In the case of an
error, many functions will also set the errno
variable
(defined in errno.h
) to indicate the nature of the
error.
For example, in our array list example from before, we could change
the data_array_list_add_array
function to be more
error-safe as follows:
int
data_array_list_add_array(struct data_array_list *list,
struct data_struct *data, size_t num_elements)
{
size_t new_alloc, new_size, i;
struct data_struct *new_data;
/* See if the list is big enough */
if (list->alloc < list->size + num_elements) {
new_alloc = list->alloc * 2;
if (new_alloc < list->size + num_elements) {
new_alloc = list->size + num_elements;
}
new_data = malloc(new_alloc * sizeof(*list->data));
if (new_data == NULL) {
errno = ENOMEM;
return -1;
}
/* Copy the old data over to the new list */
for (i = 0; i < list->size; ++i) {
new_data[i] = list->data[i];
}
free(list->data);
list->data = new_data;
}
/* Copy the data in */
for (i = 0; i < num_elements; ++i) {
list->data[i + list->size] = data[i];
}
list->size += num_elements;
return 0;
}
This way, if your system runs out of memory, it will return an error
value and set errno
to ENOMEM
instead of
simply crashing. There are other ways of doing error propagation than
return values and errno
. Another error handling mechanism I
have seen uses an extra pass-by-reference argument for the error as
follows:
void my_function(int arg1, int arg2, int *error)
{
/* do something */
if (/* error condition */) {
if (error)
error = -1;
return;
}
}
There are many different mechanisms for error propagation. I’m not going to try and describe them all here. You will see many different ways of doing error handling as you gain more programming experience. The point here is to make you aware of them and to get you to think about error handling and propagation.
There are a lot of cases where there is no good response to an error other than to simply quit the program. In this case, we call the error fatal. For example, If your system runs out of memory, there would likely be nothing your program could do. In this case, you might simply print an error to the user and quit the program as follows:
success = data_array_list_add_array(&list, data, num_elements);
if (success < 0) {
perror("function_name");
abort();
}
There are a couple of new function calls there: perror
and abort
. The perror
function prints an error message to the screen corresponding to the
current value of errno
. This way you can tell the user what
happened without knowing all of the possible errno
values.
The second function, abort,
makes the program quit with a non-zero exit status. This way you don’t
have to bother propagating the error up to the main function. The exit
function is also good for this purpose. From a command-line perspective,
there is no difference between the main function returning the value
n
and calling exit(n)
.
In any case, it is almost always better to print an error message and quit than to simply let the program crash. This way you know what error occurred and where. Otherwise, you are liable to get a cascade of errors that build on each other until the program simply crashes. When this happens, it is much harder to find the problem because the program may crash in a very different piece of code than where the error originated.