The [GNU Scientific Library][1] (or GSL) provides implementations of various mathematical and statistical algorithms for use in scientific applications. Here, we will explore using some of the single-variable root solvers provided by GSL. The root solvers in GSL are designed so that the actual solver to use is determined by a single run-time parameter so that the solvers can be easily swapped out as long as they are of the same form. First, we will demonstrate how to use their basic root finding algorithms that need only the functions value and then we will demonstrate their newton’s method solver that also requires a derivative.
For this example, we will be solving a quadratic equation in the
general form ax2 + bx + c.
To begin, we create header file for our quadratic function called
quadratic.h
:
#ifndef __EXAMPLES_LIBGSL_QUADRATIC_H__
#define __EXAMPLES_LIBGSL_QUADRATIC_H__
struct quadratic_params {
double a, b, c;
};
/* Returns the value of the quadratic at x */
double quadratic(double x, void *params);
/* Returns the derivative of the quadratic at x */
double quadratic_deriv(double x, void *params);
/* Evaluates both the quadratic and its derivative at x and stores the
* value in *y and the derivative in *dy. */
void quadratic_fdf(double x, void *params, double *y, double *dy);
#endif /* !defined __EXAMPLES_LIBGSL_QUADRATIC_H__ */
This defines a simple data structure called
quadratic_params
that stores the coefficients for a general
quadratic along with functions that evaluate the quadratic at a point.
The value passed into params
will always be of the type
struct quadratic_params *
even though it is specified as
void *
(more on that in a bit). We also create a C source
file called quadratic.c
that implements these
functions:
#include "quadratic.h"
double quadratic(double x, void *params)
{
struct quadratic_params *p;
p = (struct quadratic_params *)params;
return (p->a * x + p->b) * x + p->c;
}
double quadratic_deriv(double x, void *params)
{
struct quadratic_params *p;
p = (struct quadratic_params *)params;
return 2.0 * p->a * x + p->b;
}
void quadratic_fdf(double x, void *params, double *y, double *dy)
{
*y = quadratic(x, params);
*dy = quadratic_deriv(x, params);
}
By now, most of the above code should look familiar to you. The one
thing you may not recognize is the pointer cast. The second argument of
each of these functions is a void pointer called params
which we immediately cast to a struct quadratic_params *
and store in p
. The cast operation doesn’t change the
address stored int the pointer, just the type of data to which it is
pointing. We have seen this before when we use malloc
. The
malloc
function returns a void pointer because it just
returns an unused block of memory which doesn’t have any type associated
to it until we give it one. In this case, we are getting something that
we assume is a pointer to a quadratic_params
structure with
meaningful values in it. Before we can see why this is a valid
assumption, we need to take a look at how GSL represents functions.
In order to represent a mathematical function, GSL provides a
gsl_function
type. To turn what we did above into a
gsl_function
representing the quadratic x2 − 5 we create a
gsl_function
called F
as follows:
gsl_function F;
struct quadratic_params params = { 1.0, 0.0, -5.0 };
F.function = &quadratic;
F.params = ¶ms;
The first line just declares the variable F of type
gsl_function
. The gsl_function
type is just a
structure with two fields. The reason we don’t need the
struct
keyword is that there is a typedef
inside of gsl_math.h
that defines the type
gsl_function
to be a particular structure. The second line
creates a quadratic_params
structure and fills it with
data. The braces notation is a nice short-hand you can use when
declaring structure variables; it is precisely equivalent to the
following:
struct quadratic_params params;
params.a = 1.0;
params.b = 0.0;
params.c = -0.5;
Declaring with the nice braces notation simply makes things a little
shorter and sometimes easier to read. The third and fourth lines fill
the gsl_function
structure with the function to use and the
parameters we created. Inside of mathematical code, you will probably
see this sort of thing a lot because it allows for a large amount of
generality (as opposed to hard-coding things). Sure we could write a
specific function such as
double x_squared_minus_5(double x)
{
return x * x - 5;
}
but that would then we would have to write a separate C function for
every mathematical function. Instead, we define a general quadratic
function that takes the variable x
and a pointer so a
structure that contains the coefficients. This way we can easily solve
any number different quadratics without writing a separate C function
for each one.
If you look at how the function quadratic
was declared
above, you will see that the second argument is of type
void *
instead of struct quadratic_params *
.
This is because GSL doesn’t have any idea what type of parameters your
function takes, so it simply accepts any pointer whatsoever and trusts
you, the programmer, to pass it the right one. We can know that the
types match because GSL guarantees that the only thing that will ever be
passed into the params
argument of quadratic
is the pointer we give it in the params
field of
gsl_function
. This way, as long as we give it a pointer to
a variable of the correct type, the value passed into
quadratic
will always be a pointer to a variable of the
correct type.
If you have been paying attention, the statement
F.function = &quadratic
should have bothered you a
little. Is quadratic
a variable? Where did it come from?
The only thing we have seen named quadratic
is a function.
How are we putting a function in a variable?
In C, there is a special type of pointer called a function
pointer. Function pointers, instead of pointing to variables or
data on the stack, point to functions. When the computer runs your
program, all of its compiled code is loaded into memory. A function
pointer simply points to the location in memory where the code for the
given function is stored. Unfortunately, the syntax for declaring
variables that are function pointers is kind of strange. It’s one of
those odd corners of C that most programmers don’t know very well. (As
you will soon see, there are very good reasons why most programmers
avoid them.) The gsl_function
type is actually defined as
follows (note the typedef):
struct gsl_function_struct
{
double (* function) (double x, void * params);
void * params;
};
typedef struct gsl_function_struct gsl_function ;
The first field is named function
that takes a pointer
to a function of a particular type. The type of function is one that
returns a double
and takes two arguments: one of type
double
and one of type void *
just like our
quadratic
function. The actual type of the
function
field is double (*)(double, void *)
(yes, that’s gross). It is important to note that the prototype of the
function is built into the function pointer’s type. This is so that C
knows how to properly call the function from just the pointer. However,
this also means that if you have a variable that is a function pointer,
it can only ever store a pointer to a function with that exact
prototype. This is why every function we use for
gsl_function
has to return a double
and take a
double
and a void *
. Whatever type of function
we put into the gsl_function
structure has to be able to
cover every type of real-valued mathematical function you would want to
solve. By using a void *
to store a sort of generic extra
structure, we can pass anything we want to the function by simply
putting it in a custom structure and passing in a pointer.
GSL provides a variety of solvers for finding roots. In our first
example, we will use a bracketed solver that requires only the
function’s value. Later we will look at a Newton’s method solver. The
people who wrote GSL have done a decent job of hiding most of the
details of the solvers behind a generic gsl_root_fsolver
type. A given root solver algorithm may have any number of state
variables to keep track of different things between iterations. Instead
of exposing all of those nasty details to us, the programmers, they have
hidden them behind this gsl_root_fsolver
type and simply
provided functions to work with it. The first thing we have to do is to
ask GSL to create a solver of the appropriate type for us.
const gsl_root_fsolver_type *solver_type;
gsl_root_fsolver *solver;
solver_type = gsl_root_fsolver_bisection;
solver = gsl_root_fsolver_alloc(solver_type);
The solver_type
variable holds a solver type. In this
case, it is gsl_root_fsolver_bisection
which is some value
(again, we don’t care what it really is) that represents a bisection
solver. The symbol gsl_root_fsolber_bisection
is a global
variable provided in gsl/gsl_roots.h
that contains whatever
data is needed to specify a bisection solver. When we call
gsl_root_fsolver_alloc
, GSL creates a new bisection solver
and gives it to us. We will be responsible for destroying it with a call
to gsl_root_fsolver_free
when we are done using it.
The next thing to do is to set a few parameters in our solver.
Specifically, it’s a bracketing solver so it needs an initial interval.
Also, it will need to know what function to solve. We have already
created a gsl_function
representing x * x - 5
,
so we can set the solver up as follows:
double x_lo = 0.0, x_hi = 5.0;
gsl_root_fsolver_set(solver, &F, x_lo, x_hi);
The gsl_root_fsolver_set
call sets the function as well
as the initial interval for the solution. We make x_lo
and
x_hi
variables because we will want to use them later
during the iteration step to test for convergence.
Now that we have the solver set up, we need to iterate over it. In
order to do that we call the gsl_root_fsolver_iterate
function that simply runs one step of the iteration. Once that step is
run, we can get current best root and error interval out of the solver
and check for convergence. A single iteration step looks like this:
gsl_root_fsolver_iterate(solver);
/* get the solver's current best solution and bounds */
r = gsl_root_fsolver_root(solver);
x_lo = gsl_root_fsolver_x_lower(solver);
x_hi = gsl_root_fsolver_x_upper(solver);
/* Check to see if the solution is within 0.001 */
status = gsl_root_test_interval(x_lo, x_hi, 0, 0.001);
if (status == GSL_SUCCESS)
printf("Converged:\n");
The first line actually performs the iteration. Then the next three
statements gets the guess and bounds generated by the iteration step.
Then, we use the gsl_root_test_interval
to see if our
interval is within the given tolerance. If
status == GSL_SUCCESS
then it is within tolerance and we
say that it has converged. Otherwise, we will have
status == GSL_CONTINUE
and we need to run another step of
the iteration.
Putting everything together and adding a few printf
statements and some error-checking, we get the following
main
function:
#include <stdio.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_roots.h>
#include "quadratic.h"
#define MAX_ITERATIONS 100
int main()
{
int status;
int i;
const gsl_root_fsolver_type *solver_type;
gsl_root_fsolver *solver;
gsl_function F;
double r;
/* We want to solve x^2 - 5 */
struct quadratic_params params = { 1.0, 0.0, -5.0 };
double r_expected = sqrt(5.0);
/* On the interval [0, 5] */
double x_lo = 0.0, x_hi = 5.0;
/* Set up the function */
F.function = &quadratic;
F.params = ¶ms;
/* Allocate a bisection solver and set it to use F */
solver_type = gsl_root_fsolver_bisection;
solver = gsl_root_fsolver_alloc(solver_type);
gsl_root_fsolver_set(solver, &F, x_lo, x_hi);
printf("using %s method\n", gsl_root_fsolver_name(solver));
printf("%5s [%9s, %9s] %9s %10s %9s\n",
"iter", "lower", "upper", "root", "err", "err(est)");
status = GSL_CONTINUE;
for (i = 1; i <= MAX_ITERATIONS && status == GSL_CONTINUE; ++i) {
/* iterate one step of the solver */
status = gsl_root_fsolver_iterate(solver);
if (status != GSL_SUCCESS)
break;
/* get the solver's current best solution and bounds */
r = gsl_root_fsolver_root(solver);
x_lo = gsl_root_fsolver_x_lower(solver);
x_hi = gsl_root_fsolver_x_upper(solver);
/* Check to see if the solution is within 0.001 */
status = gsl_root_test_interval(x_lo, x_hi, 0, 0.001);
if (status == GSL_SUCCESS)
printf("Converged:\n");
printf("%5d [%.7f, %.7f] %.7f %+.7f %.7f\n",
i, x_lo, x_hi, r, r - r_expected, x_hi - x_lo);
}
/* Free the solver */
gsl_root_fsolver_free(solver);
if (status == GSL_CONTINUE) {
printf("error: too many iterations");
} else if (status != GSL_SUCCESS) {
printf("error: %s\n", gsl_strerror(status));
}
return status;
}
Now let’s say that we want to use a more efficient Brent solver. Because of the way that GSL is designed, this is really easy. All we have to do is change one line in the above code. Specifically,
/* Allocate a bisection solver and set it to use F */
solver_type = gsl_root_fsolver_brent;
solver = gsl_root_fsolver_alloc(solver_type);
gsl_root_fsolver_set(solver, &F, x_lo, x_hi);
The line we changed is the one that assigns solver_type
now sets the solver type to gsl_root_fsolver_brent
. Now,
when we call gsl_root_fsolver_alloc
, it creates a Brent
solver instead of a bisection solver. Because GSL does everything with
these generic gsl_root_fsolver
pointers, we can easily swap
things out and let GSL figure out how to run the other solver.
Newton’s method is a little different than bisection or Brent because
Newton’s method requires derivatives. In order to do this, we have to
create a gsl_function_fdf
instead of a
gsl_function
. We can do this as follows:
gsl_function_fdf FDF;
FDF.f = &quadratic;
FDF.df = &quadratic_deriv;
FDF.fdf = &quadratic_fdf;
FDF.params = ¶ms;
The gsl_function_fdf
type has a few more fields. In
addition to a function field (now called simply f
), it also
has fields that take a derivative and a function that returns both the
derivative and the value of the function itself. You should also note
that we still have the params
field just like we did
before. There is no need to have multiple params
fields
because it is assumed that all three of the given functions correspond
to a single mathematical function and that that function can be
described by a single structure. The params
structure given
here will be passed into the params
argument of any of
those three functions whenever the function is called.
Most of the changes required to use the newton solver are simply a
matter of changing everything from gsl_root_fsolver
to
gsl_root_fdfsolver
. However things also change just a bit
at the iteration step. Like with the bracketing solvers, the iteration
step is performed by gsl_root_fdfsolver_iterate
. However,
the result is not an interval, but a new guess which we compare to the
previous guess. We do that as follows:
/* Get the new approximate solution */
x0 = x;
x = gsl_root_fdfsolver_root(solver);
/* Check to see if the solution is within 0.001 */
status = gsl_root_test_delta(x, x0, 0, 0.001);
if (status == GSL_SUCCESS)
printf("Converged:\n");
The gsl_root_test_delta
function checks the absolute
difference of x
and x0
to see if it is within
our specified tolerance. Once you put all of those pieces together, a
full Newton’s method solving function looks like this:
#include <stdio.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_roots.h>
#include "quadratic.h"
#define MAX_ITERATIONS 100
int main()
{
int status;
int i;
const gsl_root_fdfsolver_type *solver_type;
gsl_root_fdfsolver *solver;
gsl_function_fdf FDF;
double x0;
/* We want to solve x^2 - 5 */
struct quadratic_params params = { 1.0, 0.0, -5.0 };
double r_expected = sqrt(5.0);
/* Starting from a guess of 5.0 */
double x = 5;
/* Set up the function */
FDF.f = &quadratic;
FDF.df = &quadratic_deriv;
FDF.fdf = &quadratic_fdf;
FDF.params = ¶ms;
/* Allocate a newton solver and set it to use FDF */
solver_type = gsl_root_fdfsolver_newton;
solver = gsl_root_fdfsolver_alloc(solver_type);
gsl_root_fdfsolver_set(solver, &FDF, x);
printf("using %s method\n", gsl_root_fdfsolver_name(solver));
printf ("%-5s %10s %10s %10s\n", "iter", "root", "err", "err(est)");
status = GSL_CONTINUE;
for (i = 1; i <= MAX_ITERATIONS && status == GSL_CONTINUE; ++i) {
/* Iterate one step of the solver */
status = gsl_root_fdfsolver_iterate(solver);
if (status != GSL_SUCCESS)
break;
/* Get the new approximate solution */
x0 = x;
x = gsl_root_fdfsolver_root(solver);
/* Check to see if the solution is within 0.001 */
status = gsl_root_test_delta(x, x0, 0, 0.001);
if (status == GSL_SUCCESS)
printf("Converged:\n");
printf ("%5d %10.7f %+10.7f %10.7f\n", i, x, x - r_expected, x - x0);
}
/* Free the solver */
gsl_root_fdfsolver_free(solver);
if (status == GSL_CONTINUE) {
printf("error: too many iterations");
} else if (status != GSL_SUCCESS) {
printf("error: %s\n", gsl_strerror(status));
}
return status;
}
You may be wondering why we have f
, df
, and
fdf
functions on the gsl_function_fdf
structure when the fdf
function can easily be written based
on the other two. The reason is one of efficiency and this is something
you will see in many applied math applications. In a lot of algorithms
for root finding, maximization, etc. most of the time spent is in
evaluating the function (or its derivative) so we want this step to be
as fast as possible. In the above example, computing the value of the
function and its derivative are basically completely different
operations. However, this is often not the case. The usual case is one
where computing the derivative and computing the value of the function
itself require many of the same operations. By making a single function
that computes both, we can avoid duplicating work and make the solver
faster. For example, suppose we were trying to solve 1 − ex2.
In this case, we would write our functions as follows:
double f(double x, void *params)
{
return 1 - exp(x * x);
}
double df(double x, void *params)
{
return 2 * x * exp(x * x);
}
void quadratic_fdf(double x, void *params, double *y, double *dy)
{
double ex2 = exp(x * x);
*y = 1 - ex2;
*dy = 2 * x * ex2;
}
By writing the function this way, we can only evaluate
exp
once and get both the function value and the
derivative. This is a very good thing because evaluating the exponential
function is, itself, a numeric process involving a power series. By
cutting down the number of evaluations of the exponential function, we
can significantly increase the speed of the root finding algorithm.
Full source code for these examples can be found here.