setjmp and longjmp Functions
In C, we can't goto a label that's in another function. Instead, we must use the setjmp and longjmp functions to perform this type of branching. As we'll see, these two functions are useful for handling error conditions that occur in a deeply nested function call.
Consider the skeleton in Figure 7.9. It consists of a main loop that reads lines from standard input and calls the function do_line to process each line. This function then calls get_token to fetch the next token from the input line. The first token of a line is assumed to be a command of some form, and a switch statement selects each command. For the single command shown, the function cmd_add is called.
Figure 7.9. Typical program skeleton for command processing
#include <setjmp.h>
#define TOK_ADD 5
void do_line(char *);
void cmd_add(void);
int get_token(void);
int
main(void)
{
char line[MAXLINE];
while (fgets(line, MAXLINE, stdin) != NULL)
do_line(line);
exit(0);
}
char *tok_ptr; /* global pointer for get_token() */
void
do_line(char *ptr) /* process one line of input */
{
int cmd;
tok_ptr = ptr;
while ((cmd = get_token()) > 0) {
switch (cmd) { /* one case for each command */
case TOK_ADD:
cmd_add();
break;
}
}
}
void
cmd_add(void)
{
int token;
token = get_token();
/* rest of processing for this command */
}
int
get_token(void)
{
/* fetch next token from line pointed to by tok_ptr */
}
The skeleton in Figure 7.9 is typical for programs that read commands, determine the command type, and then call functions to process each command. Figure 7.10 shows what the stack could look like after cmd_add has been called.
Figure 7.10. Stack frames after cmd_add has been called
Storage for the automatic variables is within the stack frame for each function. The array line is in the stack frame for main, the integer cmd is in the stack frame for do_line, and the integer token is in the stack frame for cmd_add.
As we've said, this type of arrangement of the stack is typical, but not required. Stacks do not have to grow toward lower memory addresses. On systems that don't have built-in hardware support for stacks, a C implementation might use a linked list for its stack frames.
The coding problem that's often encountered with programs like the one shown in Figure 7.9 is how to handle nonfatal errors. For example, if the cmd_add function encounters an errorsay, an invalid numberit might want to print an error, ignore the rest of the input line, and return to the main function to read the next input line. But when we're deeply nested numerous levels down from the main function, this is difficult to do in C. (In this example, in the cmd_add function, we're only two levels down from main, but it's not uncommon to be five or more levels down from where we want to return to.) It becomes messy if we have to code each function with a special return value that tells it to return one level.
The solution to this problem is to use a nonlocal goto: the setjmp and longjmp functions. The adjective nonlocal is because we're not doing a normal C goto statement within a function; instead, we're branching back through the call frames to a function that is in the call path of the current function.
#include <setjmp.h>
int setjmp(jmp_buf env);
Returns: 0 if called directly, nonzero if returning from a call to longjmp
void longjmp(jmp_buf env, int val);
We call setjmp from the location that we want to return to, which in this example is in the main function. In this case, setjmp returns 0 because we called it directly. In the call to setjmp, the argument env is of the special type jmp_buf. This data type is some form of array that is capable of holding all the information required to restore the status of the stack to the state when we call longjmp. Normally, the env variable is a global variable, since we'll need to reference it from another function.
When we encounter an errorsay, in the cmd_add functionwe call longjmp with two arguments. The first is the same env that we used in a call to setjmp, and the second, val, is a nonzero value that becomes the return value from setjmp. The reason for the second argument is to allow us to have more than one longjmp for each setjmp. For example, we could longjmp from cmd_add with a val of 1 and also call longjmp from get_token with a val of 2. In the main function, the return value from setjmp is either 1 or 2, and we can test this value, if we want, and determine whether the longjmp was from cmd_add or get_token.
Let's return to the example. Figure 7.11 shows both the main and cmd_add functions. (The other two functions, do_line and get_token, haven't changed.)
Figure 7.11. Example of setjmp and longjmp
#include "apue.h"
#include <setjmp.h>
#define TOK_ADD 5
jmp_buf jmpbuffer;
int
main(void)
{
char line[MAXLINE];
if (setjmp(jmpbuffer) != 0)
printf("error");
while (fgets(line, MAXLINE, stdin) != NULL)
do_line(line);
exit(0);
}
...
void
cmd_add(void)
{
int token;
token = get_token();
if (token < 0) /* an error has occurred */
longjmp(jmpbuffer, 1);
/* rest of processing for this command */
}
When main is executed, we call setjmp, which records whatever information it needs to in the variable jmpbuffer and returns 0. We then call do_line, which calls cmd_add, and assume that an error of some form is detected. Before the call to longjmp in cmd_add, the stack looks like that in Figure 7.10. But longjmp causes the stack to be "unwound" back to the main function, throwing away the stack frames for cmd_add and do_line (Figure 7.12). Calling longjmp causes the setjmp in main to return, but this time it returns with a value of 1 (the second argument for longjmp).
Figure 7.12. Stack frame after longjmp has been called
Automatic, Register, and Volatile Variables
We've seen what the stack looks like after calling longjmp. The next question is, "what are the states of the automatic variables and register variables in the main function?" When main is returned to by the longjmp, do these variables have values corresponding to when the setjmp was previously called (i.e., are their values rolled back), or are their values left alone so that their values are whatever they were when do_line was called (which caused cmd_add to be called, which caused longjmp to be called)? Unfortunately, the answer is "it depends." Most implementations do not try to roll back these automatic variables and register variables, but the standards say only that their values are indeterminate. If you have an automatic variable that you don't want rolled back, define it with the volatile attribute. Variables that are declared global or static are left alone when longjmp is executed.
Example
The program in Figure 7.13 demonstrates the different behavior that can be seen with automatic, global, register, static, and volatile variables after calling longjmp.
If we compile and test the program in Figure 7.13, with and without compiler optimizations, the results are different:
$ cc testjmp.c compile without any optimization
$ ./a.out
in f1():
globval = 95, autoval = 96, regival = 97, volaval = 98, statval = 99
after longjmp:
globval = 95, autoval = 96, regival = 97, volaval = 98, statval = 99
$ cc -O testjmp.c compile with full optimization
$ ./a.out
in f1():
globval = 95, autoval = 96, regival = 97, volaval = 98, statval = 99
after longjmp:
globval = 95, autoval = 2, regival = 3, volaval = 98, statval = 99
Note that the optimizations don't affect the global, static, and volatile variables; their values after the longjmp are the last values that they assumed. The setjmp(3) manual page on one system states that variables stored in memory will have values as of the time of the longjmp, whereas variables in the CPU and floating-point registers are restored to their values when setjmp was called. This is indeed what we see when we run the program in Figure 7.13. Without optimization, all five variables are stored in memory (the register hint is ignored for regival). When we enable optimization, both autoval and regival go into registers, even though the former wasn't declared register, and the volatile variable stays in memory. The thing to realize with this example is that you must use the volatile attribute if you're writing portable code that uses nonlocal jumps. Anything else can change from one system to the next.
Some printf format strings in Figure 7.13 are longer than will fit comfortably for display in a programming text. Instead of making multiple calls to printf, we rely on ISO C's string concatenation feature, where the sequence
"string1" "string2"
is equivalent to
"string1string2"
Figure 7.13. Effect of longjmp on various types of variables
#include "apue.h"
#include <setjmp.h>
static void f1(int, int, int, int);
static void f2(void);
static jmp_buf jmpbuffer;
static int globval;
int
main(void)
{
int autoval;
register int regival;
volatile int volaval;
static int statval;
globval = 1; autoval = 2; regival = 3; volaval = 4; statval = 5;
if (setjmp(jmpbuffer) != 0) {
printf("after longjmp:\n");
printf("globval = %d, autoval = %d, regival = %d,"
" volaval = %d, statval = %d\n",
globval, autoval, regival, volaval, statval);
exit(0);
}
/*
* Change variables after setjmp, but before longjmp.
*/
globval = 95; autoval = 96; regival = 97; volaval = 98;
statval = 99;
f1(autoval, regival, volaval, statval); /* never returns */
exit(0);
}
static void
f1(int i, int j, int k, int l)
{
printf("in f1():\n");
printf("globval = %d, autoval = %d, regival = %d,"
" volaval = %d, statval = %d\n", globval, i, j, k, l);
f2();
}
static void
f2(void)
{
longjmp(jmpbuffer, 1);
}
We'll return to these two functions, setjmp and longjmp, in Chapter 10 when we discuss signal handlers and their signal versions: sigsetjmp and siglongjmp.
Potential Problem with Automatic Variables
Having looked at the way stack frames are usually handled, it is worth looking at a potential error in dealing with automatic variables. The basic rule is that an automatic variable can never be referenced after the function that declared it returns. There are numerous warnings about this throughout the UNIX System manuals.
Figure 7.14 shows a function called open_data that opens a standard I/O stream and sets the buffering for the stream.
Figure 7.14. Incorrect usage of an automatic variable
#include <stdio.h>
#define DATAFILE "datafile"
FILE *
open_data(void)
{
FILE *fp;
char databuf[BUFSIZ]; /* setvbuf makes this the stdio buffer */
if ((fp = fopen(DATAFILE, "r")) == NULL)
return(NULL);
if (setvbuf(fp, databuf, _IOLBF, BUFSIZ) != 0)
return(NULL);
return(fp); /* error */
}
The problem is that when open_data returns, the space it used on the stack will be used by the stack frame for the next function that is called. But the standard I/O library will still be using that portion of memory for its stream buffer. Chaos is sure to result. To correct this problem, the array databuf needs to be allocated from global memory, either statically (static or extern) or dynamically (one of the alloc functions).