C PROGRAMMING INTERVIEW QUESTIONS AND ANSWERS:
Basic Concepts:
1. What is a local block?
A local block is any portion of a C program that is enclosed by the left brace ({) and the right brace (}). A C function contains left and right braces, and therefore anything between the two braces is contained in a local block. An ifstatement or a switch statement can also contain braces, so the portion of code between these two braces would be considered a local block.
Additionally, you might want to create your own local block without the aid of a C function or keyword construct. This is perfectly legal. Variables can be declared within local blocks, but they must be declared only at the beginning of a local block. Variables declared in this manner are visible only within the local block. Duplicate variable names declared within a local block take precedence over variables with the same name declared outside the local block. Here is an example of a program that uses local blocks:
#include <stdio.h>
void main(void);
void main()
{
/* Begin local block for function main() */
int test_var = 10;
printf("Test variable before the if statement: %d\n", test_var);
if (test_var > 5)
{
/* Begin local block for "if" statement */
int test_var = 5;
printf("Test variable within the if statement: %d\n",
test_var);
{
/* Begin independent local block (not tied to
any function or keyword) */
int test_var = 0;
printf(
"Test variable within the independent local block:%d\n",
test_var);
}
/* End independent local block */
}
/* End local block for "if" statement */
printf("Test variable after the if statement: %d\n", test_var);
}
/* End local block for function main() */
This example program produces the following output:
Test variable before the if statement: 10
Test variable within the if statement: 5
Test variable within the independent local block: 0
Test variable after the if statement: 10
Notice that as each test_var was defined, it took precedence over the previously defined test_var. Also notice that when the if statement local block had ended, the program had reentered the scope of the original test_var, and its value was 10.
2. Should variables be stored in local blocks?
The use of local blocks for storing variables is unusual and therefore should be avoided, with only rare exceptions. One of these exceptions would be for debugging purposes, when you might want to declare a local instance of a global variable to test within your function. You also might want to use a local block when you want to make your program more readable in the current context.
Sometimes having the variable declared closer to where it is used makes your program more readable. However, well-written programs usually do not have to resort to declaring variables in this manner, and you should avoid using local blocks.
3. When is a switch statement better than multiple if statements?
A switch statement is generally best to use when you have more than two conditional expressions based on a single variable of numeric type. For instance, rather than the code
if (x == 1)
printf("x is equal to one.\n");
else if (x == 2)
printf("x is equal to two.\n");
else if (x == 3)
printf("x is equal to three.\n");
else
printf("x is not equal to one, two, or three.\n");
the following code is easier to read and maintain:
switch (x)
{
case 1: printf("x is equal to one.\n");
break;
case 2: printf("x is equal to two.\n");
break;
case 3: printf("x is equal to three.\n");
break;
default: printf("x is not equal to one, two, or three.\n");
break;
}
Notice that for this method to work, the conditional expression must be based on a variable of numeric type in order to use the switch statement. Also, the conditional expression must be based on a single variable. For instance, even though the following if statement contains more than two conditions, it is not a candidate for using a switchstatement because it is based on string comparisons and not numeric comparisons:
char* name = "Lupto";
if (!stricmp(name, "Isaac"))
printf("Your name means 'Laughter'.\n");
else if (!stricmp(name, "Amy"))
printf("Your name means 'Beloved'.\n ");
else if (!stricmp(name, "Lloyd"))
printf("Your name means 'Mysterious'.\n ");
else
printf("I haven't a clue as to what your name means.\n");
4. Is a default case necessary in a switch statement?
No, but it is not a bad idea to put default statements in switch statements for error- or logic-checking purposes. For instance, the following switch statement is perfectly normal:
switch (char_code)
{
case 'Y':
case 'y': printf("You answered YES!\n");
break;
case 'N':
case 'n': printf("You answered NO!\n");
break;
}
Consider, however, what would happen if an unknown character code were passed to this switch statement. The program would not print anything. It would be a good idea, therefore, to insert a default case where this condition would be taken care of:
...
default: printf("Unknown response: %d\n", char_code);
break;
...
Additionally, default cases come in handy for logic checking. For instance, if your switch statement handled a fixed number of conditions and you considered any value outside those conditions to be a logic error, you could insert adefault case which would flag that condition. Consider the following example:
void move_cursor(int direction)
{
switch (direction)
{
case UP: cursor_up();
break;
case DOWN: cursor_down();
break;
case LEFT: cursor_left();
break;
case RIGHT: cursor_right();
break;
default: printf("Logic error on line number %ld!!!\n",
__LINE__);
break;
}
}
5. Can the last case of a switch statement skip including the break?
Even though the last case of a switch statement does not require a break statement at the end, you should addbreak statements to all cases of the switch statement, including the last case. You should do so primarily because your program has a strong chance of being maintained by someone other than you who might add cases but neglect to notice that the last case has no break statement.
This oversight would cause what would formerly be the last case statement to "fall through" to the new statements added to the bottom of the switch statement. Putting a break after each case statement would prevent this possible mishap and make your program more "bulletproof." Besides, most of today's optimizing compilers will optimize out the last break, so there will be no performance degradation if you add it.
6. Other than in a for statement, when is the comma operator used?
The comma operator is commonly used to separate variable declarations, function arguments, and expressions, as well as the elements of a for statement. Look closely at the following program, which shows some of the many ways a comma can be used:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main()
{
/* Here, the comma operator is used to separate
three variable declarations. */
int i, j, k;
/* Notice how you can use the comma operator to perform
multiple initializations on the same line. */
i = 0, j = 1, k = 2;
printf("i = %d, j = %d, k = %d\n", i, j, k);
/* Here, the comma operator is used to execute three expressions
in one line: assign k to i, increment j, and increment k.
The value that i receives is always the rightmost expression. */
i = (j++, k++);
printf("i = %d, j = %d, k = %d\n", i, j, k);
/* Here, the while statement uses the comma operator to
assign the value of i as well as test it. */
while (i = (rand() % 100), i != 50)
printf("i is %d, trying again...\n", i);
printf("\nGuess what? i is 50!\n");
}
Notice the line that reads
i = (j++, k++);
This line actually performs three actions at once. These are the three actions, in order:
1. Assigns the value of k to i. This happens because the left value (lvalue) always evaluates to the rightmost argument. In this case, it evaluates to k. Notice that it does not evaluate to k++, because k++ is a postfix incremental expression, and k is not incremented until the assignment of k to i is made. If the expression had read ++k, the value of ++k would be assigned to i because it is a prefix incremental expression, and it is incremented before the assignment is made.
2. Increments j.
3. Increments k.
Also, notice the strange-looking while statement:
while (i = (rand() % 100), i != 50)
printf("i is %d, trying again...\n");
Here, the comma operator separates two expressions, each of which is evaluated for each iteration of the whilestatement. The first expression, to the left of the comma, assigns i to a random number from 0 to 99.
The second expression, which is more commonly found in a while statement, is a conditional expression that tests to see whether i is not equal to 50. For each iteration of the while statement, i is assigned a new random number, and the value of i is checked to see that it is not 50. Eventually, i is randomly assigned the value 50, and the whilestatement terminates.
7. How can you tell whether a loop ended prematurely?
Generally, loops are dependent on one or more variables. Your program can check those variables outside the loop to ensure that the loop executed properly. For instance, consider the following example:
#define REQUESTED_BLOCKS 512
int x;
char* cp[REQUESTED_BLOCKS];
/* Attempt (in vain, I must add...) to
allocate 512 10KB blocks in memory. */
for (x=0; x< REQUESTED_BLOCKS; x++)
{
cp[x] = (char*) malloc(10000, 1);
if (cp[x] == (char*) NULL)
break;
}
/* If x is less than REQUESTED_BLOCKS,
the loop has ended prematurely. */
if (x < REQUESTED_BLOCKS)
printf("Bummer! My loop ended prematurely!\n");
Notice that for the loop to execute successfully, it would have had to iterate through 512 times. Immediately following the loop, this condition is tested to see whether the loop ended prematurely. If the variable x is anything less than 512, some error has occurred.
8. What is the difference between goto and long jmp( ) and setjmp()?
A goto statement implements a local jump of program execution, and the longjmp() and setjmp() functions implement a nonlocal, or far, jump of program execution. Generally, a jump in execution of any kind should be avoided because it is not considered good programming practice to use such statements as goto and longjmp in your program.
A goto statement simply bypasses code in your program and jumps to a predefined position. To use the gotostatement, you give it a labeled position to jump to. This predefined position must be within the same function. You cannot implement gotos between functions. Here is an example of a goto statement:
void bad_programmers_function(void)
{
int x;
printf("Excuse me while I count to 5000...\n");
x = 1;
while (1)
{
printf("%d\n", x);
if (x == 5000)
goto all_done;
else
x = x + 1;
}
all_done:
printf("Whew! That wasn't so bad, was it?\n");
}
This example could have been written much better, avoiding the use of a goto statement. Here is an example of an improved implementation:
void better_function(void)
{
int x;
printf("Excuse me while I count to 5000...\n");
for (x=1; x<=5000; x++)
printf("%d\n", x);
printf("Whew! That wasn't so bad, was it?\n");
}
As previously mentioned, the longjmp() and setjmp() functions implement a nonlocal goto. When your program calls setjmp(), the current state of your program is saved in a structure of type jmp_buf. Later, your program can call the longjmp() function to restore the program's state as it was when you called setjmp(). Unlike the gotostatement, the longjmp() and setjmp() functions do not need to be implemented in the same function.
However, there is a major drawback to using these functions: your program, when restored to its previously saved state, will lose its references to any dynamically allocated memory between the longjmp() and the setjmp(). This means you will waste memory for every malloc() or calloc() you have implemented between your longjmp()and setjmp(), and your program will be horribly inefficient. It is highly recommended that you avoid using functions such as longjmp() and setjmp() because they, like the goto statement, are quite often an indication of poor programming practice.
Here is an example of the longjmp() and setjmp() functions:
#include <stdio.h>
#include <setjmp.h>
jmp_buf saved_state;
void main(void);
void call_longjmp(void);
void main(void)
{
int ret_code;
printf("The current state of the program is being saved...\n");
ret_code = setjmp(saved_state);
if (ret_code == 1)
{
printf("The longjmp function has been called.\n");
printf("The program's previous state has been restored.\n");
exit(0);
}
printf("I am about to call longjmp and\n");
printf("return to the previous program state...\n");
call_longjmp();
}
void call_longjmp(void)
{
longjmp(saved_state, 1);
}
9. What is an lvalue?
An lvalue is an expression to which a value can be assigned. The lvalue expression is located on the left side of an assignment statement, whereas an rvalue is located on the right side of an assignment statement. Each assignment statement must have an lvalue and an rvalue. The lvalue expression must reference a storable variable in memory. It cannot be a constant. For instance, the following lines show a few examples of lvalues:
int x;
int* p_int;
x = 1;
*p_int = 5;
The variable x is an integer, which is a storable location in memory. Therefore, the statement x = 1 qualifies x to be an lvalue. Notice the second assignment statement, *p_int = 5. By using the * modifier to reference the area of memory that p_int points to, *p_int is qualified as an lvalue. In contrast, here are a few examples of what would not be considered lvalues:
#define CONST_VAL 10
int x;
/* example 1 */
1 = x;
/* example 2 */
CONST_VAL = 5;
In both statements, the left side of the statement evaluates to a constant value that cannot be changed because constants do not represent storable locations in memory. Therefore, these two assignment statements do not contain lvalues and will be flagged by your compiler as errors.
10. Can an array be an lvalue?
Is an array an expression to which we can assign a value? The answer to this question is no, because an array is composed of several separate array elements that cannot be treated as a whole for assignment purposes. The following statement is therefore illegal:
int x[5], y[5];
x = y;
You could, however, use a for loop to iterate through each element of the array and assign values individually, such as in this example:
int i;
int x[5];
int y[5];
...
for (i=0; i<5; i++)
x[i] = y[i]
...
Additionally, you might want to copy the whole array all at once. You can do so using a library function such as thememcpy() function, which is shown here:
memcpy(x, y, sizeof(y));
It should be noted here that unlike arrays, structures can be treated as lvalues. Thus, you can assign one structure variable to another structure variable of the same type, such as this:
typedef struct t_name
{
char last_name[25];
char first_name[15];
char middle_init[2];
} NAME;
...
NAME my_name, your_name;
...
your_name = my_name;
...
In the preceding example, the entire contents of the my_name structure were copied into the your_name structure. This is essentially the same as the following line:
memcpy(your_name, my_name, sizeof(your_name));
11. What is an rvalue?
rvalue can be defined as an expression that can be assigned to an lvalue. The rvalue appears on the right side of an assignment statement.
Unlike an lvalue, an rvalue can be a constant or an expression, as shown here:
int x, y;
x = 1; /* 1 is an rvalue; x is an lvalue */
y = (x + 1); /* (x + 1) is an rvalue; y is an lvalue */
An assignment statement must have both an lvalue and an rvalue. Therefore, the following statement would not compile because it is missing an rvalue:
int x; x = void_function_call() /* the function void_function_call() returns nothing */
If the function had returned an integer, it would be considered an rvalue because it evaluates into something that the lvalue, x, can store.
12. Is left-to-right or right-to-left order guaranteed for operator precedence?
The simple answer to this question is neither. The C language does not always evaluate left-to-right or right-to-left. Generally, function calls are evaluated first, followed by complex expressions and then simple expressions.
Additionally, most of today's popular C compilers often rearrange the order in which the expression is evaluated in order to get better optimized code. You therefore should always implicitly define your operator precedence by using parentheses.
For example, consider the following expression:
a = b + c/d / function_call() * 5
The way this expression is to be evaluated is totally ambiguous, and you probably will not get the results you want. Instead, try writing it by using implicit operator precedence:
a = b + (((c/d) / function_call()) * 5)
Using this method, you can be assured that your expression will be evaluated properly and that the compiler will not rearrange operators for optimization purposes.
13. What is the difference between ++var and var++?
The ++ operator is called the increment operator. When the operator is placed before the variable (++var), the variable is incremented by 1 before it is used in the expression. When the operator is placed after the variable (var++), the expression is evaluated, and then the variable is incremented by 1.
The same holds true for the decrement operator (--). When the operator is placed before the variable, you are said to have a prefix operation. When the operator is placed after the variable, you are said to have a postfix operation.
For instance, consider the following example of postfix incrementation:
int x, y;
x = 1;
y = (x++ * 5);
In this example, postfix incrementation is used, and x is not incremented until after the evaluation of the expression is done. Therefore, y evaluates to 1 times 5, or 5. After the evaluation, x is incremented to 2.
Now look at an example using prefix incrementation:
int x, y;
x = 1;
y = (++x * 5);
This example is the same as the first one, except that this example uses prefix incrementation rather than postfix. Therefore, x is incremented before the expression is evaluated, making it 2. Hence, y evaluates to 2 times 5, or 10.
14. What does the modulus operator do?
The modulus operator (%) gives the remainder of two divided numbers. For instance, consider the following portion of code:
x = 15/7
If x were an integer, the resulting value of x would be 2. However, consider what would happen if you were to apply the modulus operator to the same equation:
x = 15%7
The result of this expression would be the remainder of 15 divided by 7, or 1. This is to say that 15 divided by 7 is 2 with a remainder of 1.
The modulus operator is commonly used to determine whether one number is evenly divisible into another. For instance, if you wanted to print every third letter of the alphabet, you would use the following code:
int x;
for (x=1; x<=26; x++)
if ((x%3) == 0)
printf("%c", x+64);
The preceding example would output the string "cfilorux", which represents every third letter in the alphabet.
Variables ans Data Storage's:
1. Where in memory are my variables stored?
Variables can be stored in several places in memory, depending on their lifetime. Variables that are defined outside any function (whether of global or file static scope), and variables that are defined inside a function as static variables, exist for the lifetime of the program's execution. These variables are stored in the "data segment." The data segment is a fixed-size area in memory set aside for these variables. The data segment is subdivided into two parts, one for initialized variables and another for uninitialized variables.
Variables that are defined inside a function as auto variables (that are not defined with the keyword static) come into existence when the program begins executing the block of code (delimited by curly braces {}) containing them, and they cease to exist when the program leaves that block of code.
Variables that are the arguments to functions exist only during the call to that function. These variables are stored on the "stack". The stack is an area of memory that starts out small and grows automatically up to some predefined limit. In DOS and other systems without virtual memory, the limit is set either when the program is compiled or when it begins executing. In UNIX and other systems with virtual memory, the limit is set by the system, and it is usually so large that it can be ignored by the programmer.
The third and final area doesn't actually store variables but can be used to store data pointed to by variables. Pointer variables that are assigned to the result of a call to the malloc() function contain the address of a dynamically allocated area of memory. This memory is in an area called the "heap." The heap is another area that starts out small and grows, but it grows only when the programmer explicitly calls malloc() or other memory allocation functions, such as calloc(). The heap can share a memory segment with either the data segment or the stack, or it can have its own segment. It all depends on the compiler options and operating system. The heap, like the stack, has a limit on how much it can grow, and the same rules apply as to how that limit is determined.
2. Do variables need to be initialized?
No. All variables should be given a value before they are used, and a good compiler will help you find variables that are used before they are set to a value. Variables need not be initialized, however. Variables defined outside a function or defined inside a function with the static keyword are already initialized to 0 for you if you do not explicitly initialize them.
Automatic variables are variables defined inside a function or block of code without the static keyword. These variables have undefined values if you don't explicitly initialize them. If you don't initialize an automatic variable, you must make sure you assign to it before using the value.
Space on the heap allocated by calling malloc() contains undefined data as well and must be set to a known value before being used. Space allocated by calling calloc() is set to 0 for you when it is allocated.
3. What is page thrashing?
Some operating systems (such as UNIX or Windows in enhanced mode) use virtual memory. Virtual memory is a technique for making a machine behave as if it had more memory than it really has, by using disk space to simulate RAM (random-access memory). In the 80386 and higher Intel CPU chips, and in most other modern microprocessors (such as the Motorola 68030, Sparc, and Power PC), exists a piece of hardware called the Memory Management Unit, or MMU.
The MMU treats memory as if it were composed of a series of "pages." A page of memory is a block of contiguous bytes of a certain size, usually 4096 or 8192 bytes. The operating system sets up and maintains a table for each running program called the Process Memory Map, or PMM. This is a table of all the pages of memory that program can access and where each is really located.
Every time your program accesses any portion of memory, the address (called a "virtual address") is processed by the MMU. The MMU looks in the PMM to find out where the memory is really located (called the "physical address"). The physical address can be any location in memory or on disk that the operating system has assigned for it. If the location the program wants to access is on disk, the page containing it must be read from disk into memory, and the PMM must be updated to reflect this action (this is called a "page fault"). Hope you're still with me, because here's the tricky part. Because accessing the disk is so much slower than accessing RAM, the operating system tries to keep as much of the virtual memory as possible in RAM. If you're running a large enough program (or several small programs at once), there might not be enough RAM to hold all the memory used by the programs, so some of it must be moved out of RAM and onto disk (this action is called "paging out").
The operating system tries to guess which areas of memory aren't likely to be used for a while (usually based on how the memory has been used in the past). If it guesses wrong, or if your programs are accessing lots of memory in lots of places, many page faults will occur in order to read in the pages that were paged out. Because all of RAM is being used, for each page read in to be accessed, another page must be paged out. This can lead to more page faults, because now a different page of memory has been moved to disk. The problem of many page faults occurring in a short time, called "page thrashing," can drastically cut the performance of a system.
Programs that frequently access many widely separated locations in memory are more likely to cause page thrashing on a system. So is running many small programs that all continue to run even when you are not actively using them. To reduce page thrashing, you can run fewer programs simultaneously. Or you can try changing the way a large program works to maximize the capability of the operating system to guess which pages won't be needed. You can achieve this effect by caching values or changing lookup algorithms in large data structures, or sometimes by changing to a memory allocation library which provides an implementation of malloc() that allocates memory more efficiently. Finally, you might consider adding more RAM to the system to reduce the need to page out.
4. What is a const pointer?
The access modifier keyword const is a promise the programmer makes to the compiler that the value of a variable will not be changed after it is initialized. The compiler will enforce that promise as best it can by not enabling the programmer to write code which modifies a variable that has been declared const.
A "const pointer," or more correctly, a "pointer to const," is a pointer which points to data that is const (constant, or unchanging). A pointer to const is declared by putting the word const at the beginning of the pointer declaration. This declares a pointer which points to data that can't be modified. The pointer itself can be modified. The following example illustrates some legal and illegal uses of a const pointer:
const char *str = "hello";
char c = *str /* legal */
str++; /* legal */
*str = 'a'; /* illegal */
str[1] = 'b'; /* illegal */
The first two statements here are legal because they do not modify the data that str points to. The next two statements are illegal because they modify the data pointed to by str.
Pointers to const are most often used in declaring function parameters. For instance, a function that counted the number of characters in a string would not need to change the contents of the string, and it might be written this way:
my_strlen(const char *str)
{
int count = 0;
while (*str++)
{
count++;
}
return count;
}
Note that non-const pointers are implicitly converted to const pointers when needed, but const pointers are not converted to non-const pointers. This means that my_strlen() could be called with either a const or a non-const character pointer.
5. When should the register modifier be used? Does it really help?
The register modifier hints to the compiler that the variable will be heavily used and should be kept in the CPU's registers, if possible, so that it can be accessed faster. There are several restrictions on the use of the register modifier.
First, the variable must be of a type that can be held in the CPU's register. This usually means a single value of a size less than or equal to the size of an integer. Some machines have registers that can hold floating-point numbers as well.
Second, because the variable might not be stored in memory, its address cannot be taken with the unary and operator. An attempt to do so is flagged as an error by the compiler. Some additional rules affect how useful the register modifier is. Because the number of registers is limited, and because some registers can hold only certain types of data (such as pointers or floating-point numbers), the number and types of register modifiers that will actually have any effect are dependent on what machine the program will run on. Any additional register modifiers are silently ignored by the compiler.
Also, in some cases, it might actually be slower to keep a variable in a register because that register then becomes unavailable for other purposes or because the variable isn't used enough to justify the overhead of loading and storing it.
So when should the register modifier be used? The answer is never, with most modern compilers. Early C compilers did not keep any variables in registers unless directed to do so, and the register modifier was a valuable addition to the language. C compiler design has advanced to the point, however, where the compiler will usually make better decisions than the programmer about which variables should be stored in registers. In fact, many compilers actually ignore the register modifier, which is perfectly legal, because it is only a hint and not a directive.
In the rare event that a program is too slow, and you know that the problem is due to a variable being stored in memory, you might try adding the register modifier as a last resort, but don't be surprised if this action doesn't change the speed of the program.
6. When should the volatile modifier be used?
The volatile modifier is a directive to the compiler's optimizer that operations involving this variable should not be optimized in certain ways. There are two special cases in which use of the volatile modifier is desirable. The first case involves memory-mapped hardware (a device such as a graphics adaptor that appears to the computer's hardware as if it were part of the computer's memory), and the second involves shared memory (memory used by two or more programs running simultaneously).
Most computers have a set of registers that can be accessed faster than the computer's main memory. A good compiler will perform a kind of optimization called "redundant load and store removal." The compiler looks for places in the code where it can either remove an instruction to load data from memory because the value is already in a register, or remove an instruction to store data to memory because the value can stay in a register until it is changed again anyway.
If a variable is a pointer to something other than normal memory, such as memory-mapped ports on a peripheral, redundant load and store optimizations might be detrimental. For instance, here's a piece of code that might be used to time some operation:
time_t time_addition(volatile const struct timer *t, int a)
{
int n;
int x;
time_t then;
x = 0;
then = t->value;
for (n = 0; n < 1000; n++)
{
x = x + a;
}
return t->value - then;
}
In this code, the variable t->value is actually a hardware counter that is being incremented as time passes. The function adds the value of a to x 1000 times, and it returns the amount the timer was incremented by while the 1000 additions were being performed.
Without the volatile modifier, a clever optimizer might assume that the value of t does not change during the execution of the function, because there is no statement that explicitly changes it. In that case, there's no need to read it from memory a second time and subtract it, because the answer will always be 0. The compiler might therefore "optimize" the function by making it always return 0.
If a variable points to data in shared memory, you also don't want the compiler to perform redundant load and store optimizations. Shared memory is normally used to enable two programs to communicate with each other by having one program store data in the shared portion of memory and the other program read the same portion of memory. If the compiler optimizes away a load or store of shared memory, communication between the two programs will be affected.
7. Can a variable be both const and volatile?
Yes. The const modifier means that this code cannot change the value of the variable, but that does not mean that the value cannot be changed by means outside this code. For instance, the timer structure was accessed through a volatile const pointer. The function itself did not change the value of the timer, so it was declared const. However, the value was changed by hardware on the computer, so it was declared volatile. If a variable is both const andvolatile, the two modifiers can appear in either order.
8. When should the const modifier be used?
There are several reasons to use const pointers. First, it allows the compiler to catch errors in which code accidentally changes the value of a variable, as in
while (*str = 0) /* programmer meant to write *str != 0 */
{
/* some code here */
str++;
}
in which the = sign is a typographical error. Without the const in the declaration of str, the program would compile but not run properly.
Another reason is efficiency. The compiler might be able to make certain optimizations to the code generated if it knows that a variable will not be changed.
Any function parameter which points to data that is not modified by the function or by any function it calls should declare the pointer a pointer to const. Function parameters that are passed by value (rather than through a pointer) can be declared const if neither the function nor any function it calls modifies the data.
In practice, however, such parameters are usually declared const only if it might be more efficient for the compiler to access the data through a pointer than by copying it.
9. How reliable are floating-point comparisons?
Floating-point numbers are the "black art" of computer programming. One reason why this is so is that there is no optimal way to represent an arbitrary number. The Institute of Electrical and Electronic Engineers (IEEE) has developed a standard for the representation of floating-point numbers, but you cannot guarantee that every machine you use will conform to the standard.
Even if your machine does conform to the standard, there are deeper issues. It can be shown mathematically that there are an infinite number of "real" numbers between any two numbers. For the computer to distinguish between two numbers, the bits that represent them must differ. To represent an infinite number of different bit patterns would take an infinite number of bits. Because the computer must represent a large range of numbers in a small number of bits (usually 32 to 64 bits), it has to make approximate representations of most numbers.
Because floating-point numbers are so tricky to deal with, it's generally bad practice to compare a floating- point number for equality with anything. Inequalities are much safer. If, for instance, you want to step through a range of numbers in small increments, you might write this:
#include <stdio.h>
const float first = 0.0;
const float last = 70.0;
const float small = 0.007;
main()
{
float f;
for (f = first; f != last && f < last + 1.0; f += small)
;
printf("f is now %g\n", f);
}
However, rounding errors and small differences in the representation of the variable small might cause f to never be equal to last (it might go from being just under it to being just over it). Thus, the loop would go past the value last. The inequality f < last + 1.0 has been added to prevent the program from running on for a very long time if this happens. If you run this program and the value printed for f is 71 or more, this is what has happened.
A safer way to write this loop is to use the inequality f < last to test for the loop ending, as in this example:
float f;
for (f = first; f < last; f += small)
;
You could even precompute the number of times the loop should be executed and use an integer to count iterations of the loop, as in this example:
float f;
int count = (last - first) / small;
for (f = first; count-- > 0; f += small)
10. How can you determine the maximum value that a numeric variable can hold?
The easiest way to find out how large or small a number that a particular type can hold is to use the values defined in the ANSI standard header file limits.h. This file contains many useful constants defining the values that can be held by various types, including these:
Value Description
CHAR_BIT - Number of bits in a char
CHAR_MAX - Maximum decimal integer value of a char
CHAR_MIN - Minimum decimal integer value of a char
MB_LEN_MAX - Maximum number of bytes in a multibyte character
INT_MAX - Maximum decimal value of an int
INT_MIN - Minimum decimal value of an int
LONG_MAX - Maximum decimal value of a long
LONG_MIN - Minimum decimal value of a long
SCHAR_MAX - Maximum decimal integer value of a signed char
SCHAR_MIN - Minimum decimal integer value of a signed char
SHRT_MAX - Maximum decimal value of a short
SHRT_MIN - Minimum decimal value of a short
UCHAR_MAX - Maximum decimal integer value of unsigned char
UINT_MAX - Maximum decimal value of an unsigned integer
ULONG_MAX - Maximum decimal value of an unsigned long int
USHRT_MAX - Maximum decimal value of an unsigned short int
For integral types, on a machine that uses two's complement arithmetic (which is just about any machine you're likely to use), a signed type can hold numbers from -2(number of bits - 1) to +2(number of bits - 1) - 1.
An unsigned type can hold values from 0 to +2(number of bits)- 1. For instance, a 16-bit signed integer can hold numbers from -215(-32768) to +215 - 1 (32767).
11. Are there any problems with performing mathematical operations on different variable types?
C has three categories of built-in data types: pointer types, integral types, and floating-point types. Pointer types are the most restrictive in terms of the operations that can be performed on them. They are limited to
- subtraction of two pointers, valid only when both pointers point to elements in the same array. The result is the same as subtracting the integer subscripts corresponding to the two pointers.
+ addition of a pointer and an integral type. The result is a pointer that points to the element which would be selected by that integer.
Floating-point types consist of the built-in types float, double, and long double. Integral types consist of char, unsigned char, short, unsigned short, int, unsigned int, long, and unsigned long. All of these types can have the following arithmetic operations performed on them:
+ Addition
- Subtraction
* Multiplication
/ Division
Integral types also can have those four operations performed on them, as well as the following operations: % Modulo or remainder of division
<< Shift left
>> Shift right
& Bitwise AND operation
| Bitwise OR operation
^ Bitwise exclusive OR operation
! Logical negative operation
~ Bitwise "one's complement" operation
Although C permits "mixed mode" expressions (an arithmetic expression involving different types), it actually converts the types to be the same type before performing the operations (except for the case of pointer arithmetic described previously). The process of automatic type conversion is called "operator promotion."
12. What is operator promotion?
If an operation is specified with operands of two different types, they are converted to the smallest type that can hold both values. The result has the same type as the two operands wind up having. To interpret the rules, read the following table from the top down, and stop at the first rule that applies.
If Either Operand Is And the Other Is Change Them To
long double - any other type - long double
double - any smaller type - double
float - any smaller type - float
unsigned long - any integral type - unsigned long
long - unsigned > LONG_MAX - long
long - any smaller type - long
unsigned - any signed type - unsigned
The following example code illustrates some cases of operator promotion. The variable f1 is set to 3/4. Because both 3 and 4 are integers, integer division is performed, and the result is the integer 0. The variable f2 is set to 3/4.0. Because 4.0 is a float, the number 3 is converted to a float as well, and the result is the float 0.75.
#include <stdio.h>
main()
{
float f1 = 3 / 4;
float f2 = 3 / 4.0;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
13. When should a type cast be used?
There are two situations in which to use a type cast. The first use is to change the type of an operand to an arithmetic operation so that the operation will be performed properly. The variable f1 is set to the result of dividing the integer iby the integer j. The result is 0, because integer division is used. The variable f2 is set to the result of dividing i by jas well. However, the (float) type cast causes i to be converted to a float. That in turn causes floating-point division to be used and gives the result 0.75.
#include <stdio.h>
main()
{
int i = 3;
int j = 4;
float f1 = i / j;
float f2 = (float) i / j;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
The second case is to cast pointer types to and from void * in order to interface with functions that expect or return void pointers. For example, the following line type casts the return value of the call to malloc() to be a pointer to afoo structure.
struct foo *p = (struct foo *) malloc(sizeof(struct foo));
14. When should a type cast not be used?
A type cast should not be used to override a const or volatile declaration. Overriding these type modifiers can cause the program to fail to run correctly.
A type cast should not be used to turn a pointer to one type of structure or data type into another. In the rare events in which this action is beneficial, using a union to hold the values makes the programmer's intentions clearer.
15. Is it acceptable to declare/define a variable in a C header?
A global variable that must be accessed from more than one file can and should be declared in a header file. In addition, such a variable must be defined in one source file. Variables should not be defined in header files, because the header file can be included in multiple source files, which would cause multiple definitions of the variable.
The ANSI C standard will allow multiple external definitions, provided that there is only one initialization. But because there's really no advantage to using this feature, it's probably best to avoid it and maintain a higher level of portability.
"Global" variables that do not have to be accessed from more than one file should be declared static and should not appear in a header file.
16. What is the difference between declaring a variable and defining a variable?
Declaring a variable means describing its type to the compiler but not allocating any space for it. Defining a variable means declaring it and also allocating space to hold the variable. You can also initialize a variable at the time it is defined. Here is a declaration of a variable and a structure, and two variable definitions, one with initialization:
extern int decl1; /* this is a declaration */
struct decl2
{
int member;
}; /* this just declares the type--no variable mentioned */
int def1 = 8; /* this is a definition */
int def2; /* this is a definition */
To put it another way, a declaration says to the compiler, "Somewhere in my program will be a variable with this name, and this is what type it is." A definition says, "Right here is this variable with this name and this type."
A variable can be declared many times, but it must be defined exactly once. For this reason, definitions do not belong in header files, where they might get #included into more than one place in your program.
17. Can static variables be declared in a header file?
You can't declare a static variable without defining it as well (this is because the storage class modifiers static andextern are mutually exclusive). A static variable can be defined in a header file, but this would cause each source file that included the header file to have its own private copy of the variable, which is probably not what was intended.
18. What is the benefit of using const for declaring constants?
The benefit of using the const keyword is that the compiler might be able to make optimizations based on the knowledge that the value of the variable will not change. In addition, the compiler will try to ensure that the values won't be changed inadvertently.
Of course, the same benefits apply to #defined constants. The reason to use const rather than #define to define a constant is that a const variable can be of any type (such as a struct, which can't be represented by a #definedconstant). Also, because a const variable is a real variable, it has an address that can be used, if needed, and it resides in only one place in memory.
BITS AND BYTES:
1. What is the most efficient way to store flag values?
A flag is a value used to make a decision between two or more options in the execution of a program. For instance, the /w flag on the MS-DOS dir command causes the command to display filenames in several columns across the screen instead of displaying them one per line. In which a flag is used to indicate which of two possible types is held in a union. Because a flag has a small number of values (often only two), it is tempting to save memory space by not storing each flag in its own int or char.
Efficiency in this case is a tradeoff between size and speed. The most memory-space efficient way to store a flag value is as single bits or groups of bits just large enough to hold all the possible values. This is because most computers cannot address individual bits in memory, so the bit or bits of interest must be extracted from the bytes that contain it.
The most time-efficient way to store flag values is to keep each in its own integer variable. Unfortunately, this method can waste up to 31 bits of a 32-bit variable, which can lead to very inefficient use of memory. If there are only a few flags, it doesn't matter how they are stored. If there are many flags, it might be advantageous to store them packed in an array of characters or integers. They must then be extracted by a process called bit masking, in which unwanted bits are removed from the ones of interest.
Sometimes it is possible to combine a flag with another value to save space. It might be possible to use high- order bits of integers that have values smaller than what an integer can hold. Another possibility is that some data is always a multiple of 2 or 4, so the low-order bits can be used to store a flag.
2. What is meant by "bit masking"?
Bit masking means selecting only certain bits from byte(s) that might have many bits set. To examine some bits of a byte, the byte is bitwise "ANDed" with a mask that is a number consisting of only those bits of interest. For instance, to look at the one's digit (rightmost digit) of the variable flags, you bitwise AND it with a mask of one (the bitwise AND operator in C is &):
flags & 1;
To set the bits of interest, the number is bitwise "ORed" with the bit mask (the bitwise OR operator in C is |). For instance, you could set the one's digit of flags like so:
flags = flags | 1;
Or, equivalently, you could set it like this:
flags |= 1;
To clear the bits of interest, the number is bitwise ANDed with the one's complement of the bit mask. The "one's complement" of a number is the number with all its one bits changed to zeros and all its zero bits changed to ones. The one's complement operator in C is ~. For instance, you could clear the one's digit of flags like so:
flags = flags & ~1;
Or, equivalently, you could clear it like this:
flags &= ~1;
Sometimes it is easier to use macros to manipulate flag values.
Example Program : Macros that make manipulating flags easier.
/* Bit Masking */
/* Bit masking can be used to switch a character
between lowercase and uppercase */
#define BIT_POS(N) ( 1U << (N) )
#define SET_FLAG(N, F) ( (N) |= (F) )
#define CLR_FLAG(N, F) ( (N) &= -(F) )
#define TST_FLAG(N, F) ( (N) & (F) )
#define BIT_RANGE(N, M) ( BIT_POS((M)+1 - (N))-1 << (N) )
#define BIT_SHIFTL(B, N) ( (unsigned)(B) << (N) )
#define BIT_SHIFTR(B, N) ( (unsigned)(B) >> (N) )
#define SET_MFLAG(N, F, V) ( CLR_FLAG(N, F), SET_FLAG(N, V) )
#define CLR_MFLAG(N, F) ( (N) &= ~(F) )
#define GET_MFLAG(N, F) ( (N) & (F) )
#include <stdio.h>
void main()
{
unsigned char ascii_char = 'A'; /* char = 8 bits only */
int test_nbr = 10;
printf("Starting character = %c\n", ascii_char);
/* The 5th bit position determines if the character is
uppercase or lowercase.
5th bit = 0 - Uppercase
5th bit = 1 - Lowercase */
printf("\nTurn 5th bit on = %c\n", SET_FLAG(ascii_char, BIT_POS(5)) );
printf("Turn 5th bit off = %c\n\n", CLR_FLAG(ascii_char, BIT_POS(5)) );
printf("Look at shifting bits\n");
printf("=====================\n");
printf("Current value = %d\n", test_nbr);
printf("Shifting one position left = %d\n",
test_nbr = BIT_SHIFTL(test_nbr, 1) );
printf("Shifting two positions right = %d\n",
BIT_SHIFTR(test_nbr, 2) );
}
BIT_POS(N) takes an integer N and returns a bit mask corresponding to that single bit position (BIT_POS(0) returns a bit mask for the one's digit, BIT_POS(1) returns a bit mask for the two's digit, and so on). So instead of writing
#define A_FLAG 4096
#define B_FLAG 8192
you can write
#define A_FLAG BIT_POS(12)
#define B_FLAG BIT_POS(13)
which is less prone to errors.
The SET_FLAG(N, F) macro sets the bit at position F of variable N. Its opposite is CLR_FLAG(N, F), which clears the bit at position F of variable N. Finally, TST_FLAG(N, F) can be used to test the value of the bit at position F of variable N, as in
if (TST_FLAG(flags, A_FLAG))
/* do something */;
The macro BIT_RANGE(N, M) produces a bit mask corresponding to bit positions N through M, inclusive. With this macro, instead of writing
#define FIRST_OCTAL_DIGIT 7 /* 111 */
#define SECOND_OCTAL_DIGIT 56 /* 111000 */
you can write
#define FIRST_OCTAL_DIGIT BIT_RANGE(0, 2) /* 111 */
#define SECOND_OCTAL_DIGIT BIT_RANGE(3, 5) /* 111000 */
which more clearly indicates which bits are meant.
The macro BIT_SHIFT(B, N) can be used to shift value B into the proper bit range (starting with bit N). For instance, if you had a flag called C that could take on one of five possible colors, the colors might be defined like this:
#define C_FLAG BIT_RANGE(8, 10) /* 11100000000 */
/* here are all the values the C flag can take on */
#define C_BLACK BIT_SHIFTL(0, 8) /* 00000000000 */
#define C_RED BIT_SHIFTL(1, 8) /* 00100000000 */
#define C_GREEN BIT_SHIFTL(2, 8) /* 01000000000 */
#define C_BLUE BIT_SHIFTL(3, 8) /* 01100000000 */
#define C_WHITE BIT_SHIFTL(4, 8) /* 10000000000 */
#define C_ZERO C_BLACK
#define C_LARGEST C_WHITE
/* A truly paranoid programmer might do this */
#if C_LARGEST > C_FLAG
Cause an error message. The flag C_FLAG is not
big enough to hold all its possible values.
#endif /* C_LARGEST > C_FLAG */
The macro SET_MFLAG(N, F, V) sets flag F in variable N to the value V. The macro CLR_MFLAG(N, F) is identical to CLR_FLAG(N, F), except the name is changed so that all the operations on multibit flags have a similar naming convention. The macro GET_MFLAG(N, F) gets the value of flag F in variable N, so it can be tested, as in
if (GET_MFLAG(flags, C_FLAG) == C_BLUE)
/* do something */;
3. Are bit fields portable?
Bit fields are not portable. Because bit fields cannot span machine words, and because the number of bits in a machine word is different on different machines, a particular program using bit fields might not even compile on a particular machine.
Assuming that your program does compile, the order in which bits are assigned to bit fields is not defined. Therefore, different compilers, or even different versions of the same compiler, could produce code that would not work properly on data generated by compiled older code. Stay away from using bit fields, except in cases in which the machine can directly address bits in memory and the compiler can generate code to take advantage of it and the increase in speed to be gained would be essential to the operation of the program.
4. Is it better to bitshift a value than to multiply by 2?
Any decent optimizing compiler will generate the same code no matter which way you write it. Use whichever form is more readable in the context in which it appears. The following program's assembler code can be viewed with a tool such as CODEVIEW on DOS/Windows or the disassembler (usually called "dis") on UNIX machines:
Example: Multiplying by 2 and shifting left by 1 are often the same.
void main()
{
unsigned int test_nbr = 300;
test_nbr *= 2;
test_nbr = 300;
test_nbr <<= 1;
}
5. What is meant by high-order and low-order bytes?
We generally write numbers from left to right, with the most significant digit first. To understand what is meant by the "significance" of a digit, think of how much happier you would be if the first digit of your paycheck was increased by one compared to the last digit being increased by one.
The bits in a byte of computer memory can be considered digits of a number written in base 2. That means the least significant bit represents one, the next bit represents 2´1, or 2, the next bit represents 2´2´1, or 4, and so on. If you consider two bytes of memory as representing a single 16-bit number, one byte will hold the least significant 8 bits, and the other will hold the most significant 8 bits. Figure shows the bits arranged into two bytes. The byte holding the least significant 8 bits is called the least significant byte, or low-order byte. The byte containing the most significant 8 bits is the most significant byte, or high- order byte.
6. How are 16- and 32-bit numbers stored?
A 16-bit number takes two bytes of storage, a most significant byte and a least significant byte. If you write the 16-bit number on paper, you would start with the most significant byte and end with the least significant byte. There is no convention for which order to store them in memory, however.
Let's call the most significant byte M and the least significant byte L. There are two possible ways to store these bytes in memory. You could store M first, followed by L, or L first, followed by M. Storing byte M first in memory is called "forward" or "big-endian" byte ordering. The term big endian comes from the fact that the "big end" of the number comes first, and it is also a reference to the book Gulliver's Travels, in which the term refers to people who eat their boiled eggs with the big end on top.
Storing byte L first is called "reverse" or "little-endian" byte ordering. Most machines store data in a big- endian format. Intel CPUs store data in a little-endian format, however, which can be confusing when someone is trying to connect an Intel microprocessor-based machine to anything else.
A 32-bit number takes four bytes of storage. Let's call them Mm, Ml, Lm, and Ll in decreasing order of significance. There are 4! (4 factorial, or 24) different ways in which these bytes can be ordered. Over the years, computer designers have used just about all 24 ways. The most popular two ways in use today, however, are (Mm, Ml, Lm, Ll), which is big-endian, and (Ll, Lm, Ml, Mm), which is little-endian. As with 16-bit numbers, most machines store 32-bit numbers in a big-endian format, but Intel machines store 32-bit numbers in a little-endian format.
Functions:
1. When should I declare a function?
Functions that are used only in the current source file should be declared as static, and the function's declaration should appear in the current source file along with the definition of the function. Functions used outside of the current source file should have their declarations put in a header file, which can be included in whatever source file is going to use that function. For instance, if a function named stat_func() is used only in the source file stat.c, it should be declared as shown here:
/* stat.c */
#include <stdio.h>
static int stat_func(int, int); /* static declaration of stat_func() */
void main(void);
void main(void)
{
...
rc = stat_func(1, 2);
...
}
/* definition (body) of stat_func() */
static int stat_func(int arg1, int arg2)
{
...
return rc;
}
In this example, the function named stat_func() is never used outside of the source file stat.c. There is therefore no reason for the prototype (or declaration) of the function to be visible outside of the stat.c source file. Thus, to avoid any confusion with other functions that might have the same name, the declaration of stat_func() should be put in the same source file as the declaration of stat_func().
In the following example, the function glob_func() is declared and used in the source file global.c and is used in the source file extern.c. Because glob_func() is used outside of the source file in which it's declared, the declaration of glob_func() should be put in a header file (in this example, named proto.h) to be included in both the global.c and the extern.c source files. This is how it's done:
/* proto.h */
int glob_func(int, int); /* declaration of the glob_func() function */
/* global.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void main(void);
void main(void)
{
...
rc = glob_func(1, 2);
...
}
/* definition (body) of the glob_func() function */
int glob_func(int arg1, int arg2)
{
...
return rc;
}
/* extern.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void ext_func(void);
void ext_func(void)
{
...
/* call glob_func(), which is defined in the global.c source file */
rc = glob_func(10, 20);
...
}
In the preceding example, the declaration of glob_func() is put in the header file named proto.h becauseglob_func() is used in both the global.c and the extern.c source files. Now, whenever glob_func() is going to be used, you simply need to include the proto.h header file, and you will automatically have the function's declaration. This will help your compiler when it is checking parameters and return values from global functions you are using in your programs. Notice that your function declarations should always appear before the first function declaration in your source file.
In general, if you think your function might be of some use outside of the current source file, you should put its declaration in a header file so that other modules can access it. Otherwise, if you are sure your function will never be used outside of the current source file, you should declare the function as static and include the declaration only in the current source file.
2. Why should I prototype a function?
A function prototype tells the compiler what kind of arguments a function is looking to receive and what kind of return value a function is going to give back. This approach helps the compiler ensure that calls to a function are made correctly and that no erroneous type conversions are taking place. For instance, consider the following prototype:
int some_func(int, char*, long);
Looking at this prototype, the compiler can check all references (including the definition of some_func()) to ensure that three parameters are used (an integer, a character pointer, and then a long integer) and that a return value of type integer is received. If the compiler finds differences between the prototype and calls to the function or the definition of the function, an error or a warning can be generated to avoid errors in your source code. For instance, the following examples would be flagged as incorrect, given the preceding
prototype of some_func():
x = some_func(1); /* not enough arguments passed */
x = some_func("HELLO!", 1, "DUDE!"); /* wrong type of arguments used */
x = some_func(1, str, 2879, "T"); /* too many arguments passed */
/* In the following example, the return value expected
from some_func() is not an integer: */
long* lValue;
lValue = some_func(1, str, 2879); /* some_func() returns an int,
not a long* */
Using prototypes, the compiler can also ensure that the function definition, or body, is correct and correlates with the prototype. For instance, the following definition of some_func() is not the same as its prototype, and it therefore would be flagged by the compiler:
int some_func(char* string, long lValue, int iValue) /* wrong order of
parameters */
{
...
}
The bottom line on prototypes is that you should always include them in your source code because they provide a good error-checking mechanism to ensure that your functions are being used correctly. Besides, many of today's popular compilers give you warnings when compiling if they can't find a prototype for a function that is being referenced.
3. How many parameters should a function have?
There is no set number or "guideline" limit to the number of parameters your functions can have. However, it is considered bad programming style for your functions to contain an inordinately high (eight or more) number of parameters. The number of parameters a function has also directly affects the speed at which it is called—the more parameters, the slower the function call. Therefore, if possible, you should minimize the number of parameters you use in a function. If you are using more than four parameters, you might want to rethink your function design and calling conventions.
One technique that can be helpful if you find yourself with a large number of function parameters is to put your function parameters in a structure. Consider the following program, which contains a function namedprint_report() that uses 10 parameters. Instead of making an enormous function declaration and proto- type, theprint_report() function uses a structure to get its parameters:
#include <stdio.h>
typedef struct
{
int orientation;
char rpt_name[25];
char rpt_path[40];
int destination;
char output_file[25];
int starting_page;
int ending_page;
char db_name[25];
char db_path[40];
int draft_quality;
} RPT_PARMS;
void main(void);
int print_report(RPT_PARMS*);
void main(void)
{
RPT_PARMS rpt_parm; /* define the report parameter
structure variable */
...
/* set up the report parameter structure variable to pass to the
print_report() function */
rpt_parm.orientation = ORIENT_LANDSCAPE;
rpt_parm.rpt_name = "QSALES.RPT";
rpt_parm.rpt_path = "C:\REPORTS";
rpt_parm.destination = DEST_FILE;
rpt_parm.output_file = "QSALES.TXT";
rpt_parm.starting_page = 1;
rpt_parm.ending_page = RPT_END;
rpt_parm.db_name = "SALES.DB";
rpt_parm.db_path = "C:\DATA";
rpt_parm.draft_quality = TRUE;
/* Call the print_report() function, passing it a pointer to the
parameters instead of passing it a long list of 10 separate
parameters. */
ret_code = print_report(&rpt_parm);
...
}
int print_report(RPT_PARMS* p)
{
int rc;
...
/* access the report parameters passed to the print_report()
function */
orient_printer(p->orientation);
set_printer_quality((p->draft_quality == TRUE) ? DRAFT : NORMAL);
...
return rc;
}
The preceding example avoided a large, messy function prototype and definition by setting up a predefined structure of type RPT_PARMS to hold the 10 parameters that were needed by the print_report() function. The only possible disadvantage to this approach is that by removing the parameters from the function definition, you are bypassing the compiler's capability to type-check each of the parameters for validity during the compile stage.
Generally, you should keep your functions small and focused, with as few parameters as possible to help with execution speed. If you find yourself writing lengthy functions with many parameters, maybe you should rethink your function design or consider using the structure-passing technique presented here. Additionally, keeping your functions small and focused will help when you are trying to isolate and fix bugs in your programs.
4. What is a static function?
A static function is a function whose scope is limited to the current source file. Scope refers to the visibility of a function or variable. If the function or variable is visible outside of the current source file, it is said to have global, orexternal, scope. If the function or variable is not visible outside of the current source file, it is said to have local, orstatic, scope.
A static function therefore can be seen and used only by other functions within the current source file. When you have a function that you know will not be used outside of the current source file or if you have a function that you do not want being used outside of the current source file, you should declare it as static. Declaring local functions as static is considered good programming practice. You should use static functions often to avoid possible conflicts with external functions that might have the same name.
For instance, consider the following example program, which contains two functions. The first function,open_customer_table(), is a global function that can be called by any module. The second function,open_customer_indexes(), is a local function that will never be called by another module. This is because you can't have the customer's index files open without first having the customer table open. Here is the code:
#include <stdio.h>
int open_customer_table(void); /* global function, callable from
any module */
static int open_customer_indexes(void); /* local function, used only in
this module */
int open_customer_table(void)
{
int ret_code;
/* open the customer table */
...
if (ret_code == OK)
{
ret_code = open_customer_indexes();
}
return ret_code;
}
static int open_customer_indexes(void)
{
int ret_code;
/* open the index files used for this table */
...
return ret_code;
}
Generally, if the function you are writing will not be used outside of the current source file, you should declare it asstatic.
5. Should a function contain a return statement if it does not return a value?
In C, void functions (those that do not return a value to the calling function) are not required to include a return statement. Therefore, it is not necessary to include a return statement in your functions declared as being void.
In some cases, your function might trigger some critical error, and an immediate exit from the function might be necessary. In this case, it is perfectly acceptable to use a return statement to bypass the rest of the function's code. However, keep in mind that it is not considered good programming practice to litter your functions with return statements-generally, you should keep your function's exit point as focused and clean as possible.
6. How can you pass an array to a function by value?
An array can be passed to a function by value by declaring in the called function the array name with square brackets ([ and ]) attached to the end. When calling the function, simply pass the address of the array (that is, the array's name) to the called function. For instance, the following program passes the array x[] to the function namedbyval_func() by value:
#include <stdio.h>
void byval_func(int[]); /* the byval_func() function is passed an
integer array by value */
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call byval_func(), passing the x array by value. */
byval_func(x);
}
/* The byval_function receives an integer array by value. */
void byval_func(int i[])
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", i[y]);
}
In this example program, an integer array named x is defined and initialized with 10 values. The functionbyval_func() is declared as follows:
int byval_func(int[]);
The int[] parameter tells the compiler that the byval_func() function will take one argument—an array of integers. When the byval_func() function is called, you pass the address of the array to byval_func():byval_func(x);
Because the array is being passed by value, an exact copy of the array is made and placed on the stack. The called function then receives this copy of the array and can print it. Because the array passed to byval_func() is a copy of the original array, modifying the array within the byval_func() function has no effect on the original array.
Passing arrays of any kind to functions can be very costly in several ways. First, this approach is very inefficient because an entire copy of the array must be made and placed on the stack. This takes up valuable program time, and your program execution time is degraded. Second, because a copy of the array is made, more memory (stack) space is required. Third, copying the array requires more code generated by the compiler, so your program is larger.
Instead of passing arrays to functions by value, you should consider passing arrays to functions by reference: this means including a pointer to the original array. When you use this method, no copy of the array is made. Your programs are therefore smaller and more efficient, and they take up less stack space. To pass an array by reference, you simply declare in the called function prototype a pointer to the data type you are holding in the array.
Consider the following program, which passes the same array (x) to a function:
#include <stdio.h>
void const_func(const int*);
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call const_func(), passing the x array by reference. */
const_func(x);
}
/* The const_function receives an integer array by reference.
Notice that the pointer is declared as const, which renders
it unmodifiable by the const_func() function. */
void const_func(const int* i)
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", *(i+y));
}
In the preceding example program, an integer array named x is defined and initialized with 10 values. The function const_func() is declared as follows:
int const_func(const int*);
The const int* parameter tells the compiler that the const_func() function will take one argument—a constant pointer to an integer. When the const_func() function is called, you pass the address of the array toconst_func():
const_func(x);
Because the array is being passed by reference, no copy of the array is made and placed on the stack. The called function receives simply a constant pointer to an integer. The called function must be coded to be smart enough to know that what it is really receiving is a constant pointer to an array of integers. The const modifier is used to prevent the const_func() from accidentally modifying any elements of the original array.
The only possible drawback to this alternative method of passing arrays is that the called function must be coded correctly to access the array—it is not readily apparent by the const_func() function prototype or definition that it is being passed a reference to an array of integers. You will find, however, that this method is much quicker and more efficient, and it is recommended when speed is of utmost importance.
7. Is it possible to execute code even after the program exits the main() function?
The standard C library provides a function named atexit() that can be used to perform "cleanup" operations when your program terminates. You can set up a set of functions you want to perform automatically when your program exits by passing function pointers to the atexit() function. Here's an example of a program that uses the atexit()function:
#include <stdio.h>
#include <stdlib.h>
void close_files(void);
void print_registration_message(void);
int main(int, char**);
int main(int argc, char** argv)
{
...
atexit(print_registration_message);
atexit(close_files);
while (rec_count < max_records)
{
process_one_record();
}
exit(0);
}
This example program uses the atexit() function to signify that the close_files() function and theprint_registration_message() function need to be called automatically when the program exits. When themain() function ends, these two functions will be called to close the files and print the registration message. There are two things that should be noted regarding the atexit() function. First, the functions you specify to execute at program termination must be declared as void functions that take no parameters. Second, the functions you designate with the atexit() function are stacked in the order in which they are called with atexit(), and therefore they are executed in a last-in, first-out (LIFO) method. Keep this information in mind when using the atexit() function. In the preceding example, the atexit() function is stacked as shown here:
atexit(print_registration_message);
atexit(close_files);
Because the LIFO method is used, the close_files() function will be called first, and then theprint_registration_message() function will be called.
The atexit() function can come in handy when you want to ensure that certain functions (such as closing your program's data files) are performed before your program terminates.
8. What does a function declared as PASCAL do differently?
A C function declared as PASCAL uses a different calling convention than a "regular" C function. Normally, C function parameters are passed right to left; with the PASCAL calling convention, the parameters are passed left to right.
Consider the following function, which is declared normally in a C program:
int regular_func(int, char*, long);
Using the standard C calling convention, the parameters are pushed on the stack from right to left. This means that when the regular_func() function is called in C, the stack will contain the following parameters:
long
char*
int
The function calling regular_func() is responsible for restoring the stack when regular_func() returns.
When the PASCAL calling convention is being used, the parameters are pushed on the stack from left to right.
Consider the following function, which is declared as using the PASCAL calling convention:
int PASCAL pascal_func(int, char*, long);
When the function pascal_func() is called in C, the stack will contain the following parameters:
int
char*
long
The function being called is responsible for restoring the stack pointer. Why does this matter? Is there any benefit to using PASCAL functions?
Functions that use the PASCAL calling convention are more efficient than regular C functions—the function calls tend to be slightly faster. Microsoft Windows is an example of an operating environment that uses the PASCAL calling convention. The Windows SDK (Software Development Kit) contains hundreds of functions declared as PASCAL.
When Windows was first designed and written in the late 1980s, using the PASCAL modifier tended to make a noticeable difference in program execution speed. In today's world of fast machinery, the PASCAL modifier is much less of a catalyst when it comes to the speed of your programs. In fact, Microsoft has abandoned the PASCAL calling convention style for the Windows NT operating system.
In your world of programming, if milliseconds make a big difference in your programs, you might want to use the PASCAL modifier when declaring your functions. Most of the time, however, the difference in speed is hardly noticeable, and you would do just fine to use C's regular calling convention.
9. Is using exit() the same as using return?
No. The exit() function is used to exit your program and return control to the operating system. The return statement is used to return from a function and return control to the calling function. If you issue a return from themain() function, you are essentially returning control to the calling function, which is the operating system. In this case, the return statement and exit() function are similar. Here is an example of a program that uses the exit()function and return statement:
#include <stdio.h>
#include <stdlib.h>
int main(int, char**);
int do_processing(void);
int do_something_daring();
int main(int argc, char** argv)
{
int ret_code;
if (argc < 3)
{
printf("Wrong number of arguments used!\n");
/* return 1 to the operating system */
exit(1);
}
ret_code = do_processing();
...
/* return 0 to the operating system */
exit(0);
}
int do_processing(void)
{
int rc;
rc = do_something_daring();
if (rc == ERROR)
{
printf("Something fishy is going on around here..."\n);
/* return rc to the operating system */
exit(rc);
}
/* return 0 to the calling function */
return 0;
}
In the main() function, the program is exited if the argument count (argc) is less than 3. The statement exit(1);tells the program to exit and return the number 1 to the operating system. The operating system can then decide what to do based on the return value of the program. For instance, many DOS batch files check the environment variable named ERRORLEVEL for the return value of executable programs.
POINTERS:
1. What is indirection?
If you declare a variable, its name is a direct reference to its value. If you have a pointer to a variable, or any other object in memory, you have an indirect reference to its value. If p is a pointer, the value of p is the address of the object. *p means "apply the indirection operator to p"; its value is the value of the object that p points to. (Some people would read it as "Go indirect on p.")
*p is an lvalue; like a variable, it can go on the left side of an assignment operator, to change the value. If p is a pointer to a constant, *p is not a modifiable lvalue; it can't go on the left side of an assignment.
Consider the following program. It shows that when p points to i, *p can appear wherever i can.
#include <stdio.h>
int main()
{
int i;
int *p;
i = 5;
p = & i; /* now *p == i */
printf("i=%d, p=%P, *p=%d\n", i, p, *p);
*p = 6; /* same as i = 6 */
printf("i=%d, p=%P, *p=%d\n", i, p, *p);
return 0;
}
After p points to i (p = &i), you can print i or *p and get the same thing. You can even assign to *p, and the result is the same as if you had assigned to i.
2. How many levels of pointers can you have?
The answer depends on what you mean by "levels of pointers." If you mean "How many levels of indirection can you have in a single declaration?" the answer is "At least 12."
int i = 0;
int *ip01 = & i;
int **ip02 = & ip01;
int ***ip03 = & ip02;
int ****ip04 = & ip03;
int *****ip05 = & ip04;
int ******ip06 = & ip05;
int *******ip07 = & ip06;
int ********ip08 = & ip07;
int *********ip09 = & ip08;
int **********ip10 = & ip09;
int ***********ip11 = & ip10;
int ************ip12 = & ip11;
************ip12 = 1; /* i = 1 */
If you mean "How many levels of pointer can you use before the program gets hard to read," that's a matter of taste, but there is a limit. Having two levels of indirection (a pointer to a pointer to something) is common. Any more than that gets a bit harder to think about easily; don't do it unless the alternative would be worse.
If you mean "How many levels of pointer indirection can you have at runtime," there's no limit. This point is particularly important for circular lists, in which each node points to the next. Your program can follow the pointers forever.
Consider the following program "A circular list that uses infinite indirection".
/* Would run forever if you didn't limit it to MAX */
#include <stdio.h>
struct circ_list
{
char value[ 3 ]; /* e.g., "st" (incl '\0') */
struct circ_list *next;
};
struct circ_list suffixes[] = {
"th", & suffixes[ 1 ], /* 0th */
"st", & suffixes[ 2 ], /* 1st */
"nd", & suffixes[ 3 ], /* 2nd */
"rd", & suffixes[ 4 ], /* 3rd */
"th", & suffixes[ 5 ], /* 4th */
"th", & suffixes[ 6 ], /* 5th */
"th", & suffixes[ 7 ], /* 6th */
"th", & suffixes[ 8 ], /* 7th */
"th", & suffixes[ 9 ], /* 8th */
"th", & suffixes[ 0 ], /* 9th */
};
#define MAX 20
int main()
{
int i = 0;
struct circ_list *p = suffixes;
while (i <= MAX)
{
printf( "%d%s\n", i, p->value );
++i;
p = p->next;
}
return 0;
}
Each element in suffixes has one suffix (two characters plus the terminating NUL character) and a pointer to the next element. next is a pointer to something that has a pointer, to something that has a pointer, ad infinitum.
The example is dumb because the number of elements in suffixes is fixed. It would be simpler to have an array of suffixes and to use the i%10'th element. In general, circular lists can grow and shrink.
3. What is a null pointer?
There are times when it's necessary to have a pointer that doesn't point to anything. The macro NULL, defined in<stddef.h>, has a value that's guaranteed to be different from any valid pointer. NULL is a literal zero, possibly cast to void* or char*. Some people, notably C++ programmers, prefer to use 0 rather than NULL.
You can't use an integer when a pointer is required. The exception is that a literal zero value can be used as the nullpointer. (It doesn't have to be a literal zero, but that's the only useful case. Any expression that can be evaluated at compile time, and that is zero, will do. It's not good enough to have an integer variable that might be zero at runtime.)
4. When is a null pointer used?
The null pointer is used in three ways:
1. To stop indirection in a recursive data structure.
2. As an error value.
3. As a sentinel value.
1. Using a Null Pointer to Stop Indirection or Recursion
Recursion is when one thing is defined in terms of itself. A recursive function calls itself. The following factorial function calls itself and therefore is considered recursive:
/* Dumb implementation; should use a loop */
unsigned factorial( unsigned i )
{
if ( i == 0 || i == 1 )
{
return 1;
}
else
{
return i * factorial( i - 1 );
}
}
A recursive data structure is defined in terms of itself. The simplest and most common case is a (singularly) linked list. Each element of the list has some value, and a pointer to the next element in the list:
struct string_list
{
char *str; /* string (in this case) */
struct string_list *next;
};
There are also doubly linked lists (which also have a pointer to the preceding element) and trees and hash tables and lots of other neat stuff. You'll find them described in any good book on data structures. You refer to a linked list with a pointer to its first element. That's where the list starts; where does it stop? This is where the null pointer comes in. In the last element in the list, the next field is set to NULL when there is no following element. To visit all the elements in a list, start at the beginning and go indirect on the next pointer as long as it's not null:
while ( p != NULL )
{
/* do something with p->str */
p = p->next;
}
Notice that this technique works even if p starts as the null pointer.
2. Using a Null Pointer As an Error Value
The second way the null pointer can be used is as an error value. Many C functions return a pointer to some object. If so, the common convention is to return a null pointer as an error code:
if ( setlocale( cat, loc_p ) == NULL )
{
/* setlocale() failed; do something */
/* ... */
}
This can be a little confusing. Functions that return pointers almost always return a valid pointer (one that doesn't compare equal to zero) on success, and a null pointer (one that compares equal to zero) pointer on failure. Other functions return an int to show success or failure; typically, zero is success and nonzero is failure. That way, a "true" return value means "do some error handling":
if ( raise( sig ) != 0 ) {
/* raise() failed; do something */
/* ... */
}
The success and failure return values make sense one way for functions that return ints, and another for functions that return pointers. Other functions might return a count on success, and either zero or some negative value on failure. As with taking medicine, you should read the instructions first.
Using a Null Pointer As a Sentinel Value
The third way a null pointer can be used is as a "sentinel" value. A sentinel value is a special value that marks the end of something. For example, in main(), argv is an array of pointers. The last element in the array (argv[argc]) is always a null pointer. That's a good way to run quickly through all the elements:
/* A simple program that prints all its arguments.
It doesn't use argc ("argument count"); instead,
it takes advantage of the fact that the last value
in argv ("argument vector") is a null pointer. */
#include <stdio.h>
#include <assert.h>
int
main( int argc, char **argv)
{
int i;
printf("program name = \"%s\"\n", argv[0]);
for (i=1; argv[i] != NULL; ++i)
printf("argv[%d] = \"%s\"\n", i, argv[i]);
assert(i == argc);
return 0;
}
5. What is a void pointer?
A void pointer is a C convention for "a raw address." The compiler has no idea what type of object a void pointer "really points to." If you write
int *ip;
ip points to an int. If you write
void *p;
p doesn't point to a void!
In C and C++, any time you need a void pointer, you can use another pointer type. For example, if you have a char*, you can pass it to a function that expects a void*. You don't even need to cast it. In C (but not in C++), you can use a void* any time you need any kind of pointer, without casting. (In C++, you need to cast it.)
6. When is a void pointer used?
A void pointer is used for working with raw memory or for passing a pointer to an unspecified type.
Some C code operates on raw memory. When C was first invented, character pointers (char *) were used for that. Then people started getting confused about when a character pointer was a string, when it was a character array, and when it was raw memory.
For example, strcpy() is used to copy data from one string to another, and strncpy() is used to copy at most a certain length string to another:
char *strcpy( char *str1, const char *str2 );
char *strncpy( char *str1, const char *str2, size_t n );
memcpy() is used to move data from one location to another:
void *memcpy( void *addr1, void *addr2, size_t n );
void pointers are used to mean that this is raw memory being copied. NUL characters (zero bytes) aren't significant, and just about anything can be copied. Consider the following code:
#include "thingie.h" /* defines struct thingie */
struct thingie *p_src, *p_dest;
/* ... */
memcpy( p_dest, p_src, sizeof( struct thingie) * numThingies );
This program is manipulating some sort of object stored in a struct thingie. p1 and p2 point to arrays, or parts of arrays, of struct thingies. The program wants to copy numThingies of these, starting at the one pointed to byp_src, to the part of the array beginning at the element pointed to by p_dest. memcpy() treats p_src and p_destas pointers to raw memory; sizeof( struct thingie) * numThingies is the number of bytes to be copied.
7. Can you subtract pointers from each other? Why would you?
If you have two pointers into the same array, you can subtract them. The answer is the number of elements between the two elements.
Consider the street address analogy presented in the introduction of this chapter. Say that I live at 118 Fifth Avenue and that my neighbor lives at 124 Fifth Avenue. The "size of a house" is two (on my side of the street, sequential even numbers are used), so my neighbor is (124-118)/2 (or 3) houses up from me. (There are two houses between us,120 and 122; my neighbor is the third.) You might do this subtraction if you're going back and forth between indices and pointers.
You might also do it if you're doing a binary search. If p points to an element that's before what you're looking for, andq points to an element that's after it, then (q-p)/2+p points to an element between p and q. If that element is before what you want, look between it and q. If it's after what you want, look between p and it.
(If it's what you're looking for, stop looking.)
You can't subtract arbitrary pointers and get meaningful answers. Someone might live at 110 Main Street, but I can't subtract 110 Main from 118 Fifth (and divide by 2) and say that he or she is four houses away!
If each block starts a new hundred, I can't even subtract 120 Fifth Avenue from 204 Fifth Avenue. They're on the same street, but in different blocks of houses (different arrays).
C won't stop you from subtracting pointers inappropriately. It won't cut you any slack, though, if you use the meaningless answer in a way that might get you into trouble.
When you subtract pointers, you get a value of some integer type. The ANSI C standard defines a typedef,ptrdiff_t, for this type. (It's in <stddef.h>.) Different compilers might use different types (int or long or whatever), but they all define ptrdiff_t appropriately.
Below is a simple program that demonstrates this point. The program has an array of structures, each 16 bytes long. The difference between array[0] and array[8] is 8 when you subtract struct stuff pointers, but 128 (hex 0x80) when you cast the pointers to raw addresses and then subtract.
If you subtract 8 from a pointer to array[8], you don't get something 8 bytes earlier; you get something 8 elements earlier.
#include <stdio.h>
#include <stddef.h>
struct stuff {
char name[16];
/* other stuff could go here, too */
};
struct stuff array[] = {
{ "The" },
{ "quick" },
{ "brown" },
{ "fox" },
{ "jumped" },
{ "over" },
{ "the" },
{ "lazy" },
{ "dog." },
{ "" }
};
int main()
{
struct stuff *p0 = & array[0];
struct stuff *p8 = & array[8];
ptrdiff_t diff = p8 - p0;
ptrdiff_t addr_diff = (char*) p8 - (char*) p0;
/* cast the struct stuff pointers to void* */
printf("& array[0] = p0 = %P\n", (void*) p0);
printf("& array[8] = p8 = %P\n", (void*) p8);
/* cast the ptrdiff_t's to long's
(which we know printf() can handle) */
printf("The difference of pointers is %ld\n",
(long) diff);
printf("The difference of addresses is %ld\n",
(long) addr_diff);
printf("p8 - 8 = %P\n", (void*) (p8 - 8));
printf("p0 + 8 = %P (same as p8)\n", (void*) (p0 + 8));
return 0;
}
8. Is NULL always defined as 0(zero)?
NULL is defined as either 0 or (void*)0. These values are almost identical; either a literal zero or a void pointer is converted automatically to any kind of pointer, as necessary, whenever a pointer is needed (although the compiler can't always tell when a pointer is needed).
1. Where in memory are my variables stored?
Variables can be stored in several places in memory, depending on their lifetime. Variables that are defined outside any function (whether of global or file static scope), and variables that are defined inside a function as static variables, exist for the lifetime of the program's execution. These variables are stored in the "data segment." The data segment is a fixed-size area in memory set aside for these variables. The data segment is subdivided into two parts, one for initialized variables and another for uninitialized variables.
Variables that are defined inside a function as auto variables (that are not defined with the keyword static) come into existence when the program begins executing the block of code (delimited by curly braces {}) containing them, and they cease to exist when the program leaves that block of code.
Variables that are the arguments to functions exist only during the call to that function. These variables are stored on the "stack". The stack is an area of memory that starts out small and grows automatically up to some predefined limit. In DOS and other systems without virtual memory, the limit is set either when the program is compiled or when it begins executing. In UNIX and other systems with virtual memory, the limit is set by the system, and it is usually so large that it can be ignored by the programmer.
The third and final area doesn't actually store variables but can be used to store data pointed to by variables. Pointer variables that are assigned to the result of a call to the malloc() function contain the address of a dynamically allocated area of memory. This memory is in an area called the "heap." The heap is another area that starts out small and grows, but it grows only when the programmer explicitly calls malloc() or other memory allocation functions, such as calloc(). The heap can share a memory segment with either the data segment or the stack, or it can have its own segment. It all depends on the compiler options and operating system. The heap, like the stack, has a limit on how much it can grow, and the same rules apply as to how that limit is determined.
2. Do variables need to be initialized?
No. All variables should be given a value before they are used, and a good compiler will help you find variables that are used before they are set to a value. Variables need not be initialized, however. Variables defined outside a function or defined inside a function with the static keyword are already initialized to 0 for you if you do not explicitly initialize them.
Automatic variables are variables defined inside a function or block of code without the static keyword. These variables have undefined values if you don't explicitly initialize them. If you don't initialize an automatic variable, you must make sure you assign to it before using the value.
Space on the heap allocated by calling malloc() contains undefined data as well and must be set to a known value before being used. Space allocated by calling calloc() is set to 0 for you when it is allocated.
3. What is page thrashing?
Some operating systems (such as UNIX or Windows in enhanced mode) use virtual memory. Virtual memory is a technique for making a machine behave as if it had more memory than it really has, by using disk space to simulate RAM (random-access memory). In the 80386 and higher Intel CPU chips, and in most other modern microprocessors (such as the Motorola 68030, Sparc, and Power PC), exists a piece of hardware called the Memory Management Unit, or MMU.
The MMU treats memory as if it were composed of a series of "pages." A page of memory is a block of contiguous bytes of a certain size, usually 4096 or 8192 bytes. The operating system sets up and maintains a table for each running program called the Process Memory Map, or PMM. This is a table of all the pages of memory that program can access and where each is really located.
Every time your program accesses any portion of memory, the address (called a "virtual address") is processed by the MMU. The MMU looks in the PMM to find out where the memory is really located (called the "physical address"). The physical address can be any location in memory or on disk that the operating system has assigned for it. If the location the program wants to access is on disk, the page containing it must be read from disk into memory, and the PMM must be updated to reflect this action (this is called a "page fault"). Hope you're still with me, because here's the tricky part. Because accessing the disk is so much slower than accessing RAM, the operating system tries to keep as much of the virtual memory as possible in RAM. If you're running a large enough program (or several small programs at once), there might not be enough RAM to hold all the memory used by the programs, so some of it must be moved out of RAM and onto disk (this action is called "paging out").
The operating system tries to guess which areas of memory aren't likely to be used for a while (usually based on how the memory has been used in the past). If it guesses wrong, or if your programs are accessing lots of memory in lots of places, many page faults will occur in order to read in the pages that were paged out. Because all of RAM is being used, for each page read in to be accessed, another page must be paged out. This can lead to more page faults, because now a different page of memory has been moved to disk. The problem of many page faults occurring in a short time, called "page thrashing," can drastically cut the performance of a system.
Programs that frequently access many widely separated locations in memory are more likely to cause page thrashing on a system. So is running many small programs that all continue to run even when you are not actively using them. To reduce page thrashing, you can run fewer programs simultaneously. Or you can try changing the way a large program works to maximize the capability of the operating system to guess which pages won't be needed. You can achieve this effect by caching values or changing lookup algorithms in large data structures, or sometimes by changing to a memory allocation library which provides an implementation of malloc() that allocates memory more efficiently. Finally, you might consider adding more RAM to the system to reduce the need to page out.
4. What is a const pointer?
The access modifier keyword const is a promise the programmer makes to the compiler that the value of a variable will not be changed after it is initialized. The compiler will enforce that promise as best it can by not enabling the programmer to write code which modifies a variable that has been declared const.
A "const pointer," or more correctly, a "pointer to const," is a pointer which points to data that is const (constant, or unchanging). A pointer to const is declared by putting the word const at the beginning of the pointer declaration. This declares a pointer which points to data that can't be modified. The pointer itself can be modified. The following example illustrates some legal and illegal uses of a const pointer:
const char *str = "hello";
char c = *str /* legal */
str++; /* legal */
*str = 'a'; /* illegal */
str[1] = 'b'; /* illegal */
The first two statements here are legal because they do not modify the data that str points to. The next two statements are illegal because they modify the data pointed to by str.
Pointers to const are most often used in declaring function parameters. For instance, a function that counted the number of characters in a string would not need to change the contents of the string, and it might be written this way:
my_strlen(const char *str)
{
int count = 0;
while (*str++)
{
count++;
}
return count;
}
Note that non-const pointers are implicitly converted to const pointers when needed, but const pointers are not converted to non-const pointers. This means that my_strlen() could be called with either a const or a non-const character pointer.
5. When should the register modifier be used? Does it really help?
The register modifier hints to the compiler that the variable will be heavily used and should be kept in the CPU's registers, if possible, so that it can be accessed faster. There are several restrictions on the use of the register modifier.
First, the variable must be of a type that can be held in the CPU's register. This usually means a single value of a size less than or equal to the size of an integer. Some machines have registers that can hold floating-point numbers as well.
Second, because the variable might not be stored in memory, its address cannot be taken with the unary and operator. An attempt to do so is flagged as an error by the compiler. Some additional rules affect how useful the register modifier is. Because the number of registers is limited, and because some registers can hold only certain types of data (such as pointers or floating-point numbers), the number and types of register modifiers that will actually have any effect are dependent on what machine the program will run on. Any additional register modifiers are silently ignored by the compiler.
Also, in some cases, it might actually be slower to keep a variable in a register because that register then becomes unavailable for other purposes or because the variable isn't used enough to justify the overhead of loading and storing it.
So when should the register modifier be used? The answer is never, with most modern compilers. Early C compilers did not keep any variables in registers unless directed to do so, and the register modifier was a valuable addition to the language. C compiler design has advanced to the point, however, where the compiler will usually make better decisions than the programmer about which variables should be stored in registers. In fact, many compilers actually ignore the register modifier, which is perfectly legal, because it is only a hint and not a directive.
In the rare event that a program is too slow, and you know that the problem is due to a variable being stored in memory, you might try adding the register modifier as a last resort, but don't be surprised if this action doesn't change the speed of the program.
6. When should the volatile modifier be used?
The volatile modifier is a directive to the compiler's optimizer that operations involving this variable should not be optimized in certain ways. There are two special cases in which use of the volatile modifier is desirable. The first case involves memory-mapped hardware (a device such as a graphics adaptor that appears to the computer's hardware as if it were part of the computer's memory), and the second involves shared memory (memory used by two or more programs running simultaneously).
Most computers have a set of registers that can be accessed faster than the computer's main memory. A good compiler will perform a kind of optimization called "redundant load and store removal." The compiler looks for places in the code where it can either remove an instruction to load data from memory because the value is already in a register, or remove an instruction to store data to memory because the value can stay in a register until it is changed again anyway.
If a variable is a pointer to something other than normal memory, such as memory-mapped ports on a peripheral, redundant load and store optimizations might be detrimental. For instance, here's a piece of code that might be used to time some operation:
time_t time_addition(volatile const struct timer *t, int a)
{
int n;
int x;
time_t then;
x = 0;
then = t->value;
for (n = 0; n < 1000; n++)
{
x = x + a;
}
return t->value - then;
}
In this code, the variable t->value is actually a hardware counter that is being incremented as time passes. The function adds the value of a to x 1000 times, and it returns the amount the timer was incremented by while the 1000 additions were being performed.
Without the volatile modifier, a clever optimizer might assume that the value of t does not change during the execution of the function, because there is no statement that explicitly changes it. In that case, there's no need to read it from memory a second time and subtract it, because the answer will always be 0. The compiler might therefore "optimize" the function by making it always return 0.
If a variable points to data in shared memory, you also don't want the compiler to perform redundant load and store optimizations. Shared memory is normally used to enable two programs to communicate with each other by having one program store data in the shared portion of memory and the other program read the same portion of memory. If the compiler optimizes away a load or store of shared memory, communication between the two programs will be affected.
7. Can a variable be both const and volatile?
Yes. The const modifier means that this code cannot change the value of the variable, but that does not mean that the value cannot be changed by means outside this code. For instance, the timer structure was accessed through a volatile const pointer. The function itself did not change the value of the timer, so it was declared const. However, the value was changed by hardware on the computer, so it was declared volatile. If a variable is both const andvolatile, the two modifiers can appear in either order.
8. When should the const modifier be used?
There are several reasons to use const pointers. First, it allows the compiler to catch errors in which code accidentally changes the value of a variable, as in
while (*str = 0) /* programmer meant to write *str != 0 */
{
/* some code here */
str++;
}
in which the = sign is a typographical error. Without the const in the declaration of str, the program would compile but not run properly.
Another reason is efficiency. The compiler might be able to make certain optimizations to the code generated if it knows that a variable will not be changed.
Any function parameter which points to data that is not modified by the function or by any function it calls should declare the pointer a pointer to const. Function parameters that are passed by value (rather than through a pointer) can be declared const if neither the function nor any function it calls modifies the data.
In practice, however, such parameters are usually declared const only if it might be more efficient for the compiler to access the data through a pointer than by copying it.
9. How reliable are floating-point comparisons?
Floating-point numbers are the "black art" of computer programming. One reason why this is so is that there is no optimal way to represent an arbitrary number. The Institute of Electrical and Electronic Engineers (IEEE) has developed a standard for the representation of floating-point numbers, but you cannot guarantee that every machine you use will conform to the standard.
Even if your machine does conform to the standard, there are deeper issues. It can be shown mathematically that there are an infinite number of "real" numbers between any two numbers. For the computer to distinguish between two numbers, the bits that represent them must differ. To represent an infinite number of different bit patterns would take an infinite number of bits. Because the computer must represent a large range of numbers in a small number of bits (usually 32 to 64 bits), it has to make approximate representations of most numbers.
Because floating-point numbers are so tricky to deal with, it's generally bad practice to compare a floating- point number for equality with anything. Inequalities are much safer. If, for instance, you want to step through a range of numbers in small increments, you might write this:
#include <stdio.h>
const float first = 0.0;
const float last = 70.0;
const float small = 0.007;
main()
{
float f;
for (f = first; f != last && f < last + 1.0; f += small)
;
printf("f is now %g\n", f);
}
However, rounding errors and small differences in the representation of the variable small might cause f to never be equal to last (it might go from being just under it to being just over it). Thus, the loop would go past the value last. The inequality f < last + 1.0 has been added to prevent the program from running on for a very long time if this happens. If you run this program and the value printed for f is 71 or more, this is what has happened.
A safer way to write this loop is to use the inequality f < last to test for the loop ending, as in this example:
float f;
for (f = first; f < last; f += small)
;
You could even precompute the number of times the loop should be executed and use an integer to count iterations of the loop, as in this example:
float f;
int count = (last - first) / small;
for (f = first; count-- > 0; f += small)
10. How can you determine the maximum value that a numeric variable can hold?
The easiest way to find out how large or small a number that a particular type can hold is to use the values defined in the ANSI standard header file limits.h. This file contains many useful constants defining the values that can be held by various types, including these:
Value | Description | |
---|---|---|
CHAR_BIT | - | Number of bits in a char |
CHAR_MAX | - | Maximum decimal integer value of a char |
CHAR_MIN | - | Minimum decimal integer value of a char |
MB_LEN_MAX | - | Maximum number of bytes in a multibyte character |
INT_MAX | - | Maximum decimal value of an int |
INT_MIN | - | Minimum decimal value of an int |
LONG_MAX | - | Maximum decimal value of a long |
LONG_MIN | - | Minimum decimal value of a long |
SCHAR_MAX | - | Maximum decimal integer value of a signed char |
SCHAR_MIN | - | Minimum decimal integer value of a signed char |
SHRT_MAX | - | Maximum decimal value of a short |
SHRT_MIN | - | Minimum decimal value of a short |
UCHAR_MAX | - | Maximum decimal integer value of unsigned char |
UINT_MAX | - | Maximum decimal value of an unsigned integer |
ULONG_MAX | - | Maximum decimal value of an unsigned long int |
USHRT_MAX | - | Maximum decimal value of an unsigned short int |
For integral types, on a machine that uses two's complement arithmetic (which is just about any machine you're likely to use), a signed type can hold numbers from -2(number of bits - 1) to +2(number of bits - 1) - 1.
An unsigned type can hold values from 0 to +2(number of bits)- 1. For instance, a 16-bit signed integer can hold numbers from -215(-32768) to +215 - 1 (32767).
11. Are there any problems with performing mathematical operations on different variable types?
C has three categories of built-in data types: pointer types, integral types, and floating-point types. Pointer types are the most restrictive in terms of the operations that can be performed on them. They are limited to
- subtraction of two pointers, valid only when both pointers point to elements in the same array. The result is the same as subtracting the integer subscripts corresponding to the two pointers.
+ addition of a pointer and an integral type. The result is a pointer that points to the element which would be selected by that integer.
Floating-point types consist of the built-in types float, double, and long double. Integral types consist of char, unsigned char, short, unsigned short, int, unsigned int, long, and unsigned long. All of these types can have the following arithmetic operations performed on them:
+ Addition
- Subtraction
* Multiplication
/ Division
Integral types also can have those four operations performed on them, as well as the following operations: % Modulo or remainder of division
<< Shift left
>> Shift right
& Bitwise AND operation
| Bitwise OR operation
^ Bitwise exclusive OR operation
! Logical negative operation
~ Bitwise "one's complement" operation
Although C permits "mixed mode" expressions (an arithmetic expression involving different types), it actually converts the types to be the same type before performing the operations (except for the case of pointer arithmetic described previously). The process of automatic type conversion is called "operator promotion."
12. What is operator promotion?
If an operation is specified with operands of two different types, they are converted to the smallest type that can hold both values. The result has the same type as the two operands wind up having. To interpret the rules, read the following table from the top down, and stop at the first rule that applies.
If Either Operand Is | And the Other Is | Change Them To | ||
---|---|---|---|---|
long double | - | any other type | - | long double |
double | - | any smaller type | - | double |
float | - | any smaller type | - | float |
unsigned long | - | any integral type | - | unsigned long |
long | - | unsigned > LONG_MAX | - | long |
long | - | any smaller type | - | long |
unsigned | - | any signed type | - | unsigned |
The following example code illustrates some cases of operator promotion. The variable f1 is set to 3/4. Because both 3 and 4 are integers, integer division is performed, and the result is the integer 0. The variable f2 is set to 3/4.0. Because 4.0 is a float, the number 3 is converted to a float as well, and the result is the float 0.75.
#include <stdio.h>
main()
{
float f1 = 3 / 4;
float f2 = 3 / 4.0;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
13. When should a type cast be used?
There are two situations in which to use a type cast. The first use is to change the type of an operand to an arithmetic operation so that the operation will be performed properly. The variable f1 is set to the result of dividing the integer iby the integer j. The result is 0, because integer division is used. The variable f2 is set to the result of dividing i by jas well. However, the (float) type cast causes i to be converted to a float. That in turn causes floating-point division to be used and gives the result 0.75.
#include <stdio.h>
main()
{
int i = 3;
int j = 4;
float f1 = i / j;
float f2 = (float) i / j;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
The second case is to cast pointer types to and from void * in order to interface with functions that expect or return void pointers. For example, the following line type casts the return value of the call to malloc() to be a pointer to afoo structure.
struct foo *p = (struct foo *) malloc(sizeof(struct foo));
14. When should a type cast not be used?
A type cast should not be used to override a const or volatile declaration. Overriding these type modifiers can cause the program to fail to run correctly.
A type cast should not be used to turn a pointer to one type of structure or data type into another. In the rare events in which this action is beneficial, using a union to hold the values makes the programmer's intentions clearer.
15. Is it acceptable to declare/define a variable in a C header?
A global variable that must be accessed from more than one file can and should be declared in a header file. In addition, such a variable must be defined in one source file. Variables should not be defined in header files, because the header file can be included in multiple source files, which would cause multiple definitions of the variable.
The ANSI C standard will allow multiple external definitions, provided that there is only one initialization. But because there's really no advantage to using this feature, it's probably best to avoid it and maintain a higher level of portability.
"Global" variables that do not have to be accessed from more than one file should be declared static and should not appear in a header file.
16. What is the difference between declaring a variable and defining a variable?
Declaring a variable means describing its type to the compiler but not allocating any space for it. Defining a variable means declaring it and also allocating space to hold the variable. You can also initialize a variable at the time it is defined. Here is a declaration of a variable and a structure, and two variable definitions, one with initialization:
extern int decl1; /* this is a declaration */
struct decl2
{
int member;
}; /* this just declares the type--no variable mentioned */
int def1 = 8; /* this is a definition */
int def2; /* this is a definition */
To put it another way, a declaration says to the compiler, "Somewhere in my program will be a variable with this name, and this is what type it is." A definition says, "Right here is this variable with this name and this type."
A variable can be declared many times, but it must be defined exactly once. For this reason, definitions do not belong in header files, where they might get #included into more than one place in your program.
17. Can static variables be declared in a header file?
You can't declare a static variable without defining it as well (this is because the storage class modifiers static andextern are mutually exclusive). A static variable can be defined in a header file, but this would cause each source file that included the header file to have its own private copy of the variable, which is probably not what was intended.
18. What is the benefit of using const for declaring constants?
The benefit of using the const keyword is that the compiler might be able to make optimizations based on the knowledge that the value of the variable will not change. In addition, the compiler will try to ensure that the values won't be changed inadvertently.
Of course, the same benefits apply to #defined constants. The reason to use const rather than #define to define a constant is that a const variable can be of any type (such as a struct, which can't be represented by a #definedconstant). Also, because a const variable is a real variable, it has an address that can be used, if needed, and it resides in only one place in memory.
1. What is the most efficient way to store flag values?
A flag is a value used to make a decision between two or more options in the execution of a program. For instance, the /w flag on the MS-DOS dir command causes the command to display filenames in several columns across the screen instead of displaying them one per line. In which a flag is used to indicate which of two possible types is held in a union. Because a flag has a small number of values (often only two), it is tempting to save memory space by not storing each flag in its own int or char.
Efficiency in this case is a tradeoff between size and speed. The most memory-space efficient way to store a flag value is as single bits or groups of bits just large enough to hold all the possible values. This is because most computers cannot address individual bits in memory, so the bit or bits of interest must be extracted from the bytes that contain it.
The most time-efficient way to store flag values is to keep each in its own integer variable. Unfortunately, this method can waste up to 31 bits of a 32-bit variable, which can lead to very inefficient use of memory. If there are only a few flags, it doesn't matter how they are stored. If there are many flags, it might be advantageous to store them packed in an array of characters or integers. They must then be extracted by a process called bit masking, in which unwanted bits are removed from the ones of interest.
Sometimes it is possible to combine a flag with another value to save space. It might be possible to use high- order bits of integers that have values smaller than what an integer can hold. Another possibility is that some data is always a multiple of 2 or 4, so the low-order bits can be used to store a flag.
2. What is meant by "bit masking"?
Bit masking means selecting only certain bits from byte(s) that might have many bits set. To examine some bits of a byte, the byte is bitwise "ANDed" with a mask that is a number consisting of only those bits of interest. For instance, to look at the one's digit (rightmost digit) of the variable flags, you bitwise AND it with a mask of one (the bitwise AND operator in C is &):
flags & 1;
To set the bits of interest, the number is bitwise "ORed" with the bit mask (the bitwise OR operator in C is |). For instance, you could set the one's digit of flags like so:
flags = flags | 1;
Or, equivalently, you could set it like this:
flags |= 1;
To clear the bits of interest, the number is bitwise ANDed with the one's complement of the bit mask. The "one's complement" of a number is the number with all its one bits changed to zeros and all its zero bits changed to ones. The one's complement operator in C is ~. For instance, you could clear the one's digit of flags like so:
flags = flags & ~1;
Or, equivalently, you could clear it like this:
flags &= ~1;
Sometimes it is easier to use macros to manipulate flag values.
Example Program : Macros that make manipulating flags easier.
/* Bit Masking */
/* Bit masking can be used to switch a character
between lowercase and uppercase */
#define BIT_POS(N) ( 1U << (N) )
#define SET_FLAG(N, F) ( (N) |= (F) )
#define CLR_FLAG(N, F) ( (N) &= -(F) )
#define TST_FLAG(N, F) ( (N) & (F) )
#define BIT_RANGE(N, M) ( BIT_POS((M)+1 - (N))-1 << (N) )
#define BIT_SHIFTL(B, N) ( (unsigned)(B) << (N) )
#define BIT_SHIFTR(B, N) ( (unsigned)(B) >> (N) )
#define SET_MFLAG(N, F, V) ( CLR_FLAG(N, F), SET_FLAG(N, V) )
#define CLR_MFLAG(N, F) ( (N) &= ~(F) )
#define GET_MFLAG(N, F) ( (N) & (F) )
#include <stdio.h>
void main()
{
unsigned char ascii_char = 'A'; /* char = 8 bits only */
int test_nbr = 10;
printf("Starting character = %c\n", ascii_char);
/* The 5th bit position determines if the character is
uppercase or lowercase.
5th bit = 0 - Uppercase
5th bit = 1 - Lowercase */
printf("\nTurn 5th bit on = %c\n", SET_FLAG(ascii_char, BIT_POS(5)) );
printf("Turn 5th bit off = %c\n\n", CLR_FLAG(ascii_char, BIT_POS(5)) );
printf("Look at shifting bits\n");
printf("=====================\n");
printf("Current value = %d\n", test_nbr);
printf("Shifting one position left = %d\n",
test_nbr = BIT_SHIFTL(test_nbr, 1) );
printf("Shifting two positions right = %d\n",
BIT_SHIFTR(test_nbr, 2) );
}
BIT_POS(N) takes an integer N and returns a bit mask corresponding to that single bit position (BIT_POS(0) returns a bit mask for the one's digit, BIT_POS(1) returns a bit mask for the two's digit, and so on). So instead of writing
#define A_FLAG 4096
#define B_FLAG 8192
you can write
#define A_FLAG BIT_POS(12)
#define B_FLAG BIT_POS(13)
which is less prone to errors.
The SET_FLAG(N, F) macro sets the bit at position F of variable N. Its opposite is CLR_FLAG(N, F), which clears the bit at position F of variable N. Finally, TST_FLAG(N, F) can be used to test the value of the bit at position F of variable N, as in
if (TST_FLAG(flags, A_FLAG))
/* do something */;
The macro BIT_RANGE(N, M) produces a bit mask corresponding to bit positions N through M, inclusive. With this macro, instead of writing
#define FIRST_OCTAL_DIGIT 7 /* 111 */
#define SECOND_OCTAL_DIGIT 56 /* 111000 */
you can write
#define FIRST_OCTAL_DIGIT BIT_RANGE(0, 2) /* 111 */
#define SECOND_OCTAL_DIGIT BIT_RANGE(3, 5) /* 111000 */
which more clearly indicates which bits are meant.
The macro BIT_SHIFT(B, N) can be used to shift value B into the proper bit range (starting with bit N). For instance, if you had a flag called C that could take on one of five possible colors, the colors might be defined like this:
#define C_FLAG BIT_RANGE(8, 10) /* 11100000000 */
/* here are all the values the C flag can take on */
#define C_BLACK BIT_SHIFTL(0, 8) /* 00000000000 */
#define C_RED BIT_SHIFTL(1, 8) /* 00100000000 */
#define C_GREEN BIT_SHIFTL(2, 8) /* 01000000000 */
#define C_BLUE BIT_SHIFTL(3, 8) /* 01100000000 */
#define C_WHITE BIT_SHIFTL(4, 8) /* 10000000000 */
#define C_ZERO C_BLACK
#define C_LARGEST C_WHITE
/* A truly paranoid programmer might do this */
#if C_LARGEST > C_FLAG
Cause an error message. The flag C_FLAG is not
big enough to hold all its possible values.
#endif /* C_LARGEST > C_FLAG */
The macro SET_MFLAG(N, F, V) sets flag F in variable N to the value V. The macro CLR_MFLAG(N, F) is identical to CLR_FLAG(N, F), except the name is changed so that all the operations on multibit flags have a similar naming convention. The macro GET_MFLAG(N, F) gets the value of flag F in variable N, so it can be tested, as in
if (GET_MFLAG(flags, C_FLAG) == C_BLUE)
/* do something */;
3. Are bit fields portable?
Bit fields are not portable. Because bit fields cannot span machine words, and because the number of bits in a machine word is different on different machines, a particular program using bit fields might not even compile on a particular machine.
Assuming that your program does compile, the order in which bits are assigned to bit fields is not defined. Therefore, different compilers, or even different versions of the same compiler, could produce code that would not work properly on data generated by compiled older code. Stay away from using bit fields, except in cases in which the machine can directly address bits in memory and the compiler can generate code to take advantage of it and the increase in speed to be gained would be essential to the operation of the program.
4. Is it better to bitshift a value than to multiply by 2?
Any decent optimizing compiler will generate the same code no matter which way you write it. Use whichever form is more readable in the context in which it appears. The following program's assembler code can be viewed with a tool such as CODEVIEW on DOS/Windows or the disassembler (usually called "dis") on UNIX machines:
Example: Multiplying by 2 and shifting left by 1 are often the same.
void main()
{
unsigned int test_nbr = 300;
test_nbr *= 2;
test_nbr = 300;
test_nbr <<= 1;
}
5. What is meant by high-order and low-order bytes?
We generally write numbers from left to right, with the most significant digit first. To understand what is meant by the "significance" of a digit, think of how much happier you would be if the first digit of your paycheck was increased by one compared to the last digit being increased by one.
The bits in a byte of computer memory can be considered digits of a number written in base 2. That means the least significant bit represents one, the next bit represents 2´1, or 2, the next bit represents 2´2´1, or 4, and so on. If you consider two bytes of memory as representing a single 16-bit number, one byte will hold the least significant 8 bits, and the other will hold the most significant 8 bits. Figure shows the bits arranged into two bytes. The byte holding the least significant 8 bits is called the least significant byte, or low-order byte. The byte containing the most significant 8 bits is the most significant byte, or high- order byte.
6. How are 16- and 32-bit numbers stored?
A 16-bit number takes two bytes of storage, a most significant byte and a least significant byte. If you write the 16-bit number on paper, you would start with the most significant byte and end with the least significant byte. There is no convention for which order to store them in memory, however.
Let's call the most significant byte M and the least significant byte L. There are two possible ways to store these bytes in memory. You could store M first, followed by L, or L first, followed by M. Storing byte M first in memory is called "forward" or "big-endian" byte ordering. The term big endian comes from the fact that the "big end" of the number comes first, and it is also a reference to the book Gulliver's Travels, in which the term refers to people who eat their boiled eggs with the big end on top.
Storing byte L first is called "reverse" or "little-endian" byte ordering. Most machines store data in a big- endian format. Intel CPUs store data in a little-endian format, however, which can be confusing when someone is trying to connect an Intel microprocessor-based machine to anything else.
A 32-bit number takes four bytes of storage. Let's call them Mm, Ml, Lm, and Ll in decreasing order of significance. There are 4! (4 factorial, or 24) different ways in which these bytes can be ordered. Over the years, computer designers have used just about all 24 ways. The most popular two ways in use today, however, are (Mm, Ml, Lm, Ll), which is big-endian, and (Ll, Lm, Ml, Mm), which is little-endian. As with 16-bit numbers, most machines store 32-bit numbers in a big-endian format, but Intel machines store 32-bit numbers in a little-endian format.
1. When should I declare a function?
Functions that are used only in the current source file should be declared as static, and the function's declaration should appear in the current source file along with the definition of the function. Functions used outside of the current source file should have their declarations put in a header file, which can be included in whatever source file is going to use that function. For instance, if a function named stat_func() is used only in the source file stat.c, it should be declared as shown here:
/* stat.c */
#include <stdio.h>
static int stat_func(int, int); /* static declaration of stat_func() */
void main(void);
void main(void)
{
...
rc = stat_func(1, 2);
...
}
/* definition (body) of stat_func() */
static int stat_func(int arg1, int arg2)
{
...
return rc;
}
In this example, the function named stat_func() is never used outside of the source file stat.c. There is therefore no reason for the prototype (or declaration) of the function to be visible outside of the stat.c source file. Thus, to avoid any confusion with other functions that might have the same name, the declaration of stat_func() should be put in the same source file as the declaration of stat_func().
In the following example, the function glob_func() is declared and used in the source file global.c and is used in the source file extern.c. Because glob_func() is used outside of the source file in which it's declared, the declaration of glob_func() should be put in a header file (in this example, named proto.h) to be included in both the global.c and the extern.c source files. This is how it's done:
/* proto.h */
int glob_func(int, int); /* declaration of the glob_func() function */
/* global.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void main(void);
void main(void)
{
...
rc = glob_func(1, 2);
...
}
/* definition (body) of the glob_func() function */
int glob_func(int arg1, int arg2)
{
...
return rc;
}
/* extern.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void ext_func(void);
void ext_func(void)
{
...
/* call glob_func(), which is defined in the global.c source file */
rc = glob_func(10, 20);
...
}
In the preceding example, the declaration of glob_func() is put in the header file named proto.h becauseglob_func() is used in both the global.c and the extern.c source files. Now, whenever glob_func() is going to be used, you simply need to include the proto.h header file, and you will automatically have the function's declaration. This will help your compiler when it is checking parameters and return values from global functions you are using in your programs. Notice that your function declarations should always appear before the first function declaration in your source file.
In general, if you think your function might be of some use outside of the current source file, you should put its declaration in a header file so that other modules can access it. Otherwise, if you are sure your function will never be used outside of the current source file, you should declare the function as static and include the declaration only in the current source file.
2. Why should I prototype a function?
A function prototype tells the compiler what kind of arguments a function is looking to receive and what kind of return value a function is going to give back. This approach helps the compiler ensure that calls to a function are made correctly and that no erroneous type conversions are taking place. For instance, consider the following prototype:
int some_func(int, char*, long);
Looking at this prototype, the compiler can check all references (including the definition of some_func()) to ensure that three parameters are used (an integer, a character pointer, and then a long integer) and that a return value of type integer is received. If the compiler finds differences between the prototype and calls to the function or the definition of the function, an error or a warning can be generated to avoid errors in your source code. For instance, the following examples would be flagged as incorrect, given the preceding
prototype of some_func():
x = some_func(1); /* not enough arguments passed */
x = some_func("HELLO!", 1, "DUDE!"); /* wrong type of arguments used */
x = some_func(1, str, 2879, "T"); /* too many arguments passed */
/* In the following example, the return value expected
from some_func() is not an integer: */
long* lValue;
lValue = some_func(1, str, 2879); /* some_func() returns an int,
not a long* */
Using prototypes, the compiler can also ensure that the function definition, or body, is correct and correlates with the prototype. For instance, the following definition of some_func() is not the same as its prototype, and it therefore would be flagged by the compiler:
int some_func(char* string, long lValue, int iValue) /* wrong order of
parameters */
{
...
}
The bottom line on prototypes is that you should always include them in your source code because they provide a good error-checking mechanism to ensure that your functions are being used correctly. Besides, many of today's popular compilers give you warnings when compiling if they can't find a prototype for a function that is being referenced.
3. How many parameters should a function have?
There is no set number or "guideline" limit to the number of parameters your functions can have. However, it is considered bad programming style for your functions to contain an inordinately high (eight or more) number of parameters. The number of parameters a function has also directly affects the speed at which it is called—the more parameters, the slower the function call. Therefore, if possible, you should minimize the number of parameters you use in a function. If you are using more than four parameters, you might want to rethink your function design and calling conventions.
One technique that can be helpful if you find yourself with a large number of function parameters is to put your function parameters in a structure. Consider the following program, which contains a function namedprint_report() that uses 10 parameters. Instead of making an enormous function declaration and proto- type, theprint_report() function uses a structure to get its parameters:
#include <stdio.h>
typedef struct
{
int orientation;
char rpt_name[25];
char rpt_path[40];
int destination;
char output_file[25];
int starting_page;
int ending_page;
char db_name[25];
char db_path[40];
int draft_quality;
} RPT_PARMS;
void main(void);
int print_report(RPT_PARMS*);
void main(void)
{
RPT_PARMS rpt_parm; /* define the report parameter
structure variable */
...
/* set up the report parameter structure variable to pass to the
print_report() function */
rpt_parm.orientation = ORIENT_LANDSCAPE;
rpt_parm.rpt_name = "QSALES.RPT";
rpt_parm.rpt_path = "C:\REPORTS";
rpt_parm.destination = DEST_FILE;
rpt_parm.output_file = "QSALES.TXT";
rpt_parm.starting_page = 1;
rpt_parm.ending_page = RPT_END;
rpt_parm.db_name = "SALES.DB";
rpt_parm.db_path = "C:\DATA";
rpt_parm.draft_quality = TRUE;
/* Call the print_report() function, passing it a pointer to the
parameters instead of passing it a long list of 10 separate
parameters. */
ret_code = print_report(&rpt_parm);
...
}
int print_report(RPT_PARMS* p)
{
int rc;
...
/* access the report parameters passed to the print_report()
function */
orient_printer(p->orientation);
set_printer_quality((p->draft_quality == TRUE) ? DRAFT : NORMAL);
...
return rc;
}
The preceding example avoided a large, messy function prototype and definition by setting up a predefined structure of type RPT_PARMS to hold the 10 parameters that were needed by the print_report() function. The only possible disadvantage to this approach is that by removing the parameters from the function definition, you are bypassing the compiler's capability to type-check each of the parameters for validity during the compile stage.
Generally, you should keep your functions small and focused, with as few parameters as possible to help with execution speed. If you find yourself writing lengthy functions with many parameters, maybe you should rethink your function design or consider using the structure-passing technique presented here. Additionally, keeping your functions small and focused will help when you are trying to isolate and fix bugs in your programs.
4. What is a static function?
A static function is a function whose scope is limited to the current source file. Scope refers to the visibility of a function or variable. If the function or variable is visible outside of the current source file, it is said to have global, orexternal, scope. If the function or variable is not visible outside of the current source file, it is said to have local, orstatic, scope.
A static function therefore can be seen and used only by other functions within the current source file. When you have a function that you know will not be used outside of the current source file or if you have a function that you do not want being used outside of the current source file, you should declare it as static. Declaring local functions as static is considered good programming practice. You should use static functions often to avoid possible conflicts with external functions that might have the same name.
For instance, consider the following example program, which contains two functions. The first function,open_customer_table(), is a global function that can be called by any module. The second function,open_customer_indexes(), is a local function that will never be called by another module. This is because you can't have the customer's index files open without first having the customer table open. Here is the code:
#include <stdio.h>
int open_customer_table(void); /* global function, callable from
any module */
static int open_customer_indexes(void); /* local function, used only in
this module */
int open_customer_table(void)
{
int ret_code;
/* open the customer table */
...
if (ret_code == OK)
{
ret_code = open_customer_indexes();
}
return ret_code;
}
static int open_customer_indexes(void)
{
int ret_code;
/* open the index files used for this table */
...
return ret_code;
}
Generally, if the function you are writing will not be used outside of the current source file, you should declare it asstatic.
5. Should a function contain a return statement if it does not return a value?
In C, void functions (those that do not return a value to the calling function) are not required to include a return statement. Therefore, it is not necessary to include a return statement in your functions declared as being void.
In some cases, your function might trigger some critical error, and an immediate exit from the function might be necessary. In this case, it is perfectly acceptable to use a return statement to bypass the rest of the function's code. However, keep in mind that it is not considered good programming practice to litter your functions with return statements-generally, you should keep your function's exit point as focused and clean as possible.
6. How can you pass an array to a function by value?
An array can be passed to a function by value by declaring in the called function the array name with square brackets ([ and ]) attached to the end. When calling the function, simply pass the address of the array (that is, the array's name) to the called function. For instance, the following program passes the array x[] to the function namedbyval_func() by value:
#include <stdio.h>
void byval_func(int[]); /* the byval_func() function is passed an
integer array by value */
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call byval_func(), passing the x array by value. */
byval_func(x);
}
/* The byval_function receives an integer array by value. */
void byval_func(int i[])
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", i[y]);
}
In this example program, an integer array named x is defined and initialized with 10 values. The functionbyval_func() is declared as follows:
int byval_func(int[]);
The int[] parameter tells the compiler that the byval_func() function will take one argument—an array of integers. When the byval_func() function is called, you pass the address of the array to byval_func():byval_func(x);
Because the array is being passed by value, an exact copy of the array is made and placed on the stack. The called function then receives this copy of the array and can print it. Because the array passed to byval_func() is a copy of the original array, modifying the array within the byval_func() function has no effect on the original array.
Passing arrays of any kind to functions can be very costly in several ways. First, this approach is very inefficient because an entire copy of the array must be made and placed on the stack. This takes up valuable program time, and your program execution time is degraded. Second, because a copy of the array is made, more memory (stack) space is required. Third, copying the array requires more code generated by the compiler, so your program is larger.
Instead of passing arrays to functions by value, you should consider passing arrays to functions by reference: this means including a pointer to the original array. When you use this method, no copy of the array is made. Your programs are therefore smaller and more efficient, and they take up less stack space. To pass an array by reference, you simply declare in the called function prototype a pointer to the data type you are holding in the array.
Consider the following program, which passes the same array (x) to a function:
#include <stdio.h>
void const_func(const int*);
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call const_func(), passing the x array by reference. */
const_func(x);
}
/* The const_function receives an integer array by reference.
Notice that the pointer is declared as const, which renders
it unmodifiable by the const_func() function. */
void const_func(const int* i)
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", *(i+y));
}
In the preceding example program, an integer array named x is defined and initialized with 10 values. The function const_func() is declared as follows:
int const_func(const int*);
The const int* parameter tells the compiler that the const_func() function will take one argument—a constant pointer to an integer. When the const_func() function is called, you pass the address of the array toconst_func():
const_func(x);
Because the array is being passed by reference, no copy of the array is made and placed on the stack. The called function receives simply a constant pointer to an integer. The called function must be coded to be smart enough to know that what it is really receiving is a constant pointer to an array of integers. The const modifier is used to prevent the const_func() from accidentally modifying any elements of the original array.
The only possible drawback to this alternative method of passing arrays is that the called function must be coded correctly to access the array—it is not readily apparent by the const_func() function prototype or definition that it is being passed a reference to an array of integers. You will find, however, that this method is much quicker and more efficient, and it is recommended when speed is of utmost importance.
7. Is it possible to execute code even after the program exits the main() function?
The standard C library provides a function named atexit() that can be used to perform "cleanup" operations when your program terminates. You can set up a set of functions you want to perform automatically when your program exits by passing function pointers to the atexit() function. Here's an example of a program that uses the atexit()function:
#include <stdio.h>
#include <stdlib.h>
void close_files(void);
void print_registration_message(void);
int main(int, char**);
int main(int argc, char** argv)
{
...
atexit(print_registration_message);
atexit(close_files);
while (rec_count < max_records)
{
process_one_record();
}
exit(0);
}
This example program uses the atexit() function to signify that the close_files() function and theprint_registration_message() function need to be called automatically when the program exits. When themain() function ends, these two functions will be called to close the files and print the registration message. There are two things that should be noted regarding the atexit() function. First, the functions you specify to execute at program termination must be declared as void functions that take no parameters. Second, the functions you designate with the atexit() function are stacked in the order in which they are called with atexit(), and therefore they are executed in a last-in, first-out (LIFO) method. Keep this information in mind when using the atexit() function. In the preceding example, the atexit() function is stacked as shown here:
atexit(print_registration_message);
atexit(close_files);
Because the LIFO method is used, the close_files() function will be called first, and then theprint_registration_message() function will be called.
The atexit() function can come in handy when you want to ensure that certain functions (such as closing your program's data files) are performed before your program terminates.
8. What does a function declared as PASCAL do differently?
A C function declared as PASCAL uses a different calling convention than a "regular" C function. Normally, C function parameters are passed right to left; with the PASCAL calling convention, the parameters are passed left to right.
Consider the following function, which is declared normally in a C program:
int regular_func(int, char*, long);
Using the standard C calling convention, the parameters are pushed on the stack from right to left. This means that when the regular_func() function is called in C, the stack will contain the following parameters:
long
char*
int
The function calling regular_func() is responsible for restoring the stack when regular_func() returns.
When the PASCAL calling convention is being used, the parameters are pushed on the stack from left to right.
Consider the following function, which is declared as using the PASCAL calling convention:
int PASCAL pascal_func(int, char*, long);
When the function pascal_func() is called in C, the stack will contain the following parameters:
int
char*
long
The function being called is responsible for restoring the stack pointer. Why does this matter? Is there any benefit to using PASCAL functions?
Functions that use the PASCAL calling convention are more efficient than regular C functions—the function calls tend to be slightly faster. Microsoft Windows is an example of an operating environment that uses the PASCAL calling convention. The Windows SDK (Software Development Kit) contains hundreds of functions declared as PASCAL.
When Windows was first designed and written in the late 1980s, using the PASCAL modifier tended to make a noticeable difference in program execution speed. In today's world of fast machinery, the PASCAL modifier is much less of a catalyst when it comes to the speed of your programs. In fact, Microsoft has abandoned the PASCAL calling convention style for the Windows NT operating system.
In your world of programming, if milliseconds make a big difference in your programs, you might want to use the PASCAL modifier when declaring your functions. Most of the time, however, the difference in speed is hardly noticeable, and you would do just fine to use C's regular calling convention.
No. The exit() function is used to exit your program and return control to the operating system. The return statement is used to return from a function and return control to the calling function. If you issue a return from themain() function, you are essentially returning control to the calling function, which is the operating system. In this case, the return statement and exit() function are similar. Here is an example of a program that uses the exit()function and return statement:
#include <stdio.h>
#include <stdlib.h>
int main(int, char**);
int do_processing(void);
int do_something_daring();
int main(int argc, char** argv)
{
int ret_code;
if (argc < 3)
{
printf("Wrong number of arguments used!\n");
/* return 1 to the operating system */
exit(1);
}
ret_code = do_processing();
...
/* return 0 to the operating system */
exit(0);
}
int do_processing(void)
{
int rc;
rc = do_something_daring();
if (rc == ERROR)
{
printf("Something fishy is going on around here..."\n);
/* return rc to the operating system */
exit(rc);
}
/* return 0 to the calling function */
return 0;
}
In the main() function, the program is exited if the argument count (argc) is less than 3. The statement exit(1);tells the program to exit and return the number 1 to the operating system. The operating system can then decide what to do based on the return value of the program. For instance, many DOS batch files check the environment variable named ERRORLEVEL for the return value of executable programs.
POINTERS:
1. What is indirection?
If you declare a variable, its name is a direct reference to its value. If you have a pointer to a variable, or any other object in memory, you have an indirect reference to its value. If p is a pointer, the value of p is the address of the object. *p means "apply the indirection operator to p"; its value is the value of the object that p points to. (Some people would read it as "Go indirect on p.")
*p is an lvalue; like a variable, it can go on the left side of an assignment operator, to change the value. If p is a pointer to a constant, *p is not a modifiable lvalue; it can't go on the left side of an assignment.
Consider the following program. It shows that when p points to i, *p can appear wherever i can.
#include <stdio.h>
int main()
{
int i;
int *p;
i = 5;
p = & i; /* now *p == i */
printf("i=%d, p=%P, *p=%d\n", i, p, *p);
*p = 6; /* same as i = 6 */
printf("i=%d, p=%P, *p=%d\n", i, p, *p);
return 0;
}
After p points to i (p = &i), you can print i or *p and get the same thing. You can even assign to *p, and the result is the same as if you had assigned to i.
2. How many levels of pointers can you have?
The answer depends on what you mean by "levels of pointers." If you mean "How many levels of indirection can you have in a single declaration?" the answer is "At least 12."
int i = 0;
int *ip01 = & i;
int **ip02 = & ip01;
int ***ip03 = & ip02;
int ****ip04 = & ip03;
int *****ip05 = & ip04;
int ******ip06 = & ip05;
int *******ip07 = & ip06;
int ********ip08 = & ip07;
int *********ip09 = & ip08;
int **********ip10 = & ip09;
int ***********ip11 = & ip10;
int ************ip12 = & ip11;
************ip12 = 1; /* i = 1 */
If you mean "How many levels of pointer can you use before the program gets hard to read," that's a matter of taste, but there is a limit. Having two levels of indirection (a pointer to a pointer to something) is common. Any more than that gets a bit harder to think about easily; don't do it unless the alternative would be worse.
If you mean "How many levels of pointer indirection can you have at runtime," there's no limit. This point is particularly important for circular lists, in which each node points to the next. Your program can follow the pointers forever.
Consider the following program "A circular list that uses infinite indirection".
/* Would run forever if you didn't limit it to MAX */
#include <stdio.h>
struct circ_list
{
char value[ 3 ]; /* e.g., "st" (incl '\0') */
struct circ_list *next;
};
struct circ_list suffixes[] = {
"th", & suffixes[ 1 ], /* 0th */
"st", & suffixes[ 2 ], /* 1st */
"nd", & suffixes[ 3 ], /* 2nd */
"rd", & suffixes[ 4 ], /* 3rd */
"th", & suffixes[ 5 ], /* 4th */
"th", & suffixes[ 6 ], /* 5th */
"th", & suffixes[ 7 ], /* 6th */
"th", & suffixes[ 8 ], /* 7th */
"th", & suffixes[ 9 ], /* 8th */
"th", & suffixes[ 0 ], /* 9th */
};
#define MAX 20
int main()
{
int i = 0;
struct circ_list *p = suffixes;
while (i <= MAX)
{
printf( "%d%s\n", i, p->value );
++i;
p = p->next;
}
return 0;
}
Each element in suffixes has one suffix (two characters plus the terminating NUL character) and a pointer to the next element. next is a pointer to something that has a pointer, to something that has a pointer, ad infinitum.
The example is dumb because the number of elements in suffixes is fixed. It would be simpler to have an array of suffixes and to use the i%10'th element. In general, circular lists can grow and shrink.
3. What is a null pointer?
There are times when it's necessary to have a pointer that doesn't point to anything. The macro NULL, defined in<stddef.h>, has a value that's guaranteed to be different from any valid pointer. NULL is a literal zero, possibly cast to void* or char*. Some people, notably C++ programmers, prefer to use 0 rather than NULL.
You can't use an integer when a pointer is required. The exception is that a literal zero value can be used as the nullpointer. (It doesn't have to be a literal zero, but that's the only useful case. Any expression that can be evaluated at compile time, and that is zero, will do. It's not good enough to have an integer variable that might be zero at runtime.)
4. When is a null pointer used?
The null pointer is used in three ways:
1. To stop indirection in a recursive data structure.
2. As an error value.
3. As a sentinel value.
1. Using a Null Pointer to Stop Indirection or Recursion
Recursion is when one thing is defined in terms of itself. A recursive function calls itself. The following factorial function calls itself and therefore is considered recursive:
/* Dumb implementation; should use a loop */
unsigned factorial( unsigned i )
{
if ( i == 0 || i == 1 )
{
return 1;
}
else
{
return i * factorial( i - 1 );
}
}
A recursive data structure is defined in terms of itself. The simplest and most common case is a (singularly) linked list. Each element of the list has some value, and a pointer to the next element in the list:
struct string_list
{
char *str; /* string (in this case) */
struct string_list *next;
};
There are also doubly linked lists (which also have a pointer to the preceding element) and trees and hash tables and lots of other neat stuff. You'll find them described in any good book on data structures. You refer to a linked list with a pointer to its first element. That's where the list starts; where does it stop? This is where the null pointer comes in. In the last element in the list, the next field is set to NULL when there is no following element. To visit all the elements in a list, start at the beginning and go indirect on the next pointer as long as it's not null:
while ( p != NULL )
{
/* do something with p->str */
p = p->next;
}
Notice that this technique works even if p starts as the null pointer.
2. Using a Null Pointer As an Error Value
The second way the null pointer can be used is as an error value. Many C functions return a pointer to some object. If so, the common convention is to return a null pointer as an error code:
if ( setlocale( cat, loc_p ) == NULL )
{
/* setlocale() failed; do something */
/* ... */
}
This can be a little confusing. Functions that return pointers almost always return a valid pointer (one that doesn't compare equal to zero) on success, and a null pointer (one that compares equal to zero) pointer on failure. Other functions return an int to show success or failure; typically, zero is success and nonzero is failure. That way, a "true" return value means "do some error handling":
if ( raise( sig ) != 0 ) {
/* raise() failed; do something */
/* ... */
}
The success and failure return values make sense one way for functions that return ints, and another for functions that return pointers. Other functions might return a count on success, and either zero or some negative value on failure. As with taking medicine, you should read the instructions first.
Using a Null Pointer As a Sentinel Value
The third way a null pointer can be used is as a "sentinel" value. A sentinel value is a special value that marks the end of something. For example, in main(), argv is an array of pointers. The last element in the array (argv[argc]) is always a null pointer. That's a good way to run quickly through all the elements:
/* A simple program that prints all its arguments.
It doesn't use argc ("argument count"); instead,
it takes advantage of the fact that the last value
in argv ("argument vector") is a null pointer. */
#include <stdio.h>
#include <assert.h>
int
main( int argc, char **argv)
{
int i;
printf("program name = \"%s\"\n", argv[0]);
for (i=1; argv[i] != NULL; ++i)
printf("argv[%d] = \"%s\"\n", i, argv[i]);
assert(i == argc);
return 0;
}
5. What is a void pointer?
A void pointer is a C convention for "a raw address." The compiler has no idea what type of object a void pointer "really points to." If you write
int *ip;
ip points to an int. If you write
void *p;
p doesn't point to a void!
In C and C++, any time you need a void pointer, you can use another pointer type. For example, if you have a char*, you can pass it to a function that expects a void*. You don't even need to cast it. In C (but not in C++), you can use a void* any time you need any kind of pointer, without casting. (In C++, you need to cast it.)
6. When is a void pointer used?
A void pointer is used for working with raw memory or for passing a pointer to an unspecified type.
Some C code operates on raw memory. When C was first invented, character pointers (char *) were used for that. Then people started getting confused about when a character pointer was a string, when it was a character array, and when it was raw memory.
For example, strcpy() is used to copy data from one string to another, and strncpy() is used to copy at most a certain length string to another:
char *strcpy( char *str1, const char *str2 );
char *strncpy( char *str1, const char *str2, size_t n );
memcpy() is used to move data from one location to another:
void *memcpy( void *addr1, void *addr2, size_t n );
void pointers are used to mean that this is raw memory being copied. NUL characters (zero bytes) aren't significant, and just about anything can be copied. Consider the following code:
#include "thingie.h" /* defines struct thingie */
struct thingie *p_src, *p_dest;
/* ... */
memcpy( p_dest, p_src, sizeof( struct thingie) * numThingies );
This program is manipulating some sort of object stored in a struct thingie. p1 and p2 point to arrays, or parts of arrays, of struct thingies. The program wants to copy numThingies of these, starting at the one pointed to byp_src, to the part of the array beginning at the element pointed to by p_dest. memcpy() treats p_src and p_destas pointers to raw memory; sizeof( struct thingie) * numThingies is the number of bytes to be copied.
7. Can you subtract pointers from each other? Why would you?
If you have two pointers into the same array, you can subtract them. The answer is the number of elements between the two elements.
Consider the street address analogy presented in the introduction of this chapter. Say that I live at 118 Fifth Avenue and that my neighbor lives at 124 Fifth Avenue. The "size of a house" is two (on my side of the street, sequential even numbers are used), so my neighbor is (124-118)/2 (or 3) houses up from me. (There are two houses between us,120 and 122; my neighbor is the third.) You might do this subtraction if you're going back and forth between indices and pointers.
You might also do it if you're doing a binary search. If p points to an element that's before what you're looking for, andq points to an element that's after it, then (q-p)/2+p points to an element between p and q. If that element is before what you want, look between it and q. If it's after what you want, look between p and it.
(If it's what you're looking for, stop looking.)
You can't subtract arbitrary pointers and get meaningful answers. Someone might live at 110 Main Street, but I can't subtract 110 Main from 118 Fifth (and divide by 2) and say that he or she is four houses away!
If each block starts a new hundred, I can't even subtract 120 Fifth Avenue from 204 Fifth Avenue. They're on the same street, but in different blocks of houses (different arrays).
C won't stop you from subtracting pointers inappropriately. It won't cut you any slack, though, if you use the meaningless answer in a way that might get you into trouble.
When you subtract pointers, you get a value of some integer type. The ANSI C standard defines a typedef,ptrdiff_t, for this type. (It's in <stddef.h>.) Different compilers might use different types (int or long or whatever), but they all define ptrdiff_t appropriately.
Below is a simple program that demonstrates this point. The program has an array of structures, each 16 bytes long. The difference between array[0] and array[8] is 8 when you subtract struct stuff pointers, but 128 (hex 0x80) when you cast the pointers to raw addresses and then subtract.
If you subtract 8 from a pointer to array[8], you don't get something 8 bytes earlier; you get something 8 elements earlier.
#include <stdio.h>
#include <stddef.h>
struct stuff {
char name[16];
/* other stuff could go here, too */
};
struct stuff array[] = {
{ "The" },
{ "quick" },
{ "brown" },
{ "fox" },
{ "jumped" },
{ "over" },
{ "the" },
{ "lazy" },
{ "dog." },
{ "" }
};
int main()
{
struct stuff *p0 = & array[0];
struct stuff *p8 = & array[8];
ptrdiff_t diff = p8 - p0;
ptrdiff_t addr_diff = (char*) p8 - (char*) p0;
/* cast the struct stuff pointers to void* */
printf("& array[0] = p0 = %P\n", (void*) p0);
printf("& array[8] = p8 = %P\n", (void*) p8);
/* cast the ptrdiff_t's to long's
(which we know printf() can handle) */
printf("The difference of pointers is %ld\n",
(long) diff);
printf("The difference of addresses is %ld\n",
(long) addr_diff);
printf("p8 - 8 = %P\n", (void*) (p8 - 8));
printf("p0 + 8 = %P (same as p8)\n", (void*) (p0 + 8));
return 0;
}
8. Is NULL always defined as 0(zero)?
NULL is defined as either 0 or (void*)0. These values are almost identical; either a literal zero or a void pointer is converted automatically to any kind of pointer, as necessary, whenever a pointer is needed (although the compiler can't always tell when a pointer is needed).
9. Is NULL always equal to 0(zero)?
The answer depends on what you mean by "equal to." If you mean "compares equal to," such as
if ( /* ... */ )
{
p = NULL;
}
else
{
p = /* something else */;
}
/* ... */
if ( p == 0 )
then yes, NULL is always equal to 0. That's the whole point of the definition of a null pointer.
If you mean "is stored the same way as an integer zero," the answer is no, not necessarily. That's the most common way to store a null pointer. On some machines, a different representation is used.
The only way you're likely to tell that a null pointer isn't stored the same way as zero is by displaying a pointer in a debugger, or printing it. (If you cast a null pointer to an integer type, that might also show a nonzero value.)
10. What does it mean when a pointer is used in an if statement?
Any time a pointer is used as a condition, it means "Is this a non-null pointer?" A pointer can be used in an if, while,for, or do/while statement, or in a conditional expression. It sounds a little complicated, but it's not.
Take this simple case:
if ( p )
{
/* do something */
}
else
{
/* do something else */
}
An if statement does the "then" (first) part when its expression compares unequal to zero. That is,
if ( /* something */ )
is always exactly the same as this:
if ( /* something */ != 0 )
That means the previous simple example is the same thing as this:
if ( p != 0 )
{
/* do something (not a null pointer) */
}
else
{
/* do something else (a null pointer) */
}
This style of coding is a little obscure. It's very common in existing C code; you don't have to write code that way, but you need to recognize such code when you see it.
9. Is NULL always equal to 0(zero)?
The answer depends on what you mean by "equal to." If you mean "compares equal to," such as
if ( /* ... */ )
{
p = NULL;
}
else
{
p = /* something else */;
}
/* ... */
if ( p == 0 )
then yes, NULL is always equal to 0. That's the whole point of the definition of a null pointer.
If you mean "is stored the same way as an integer zero," the answer is no, not necessarily. That's the most common way to store a null pointer. On some machines, a different representation is used.
The only way you're likely to tell that a null pointer isn't stored the same way as zero is by displaying a pointer in a debugger, or printing it. (If you cast a null pointer to an integer type, that might also show a nonzero value.)
10. What does it mean when a pointer is used in an if statement?
Any time a pointer is used as a condition, it means "Is this a non-null pointer?" A pointer can be used in an if, while,for, or do/while statement, or in a conditional expression. It sounds a little complicated, but it's not.
Take this simple case:
if ( p )
{
/* do something */
}
else
{
/* do something else */
}
An if statement does the "then" (first) part when its expression compares unequal to zero. That is,
if ( /* something */ )
is always exactly the same as this:
if ( /* something */ != 0 )
That means the previous simple example is the same thing as this:
if ( p != 0 )
{
/* do something (not a null pointer) */
}
else
{
/* do something else (a null pointer) */
}
This style of coding is a little obscure. It's very common in existing C code; you don't have to write code that way, but you need to recognize such code when you see it.
11. Can you add pointers together? Why would you?
No, you can't add pointers together. If you live at 1332 Lakeview Drive, and your neighbor lives at 1364 Lakeview, what's 1332+1364? It's a number, but it doesn't mean anything. If you try to perform this type of calculation with pointers in a C program, your compiler will complain.
The only time the addition of pointers might come up is if you try to add a pointer and the difference of two pointers:
p = p + p2 - p1;
which is the same thing as this:
p = (p + p2) - p1.
Here's a correct way of saying this:
p = p + ( p2 - p1 );
Or even better in this case would be this example:
p += p2 - p1;
12. How do you use a pointer to a function?
The hardest part about using a pointer-to-function is declaring it. Consider an example. You want to create a pointer,pf, that points to the strcmp() function. The strcmp() function is declared in this way:
int strcmp( const char *, const char * )
To set up pf to point to the strcmp() function, you want a declaration that looks just like the strcmp() function's declaration, but that has *pf rather than strcmp:
int (*pf)( const char *, const char * );
Notice that you need to put parentheses around *pf. If you don't include parentheses, as in
int *pf( const char *, const char * ); /* wrong */
you'll get the same thing as this:
(int *) pf( const char *, const char * ); /* wrong */
That is, you'll have a declaration of a function that returns int*.
After you've gotten the declaration of pf, you can #include <string.h> and assign the address of strcmp() topf:
pf = strcmp;
or
pf = & strcmp; /* redundant & */
You don't need to go indirect on pf to call it:
if ( pf( str1, str2 ) > 0 ) /* ... */
13. When would you use a pointer to a function?
Pointers to functions are interesting when you pass them to other functions. A function that takes function pointers says, in effect, "Part of what I do can be customized. Give me a pointer to a function, and I'll call it when that part of the job needs to be done. That function can do its part for me." This is known as a "callback." It's used a lot in graphical user interface libraries, in which the style of a display is built into the library but the contents of the display are part of the application.
As a simpler example, say you have an array of character pointers (char*s), and you want to sort it by the value of the strings the character pointers point to. The standard qsort() function uses function pointers to perform that task. qsort() takes four arguments
1. a pointer to the beginning of the array,
2. the number of elements in the array,
3. the size of each array element, and
4. a comparison function.
and returns an int.
The comparison function takes two arguments, each a pointer to an element. The function returns 0 if the pointed-to elements compare equal, some negative value if the first element is less than the second, and some positive value if the first element is greater than the second. A comparison function for integers might look like this:
int icmp( const int *p1, const int *p2 )
{
return *p1 - *p2;
}
The sorting algorithm is part of qsort(). So is the exchange algorithm; it just copies bytes, possibly by callingmemcpy() or memmove(). qsort() doesn't know what it's sorting, so it can't know how to compare them. That part is provided by the function pointer.
You can't use strcmp() as the comparison function for this example, for two reasons. The first reason is thatstrcmp()'s type is wrong; more on that a little later. The second reason is that it won't work. strcmp() takes two pointers to char and treats them as the first characters of two strings. The example deals with an array of character pointers (char*s), so the comparison function must take two pointers to character pointers (char*s). In this case, the following code might be an example of a good comparison function:
int strpcmp( const void *p1, const void *p2 )
{
char * const *sp1 = (char * const *) p1;
char * const *sp2 = (char * const *) p2;
return strcmp( *sp1, *sp2 );
}
The call to qsort() might look something like this:
qsort( array, numElements, sizeof( char * ), pf2 );
qsort() will call strpcmp() every time it needs to compare two character pointers (char*s).
Why can't strcmp() be passed to qsort(), and why were the arguments of strpcmp() what they were?
A function pointer's type depends on the return type of the pointed-to function, as well as the number and types of all its arguments. qsort() expects a function that takes two constant void pointers:
void qsort( void *base,
size_t numElements,
size_t sizeOfElement,
int (*compFunct)( const void *, const void *) );
Because qsort() doesn't really know what it's sorting, it uses a void pointer in its argument (base) and in the arguments to the comparison function. qsort()'s void* argument is easy; any pointer can be converted to a void*without even needing a cast. The function pointer is harder.
For an array of character arrays, strcmp() would have the right algorithm but the wrong argument types. The simplest, safest way to handle this situation is to pass a function that takes the right argument types for qsort() and then casts them to the right argument types. That's what strpcmp() does.
If you have a function that takes a char*, and you know that a char* and a void* are the same in every environment your program might ever work in, you might cast the function pointer, rather than the pointed- to function's arguments, in this way:
char table[ NUM_ELEMENTS ][ ELEMENT_SIZE ];
/* ... */
/* passing strcmp() to qsort for array of array of char */
qsort( table, NUM_ELEMENTS, ELEMENT_SIZE,
( int (*)( const void *, const void * ) ) strcmp );
Casting the arguments and casting the function pointer both can be error prone. In practice, casting the function pointer is more dangerous.
The basic problem here is using void* when you have a pointer to an unknown type. C++ programs sometime solve this problem with templates.
14. Can the size of an array be declared at runtime?
No. In an array declaration, the size must be known at compile time. You can't specify a size that's known only at runtime. For example, if i is a variable, you can't write code like this:
char array[i]; /* not valid C */
Some languages provide this latitude. C doesn't. If it did, the stack would be more complicated, function calls would be more expensive, and programs would run a lot slower.
If you know that you have an array but you won't know until runtime how big it will be, declare a pointer to it and usemalloc() or calloc() to allocate the array from the heap.
If you know at compile time how big an array is, you can declare its size at compile time. Even if the size is some complicated expression, as long as it can be evaluated at compile time, it can be used.
/* A program that copies the argv array and all the pointed-to
strings. It also deallocates all the copies. */
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
char** new_argv;
int i;
/* Since argv[0] through argv[argc] are all valid, the
program needs to allocate room for argc+1 pointers. */
new_argv = (char**) calloc(argc+1, sizeof (char*));
/* or malloc((argc+1) * sizeof (char*)) */
printf("allocated room for %d pointers starting at %P\n", argc+1, new_argv);
/* now copy all the strings themselves
(argv[0] through argv[argc-1]) */
for (i = 0; i < argc; ++i) {
/* make room for '\0' at end, too */
new_argv[i] = (char*) malloc(strlen(argv[i]) + 1);
strcpy(new_argv[i], argv[i]);
printf("allocated %d bytes for new_argv[%d] at %P, ""copied \"%s\"\n",
strlen(argv[i]) + 1, i, new_argv[i], new_argv[i]);
}
new_argv[argc] = NULL;
/* To deallocate everything, get rid of the strings (in any
order), then the array of pointers. If you free the array
of pointers first, you lose all reference to the copied
strings. */
for (i = 0; i < argc; ++i)
{
free(new_argv[i]);
printf("freed new_argv[%d] at %P\n", i, new_argv[i]);
argv[i] = NULL;
}
free(new_argv);
printf("freed new_argv itself at %P\n", new_argv);
return 0;
}
15. Is it better to use malloc() or calloc()?
Both the malloc() and the calloc() functions are used to allocate dynamic memory. Each operates slightly different from the other. malloc() takes a size and returns a pointer to a chunk of memory at least that big:
void *malloc( size_t size );
calloc() takes a number of elements, and the size of each, and returns a pointer to a chunk of memory at least big enough to hold them all:
void *calloc( size_t numElements, size_t sizeOfElement );
There's one major difference and one minor difference between the two functions. The major difference is thatmalloc() doesn't initialize the allocated memory. The first time malloc() gives you a particular chunk of memory, the memory might be full of zeros. If memory has been allocated, freed, and reallocated, it probably has whatever junk was left in it. That means, unfortunately, that a program might run in simple cases (when memory is never reallocated) but break when used harder (and when memory is reused).
calloc() fills the allocated memory with all zero bits. That means that anything there you're going to use as a charor an int of any length, signed or unsigned, is guaranteed to be zero. Anything you're going to use as a pointer is set to all zero bits. That's usually a null pointer, but it's not guaranteed.
Anything you're going to use as a float or double is set to all zero bits; that's a floating-point zero on some types of machines, but not on all.
The minor difference between the two is that calloc() returns an array of objects; malloc() returns one object. Some people use calloc() to make clear that they want an array. Other than initialization, most C programmers don't distinguish between
calloc( numElements, sizeOfElement)
and
malloc( numElements * sizeOfElement)
There's a nit, though. malloc() doesn't give you a pointer to an array. In theory (according to the ANSI C standard), pointer arithmetic works only within a single array. In practice, if any C compiler or interpreter were to enforce that theory, lots of existing C code would break. (There wouldn't be much use for realloc(), either, which also doesn't guarantee a pointer to an array.)
Don't worry about the array-ness of calloc(). If you want initialization to zeros, use calloc(); if not, usemalloc().
16. How do you declare an array that will hold more than 64KB of data?
The coward's answer is, you can't, portably. The ANSI/ISO C standard requires compilers to handle only single objects as large as (32KB - 1) bytes long.
Why is 64KB magic? It's the biggest number that needs more than 16 bits to represent it.
For some environments, to get an array that big, you just declare it. It works, no trouble. For others, you can't declare such an array, but you can allocate one off the heap, just by calling malloc() or calloc().
On a PC compatible, the same limitations apply, and more. You need to use at least a large data model. You might also need to call "far" variants of malloc() or calloc(). For example, with Borland C and C++ compilers, you could write
far char *buffer = farmalloc(70000L);
Or with Microsoft C and C++ compilers, you could write
far char *buffer = fmalloc(70000L);
to allocate 70,000 bytes of memory into a buffer. (The L in 70000L forces a long constant. An int constant might be only 15 bits long plus a sign bit, not big enough to store the value 70,000.)
17. What is the difference between far and near ?
Compilers for PC compatibles use two types of pointers.
near pointers are 16 bits long and can address a 64KB range. far pointers are 32 bits long and can address a 1MB range.
near pointers operate within a 64KB segment. There's one segment for function addresses and one segment for data.
far pointers have a 16-bit base (the segment address) and a 16-bit offset. The base is multiplied by 16, so a far pointer is effectively 20 bits long. For example, if a far pointer had a segment of 0x7000 and an offset of 0x1224, the pointer would refer to address 0x71224. A far pointer with a segment of 0x7122 and an offset of 0x0004 would refer to the same address.
Before you compile your code, you must tell the compiler which memory model to use. If you use a small- code memory model, near pointers are used by default for function addresses. That means that all the functions need to fit in one 64KB segment. With a large-code model, the default is to use far function addresses. You'll get near pointers with a small data model, and far pointers with a large data model. These are just the defaults; you can declare variables and functions as explicitly near or far.
far pointers are a little slower. Whenever one is used, the code or data segment register needs to be swapped out.far pointers also have odd semantics for arithmetic and comparison. For example, the two far pointers in the preceding example point to the same address, but they would compare as different! If your program fits in a small-data, small-code memory model, your life will be easier. If it doesn't, there's not much you can do.
If it sounds confusing, it is. There are some additional, compiler-specific wrinkles. Check your compiler manuals for details.
18. When should a far pointer be used?
Sometimes you can get away with using a small memory model in most of a given program. There might be just a few things that don't fit in your small data and code segments.
When that happens, you can use explicit far pointers and function declarations to get at the rest of memory. A farfunction can be outside the 64KB segment most functions are shoehorned into for a small-code model. (Often, libraries are declared explicitly far, so they'll work no matter what code model the program uses.)
A far pointer can refer to information outside the 64KB data segment. Typically, such pointers are used withfarmalloc() and such, to manage a heap separate from where all the rest of the data lives.
If you use a small-data, large-code model, you should explicitly make your function pointers far.
19. What is the stack?
A "stack trace" is a list of which functions have been called, based on this information. When you start using a debugger, one of the first things you should learn is how to get a stack trace.
The stack is very inflexible about allocating memory; everything must be deallocated in exactly the reverse order it was allocated in. For implementing function calls, that is all that's needed. Allocating memory off the stack is extremely efficient. One of the reasons C compilers generate such good code is their heavy use of a simple stack.
There used to be a C function that any programmer could use for allocating memory off the stack. The memory was automatically deallocated when the calling function returned. This was a dangerous function to call; it's not available anymore.
20. What is the heap?
The heap is where malloc(), calloc(), and realloc() get memory.
Getting memory from the heap is much slower than getting it from the stack. On the other hand, the heap is much more flexible than the stack. Memory can be allocated at any time and deallocated in any order. Such memory isn't deallocated automatically; you have to call free().
Recursive data structures are almost always implemented with memory from the heap. Strings often come from there too, especially strings that could be very long at runtime.
If you can keep data in a local variable (and allocate it from the stack), your code will run faster than if you put the data on the heap. Sometimes you can use a better algorithm if you use the heap—faster, or more robust, or more flexible. It's a tradeoff.
If memory is allocated from the heap, it's available until the program ends. That's great if you remember to deallocate it when you're done. If you forget, it's a problem. A "memory leak" is some allocated memory that's no longer needed but isn't deallocated. If you have a memory leak inside a loop, you can use up all the memory on the heap and not be able to get any more. (When that happens, the allocation functions return a null pointer.) In some environments, if a program doesn't deallocate everything it allocated, memory stays unavailable even after the program ends.
Some programming languages don't make you deallocate memory from the heap. Instead, such memory is "garbage collected" automatically. This maneuver leads to some very serious performance issues. It's also a lot harder to implement. That's an issue for the people who develop compilers, not the people who buy them. (Except that software that's harder to implement often costs more.) There are some garbage collection libraries for C, but they're at the bleeding edge of the state of the art.
21. What happens if you free a pointer twice?
If you free a pointer, use it to allocate memory again, and free it again, of course it's safe
If you free a pointer, the memory you freed might be reallocated. If that happens, you might get that pointer back. In this case, freeing the pointer twice is OK, but only because you've been lucky. The following example is silly, but safe:
#include <stdlib.h>
int main(int argc, char** argv)
{
char** new_argv1;
char** new_argv2;
new_argv1 = calloc(argc+1, sizeof(char*));
free(new_argv1); /* freed once */
new_argv2 = (char**) calloc(argc+1, sizeof(char*));
if (new_argv1 == new_argv2)
{
/* new_argv1 accidentally points to freeable memory */
free(new_argv1); /* freed twice */
}
else
{
free(new_argv2);
}
new_argv1 = calloc(argc+1, sizeof(char*));
free(new_argv1); /* freed once again */
return 0;
}
In the preceding program, new_argv1 is pointed to a chunk of memory big enough to copy the argv array, which is immediately freed. Then a chunk the same size is allocated, and its address is assigned to new_argv2. Because the first chunk was available again, calloc might have returned it again; in that case, new_argv1 and new_argv2 have the same value, and it doesn't matter which variable you use. (Remember, it's the pointed- to memory that's freed, not the pointer variable.) new_argv1 is pointed to allocated memory again, which is again freed. You can free a pointer as many times as you want; it's the memory you have to be careful about.
What if you free allocated memory, don't get it allocated back to you, and then free it again? Something like this:
void caller( ... )
{
void *p;
/* ... */
callee( p );
free( p );
}
void callee( void* p )
{
/* ... */
free( p );
return;
}
In this example, the caller() function is passing p to the callee() function and then freeing p. Unfortunately,callee() is also freeing p. Thus, the memory that p points to is being freed twice. The ANSI/ ISO C standard says this is undefined. Anything can happen. Usually, something very bad happens.
The memory allocation and deallocation functions could be written to keep track of what has been used and what has been freed. Typically, they aren't. If you free() a pointer, the pointed-to memory is assumed to have been allocated bymalloc() or calloc() but not deallocated since then. free() calculates how big that chunk of memory was and updates the data structures in the memory "arena." Even if the memory has been freed already, free() will assume that it wasn't, and it will blindly update the arena. This action is much faster than it would have been if free() had checked to see whether the pointer was OK to deallocate.
If something doesn't work right, your program is now in trouble. When free() updates the arena, it will probably write some information in a wrong place. You now have the fun of dealing with a wild pointer;
22. What is the difference between NULL and NUL?
NULL is a macro defined in <stddef.h> for the null pointer.
NUL is the name of the first character in the ASCII character set. It corresponds to a zero value. There's no standard macro NUL in C, but some people like to define it.
NULL can be defined as ((void*)0), NUL as '\0'. Both can also be defined simply as 0. If they're defined that way, they can be used interchangeably. That's a bad way to write C code. One is meant to be used as a pointer; the other, as a character. If you write your code so that the difference is obvious, the next person who has to read and change your code will have an easier job. If you write obscurely, the next person might have problems.
23. What is a "null pointer assignment" error? What are bus errors, memory faults, and core dumps?
These are all serious errors, symptoms of a wild pointer or subscript.
Null pointer assignment is a message you might get when an MS-DOS program finishes executing. Some such programs can arrange for a small amount of memory to be available "where the NULL pointer points to" (so to speak). If the program tries to write to that area, it will overwrite the data put there by the compiler. When the program is done, code generated by the compiler examines that area. If that data has been changed, the compiler-generated code complains with null pointer assignment.
This message carries only enough information to get you worried. There's no way to tell, just from a null pointer assignment message, what part of your program is responsible for the error. Some debuggers, and some compilers, can give you more help in finding the problem.
Bus error: core dumped and Memory fault: core dumped are messages you might see from a program running under UNIX. They're more programmer friendly. Both mean that a pointer or an array subscript was wildly out of bounds. You can get these messages on a read or on a write. They aren't restricted to null pointer problems.
The core dumped part of the message is telling you about a file, called core, that has just been written in your current directory. This is a dump of everything on the stack and in the heap at the time the program was running. With the help of a debugger, you can use the core dump to find where the bad pointer was used.
That might not tell you why the pointer was bad, but it's a step in the right direction. If you don't have write permission in the current directory, you won't get a core file, or the core dumped message.
The same tools that help find memory allocation bugs can help find some wild pointers and subscripts, sometimes. The best such tools can find almost all occurrences of this kind of problem.
24. How can you determine the size of an allocated portion of memory?
You can't, really. free() can, but there's no way for your program to know the trick free() uses.
25. How does free() know how much memory to release?
There's no standard way. It can vary from compiler to compiler, even from version to version of the same compiler.free(), malloc(), calloc(), and realloc() are functions; as long as they all work the same way, they can work any way that works.
Most implementations take advantage of the same trick, though. When malloc() (or one of the other allocation functions) allocates a block of memory, it grabs a little more than it was asked to grab. malloc() doesn't return the address of the beginning of this block. Instead, it returns a pointer a little bit after that.
At the very beginning of the block, before the address returned, malloc() stores some information, such as how big the block is. (If this information gets overwritten, you'll have wild pointer problems when you free the memory.)
There's no guarantee free() works this way. It could use a table of allocated addresses and their lengths. It could store the data at the end of the block (beyond the length requested by the call to malloc()). It could store a pointer rather than a count.
26. Can math operations be performed on a void pointer?
No. Pointer addition and subtraction are based on advancing the pointer by a number of elements. By definition, if you have a void pointer, you don't know what it's pointing to, so you don't know the size of what it's pointing to.
If you want pointer arithmetic to work on raw addresses, use character pointers.
27. How do you print an address?
The safest way is to use printf() (or fprintf() or sprintf()) with the %P specification. That prints a voidpointer (void*). Different compilers might print a pointer with different formats. Your compiler will pick a format that's right for your environment.
If you have some other kind of pointer (not a void*) and you want to be very safe, cast the pointer to a void*:
printf( "%P\n", (void*) buffer );
There's no guarantee any integer type is big enough to store a pointer. With most compilers, an unsigned long is big enough. The second safest way to print an address (the value of a pointer) is to cast it to an unsigned long, then print that.