Understanding Pointers
One of the lectures in my university is called “Introduction into structured programming” which basically is “learning C in one and a half month”. To get our ECTS points we have to team up and submit small programs written in C together with a design document that pass a given test suite.
Unfortunately many of the students in my group are totally new to C and understanding pointers is tricky. First of all because it's not quite clear why pointers are useful at all and secondly because all these stars and some weird features of the C language make it even trickier to get ones head around it.
Introduction
But pointers are pretty easy once you understand how they actually work. Basically your computer has some some amount of memory available, all memory divided into one-byte cells and each of these cells has a number, the memory address.
If you have this piece of C code:
int value = 42; int *pointer = &value;
The & operator gives you the memory address of the operand (the thing on the right side of it). int * means “a pointer to an integer”. Because we assign it &value it means “a pointer to an integer with the memory address of value”. So it now points to value. In memory this could look like this (simplified):
| Memory Address | Value | Variable |
|---|---|---|
| 100 | 42 | int value |
| 104 | 100 | int *pointer |
Now as you can see from the table above the pointer itself (the int *pointer) thing takes up memory too and as such it also has a memory address. Now we can expand that example one more time:
int value = 42; int *pointer = &value; int **pointer_pointer = &pointer;
Which looks like this in memory:
| Memory Address | Value | Variable |
|---|---|---|
| 100 | 42 | int value |
| 104 | 100 | int *pointer |
| 108 | 104 | int **pointer_pointer |
This now is a pointer to a pointer. I'll show you later why this is useful, but note that this is pretty much the same thing. We now have a pointer to an integer pointer and the value stored in that pointer is the memory address of the integer pointer we've had before (If that sounds scary, look at the table and notice that it's not ^^).
So now that we have a pointer here we probably want to do something with it. And that's why there is the * operator which is the reverse operator of &. *pointer “dereferences” the pointer and returns the value:
int integer = 42; int *pointer = printf("%d, %d\n", integer, *pointer);
This will print the number 42 twice. *pointer means “look at the value of my pointer and go to the memory address stored in there. Then return me the value stored there”. We can also use the star-operator to change the value at the address stored in our pointer:
int integer = 0; int *pointer = &integer; *pointer = 42; printf("%d, %d\n", integer, *pointer);
And fair enough, this will print 42 twice again. Now please note that int *pointer = foo and *pointer = foo is not the same thing. The first thing declares “pointer” as an integer pointer, the latter changes the value stored at the address pointer is pointing to. int *pointer = foo; is basically a shortcut for int *pointer; pointer = foo.
Arrays
So now to what you can use pointer for. The most common use case for pointers are arrays because arrays and pointers are the same thing in C. foo[1] is exactly the same thing as *(foo + 1). Confusing? We will resolve that shortly. Take this code as an example:
int foo[] = {4, 8, 15, 16, 23, 42}; printf("%d, %d\n", foo[1], *(foo + 1));
This example prints 8 two times. In memory it looks like this:
| Memory Address | Value |
|---|---|
| 100 | 4 |
| 104 | 8 |
| 108 | 15 |
| 112 | 16 |
| 116 | 23 |
| 120 | 42 |
Now “foo” is actually nothing more than a pointer that points to the memory address 100 in that example. *foo gives you 4 and so does foo[0]. The interesting part is getting the second item of that array. As you can guess from the table above each integer is 4 bytes in size (that's why the first item is at memory address 100, the second at 104 etc.). If you add a number to a pointer you are actually adding that number times the number of bytes required for an integer. In our case four. So foo + 1 gives you 104 if “foo” was 100. foo + 2 gives you 108 and so on.
*(foo + 2) gives you 15 and so does foo[2]. And because that is really implemented like that in C you can also do 2[foo] instead of foo[2] which then translates to *(2 + foo) which obviously is the same thing as *(foo + 2).
So yes, arrays are pointers and so are...
Strings
Because strings in C are nothing more than an array of chars with a closing 0 character. So quickly. This code:
char *foo = "Hello";
Looks like this in memory:
| Memory Address | Value |
|---|---|
| 100 | 'H' |
| 101 | 'e' |
| 102 | 'l' |
| 103 | 'l' |
| 104 | 'o' |
| 105 | 0 |
Of course not letters are stored in memory but the ASCII code of that letter. So because strings are nothing more than the pointer to the first character of that string we can process them by iterating over them until we reach the NULL-character for processing. The following code for example iterates over the string and prints the ASCII code for each of the characters:
char *string = "Hello"; char *p = string; while (*p) printf("%u\n", *p++);
(I created a second pointer that points to the string so that we can safely increment that one until we reach the last letter without loosing our pointer to the start of the string. If you would increment string instead of p string would point to the null-character after the while loop.)
Multiple Return Values
Another good example why pointers are useful is if your function returns more than one value. Many functions in C are returning an integer that is 0 on success or an error number otherwise and returns the real values by pointers. For example we could implement a safe division of number like this:
int divide_number(int a, int b, int *result) { if (b == 0) return 1; *result = a / b; return 0; }
You can use that function like this:
int a = 42, b = 0, result; if (divide_number(a, b, else printf("The result of %d / %d is %d\n", a, b, result);
Note that we do not pass “result” but the memory address of “result” to the function so that the function can store a value in there.
Null Pointers
There is one special pointer that points to the very first position in the memory. It's called a null pointer and really nothing more than a pointer that points to memory address “0”. Per definition the application will crash if you try to dereference it. So don't do that ;) If you use the NULL macro you get a null pointer.
Null pointers are useful for error checking. Many functions that operate on pointers will return a null pointer on error. To see if a pointer is a null pointer, just do if (pointer == NULL) or just if (!pointer).
Dynamic Memory Management
Which ultimately brings us to dynamic memory management where you need pointers. Imagine you don't know at compile time how many memory you will need for something. In this situation you have to allocate memory for yourself and free it afterwards. To allocate memory you can use malloc() which takes the number of bytes it should allocate and returns the memory address where the computer has enough free memory for you. To free the memory later again you can use the free() function which takes the pointer malloc returned and frees the memory again. If there is not enough memory available (unlikely on modern systems but you should still handle it) the function returns a null-pointer. Let's implement a simple string copy function that allocates memory itself:
char* copy_string(char* original) { unsigned int length = strlen(original); char *result = malloc(sizeof(char) * (length + 1)); if (!result) return NULL; strcpy(result, original); return result; }
(You will have to include string.h and stdlib.h for the functions used in that function). Now that function looks a bit more complex than what we've done so far, but it's not that complicated. Basically that function is called with the pointer to the start of a string we want to copy. The strlen() function loops over the string and returns the number of characters before the null char (aka the length of the string). Then we allocate memory for length + 1 characters (the +1 for the null byte that delimits every C string). If the allocation failed (malloc returns NULL in that case) we return a NULL from the function and stop there. Otherwise strcpy is called and starts copying the bytes over. At the very last our newly copied string is returned.
We can then use the string like we would use any string but we have to make sure that the memory is freed after using with the free() function:
char *original = "Hello World"; char *copy = copy_string(original); printf("%s\n", copy); free(copy);
Pointers to Pointers
Lastly a quick example why pointers to pointers are useful:
void set_my_string(char **string) { *string = copy_string("Hello"); } /* and use it like this */ char *my_string; set_my_string( printf("%s\n", my_string); free(my_string);
This is basically the same thing as our divide function above, just that it copies a string and puts the memory address of the first copied character into the value of the pointer “my_string”. Before copying the memory looks like this:
| Memory Address | Value | Variable |
|---|---|---|
| 96 | / | &my_string |
| 100 | / | my_string |
| 101 | / | |
| 102 | / | |
| 103 | / | |
| 104 | / | |
| 105 | / |
The slash symbolizes uninitialized values in the memory.
After copying it looks like this in memory:
| Memory Address | Value | Variable |
|---|---|---|
| 96 | 100 | &my_string |
| 100 | 'H' | my_string |
| 101 | 'e' | |
| 102 | 'l' | |
| 103 | 'l' | |
| 104 | 'o' | |
| 105 | 0 |
&my_string returns 96 because that is the memory address of that variable in this example. The function set_my_string() then copies the string into memory 100 to 105 and sets the value of memory address 96 (&my_string) to 100 which is the first character of the copied string.
Hope that helps someone understanding pointers in C. Have fun ;)
This reminds me why I don't do C anymore... yeah I know. That makes me a wuss, but I got spoiled easy with vm based memory management like with Python especially so no loss unless I want to make a C extension. So, one thing I would like to commend your school and prof on is using a pseudo test driven development system. Tests are very important and I have met entirely too many grads that don't understand this. Bravo.
— Jason on Saturday, November 8, 2008 18:41 #
Thanks. This is a good introduction and a helpful reference. I read all the way through it on (unofficial) Planet Python.
— Carl T. on Sunday, November 9, 2008 1:10 #
Thanks. Good explanation about a difficult topic :-)
— meisterluk on Sunday, November 9, 2008 8:01 #
Ah, finally I understand pointers! I never understood them when writing C in school. Then after working as a software engineer for 7 years I never had to work with C enough to learn them :)
— Kumar McMillan on Monday, November 10, 2008 19:37 #