Pointers
type *name;
// declare a pointertype *name = address;
// declare + initialize a pointer- A pointer is a variable that contains a memory address.
- Source of confusion:
*p
vs p
– the pointer variable is just p
, without *
.
int main(void) {
int x = 42;
int *p; // int* p; also works
p = &x; // p now stores the address of x
printf("x is %d\n", x);
printf("&x is %p\n", &x);
printf("p is %p\n", p);
return 0;
}
Stylistic Choice
- C provides flexibility in declaring pointers.
- One way can lead to visual confusion when declaring multiple pointers on a single line.
- The other way is often preferred.
int* p1;
int *p2; // preferred
int* p1, p2; // bug?; equivalent to int *p1; int p2;
int* p1, * p2; // correct or int *p1, *p2; // correct, preferred (generally)
Dereferencing Pointers
- Dereference: access the memory referred to by a pointer.
*pointer
// dereference a pointer*pointer
is an alias for the variable the pointer points to.*pointer = value;
// dereference / assignpointer-p = pointer-q;
// pointer assignment
#include <stdio.h>
int main(int argc, char **argv) {
int x = 42;
int *p; // p is a pointer to an integer
p = &x; // p now stores the address of x
printf("x is %d\n", x);
*p = 99;
printf("x is %d\n", x);
return 0;
}
Pointers as Function Arguments
- Pointers allow C to emulate pass by reference.
- Enables modifying out parameters and efficient passing of in parameters.
void min_max(int array[], int len, int *min, int *max) {
// find the index of largest value and assign it to max_i
// find the index of the smallest value and assign it to min_i
*max = array[max_i];
*min = array[min_i];
}
int main(void) {
int x[100];
int largest, smallest;
// some code that populates the array x...
min_max(x, 100, &smallest, &largest);
printf(“smallest = %d, largest = %d\n”, smallest, largest);
}
Pointers as Return Values
- It is okay to return a passed-in pointer (or dynamically allocated memory).
- It is NOT okay to return the address of a local variable.
int *max(int *a, int *b) {
if (*a > *b) return a;
return b;
}
int main(void) {
int x = 5, y = 6;
printf(“max of %d and %d is %d\n”, x, y, *max(&x, &y));
}
int *foo(void) {
int x = 5;
return &x;
}
int main(void) {
printf(“%d\n”, *foo());
}
Variable Storage Classes
- C (and other languages) have several storage classes defined by their scopes:
auto
: Automatically allocated and deallocated variables (local function variables declared on the stack).global
: Globally defined variables that can be accessed anywhere within the program.- The keyword
extern
is used in .c/.h
files to indicate a variable defined elsewhere.
static
: A variable that is global to the local file only (static global variable) or to the scope of a function (static local variable).- The keyword
static
is used to identify a variable as local only.
static int localScopeVariable; // Static variable
extern int GlobalVariable; // Global variable defined elsewhere
- More examples: https://overiq.com/c-programming-101/local-global-and-static-variables-in-c/
Rules for Initialization
- In general,
static
or global
variables are given a default value (often zero), and auto
storage class variables are indeterminate. - Indeterminate means that the compiler can do anything it wants, which for most compilers is to take whatever value is in memory.
- You cannot depend on indeterminate values.
- C89 Specification:
- If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
- If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.
- C99 Specification:
- If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
- If an object that has static storage duration is not initialized explicitly, then:
- if it has pointer type, it is initialized to a null pointer;
- if it has arithmetic type, it is initialized to (positive or unsigned) zero;
- if it is an aggregate, every member is initialized (recursively) according to these rules;
- if it is a union, the first named member is initialized (recursively) according to these rules.
Variable Storage Class Example
global
: Declared outside of any function (without any keyword).- Accessible from anywhere within the same file.
- Accessible in other files via the
extern
keyword.
static
: Declared outside of any function with the static
keyword.- Prepending
static
to line 3 of main.c
limits the scope of x
to main.c
; compilation fails.
// main.c
#include <stdio.h>
int x = 42;
int triple(void);
int main(void) {
printf(“%d\n”, x);
printf(“%d\n”, triple());
}
// triple.c
extern int x;
int triple(void) {
return x * 3;
}
Variable Storage Class Example (cont.)
- Global and static variables:
- Initialized to a supplied or default value before program execution begins.
- Preserve changes until the end of program execution.
- Static variables can also appear within a function:
- Limits the scope to the function.
- Unlike automatic variables, preserves changes across invocations.
#include <stdio.h>
void foo(void) {
int x = 0;
static int y = 0;
x += 1;
y += 1;
printf(“x=%d y=%d\n”, x, y);
}
int main(void) {
foo();
foo();
}
Function Storage Class Example
- Storage classes also apply to functions.
- By default, functions are global: line 5 in
main.c
has an implicit extern
. - To limit the visibility of
triple
to a file, prepend static
in line 3; compilation fails. - Functions in C cannot be nested; hence, there is no static function within a function.
// main.c
#include <stdio.h>
int x = 42;
int triple(void);
int main(void) {
printf(“%d\n”, x);
printf(“%d\n”, triple());
}
// triple.c
extern int x;
int triple(void) {
return x * 3;
}
Arrays
type name[size];
- Example:
int scores[100];
allocates 100 integers worth of memory. - Initially, each array element contains garbage data.
- An array does not know its own size.
sizeof(scores)
is not reliable; only works in some situations.- The C99 standard allows the array size to be an expression:
int n = 100;
int scores[n];
// OK in C99
- C Arrays are zero-indexed!
Initializing and Using Arrays
type name[size] = {value, value, ..., value};
- Allocates an array and fills it with supplied values.
- If fewer values are given than the array size, fills the rest with 0.
name[index] = <expression>;
- Sets the value of an array element.
int primes[6] = {2, 3, 5, 6, 11, 13};
primes[3] = 7;
primes[100] = 0; // smash!
- Best approach to init all values
int val4[3] = { [0 ... 2] = 1 };
Multi-Dimensional Arrays
type name[rows][columns] = {{values}, ..., {values}};
- Allocates a 2D array and fills it with predefined values.
double grid[2][3]; // a 2 row, 3 column array of doubles
int matrix[3][5] = {
{0, 1, 2, 3, 4},
{0, 2, 4, 6, 8},
{1, 3, 5, 7, 9}
};
grid[0][2] = (double) matrix[2][4]; // which val?
Arrays and Pointers
- The array name can be used as a pointer to the 0th element.
- Array names and pointers are different otherwise.
int a[6] = { [0 ... 5] = 42 };
printf(“%d\n”, *a); // prints 42
*a = 10;
int a[5];
int *p;
int x = 5;
p = &x; // OK
a = &x; // WILL FAIL TO COMPILE
p++; // OK
a++; // WILL FAIL TO COMPILE
Pointer Arithmetic
- We can make pointers point to array elements.
- We can access array elements through pointers.
- We can do arithmetic on a pointer that points to array elements:
- Add an integer to a pointer.
- Subtract an integer from a pointer.
- Subtract one pointer from another.
int a[6] = { [0 ... 5] = 42 };
int *p = &a[0];
*p = 8;
Pointer Arithmetic (cont.)
- Adding integer
i
to pointer p
yields a pointer to the element i
places after the one that p
points to. - Using array name as a pointer:
a[i] == *(a+i)
a[0] == *(a+0) = *a
a[5] == *(a+5)
- Fun trick:
a[2] == 2[a]
- Try it:
int a[5] = {0, 1, 2, 3, 4}; printf(“%d %d\n”, a[2], 2[a]);
int a[6] = { [0 ... 5] = 42 };
int *p = &a[2];
int *q = p + 3;
++p;
Pointer Arithmetic (cont.)
- In pointer arithmetic, the pointer is incremented based on the size of a single array element.
- Pointer arithmetic is applicable only to pointers that point to arrays (undefined otherwise).
- Subtract pointers only if they point to the same array (undefined otherwise).
#include <stdio.h>
int main(void) {
int foo[10];
int *p = foo; // point to the 0-th element
printf(“%p\n”, p);
++p; // point to the next element
printf(“%p\n”, p);
char bar[10];
char *q = bar; // point to the 0-th element
printf(“%p\n”, q);
++q; // point to the next element
printf(“%p\n”, q);
}
Pointer Arithmetic (cont.)
#include <stdio.h>
int main(void) {
int foo[10] = { [0 ... 9] = 42 };
int sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0;
for (int i = 0; i < 10; ++i)
sum1 += foo[i];
for (int *p = &foo[0]; p < &foo[10]; ++p)
sum2 += *p;
for (int *p = foo; p < foo + 10; ++p)
sum3 += *p;
int *p = foo;
for (int i = 0; i < 10; ++i)
sum4 += p[i];
printf(“%d %d %d %d\n”, sum1, sum2, sum3, sum4);
}
Arrays as Function Parameters
- When passing an array as an argument to a function, the address of the 0-th element is passed by value – copied to a pointer in the callee.
- Therefore,
int foo(int a[])
is equivalent to int foo(int *a);
- Unlike array variables, array arguments inside a function are real pointers.
- You can increment, decrement, and assign to them.
- Arrays are effectively passed by reference – a function doesn’t get a copy of the array.
int foo(int a[]) { ... }
int main(void) {
int x[5];
...
foo(x);
}
Passing Array Size to Function
- Solution 1: Declare the array size in the function.
- Problem: Code isn’t very flexible; you need different functions for different-sized arrays.
- The array size in the function signature is ignored by the compiler.
int sumAll(int a[5]);
int main(void) {
int numbers[5] = {3, 4, 1, 7, 4};
int sum = sumAll(numbers);
return 0;
}
int sumAll(int a[5]) {
int i, sum = 0;
for (i = 0; i < 5; i++) {
sum += a[i];
}
return sum;
}
Passing Array Size to Function (cont.)
- Solution 2: Pass the size as a parameter.
int sumAll(int a[], int size);
int main(void) {
int numbers[5] = {3, 4, 1, 7, 4};
int sum = sumAll(numbers, 5);
printf("sum is: %d\n", sum);
return 0;
}
int sumAll(int a[], int size) {
int i, sum = 0;
for (i = 0; i <= size; i++) {
sum += a[i];
}
return sum;
}
Returning an Array
- Local variables, including arrays, are stack allocated.
- They disappear when a function returns.
- Therefore, local arrays can’t be safely returned from functions.
int[] copyarray(int src[], int size) {
int i, dst[size]; // OK in C99
for (i = 0; i < size; i++) {
dst[i] = src[i];
}
return dst;
}
Solution: An Output Parameter
- Create the “returned” array in the caller.
- Pass it as an output parameter to
copyarray
. - This works because arrays are effectively passed by reference.
void copyarray(int src[], int dst[], int size) {
for (int i = 0; i < size; i++)
dst[i] = src[i];
}
int main(void) {
int foo[5] = { [0 ... 4] = 42 };
int bar[5];
copyarray(foo, bar, 5);
}
Virtual vs Physical Address
&foo
produces the virtual address of foo
#include <stdio.h>
int foo(int x) {
return x+1;
}
int main(void) {
int x, y;
int a[2];
printf("x is at %p\n", &x);
printf("y is at %p\n", &y);
printf("a[0] is at %p\n", &a[0]);
printf("a[1] is at %p\n", &a[1]);
printf("foo is at %p\n", &foo);
printf("main is at %p\n", &main);
return 0;
}
OS and Processes (redux)
- The OS lets you run multiple applications at once.
- An application runs within an OS “process.”
- The OS “timeslices” each CPU between runnable processes.
- This happens very fast; approximately 100 times per second!
Processes and Virtual Memory
- The OS gives each process the illusion of its own, private memory.
- Called the process’s address space.
- Contains the process’s virtual memory, visible only to it.
- 32-bit pointers on 32-bit machines.
- 64-bit pointers on 64-bit machines.
Loading
- When the OS loads a program, it:
- Creates an address space.
- Inspects the executable file to see what’s in it.
- (Lazily) copies regions of the file into the right place in the address space.
- Does any final linking, relocation, or other needed preparation.
Something Curious
- Let’s try running the
addr
program several times:
x is at 0x7fff8ed15588
y is at 0x7fff8ed1558c
a[0] is at 0x7fff8ed15590
a[1] is at 0x7fff8ed15594
foo is at 0x561785be9169
main is at 0x561785be917c
x is at 0x7ffe944b1ee8
y is at 0x7ffe944b1eec
a[0] is at 0x7ffe944b1ef0
a[1] is at 0x7ffe944b1ef4
foo is at 0x5575250db169
main is at 0x5575250db17c
ASLR
- Linux uses address-space layout randomization for added security.
- Linux randomizes:
- Executable code location.
- Base of the stack.
- Shared library (
mmap
) location.
- Makes stack-based buffer overflow attacks tougher.
- Makes debugging tougher.
- Google "disable Linux address space randomization".