AR

CMPSC 311 - Arrays and Pointers Vocabulary

Pointers

  • type *name; // declare a pointer
  • type *name = address; // declare + initialize a pointer
  • A pointer is a variable that contains a memory address.
  • Source of confusion: *p vs p – the pointer variable is just p, without *.
int main(void) {
    int x = 42;
    int *p;  // int* p; also works
    p = &x;  // p now stores the address of x
    printf("x is %d\n", x);
    printf("&x is %p\n", &x);
    printf("p is %p\n", p);
    return 0;
}

Stylistic Choice

  • C provides flexibility in declaring pointers.
  • One way can lead to visual confusion when declaring multiple pointers on a single line.
  • The other way is often preferred.
int* p1;
int *p2; // preferred

int* p1, p2;  // bug?; equivalent to int *p1; int p2;

int* p1, * p2; // correct or int *p1, *p2;  // correct, preferred (generally)

Dereferencing Pointers

  • Dereference: access the memory referred to by a pointer.
  • *pointer // dereference a pointer
  • *pointer is an alias for the variable the pointer points to.
  • *pointer = value; // dereference / assign
  • pointer-p = pointer-q; // pointer assignment
#include <stdio.h>

int main(int argc, char **argv) {
    int x = 42;
    int *p;  // p is a pointer to an integer
    p = &x;  // p now stores the address of x
    printf("x is %d\n", x);
    *p = 99;
    printf("x is %d\n", x);
    return 0;
}

Pointers as Function Arguments

  • Pointers allow C to emulate pass by reference.
  • Enables modifying out parameters and efficient passing of in parameters.
void min_max(int array[], int len, int *min, int *max) {
    // find the index of largest value and assign it to max_i
    // find the index of the smallest value and assign it to min_i
    *max = array[max_i];
    *min = array[min_i];
}

int main(void) {
    int x[100];
    int largest, smallest;
    // some code that populates the array x...
    min_max(x, 100, &smallest, &largest);
    printf(“smallest = %d, largest = %d\n”, smallest, largest);
}

Pointers as Return Values

  • It is okay to return a passed-in pointer (or dynamically allocated memory).
  • It is NOT okay to return the address of a local variable.
int *max(int *a, int *b) {
    if (*a > *b) return a;
    return b;
}

int main(void) {
    int x = 5, y = 6;
    printf(“max of %d and %d is %d\n”, x, y, *max(&x, &y));
}

int *foo(void) {
    int x = 5;
    return &x;
}

int main(void) {
    printf(“%d\n”, *foo());
}

Variable Storage Classes

  • C (and other languages) have several storage classes defined by their scopes:
    • auto: Automatically allocated and deallocated variables (local function variables declared on the stack).
    • global: Globally defined variables that can be accessed anywhere within the program.
      • The keyword extern is used in .c/.h files to indicate a variable defined elsewhere.
    • static: A variable that is global to the local file only (static global variable) or to the scope of a function (static local variable).
      • The keyword static is used to identify a variable as local only.
static int localScopeVariable; // Static variable
extern int GlobalVariable;  // Global variable defined elsewhere
  • More examples: https://overiq.com/c-programming-101/local-global-and-static-variables-in-c/

Rules for Initialization

  • In general, static or global variables are given a default value (often zero), and auto storage class variables are indeterminate.
  • Indeterminate means that the compiler can do anything it wants, which for most compilers is to take whatever value is in memory.
  • You cannot depend on indeterminate values.
  • C89 Specification:
    • If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
    • If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.
  • C99 Specification:
    • If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
    • If an object that has static storage duration is not initialized explicitly, then:
      • if it has pointer type, it is initialized to a null pointer;
      • if it has arithmetic type, it is initialized to (positive or unsigned) zero;
      • if it is an aggregate, every member is initialized (recursively) according to these rules;
      • if it is a union, the first named member is initialized (recursively) according to these rules.

Variable Storage Class Example

  • global: Declared outside of any function (without any keyword).
    • Accessible from anywhere within the same file.
    • Accessible in other files via the extern keyword.
  • static: Declared outside of any function with the static keyword.
    • Prepending static to line 3 of main.c limits the scope of x to main.c; compilation fails.
// main.c
#include <stdio.h>

int x = 42;

int triple(void);

int main(void) {
    printf(“%d\n”, x);
    printf(“%d\n”, triple());
}

// triple.c
extern int x;

int triple(void) {
    return x * 3;
}

Variable Storage Class Example (cont.)

  • Global and static variables:
    • Initialized to a supplied or default value before program execution begins.
    • Preserve changes until the end of program execution.
  • Static variables can also appear within a function:
    • Limits the scope to the function.
    • Unlike automatic variables, preserves changes across invocations.
#include <stdio.h>

void foo(void) {
    int x = 0;
    static int y = 0;
    x += 1;
    y += 1;
    printf(“x=%d y=%d\n”, x, y);
}

int main(void) {
    foo();
    foo();
}

Function Storage Class Example

  • Storage classes also apply to functions.
  • By default, functions are global: line 5 in main.c has an implicit extern.
  • To limit the visibility of triple to a file, prepend static in line 3; compilation fails.
  • Functions in C cannot be nested; hence, there is no static function within a function.
// main.c
#include <stdio.h>

int x = 42;

int triple(void);

int main(void) {
    printf(“%d\n”, x);
    printf(“%d\n”, triple());
}

// triple.c
extern int x;

int triple(void) {
    return x * 3;
}

Arrays

  • type name[size];
  • Example: int scores[100]; allocates 100 integers worth of memory.
  • Initially, each array element contains garbage data.
  • An array does not know its own size.
  • sizeof(scores) is not reliable; only works in some situations.
  • The C99 standard allows the array size to be an expression:
    • int n = 100;
    • int scores[n]; // OK in C99
  • C Arrays are zero-indexed!

Initializing and Using Arrays

  • type name[size] = {value, value, ..., value};
    • Allocates an array and fills it with supplied values.
    • If fewer values are given than the array size, fills the rest with 0.
  • name[index] = <expression>;
    • Sets the value of an array element.
int primes[6] = {2, 3, 5, 6, 11, 13};
primes[3] = 7;
primes[100] = 0;  // smash!
  • Best approach to init all values
int val4[3] = { [0 ... 2] = 1 };

Multi-Dimensional Arrays

  • type name[rows][columns] = {{values}, ..., {values}};
    • Allocates a 2D array and fills it with predefined values.
double grid[2][3]; // a 2 row, 3 column array of doubles
int matrix[3][5] = {
    {0, 1, 2, 3, 4},
    {0, 2, 4, 6, 8},
    {1, 3, 5, 7, 9}
};

grid[0][2] = (double) matrix[2][4];  // which val?

Arrays and Pointers

  • The array name can be used as a pointer to the 0th element.
  • Array names and pointers are different otherwise.
int a[6] = { [0 ... 5] = 42 };
printf(“%d\n”, *a);  // prints 42

*a = 10;
int a[5];
int *p;
int x = 5;

p = &x; // OK
a = &x; // WILL FAIL TO COMPILE

p++;  // OK
a++;  // WILL FAIL TO COMPILE

Pointer Arithmetic

  • We can make pointers point to array elements.
  • We can access array elements through pointers.
  • We can do arithmetic on a pointer that points to array elements:
    • Add an integer to a pointer.
    • Subtract an integer from a pointer.
    • Subtract one pointer from another.
int a[6] = { [0 ... 5] = 42 };
int *p = &a[0];
*p = 8;

Pointer Arithmetic (cont.)

  • Adding integer i to pointer p yields a pointer to the element i places after the one that p points to.
  • Using array name as a pointer: a[i] == *(a+i)
    • a[0] == *(a+0) = *a
    • a[5] == *(a+5)
  • Fun trick: a[2] == 2[a]
  • Try it: int a[5] = {0, 1, 2, 3, 4}; printf(“%d %d\n”, a[2], 2[a]);
int a[6] = { [0 ... 5] = 42 };
int *p = &a[2];
int *q = p + 3;
++p;

Pointer Arithmetic (cont.)

  • In pointer arithmetic, the pointer is incremented based on the size of a single array element.
  • Pointer arithmetic is applicable only to pointers that point to arrays (undefined otherwise).
  • Subtract pointers only if they point to the same array (undefined otherwise).
#include <stdio.h>

int main(void) {
    int foo[10];
    int *p = foo; // point to the 0-th element
    printf(“%p\n”, p);
    ++p;  // point to the next element
    printf(“%p\n”, p);

    char bar[10];
    char *q = bar; // point to the 0-th element
    printf(“%p\n”, q);
    ++q;  // point to the next element
    printf(“%p\n”, q);
}

Pointer Arithmetic (cont.)

#include <stdio.h>

int main(void) {
    int foo[10] = { [0 ... 9] = 42 };
    int sum1 = 0, sum2 = 0, sum3 = 0, sum4 = 0;

    for (int i = 0; i < 10; ++i)
        sum1 += foo[i];

    for (int *p = &foo[0]; p < &foo[10]; ++p)
        sum2 += *p;

    for (int *p = foo; p < foo + 10; ++p)
        sum3 += *p;

    int *p = foo;
    for (int i = 0; i < 10; ++i)
        sum4 += p[i];

    printf(“%d %d %d %d\n”, sum1, sum2, sum3, sum4);
}

Arrays as Function Parameters

  • When passing an array as an argument to a function, the address of the 0-th element is passed by value – copied to a pointer in the callee.
    • Therefore, int foo(int a[]) is equivalent to int foo(int *a);
  • Unlike array variables, array arguments inside a function are real pointers.
    • You can increment, decrement, and assign to them.
  • Arrays are effectively passed by reference – a function doesn’t get a copy of the array.
int foo(int a[]) { ... }

int main(void) {
    int x[5];
    ...
    foo(x);
}

Passing Array Size to Function

  • Solution 1: Declare the array size in the function.
    • Problem: Code isn’t very flexible; you need different functions for different-sized arrays.
    • The array size in the function signature is ignored by the compiler.
int sumAll(int a[5]);

int main(void) {
    int numbers[5] = {3, 4, 1, 7, 4};
    int sum = sumAll(numbers);
    return 0;
}

int sumAll(int a[5]) {
    int i, sum = 0;
    for (i = 0; i < 5; i++) {
        sum += a[i];
    }
    return sum;
}

Passing Array Size to Function (cont.)

  • Solution 2: Pass the size as a parameter.
int sumAll(int a[], int size);

int main(void) {
    int numbers[5] = {3, 4, 1, 7, 4};
    int sum = sumAll(numbers, 5);
    printf("sum is: %d\n", sum);
    return 0;
}

int sumAll(int a[], int size) {
    int i, sum = 0;
    for (i = 0; i <= size; i++) {
        sum += a[i];
    }
    return sum;
}

Returning an Array

  • Local variables, including arrays, are stack allocated.
    • They disappear when a function returns.
    • Therefore, local arrays can’t be safely returned from functions.
int[] copyarray(int src[], int size) {
    int i, dst[size];  // OK in C99
    for (i = 0; i < size; i++) {
        dst[i] = src[i];
    }
    return dst;
}

Solution: An Output Parameter

  • Create the “returned” array in the caller.
  • Pass it as an output parameter to copyarray.
  • This works because arrays are effectively passed by reference.
void copyarray(int src[], int dst[], int size) {
    for (int i = 0; i < size; i++)
        dst[i] = src[i];
}

int main(void) {
    int foo[5] = { [0 ... 4] = 42 };
    int bar[5];
    copyarray(foo, bar, 5);
}

Virtual vs Physical Address

  • &foo produces the virtual address of foo
#include <stdio.h>

int foo(int x) {
    return x+1;
}

int main(void) {
    int x, y;
    int a[2];

    printf("x  is at %p\n", &x);
    printf("y  is at %p\n", &y);
    printf("a[0] is at %p\n", &a[0]);
    printf("a[1] is at %p\n", &a[1]);
    printf("foo  is at %p\n", &foo);
    printf("main is at %p\n", &main);
    return 0;
}

OS and Processes (redux)

  • The OS lets you run multiple applications at once.
  • An application runs within an OS “process.”
  • The OS “timeslices” each CPU between runnable processes.
  • This happens very fast; approximately 100 times per second!

Processes and Virtual Memory

  • The OS gives each process the illusion of its own, private memory.
    • Called the process’s address space.
    • Contains the process’s virtual memory, visible only to it.
  • 32-bit pointers on 32-bit machines.
  • 64-bit pointers on 64-bit machines.

Loading

  • When the OS loads a program, it:
    • Creates an address space.
    • Inspects the executable file to see what’s in it.
    • (Lazily) copies regions of the file into the right place in the address space.
    • Does any final linking, relocation, or other needed preparation.

Something Curious

  • Let’s try running the addr program several times:
x  is at 0x7fff8ed15588
y  is at 0x7fff8ed1558c
a[0] is at 0x7fff8ed15590
a[1] is at 0x7fff8ed15594
foo is at 0x561785be9169
main is at 0x561785be917c

x  is at 0x7ffe944b1ee8
y  is at 0x7ffe944b1eec
a[0] is at 0x7ffe944b1ef0
a[1] is at 0x7ffe944b1ef4
foo is at 0x5575250db169
main is at 0x5575250db17c

ASLR

  • Linux uses address-space layout randomization for added security.
  • Linux randomizes:
    • Executable code location.
    • Base of the stack.
    • Shared library (mmap) location.
  • Makes stack-based buffer overflow attacks tougher.
  • Makes debugging tougher.
  • Google "disable Linux address space randomization".