AR

CMPSC 311 - Intro to Systems Programming - C Fundamentals

Introduction to C

C Workflow

  • Source files have extensions .c and .h. Example: foo.c, foo.h, bar.c.
  • Use an editor (e.g., emacs, vim) or an IDE (e.g., VS Code, Eclipse) to edit the code.
  • Compilation: Source files are compiled into object files (.o). For example, foo.c compiles to foo.o and bar.c to bar.o.
  • Linking: Object files are linked together to create an executable.
  • Libraries: Libraries can be statically linked (.a) or shared (.so). For example, libZ.a, libc.so.
  • Execution: The executable is run, debugged, and profiled.

Defining a Function

  • General structure:

    returnType name(type name, ..., type name) {
        statements;
    }
    
  • Example:

    // sum integers from 1 to max
    int sumTo(int max) {
        int i, sum = 0;
        for (i = 1; i <= max; i++) {
            sum += i;
        }
        return sum;
    }
    

From C to Machine Code

  • C source file (dosum.c) is compiled using a C compiler (e.g., gcc -S).

  • C compiler generates assembly source file (dosum.s).

  • Assembler (as) transforms the assembly file into machine code (dosum.o).

  • Example:

    • C source file (dosum.c):

      int dosum(int i, int j) {
          return i + j;
      }
      
    • Assembly source file (dosum.s):

      dosum:
      pushl  %ebp
      movl  %esp, %ebp
      movl  12(%ebp), %eax
      addl  8(%ebp), %eax
      popl  %ebp
      ret
      
    • Machine code (dosum.o):

      80483b0: 55 89 e5 8b  45 0c 03 45  08 5d c3
      
  • Most C compilers directly generate object files without saving the .s assembly file.

  • Object code is re-locatable machine code but generally cannot be executed without further manipulation (e.g., linking).

Anatomy of a C Program

  • Example:

    #include <stdio.h>
    
    int myfunc(int i) {
        printf("Got into function with %d\n", i);
        return 0;
    }
    
    int main(void) {
        myfunc(10);
        return 0;
    }
    
  • All C programs start with the main() function.

  • Compilation and execution:

    gcc -g -Wall main.c -o main
    ./main
    

Running a Program

  • UNIX searches for programs in directories listed in the PATH environment variable.

  • To run a program in the current directory, prefix it with ./.

  • Example:

    mcdaniel@ubuntu:~/tmp/helloworld$ emacs helloworld.c
    mcdaniel@ubuntu:~/tmp/helloworld$ gcc helloworld.c -o helloworld
    mcdaniel@ubuntu:~/tmp/helloworld$ helloworld
    helloworld: command not found
    mcdaniel@ubuntu:~/tmp/helloworld$ echo $PATH
    /usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
    mcdaniel@ubuntu:~/tmp/helloworld$ ./helloworld
    Hello world!
    mcdaniel@ubuntu:~/tmp/helloworld export PATH=$PATH:/new/path
    

Multi-File C Programs

  • Example:

    • dosum.c:

      int dosum(int i, int j) {
          return i + j;
      }
      
    • sumnum.c:

      #include <stdio.h>
      
      int dosum(int i, int j);
      
      int main(int argc, char **argv) {
          printf("%d\n", dosum(1, 2));
          return 0;
      }
      
  • A prototype of dosum() in sumnum.c tells the compiler about the arguments and return value of dosum(), which is implemented in dosum.c.

  • #include is needed to provide function declarations (prototypes) and other definitions.

Compiling Multi-File Programs

  • Multiple object files are linked to produce an executable.

  • Standard libraries (e.g., libc, crt1) are also linked.

  • A library is a pre-assembled collection of .o files.

  • Compilation and linking process:

    gcc -c dosum.c
    gcc -c sumnum.c
    ld (or gcc) sumnum libraries (e.g., libc)
    

Object Files Revisited

  • .o files (e.g., sumnum.o, dosum.o) contain machine code produced by the compiler.
  • Each may contain references to external symbols (variables and functions not defined in the associated .c file).
  • For example, sumnum.o contains code that relies on printf() and dosum(), which are defined in libc.a and dosum.o, respectively.
  • Linking resolves these external symbols while combining object files and libraries.

Diving into C

  • Similarities with Java:

    • Syntax for statements, control structures, function calls.
    • Types: int, double, char, long, float.
    • Type-casting syntax: float x = (float) 5 / 3;
    • Expressions, operators, precedence: +, -, *, /, %, ++, --, =, +=, -=, *=, /=, %=, <, <=, ==, !=, >, >=, &&, ||, !.
    • Scope (local scope is within {} braces).
    • Comments: /* comment */ or // comment *to EOL*.

Primitive Types in C

  • Integer types: char, int

  • Floating-point types: float, double

  • Modifiers: short [int], long [int, double], signed [char, int], unsigned [char, int]

  • Type sizes and ranges (32-bit and 64-bit systems):

    TypeBytes (32-bit)Bytes (64-bit)32-bit Rangeprintf
    char11[0, 255]%c
    short int22[-32768, 32767]%hd
    unsigned short int22[0, 65535]%hu
    int44[-2147483648, 2147483647]%d
    unsigned int44[0, 4294967295]%u
    long int48[-2147483648, 2147483647]%ld
    long long int88[-9223372036854775808, 9223372036854775807]%lld
    float44approx [10^{-38}, 10^{38}]%f
    double88approx [10^{-308}, 10^{308}]%lf
    long double1216approx [10^{-4932}, 10^{4932}]%Lf
    pointer48[0, 4294967295]%p

C99 Extended Integer Types

  • Solves the problem of platform-dependent long int size.

  • Defined in <stdint.h>.

  • Examples:

    #include <stdint.h>
    
    void foo(void) {
        int8_t  w;  // exactly 8 bits, signed
        int16_t x;  // exactly 16 bits, signed
        int32_t y;  // exactly 32 bits, signed
        int64_t z;  // exactly 64 bits, signed
        uint8_t w;  // exactly 8 bits, unsigned
        ...etc.
    }
    

Similar to Java (Variables)

  • Variables must be declared at the start of a function or block (not required since C99).

  • Variables need not be initialized before use (but gcc -Wall will warn); always initialize your variables.

  • Example:

    #include <stdio.h>
    
    int main(void) {
        int x, y = 5;  // note x is uninitialized!
        long z = x + y;
        printf("z is '%ld'\n", z); // what’s printed?
        {
            int y = 10;
            printf("y is '%d'\n", y);
        }
        int w = 20;  // ok in c99
        printf("y is '%d', w is '%d'\n", y, w);
        return 0;
    }
    

Similar to Java (Const)

  • const is a qualifier that indicates the variable’s value cannot change.

  • The compiler will issue an error if you try to violate this.

  • Useful for defining constants.

  • Example:

    #include <stdio.h>
    
    int main(void) {
        const double MAX_GPA = 4.0;
        printf("MAX_GPA: %g\n", MAX_GPA);
        // MAX_GPA = 5.0;  // illegal!
        return 0;
    }
    

Similar to Java (Loops and Conditionals)

  • for loops: cannot declare variables in the loop header (changed in C99).

  • if/else, while, and do while loops.

  • No boolean type (changed in C99: #include <stdbool.h>).

  • Any type can be used; 0 means false, everything else is true.

  • Example:

    int i;
    for (i = 0; i < 100; i++) {
        if (i % 10 == 0) {
            printf("i: %d\n", i);
        }
    }
    

Pointers

  • Key concepts:

    • Taking the address of a variable: &
    • Dereferencing a pointer: *
    • Aliasing: *ip is an alias for i
  • Example:

    #include <stdio.h>
    
    int main(void) {
        int i = 5;
        int *ip = &i;
        printf("%d\n", i);  // output: 5
        printf("%p\n", ip); // output: memory address of i
        *ip = 42;
        printf("%d\n", i);  // output: 42
        printf("%d\n", *ip); // output: 42
    }
    

Pass by Value vs. Pass by Reference

  • C always passes arguments by value.

  • A copy of the value is passed to the function.

  • Local modifications do not affect the original value.

  • Pointers allow you to pass by reference.

  • Pass the memory location of a variable.

  • Example:

    void add_pbv(int c) {
        c += 10;
        printf("pbv c: %d\n", c);
    }
    
    void add_pbr(int *c) {
        *c += 10;
        printf("pbr *c: %d\n", *c);
    }
    
    int main(void) {
        int x = 1;
        printf("x: %d\n", x);     // output: 1
        add_pbv(x);
        printf("x: %d\n", x);     // output: 1
        add_pbr(&x);
        printf("x: %d\n", x);     // output: 11
        return 0;
    }
    

Pass-by-Value Explained

  • Callee receives a copy of the argument.

  • Modifications do not affect the caller's copy.

  • Example:

    void swap(int a, int b) {
        int tmp = a;
        a = b;
        b = tmp;
    }
    
    int main(void) {
        int a = 42, b = -7;
        swap(a, b);
        printf("a: %d, b: %d\n", a, b); // Output: a: 42, b: -7
        return 0;
    }
    

    The values of a and b in main are unchanged after calling swap.

Pass-by-Reference

  • Use pointers to pass by reference.

  • The callee receives a copy of the pointer.

  • The pointer's value points to the variable in the caller's scope.

  • Allows the callee to modify a variable in the caller's scope.

  • Example:

    void swap(int *a, int *b) {
        int tmp = *a;
        *a = *b;
        *b = tmp;
    }
    
    int main(void) {
        int a = 42, b = -7;
        swap(&a, &b);
        printf("a: %d, b: %d\n", a, b); // Output: a: -7, b: 42
        return 0;
    }
    

Arrays

  • A bare, contiguous block of memory.

  • An array of 6 integers requires $6 \times 4 = 24$ bytes of memory.

  • Arrays have no methods and do not know their own length (no bounds checking).

  • C does not prevent you from overstepping the end of an array.

  • This is a common source of security bugs (buffer overflow).

  • Example:

    X[7] = 45; // Legal C, but can cause a memory fault if X is not large enough!!!!
    

Strings

  • An array of char terminated by the NULL character '\0'.

  • Strings are not objects and have no methods.

  • string.h has helpful utilities.

  • Example:

    char *x = "hello\n";
    

Errors and Exceptions

  • C has no exceptions (no try/catch).
  • Errors are returned as integer error codes from functions.
  • Error handling can be ugly and inelegant.
  • Some support from the OS using signals.
  • Crashes: doing something bad can spray bytes around memory, hopefully causing a