CMPSC 311 - Intro to Systems Programming - C Fundamentals
Introduction to C
C Workflow
- Source files have extensions
.cand.h. Example:foo.c,foo.h,bar.c. - Use an editor (e.g., emacs, vim) or an IDE (e.g., VS Code, Eclipse) to edit the code.
- Compilation: Source files are compiled into object files (
.o). For example,foo.ccompiles tofoo.oandbar.ctobar.o. - Linking: Object files are linked together to create an executable.
- Libraries: Libraries can be statically linked (
.a) or shared (.so). For example,libZ.a,libc.so. - Execution: The executable is run, debugged, and profiled.
Defining a Function
General structure:
returnType name(type name, ..., type name) { statements; }Example:
// sum integers from 1 to max int sumTo(int max) { int i, sum = 0; for (i = 1; i <= max; i++) { sum += i; } return sum; }
From C to Machine Code
C source file (
dosum.c) is compiled using a C compiler (e.g.,gcc -S).C compiler generates assembly source file (
dosum.s).Assembler (
as) transforms the assembly file into machine code (dosum.o).Example:
C source file (
dosum.c):int dosum(int i, int j) { return i + j; }Assembly source file (
dosum.s):dosum: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax addl 8(%ebp), %eax popl %ebp retMachine code (
dosum.o):80483b0: 55 89 e5 8b 45 0c 03 45 08 5d c3
Most C compilers directly generate object files without saving the
.sassembly file.Object code is re-locatable machine code but generally cannot be executed without further manipulation (e.g., linking).
Anatomy of a C Program
Example:
#include <stdio.h> int myfunc(int i) { printf("Got into function with %d\n", i); return 0; } int main(void) { myfunc(10); return 0; }All C programs start with the
main()function.Compilation and execution:
gcc -g -Wall main.c -o main ./main
Running a Program
UNIX searches for programs in directories listed in the
PATHenvironment variable.To run a program in the current directory, prefix it with
./.Example:
mcdaniel@ubuntu:~/tmp/helloworld$ emacs helloworld.c mcdaniel@ubuntu:~/tmp/helloworld$ gcc helloworld.c -o helloworld mcdaniel@ubuntu:~/tmp/helloworld$ helloworld helloworld: command not found mcdaniel@ubuntu:~/tmp/helloworld$ echo $PATH /usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games mcdaniel@ubuntu:~/tmp/helloworld$ ./helloworld Hello world! mcdaniel@ubuntu:~/tmp/helloworld export PATH=$PATH:/new/path
Multi-File C Programs
Example:
dosum.c:int dosum(int i, int j) { return i + j; }sumnum.c:#include <stdio.h> int dosum(int i, int j); int main(int argc, char **argv) { printf("%d\n", dosum(1, 2)); return 0; }
A prototype of
dosum()insumnum.ctells the compiler about the arguments and return value ofdosum(), which is implemented indosum.c.#includeis needed to provide function declarations (prototypes) and other definitions.
Compiling Multi-File Programs
Multiple object files are linked to produce an executable.
Standard libraries (e.g.,
libc,crt1) are also linked.A library is a pre-assembled collection of
.ofiles.Compilation and linking process:
gcc -c dosum.c gcc -c sumnum.c ld (or gcc) sumnum libraries (e.g., libc)
Object Files Revisited
.ofiles (e.g.,sumnum.o,dosum.o) contain machine code produced by the compiler.- Each may contain references to external symbols (variables and functions not defined in the associated
.cfile). - For example,
sumnum.ocontains code that relies onprintf()anddosum(), which are defined inlibc.aanddosum.o, respectively. - Linking resolves these external symbols while combining object files and libraries.
Diving into C
Similarities with Java:
- Syntax for statements, control structures, function calls.
- Types:
int,double,char,long,float. - Type-casting syntax:
float x = (float) 5 / 3; - Expressions, operators, precedence:
+,-,*,/,%,++,--,=,+=,-=,*=,/=,%=,<,<=,==,!=,>,>=,&&,||,!. - Scope (local scope is within
{}braces). - Comments:
/* comment */or// comment *to EOL*.
Primitive Types in C
Integer types:
char,intFloating-point types:
float,doubleModifiers:
short [int],long [int, double],signed [char, int],unsigned [char, int]Type sizes and ranges (32-bit and 64-bit systems):
Type Bytes (32-bit) Bytes (64-bit) 32-bit Range printfchar1 1 [0, 255]%cshort int2 2 [-32768, 32767]%hdunsigned short int2 2 [0, 65535]%huint4 4 [-2147483648, 2147483647]%dunsigned int4 4 [0, 4294967295]%ulong int4 8 [-2147483648, 2147483647]%ldlong long int8 8 [-9223372036854775808, 9223372036854775807]%lldfloat4 4 approx [10^{-38}, 10^{38}]%fdouble8 8 approx [10^{-308}, 10^{308}]%lflong double12 16 approx [10^{-4932}, 10^{4932}]%Lfpointer4 8 [0, 4294967295]%p
C99 Extended Integer Types
Solves the problem of platform-dependent
long intsize.Defined in
<stdint.h>.Examples:
#include <stdint.h> void foo(void) { int8_t w; // exactly 8 bits, signed int16_t x; // exactly 16 bits, signed int32_t y; // exactly 32 bits, signed int64_t z; // exactly 64 bits, signed uint8_t w; // exactly 8 bits, unsigned ...etc. }
Similar to Java (Variables)
Variables must be declared at the start of a function or block (not required since C99).
Variables need not be initialized before use (but
gcc -Wallwill warn); always initialize your variables.Example:
#include <stdio.h> int main(void) { int x, y = 5; // note x is uninitialized! long z = x + y; printf("z is '%ld'\n", z); // what’s printed? { int y = 10; printf("y is '%d'\n", y); } int w = 20; // ok in c99 printf("y is '%d', w is '%d'\n", y, w); return 0; }
Similar to Java (Const)
constis a qualifier that indicates the variable’s value cannot change.The compiler will issue an error if you try to violate this.
Useful for defining constants.
Example:
#include <stdio.h> int main(void) { const double MAX_GPA = 4.0; printf("MAX_GPA: %g\n", MAX_GPA); // MAX_GPA = 5.0; // illegal! return 0; }
Similar to Java (Loops and Conditionals)
forloops: cannot declare variables in the loop header (changed in C99).if/else,while, anddo whileloops.No boolean type (changed in C99:
#include <stdbool.h>).Any type can be used; 0 means false, everything else is true.
Example:
int i; for (i = 0; i < 100; i++) { if (i % 10 == 0) { printf("i: %d\n", i); } }
Pointers
Key concepts:
- Taking the address of a variable:
& - Dereferencing a pointer:
* - Aliasing:
*ipis an alias fori
- Taking the address of a variable:
Example:
#include <stdio.h> int main(void) { int i = 5; int *ip = &i; printf("%d\n", i); // output: 5 printf("%p\n", ip); // output: memory address of i *ip = 42; printf("%d\n", i); // output: 42 printf("%d\n", *ip); // output: 42 }
Pass by Value vs. Pass by Reference
C always passes arguments by value.
A copy of the value is passed to the function.
Local modifications do not affect the original value.
Pointers allow you to pass by reference.
Pass the memory location of a variable.
Example:
void add_pbv(int c) { c += 10; printf("pbv c: %d\n", c); } void add_pbr(int *c) { *c += 10; printf("pbr *c: %d\n", *c); } int main(void) { int x = 1; printf("x: %d\n", x); // output: 1 add_pbv(x); printf("x: %d\n", x); // output: 1 add_pbr(&x); printf("x: %d\n", x); // output: 11 return 0; }
Pass-by-Value Explained
Callee receives a copy of the argument.
Modifications do not affect the caller's copy.
Example:
void swap(int a, int b) { int tmp = a; a = b; b = tmp; } int main(void) { int a = 42, b = -7; swap(a, b); printf("a: %d, b: %d\n", a, b); // Output: a: 42, b: -7 return 0; }The values of a and b in main are unchanged after calling swap.
Pass-by-Reference
Use pointers to pass by reference.
The callee receives a copy of the pointer.
The pointer's value points to the variable in the caller's scope.
Allows the callee to modify a variable in the caller's scope.
Example:
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; } int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d\n", a, b); // Output: a: -7, b: 42 return 0; }
Arrays
A bare, contiguous block of memory.
An array of 6 integers requires $6 \times 4 = 24$ bytes of memory.
Arrays have no methods and do not know their own length (no bounds checking).
C does not prevent you from overstepping the end of an array.
This is a common source of security bugs (buffer overflow).
Example:
X[7] = 45; // Legal C, but can cause a memory fault if X is not large enough!!!!
Strings
An array of
charterminated by the NULL character'\0'.Strings are not objects and have no methods.
string.hhas helpful utilities.Example:
char *x = "hello\n";
Errors and Exceptions
- C has no exceptions (no
try/catch). - Errors are returned as integer error codes from functions.
- Error handling can be ugly and inelegant.
- Some support from the OS using signals.
- Crashes: doing something bad can spray bytes around memory, hopefully causing a