SH

Comprehensive Notes on Data Structures

Exam Information

  • 20% of the final exam marks will be allocated to computer science topics.
  • Marks are not evenly distributed; each topic may have 2-4 marks allocated, except for human-computer interaction, which will not be on the exam.
  • For each topic, a list of possible question types will be provided.

Data Structures

Key Concepts

  • Definition: Understanding what a data structure is and how it organizes data in a structured way. While a direct definition might not be required, understanding the concept is essential.
  • Types: Understanding different data structures like arrays, lists, queues, stacks, linked lists, hash tables, trees, and graphs.
  • Use Cases: Recognizing scenarios where each data structure is applicable and explaining why a particular data structure is chosen for a given scenario.

What is a Data Structure?

  • A storage method that stores and organizes data.
  • When faced with a problem, consider what data structure would be suitable to load and organize the data into main memory.
  • The goal is to arrange data in a way that makes it easy to access and update efficiently.
  • Efficiency and performance are key factors in choosing a data structure. Dictionaries, for example, can be significantly faster than lists in certain scenarios.
  • Data structures involve storing and organizing data and performing operations such as adding, removing, inserting, traversing, sorting, and searching.

Array

  • The first data structure learned to store a list of multiple items.
  • When creating an array, you need to specify its size either explicitly or implicitly during initialization.
  • Arrays have a fixed size.
  • Elements are accessed using unique index numbers, starting from 0 to the length of the array minus one: index = length - 1
  • Arrays provide random access, meaning you can retrieve items in any order using their index.
  • Use Cases
    • Suitable when you have a fixed size of data and require fast random access to its items.
    • Arrays are efficient but do not support adding, inserting, or removing elements once the array is defined.
    • If you need to add more elements than the array's initial size, you must create a new array with the desired size, copy the old elements and add the new one.
    • Arrays are more memory efficient than list.

List

  • Variable size, allowing you to add or remove elements without worrying about the initial size.
  • Lists provide random access using indices, similar to arrays.
  • Use Cases
    • Suitable for scenarios with a variable size of data.
    • Lists make it easy to add, insert, and remove elements due to built-in operations.
  • Downsides
    • Lists are generally less efficient than arrays.
    • When a list reaches its capacity (e.g., 2/3 full), its size is doubled, which can impact performance.
    • Shrinking of the list is also managed by the implementation.

Queue

  • Variable size data structure that grows and shrinks based on the data it stores.
  • Follows the principle of "First In, First Out" (FIFO).
  • Elements are added to the end of the queue (enqueue) and removed from the front of the queue (dequeue).
  • No insertion or random access using index.
  • Use Cases
    • Web servers use queues to manage incoming requests from users, processing them in the order they were received.
    • Customer orders are queued for inventory, checking, packing, and shipping.
    • Print jobs are typically queued for processing in the order they were submitted.
  • Only support two operations: enqueue item and dequeue item.

Stack

  • Variable-size data structure that operates on the principle of "Last In, First Out" (LIFO).
  • Think of a stack of pancakes, where the last pancake placed on the stack is the first one served.
  • Elements are added (pushed) and removed (popped) from the top of the stack.
  • No random access using index or insertion operations.
  • Implementation Details
    • The end part is blocked and it always has a pointer pointed to the top of the stack.
  • Use Cases
    • Programming languages use stacks to keep track of nested or recursive function calls (stack memory).
    • Web browsers use stacks to implement the back button functionality, keeping track of previously visited web pages.
    • Any scenario that requires backtracking can make use of a stack.
    • Maze solving algorithms often use stacks to keep track of visited paths.

Linked List

  • Variable size data structure where elements are not stored contiguously in memory.
  • Each element (node) contains data and a pointer (reference variable) to the next node in the list.
  • Nodes can be scattered in memory, and they are linked together through pointers.
  • The last node points to null to indicate the end of the list.
  • A pointer typically points to the head of the list, allowing traversal by following the next pointers.
  • Operations
    • Adding elements to the end of the list is easy, set the new node to point to null.
    • Inserting and removing elements in the middle of the list involves adjusting pointers.
  • No random access using index.
  • Elements are accessed sequentially by following the next pointers.
  • Double Linked List
    • Each node has a pointer to the next node and a pointer to the previous node.
    • The first node's previous pointer and the last node's next pointer point to null.
    • Double linked lists often have pointers to both the head and tail of the list, facilitating traversal in both directions.
  • Use Cases
    • Undo and redo functionality in software applications (e.g., Word documents, PowerPoints).
    • Music players with playlist functionality, allowing users to go forward and backward through the playlist.
    • Slide shows and browsing history.

Hash Tables (Dictionaries)

  • Also known as dictionaries (in C#).
  • Data is stored in key-value pairs, similar to a dictionary where you look up the definition (value) of a word (key).
  • Provide fast access to values using their corresponding keys which has O(1) complexity.
  • Operations are performed based on keys.
  • Hash Function
    • A function that takes input data of arbitrary size and maps it to a fixed-size value (hash code).
    • The hash code is used as an index to store the value in the hash table.
  • Use Cases
    • Dictionaries.
    • Phone books (name as key, contact details as value).
    • Online word dictionaries.
    • Blacklists (e.g., storing addresses and checking emails against them).
  • Implementation details vary between languages (e.g., Java overwrites old key-value pairs, while C# may not allow it without removal).

Tree

  • Variable-size data structure with various types.
  • The tree is probably the most useful data structure.
  • Elements can be added, inserted, and removed, depending on the type of the tree.
  • Types of Trees
    • Binary tree (each node has two children), ternary tree (three children), B+ tree, black-red tree, balanced tree, etc.
  • Use Cases
    • Family trees.
    • Hierarchical structures.
    • Computer directories (file systems).
    • Website menus and dropdowns (DOM).
      • DOM (Document Object Model) are forms of tree structure.

Graphs

  • Data structure consisting of vertices (nodes) and edges (connections).
  • Graphs can be directed or undirected, and edges can have weights.
  • Graphs may contain loops and can be synchronic or asynchronic (with or without cycles).
  • Use Cases
    • Social networks.
    • Recommendation systems.
    • Computer networks.
      * The internet is an example, using nodes and links.

Programming Language Implementations

  • Languages like C# and Java provide built-in implementations of these data structures.
  • Collections are an umbrella term for data structures that store data.
  • Examples include lists, queues, stacks, linked lists, hash sets, and dictionaries.
  • The C# Queue class supports methods like Enqueue and Dequeue for adding and removing items, respectively.

Exam Focus

  • Focus on stacks, queues, linked lists, and hash tables.
  • Understand what they are, what they do, and their use cases.
  • Be able to choose an appropriate data structure for a given scenario and explain the rationale behind the choice.
  • Know the basic concepts of stacks (LIFO), queues (FIFO), linked lists (nodes linked by pointers), and hash tables (key-value pairs).