The Science of Computing II - High-Level Data Structures

The Science of Computing II

Living with Cyber High-Level Data Structures

Pillar: Data Structures

This lesson focuses on data structures, which are the various ways that data can be organized within a computer program. The lesson introduces four common high-level data structures: lists, stacks, queues, and trees. These structures provide methods to efficiently organize data for problem-solving.

Types of Data Structures
  1. Linear Data Structures

    • Lists

    • Stacks

    • Queues

  2. Non-Linear Data Structures

    • Trees

    The first three structures (lists, stacks, and queues) are linear, meaning their items exist sequentially. The tree structure is non-sequential, as its contents cannot be represented by a simple sequential listing.

Lists

When designing algorithms, we frequently need to store and manipulate data, often in list form.

  • Lists have been previously observed in lessons regarding searching and sorting algorithms.

  • They are used in programming languages like Scratch and Python.

  • A list is characterized by:

    • First item

    • Last item

    • Middle items

Accessing Elements

Accessing individual items in a list can be achieved using its position or index. Lists typically have two implementations: array-based lists and linked lists.

Array-Based Lists

Arrays can be compared to a numbered list (e.g., grocery list). They store multiple instances of the same data type (all numbers, all letters).

  • Elements: Items contained in an array.

  • Order: The position of an element matters for accessing it appropriately.

  • Address: Refers to the actual memory location of an element.

  • Index: The relative distance from the first element to that element.

Memory Allocation and Weakness of Arrays
  • Arrays must be declared with their capacity, allocating contiguous memory.

  • Overestimating capacity leads to wasted memory, and underestimating can necessitate costly array duplication.

Linked Lists

Unlike arrays, linked lists can adaptively grow or shrink.

  • Advantages: Memory usage suits the current needs.

  • Disadvantages: Requires more memory for linking elements.

Structure of Linked Lists
  • Node: Contains data and a link component to the next node.

  • The head is the first node, while the tail is the last node, whose link equals null/None or zero.

  • The link stores the memory address of the next node.

Example: Creating a Linked List
  1. Insert the value 5: create a node (data=5, link=0)

    • Memory Address: 5B44

  2. Insert the value 9: create a node (data=9) at 5B46 and link it to node 5

    • Updated head: 5B44 points to 5B46

  3. Continue inserting values, e.g., 2 at 5B50, and linking appropriately.

Deleting Nodes

To delete a node (e.g., node containing value 2):

  • Adjust the linking of the previous node to skip the deleted node.

Terminology Recap for Linked Lists
  • Node: An element containing data and a link.

  • Data: Value held within the node.

  • Link: Reference to the next node.

  • Head: First node.

  • Tail: Last node.

  • Traversal: Visiting all nodes from head to tail.

Stacks

Stacks model the behavior of objects stacked on top of one another.

  • LIFO (Last-In, First-Out): The last added block is the first removed.

  • Operations:

    • Push: Add an item to the top.

    • Pop: Remove an item from the top.

Real-World Analogy

In a stack of blocks: C is at the top, followed by B, then A. When A is pushed, it is placed at the bottom, and C, being added last, is the most accessible.

Stacks in Computing
  • Used for tasks requiring LIFO processing, e.g., task management.

  • Example of stacking tasks: Typing a paper, answering the phone, and attending another call – all handled in a LIFO manner.

Practice with Stacks
  • Push elements of 'PUPILS' and pop them sequentially to reverse the order.

Parentheses Matching
  • Use stacks to ensure parentheses in expressions are matched correctly.

  • For each left parenthesis: push onto the stack; for right parenthesis: pop and match with left parenthesis.

Queues

Queues are similar to lines in real life (FIFO - First-In, First-Out).

  • Ex: A queue at a movie theater: First person gets served first.

  • Operations:

    • Enqueue: Add item at the back of the queue.

    • Dequeue: Remove item from the front.

Terminology Recap for Queues
  • Queue: List where insertions are at the back and deletions at the front.

  • Item: Element in the queue.

  • Front: Item at the front.

  • Back: Item at the back.

Trees

We'll focus on binary trees in this lesson, relating back to efficient searching algorithms like binary search.

  • Binary trees consist of nodes where each can have two children.

  • Root: Top node of the tree.

  • Children: nodes attached to a parent; Leaf: nodes without children.

  • Leaf nodes collectively are called leaves.

Binary Search Trees
  • Ordered property: Left subtree less than parent, right greater than parent.

  • Example: A binary tree structure can be represented in both graphical and array formats.

    • Finding a node: Compare values recursively.

Array Representation of Trees
  • A binary tree can be represented in an array, aiding in direct indexing for parent and child relationships.

  • Left child node at index $2i+1$ and right child node at index $2i+2$, while the parent can be accessed at $ ext{floor}igg( rac{i-1}{2}igg)$.

Terminology Recap for Trees
  • Binary tree: Tree with at most two children per node.

  • Node: Contains data and children references.

  • Root: Topmost node.

  • Leaf: Node without children.

  • Subtree: A subset of nodes within a tree.

  • Balanced: Tree with log base 2 of the number of nodes as the maximum height.

  • Ordered binary tree: A binary tree where child nodes are ordered based on size relative to their parent.

This comprehensive overview covers the basic concepts of lists, stacks, queues, and trees, all fundamental data structures in computer science essential for effective algorithm design and data management.