Data Structures

Introduction to Data Structures

  • Lecturer: Mr. J.I. Magutorima.
  • Definition of Data Structure:   - A storage mechanism used to store and organize data.   - A method of arranging data on a computer to ensure efficient access and updating.   - The logical or mathematical model associated with a particular organization of data.   - In the field of computer science, it refers to a specific way of storing and organizing data within a computer’s memory.
  • Factors for Selection:   - Richness: The depth of features provided by the structure.   - Simplicity and Effectiveness: The ease with which data can be processed when necessary.   - Requirement-Based Choice: Selection depends on the specific needs of a project. For instance, storing data sequentially in memory warrants the use of an Array data structure.
  • Data Structure vs. Data Types:   - These concepts are slightly different. A data structure is defined as a collection of data types arranged in a specific, predetermined order.
  • Significance of Knowledge:   - Understanding how each data structure works allows for the selection of the most appropriate one for a project, resulting in code that is efficient in terms of both memory and time.

The Need for Data Structures

  • Organization: Essential for organizing vast amounts of data systematically.
  • Technological Advancement: Necessary to complement more powerful computers.
  • Memory Efficiency: Facilitates the effective use of memory cache.
  • Problem Solving: Required for solving complex problems, such as finding the Shortest path.
  • Resource Management: Crucial for the saving of resources in two primary areas:   - Space   - Time

Classification of Data Structures

  • Primitive Data Structure:   - Integer   - Boolean   - Character   - Float
  • Non-primitive Data Structure:   - Linear Data Structure:     - Arrays     - Linked List     - Queue     - Stack   - Non-Linear Data Structure:     - Tree     - Graph

Linear Data Structures

  • General Characteristics:   - Elements are arranged in a sequence, appearing one after the other.   - They are relatively easy to implement due to the particular order of elements.   - Limitation: As program complexity increases, linear structures may become inefficient due to operational complexities.
  • Array Data Structure:   - A container that holds a fixed number of items.   - All items in an array must be of the same type.   - Arrays are the foundational tool used by most other data structures to implement their algorithms.   - It consists of a list of a finite number nn referenced by a set of nn consecutive numbers, typically 1,2,3n1, 2, 3 \dots n.   - Elements are arranged in continuous memory blocks.   - The programming language determines the type of elements allowed in an array.
  • Stack Data Structure:   - Operates on the LIFO (Last In First Out) principle.   - The last element stored in the stack is the first one to be removed.   - Metaphor: Functions like a pile of plates where the last plate placed on top is the first one taken off.
  • Queue Data Structure:   - Operates on the FIFO (First In First Out) principle.   - The first element stored in the queue is the first one to be removed.   - Metaphor: Functions like a queue of people at a ticket counter where the person at the front gets served first.
  • Linked List Data Structure:   - A linear collection of data elements known as nodes.   - The sequential order is maintained through the use of pointers.   - Structure of a Node (Chain):     - Data: The actual information being stored.     - Reference: A pointer or address to the next element in the list.   - Data elements are connected through a series of these nodes.

Non-Linear Data Structures

  • General Characteristics:   - Elements are not arranged in a sequence.   - They are organized in a hierarchical manner where one element connects to one or more other elements.   - Divided into graph-based and tree-based structures.
  • Graph Data Structure:   - Composed of nodes called vertices.   - Vertices are connected to one another through edges.   - Popular Graph-Based Structures:     - Spanning Tree and Minimum Spanning Tree     - Strongly Connected Components     - Adjacency Matrix     - Adjacency List
  • Trees Data Structure:   - Similar to graphs, trees are collections of vertices and edges.   - Constraint: In a tree, there can only be exactly one edge between any two vertices.   - Popular Tree-Based Structures:     - Binary Tree     - Binary Search Tree     - AVL Tree     - B-Tree     - B+ Tree     - Red-Black Tree

Comparison: Linear vs. Non-linear Data Structures

  • Organization:   - Linear: Items arranged in sequential order, one after the other.   - Non-Linear: Items arranged in non-sequential (hierarchical) order.
  • Layers:   - Linear: All items exist on a single layer.   - Non-Linear: Items are present at different layers.
  • Traversal:   - Linear: Can be traversed in a single run; starting from the first element, one can pass through all elements sequentially in one pass.   - Non-Linear: Requires multiple runs; starting from the first element, it may not be possible to reach all elements in a single pass.
  • Memory Utilization:   - Linear: Utilization is not efficient.   - Non-Linear: Different structures utilize memory in varied efficient ways depending on specific needs.
  • Time Complexity:   - Linear: Complexity increases as the data size increases.   - Non-Linear: Time complexity remains the same.
  • Examples:   - Linear: Arrays, Stack, Queue.   - Non-Linear: Tree, Graph, Map.

Data Structure Operations

  • Overview: Several operations can be invoked to uproot and process data. The choice of data structure often depends on which operations suit the specific situation.
  • Main Operations:   - Traversing: Accessing each record or node exactly once so that specific items within the record can be processed.   - Searching: Finding the location of a desired node using a specific key value.   - Inserting: According to the provided text, this involves removing a node or record from the structure (Note: Potential source error noted, but documented per transcript).   - Sorting: Arranging data items within a structure in a specific order, such as ascending or descending.   - Deleting: Removing a node or record from the structure.   - Merging: Combining the records of two different sorted files into a single, unified sorted file.