Study Notes: Data Structures, Abstract Data Types, and Algorithms

Data Structures and Abstract Data Types (ADT)

  • Data structures are a particular way of organizing data in a computer to be used efficiently. It is the arrangement of data in memory locations to represent values of the carrier set of an abstract data type.

  • They provide a means to manage large amounts of data for users, such as in large databases and internet indexing services. This is key to designing efficient algorithms.

  • Data structures rely on the ability of a computer to fetch and store data at any place in memory, which is specified by a pointer — a bit string representing a memory address that can be stored in memory and manipulated by the program.

  • Examples of Data Structures include:

    • Array: A fixed-length, ordered collection of values of the same type stored in contiguous memory locations. One- or two-dimensional (matrix) arrays exist. Elements are identified by at least one array index or key. Used to implement tables (especially lookup tables), lists, and strings.

    • List: An abstract data type representing a sequence of values, where duplicates may occur. Represents a finite sequence; the infinite analogue is a stream. Lists are containers that hold values.

    • Linked List: Chains of nodes where each node contains data and a pointer to the next node. Nodes may include additional references. Benefits over arrays include easy insertion/removal without reallocation or reorganization, since items need not be stored contiguously in memory.

  • A data structure is often designed to support the operations required by the abstract data types it implements.

Common Data Structures

  • Stack: A collection with push and pop operations; Last-In-First-Out (LIFO). The top of the stack is where push/pop occur. Optional peek/top returns the top value without removing it. Stacks may become overflow in a full state if there is no space to push more data. They have a natural order: elements are removed in the reverse order of their addition.

  • Queue: A collection with enqueue (add to rear) and dequeue (remove from front) operations. First-In-First-Out (FIFO); first element added is the first removed. Linear data structure.

  • Hashing: A method for storing and retrieving records based on a search key value. Records are stored in a hash table and located by a hash function applied to the search key. Records are not stored in value order.

    • Let M be the number of slots in the hash table, with slots numbered from 0 to M−1. The calculation locations are determined by the hash function; the position is a slot in the table: the set of possible indices is ext{slots} = igl\u2211{0,1,
      , ext{M-1}igr
      rbracket (represented as the range 0,1,<br>,extM10,1,<br>, ext{M}-1).

  • Trees: A data structure made up of nodes (or vertices) and edges without cycles. An empty tree is the null tree. A non-empty tree has a root and levels of nodes forming a hierarchy. Recursively, a node consists of a value and a list of references to its children; there are constraints that no reference duplicates and none point to the root.

Abstract Data Type (ADT)

  • An Abstract Data Type is a mathematical model for a class of data structures that share similar behavior, or for certain data types in programming languages with identical semantics. An ADT is defined by the operations that may be performed on it and by mathematical preconditions/constraints on the effects and possibly the cost of those operations.

  • ADT Operations (examples of common ADTs):

    • Abstract Array

    • Adding elements

    • Sorting elements

    • Searching elements

    • Re-arranging the elements

    • Performing matrix operations

    • Prefix and postfix operations

    • Abstract List

    • Inserting

    • Searching

    • Deletion

    • Abstract Link (Linked List)

    • Checking whether the list is empty

    • Accessing a node to modify it or obtain its information

    • Traversing the list to access all elements (e.g., print, find element)

    • Determining the size (number of elements)

    • Inserting or removing a specific element

    • Creating a list by reading elements from an input stream

    • Converting a list to and from an array, string, etc.

    • Abstract Stack

    • Push (inserts an item)

    • Pop (extracts an item)

    • Peek or top (examine top element without removal)

    • Abstract Queue

    • Enqueue (join the queue)

    • Dequeue (remove the first element)

    • Front (access and serve the first element)

    • Abstract Hashing

    • Add (Insert)

    • Delete (Remove)

    • Abstract Tree

    • Searching

    • Insertion

    • Deletion

    • Traversal

    • Sort

Introduction to Algorithms

  • An algorithm is a finite sequence of steps for accomplishing a computational task.

  • It must have steps that are simple and definite enough to be done by a computer and it must terminate after a finite number of steps.

  • An algorithm can be considered as a computational procedure that takes input values and produces output values. It can be described as a procedure that accepts data, manipulates it following prescribed steps, to eventually yield the required result.

  • Example intuition: a recipe is a good illustration of an algorithm.

  • Algorithms are essential to computing because a computer program is essentially an algorithm that tells the computer what steps to perform in a specific order to carry out a task.

Characteristics of an Algorithm

  • Each step must be exact (precise and unambiguous).

  • Algorithms must terminate (finite number of steps); endless loops must be avoided.

  • It must be effective (provide correct answers at all times).

  • It must be general (solve every instance of a problem).

  • It must be unique (the result of each step is uniquely defined depending only on input and previous steps).

  • Finiteness (the algorithm stops after a finite number of instructions).

  • Output (the algorithm always produces output).

Ways to Express Algorithms

  • Human Language: describe a sequence of steps in plain natural language.

  • Pseudocode: a high-level informal description using the conventions of a programming language; omits non-essential details to aid human understanding. Example structure:

    • Start the program

    • Enter two numbers X, Y

    • Multiply the two numbers together

    • Print product

    • End program

  • Flowchart: a diagram representing an algorithm, workflow, or process using boxes of various shapes connected by arrows. Used for analysis, design, documenting, or managing processes; helps sketch the structure before coding.

Flowchart Symbols (Standard Symbols)

  • Terminal: start and end of a process.

  • Input/Output: any input or output operation.

  • Computer Processing: any processing performed by the computer system.

  • Predefined Processing: indicates a process not specially defined in the flowchart.

  • Comment: explanatory statement to clarify something.

  • Flow line: connects symbols and shows the flow of control.

  • Document Input/Output: input from a document, output to a document.

  • Decision: a point where a decision is made to determine the next action.

  • On-page Connector: connects parts of a flowchart on the same page.

  • Off-page Connector: connects parts of a flowchart that continue on separate pages.

References and attribution in the original material include Chaudhuri (2020) for flowchart symbolism and LibreTexts (2023) on databases and data structures.