Data Structures Notes
CSF102: Data Structures Session: 2025
B. Tech 2nd Semester
Course Coordinator: Dr. Madhu Sharma
Course Instructor: Mohd Khalid
Unit 1: Introduction to Algorithms & Data Structure
Introduction:
Data types, Abstraction, Abstract Data Type (ADT), Concept of data structure, Types of data structures, Operations on Data Structures
Introduction to Algorithms, Writing Pseudocodes, Algorithm analysis, Complexity of algorithms and Time space trade-off
Searching: Linear and Binary Search Techniques and their complexity analysis.
Data Structure
A logical and mathematical model for storing and organizing data in a particular way to facilitate algorithm design and implementation.
Basic Terminologies
Data: An elementary value or a collection of values (e.g., Employee's name and ID).
Data Items: A single unit of value.
Record: A collection of various data items.
File: A collection of records of one type.
Elementary Items: Data items that cannot be further divided (e.g., the ID of an Employee).
Information: Data with attributes, i.e., meaningful data.
Introduction
Data structure affects the design of both structural & functional aspects of a program.
Program = Algorithm + Data Structure
Algorithm: A step-by-step procedure to solve a particular function.
Classification of Data Structures
Data structures are normally divided into two broad categories:
Primitive Data Structure
Non-Primitive Data Structure
Primitive Data Structures
Basic structures directly operated upon by machine instructions.
Examples:
INTEGER
CHARACTER
FLOAT
POINTERS
Can hold a single value in a specific location.
Non-Primitive Data Structures
ARRAYS
Single Dimension
Two Dimension
Multi Dimension
LISTS
Linear
Stack
Queue
Linked List
Non-Linear
Trees
Graphs
Operations on Primitive Data Structures
Create
Selection
Updating
Destroy or Delete
Non-Primitive Data Structure characteristics
Can hold multiple values either contiguously or randomly.
Defined by the programmer.
Linear Data Structures
Have homogeneous elements.
Elements are in a sequence, forming a linear series.
Easy to implement because computer memory is organized linearly.
Examples:
Stack
Queue
Linked Lists
Non-Linear Data Structures
Data items are connected to several other data items.
Exhibit hierarchical or parent-child relationships.
Data elements are not arranged sequentially.
Examples:
Trees
Graphs
Operations on Non-Primitive Data Structures
Traversal
Insertion
Selection
Searching
Sorting
Merging
Destroy or Delete
Primitive vs. Non-primitive Data Structures
Primitive:
Stores data of only one type.
Examples: integer, character, float.
Cannot be NULL.
Size depends on the data type.
Starts with a lowercase character.
Cannot be used to call methods directly.
Non-primitive:
Stores data of more than one type.
Examples: Array, Linked list, stack.
Can consist of a NULL value.
Size is not fixed.
Starts with an uppercase character.
*cannot be used to call methods.
Linear vs. Non-linear Data Structures
Linear:
Data elements are arranged linearly; each element is attached to its previous and next adjacent.
Single level is involved.
Easier implementation.
Elements can be traversed in a single run.
Memory may not be utilized efficiently.
Examples: array, stack, queue, linked list.
Applications in application software development.
Useful for simple data storage and manipulation.
Good performance for adding/removing at ends, slower for searching/removing in the middle.
Non-linear:
Data elements are attached hierarchically.
Multiple levels are involved.
Complex implementation.
Elements cannot be traversed in a single run.
Memory is utilized efficiently.
Examples: trees and graphs.
Applications in Artificial Intelligence and image processing.
Useful for representing complex relationships and data hierarchies.
Performance varies, can be optimized for specific operations.
Operations on Data Structures (Detailed)
Traversing: Visiting each element stored in the data structure.
Searching: Finding a particular element; successful if the element is found.
Insertion: Adding an element to the data structure.
Deletion: Removing an element from the data structure.
In stack, it's called "Pop."
In queue, it's called "Dequeue."
Merging: Combining lists of two sorted data items to form a single sorted list.
Abstract Data Type (ADT)
A mathematical model for data types defined by its behavior (semantics) from a user perspective.
Defined in terms of:
Possible values
Possible operations on data of this type
Behavior of these operations
Contrasts with data structures, which are concrete representations of data from an implementer's view.
Only specifies what operations are to be performed, not how they will be implemented.
Gives an implementation-independent view, providing only essentials and hiding the details (abstraction).
Can be defined as a set of data elements along with the operations on the data.
ADT Implementation
Contains a storage structure to store data elements and algorithms for fundamental operations.
Examples: array, linked list, queue, stack.
Separates a data structure from its implementation information.
List-ADT
Contains elements of the same type arranged sequentially.
Operations:
get(): Return an element at a given position.insert(): Insert an element at any position.remove(): Remove the first occurrence of an element.removeAt(): Remove the element at a specified location.replace(): Replace an element at any position.size(): Return the number of elements.isEmpty(): Return true if empty, otherwise false.isFull(): Return true if full, otherwise false.
ADT Model
Consists of public and private functions.
Involves:
Encapsulation: Wrapping all data in a single unit (ADT).
Abstraction: Showing operations that can be performed and the data structures used.
Real-World Example (Smartphone)
Data (Specifications):
4 GB RAM
Snapdragon 2.2ghz processor
5 inch LCD screen
Dual camera
Android 8.0
Operations:
call(): Make a call.text(): Send a text message.photo(): Click a photo.video(): Make a video.
Applications of Data Structures
Organization of data in computer's memory.
Representing information in databases.
Implementation of algorithms for searching data (e.g., search engine).
Implementation of algorithms to manipulate data (e.g., word processors).
Implementation of algorithms to analyze data (e.g., data miners).
Support for algorithms to generate data (e.g., a random number generator).
Support for algorithms to compress and decompress data (e.g., a zip utility).
Implementation of algorithms to encrypt and decrypt data (e.g., a security system).
Algorithm
A process or set of rules required to perform calculations or problem-solving operations, especially by a computer.
Formal definition: A finite set of instructions carried out in a specific order to perform a specific task.
Not a complete program but a solution (logic) of a problem, represented as a flowchart or pseudocode.
An algorithm is a finite set of Instructions for solving a particular problem in a finite amount of time using a finite amount of data.
Characteristics Required of an Algorithm
Each instruction should be precise and unambiguous.
Each instruction should be performable in a finite time.
One or more instructions should not be repeated infinitely (ensuring termination).
The desired results must be obtained after the algorithm terminates.
Algorithm can be represented as programs, flowcharts and pseudo-codes.
Characteristics of an Algorithm
Input: Zero or more input values.
Output: One or more output values.
Unambiguity: Instructions should be clear and simple.
Finiteness: Should contain a limited (countable) number of instructions.
Effectiveness: Each instruction affects the overall process.
Language independence: Can be implemented in any language with the same output.
Dataflow of an Algorithm
Problem: A real-world problem or instance.
Algorithm: Step-by-step procedure designed for the problem.
Input: Required and desired inputs provided to the algorithm.
Processing unit: Processes the input to produce the desired output.
Output: The outcome or result of the program.
Why Algorithms
Scalability: Helps to scale down a large real-world problem into smaller steps for easy analysis.
Performance: Indicates if the problem can be easily broken into smaller steps, making it feasible.
Sample Algorithm (Example 1)
Problem: Write an algorithm to find the sum of two numbers.
Algorithm:
Step 1: Read two numbers in variable num1 and num2
Step 2: Add num1 and num2 and sum=num1+num2
Step 3: Display result sum
Algorithm Example: Even or Odd
Step 1: Start
Step 2: Read a number to N
Step 3: Divide the number by 2 and store the remainder in R.
Step 4: If R = 0 go to Step 6
Step 5: Print “N is odd” go to step 7
Step 6: Print “N is even”
Step 7: Stop
Factors of an Algorithm
Modularity: Ability to break down a problem into small modules or steps.
Correctness: The algorithm produces the desired output for given inputs.
Maintainability: Should be designed simply for easy redefinition without major changes.
Functionality: Considers various logical steps to solve the real-world problem.
Robustness: How clearly an algorithm can define a problem.
User-friendly: Easy to explain to a programmer.
Simplicity: Easy to understand.
Extensibility: Others can use and extend the algorithm.
Issues of Algorithms
How to design algorithms (step-by-step procedure).
How to analyze algorithm efficiency.
Categories of Algorithms
Sort: For sorting items in a certain order.
Search: For searching items inside a data structure.
Delete: For deleting existing elements from the data structure.
Insert: For inserting items inside a data structure.
Update: For updating existing elements inside a data structure.
Pseudocode
An informal high-level description of the operating principle of a computer program or algorithm.
Uses structural conventions of a standard programming language for human reading.
More structured than prose but less formal than a programming language.
Methodology that allows programmers to represent the implementation of an algorithm.
Expressions:
Use standard mathematical symbols for numeric and boolean expressions
Use
equalkeyword for assignment (=in C)Use
=for equality relationship (==in C)
Method Declarations:
Algorithm name(param1, param2)
Good vs Bad Ways of Writing Pseudocode
Good Example
FOR each index of string
IF string(index) is equal to hash THEN
RETURN "Hash found"
END IF
END FOR
Bad Example
FOR i = 0 to 20
IF string[i] = '#'
RETURN "Hash found"
END IF
END FOR
Worst Example
FOR each index of string
if string[i] = '#'
return "Hash found"
END if
END FOR
Advantages of Pseudocode
Easily transformed into actual programming language.
Easily understood by non-programmers.
Easily modifiable compared to flowcharts.
Beneficial for structured, designed elements.
Errors can be detected before coding.
Disadvantages of Pseudocode
Lacks a standardized style or format.
Higher error possibility when transforming into code.
May require a tool for extracting and drawing flowcharts.
Does not depict the design.
Algorithm vs. Pseudocode
Algorithm: A problem-solving process, a series of instructions to solve a problem.
Pseudocode: A way of writing an algorithm using informal, simple language without strict syntax.
Problem Example: Calculating Absentees
Problem: Suppose there are 60 students in the class. How will you calculate the number of absentees in the class?
Pseudo Approach:
Initialize a variable called as Count to zero, absent to zero, total to 60
FOR EACH Student PRESENT DO the following:
Increase the Count by One
Then Subtract Count from total and store the result in absent
Display the number of absent students
Algorithmic Approach:
Count <- 0, absent <- 0, total <- 60
REPEAT till all students counted
Count <- Count + 1
absent <- total - Count
Print "Number absent is:" , absent
Algorithm Analysis
Can be analyzed before (priori) and after (posterior) creation.
Priori Analysis: Theoretical analysis before implementation; factors like processor speed are not considered.
Posterior Analysis: Practical analysis after implementation using a programming language; evaluates running time and space taken by the algorithm.
Algorithm Complexity
Performance measured by:
Time complexity: Amount of time required to complete execution, denoted by Big O notation.
Space complexity: Amount of space required to solve a problem and produce an output, also expressed in Big O notation.
Analysis of Algorithms
Worst-case, Average-case, Best-case
Worst-case: Upper bound on the running time for any input.
Average-case: Estimate of the running time for an 'average' input, assuming all inputs are equally likely.
Best-case: Analysis under optimal conditions (rarely used for decision-making).
Asymptotic Analysis and Notations
Evaluates algorithm performance in terms of input size, not actual running time.
Measures how time (or space) taken by an algorithm increases with input size.
Uses machine-independent constants.
Asymptotic notations are mathematical tools to represent the time complexity.
BIG O Notation
Defines an upper bound of an algorithm.
O(g(n)) = { f(n): \text{ there exist positive constants } c \text{ and } n0 \text{ such that } 0 <= f(n) <= c*g(n) \text{ for all } n >= n0 }
Categories of Algorithms (Based on Big O Notation)
Constant time algorithm: running time complexity given as
Linear time algorithm: running time complexity given as
Logarithmic time algorithm: running time complexity given as
Polynomial time algorithm: running time complexity given as , where k > 1
Exponential time algorithm: running time complexity given as
O(1) < O(log n) < O(n) < O(n^2) < O(2^n) < O(n!)
BIG Omega Notation
Notation provides an asymptotic lower bound.
\Omega(g(n)) = {f(n): \text{ there exist positive constants } c \text{ and } n0 \text{ such that } 0 <= c*g(n) <= f(n) \text{ for all } n >= n0}
BIG Theta Notation
Bounds a function from above and below, defining exact asymptotic behavior.
Drop low order terms and ignore leading constants.
Example:
(g(n)) = {f(n) \text{ there exist positive constants } c1, c2 \text{ and } n0 \text{ such that } 0 <= c1g(n) <= f(n) <= c2g(n) \text{ for all } n >= n0}
Theta Notation (Θ-Notation)
Encloses the function from above and below.
Represents the upper and lower bound of the running time of an algorithm.
Used for analyzing the average-case complexity of an algorithm.
Theta (Average Case): Add the running times for each possible input combination and take the average.
Linear Search
Also called sequential search algorithm.
Simplest searching algorithm.
Traverse the list completely and match each element with the item to be found.
Returns the location of the item if found; otherwise, returns NULL.
Linear Search Algorithm
Linear_Search(a, n, val)// 'a' is the given array, 'n' is the size of given array, 'val' is the value to searchStep 1: set pos = -1
Step 2: set i = 1
Step 3: repeat step 4 while i <= n
Step 4: if a[i] == val
set pos = i
print pos
go to step 6
set i = i + 1
Step 5: if pos = -1
print "value is not present in the array "
Step 6: exit
Linear Search - Time Complexity
Best Case:
Average Case:
Worst Case:
Best Case: Element is at the first position of the array.
Average Case: The average time it takes to find element.
Worst Case: Element is at the end of the array, or not present in the array.
Binary Search
Works efficiently on sorted lists.
Ensures the list is sorted before searching.
Follows the divide and conquer approach.
Divides the list into two halves, compares the item with the middle element.
If a match is found, the location of the middle element is returned.
Otherwise, search either of the halves, depending on the result produced through the match.
Binary Search Algorithm
Search a sorted array by repeatedly dividing the search interval in half.
Begin with an interval covering the whole array.
If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half.
Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is empty.
Binary Search - Iterative Pseudocode
BinarySearchIT(A[0..N-1], value)low = 0
high = N-1
while (low <= high)
mid=(low +(high-low)/2)
if (A[mid] > value)
high = mid-1
else if (A[mid] < value)
low = mid + 1
else
return mid
return -1
Binary Search - Recursive Pseudocode
BinarySearchREC(A[0..N-1], value, low, high)if (high>=low)
mid = low +((high-low)/2)
if (A[mid] > value)
return BinarySearchREC(A, value, low, mid-1)
else if (A[mid] <value)
return BinarySearchREC(A, value, mid+1, high)
else
return mid
return -1
Binary Search - Time Complexity
Best Case Complexity - In Binary search, best case occurs when the element to search is found in first comparison, i.e., when the first middle element itself is the element to be searched. The best-case time complexity of Binary search is O(1).
Average Case Complexity - The average case time complexity of Binary search is .
Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep reducing the search space till it has only one element. The worst-case time complexity of Binary search is .
Time-Space Trade-Off
The best algorithm requires less memory space and takes less time to complete its execution.
Designing such an ideal algorithm is not a trivial task.
More than one algorithm can solve a particular problem.
It is not uncommon to sacrifice one thing for the other.
There exists a time–space trade-off among algorithms.
If space is a big constraint, then choose a program that takes less space at the cost of more CPU time.
On the contrary, if time is a major constraint, then choose a program that takes minimum time to execute at the cost of more space.
References
https://www.geeksforgeeks.org/
Thareja, R. (2014). Data structures using C. Oxford University.
Lipschutz, S. (2003). Schaum's outline of theory and problems of data structures. McGraw-Hill.