string
UNIT I BASIC CONCEPTS OF DATA STRUCTURES
1.1 Introduction
In programming, variables store single data, but storing multiple related data items is often necessary.
Data Organization: Structuring data for convenience; a container for data is a data structure.
Data Structure: A means of organizing data in primary memory for convenient processing.
Data is stored in main or secondary memory.
Models (logical or mathematical) represent/organize/store data in main memory.
Data structures are collections of data organized in a specific manner in a computer's main memory, implementing Abstract Data Types (ADTs).
1.2 Basic Terminology: Elementary Data Organization
Data: A value or set of values (e.g., student marks).
Data Item: A single unit of values (e.g., roll number).
Entity: Something with qualities, characteristics, properties, or attributes that may contain values (e.g., a Student entity).
Attributes of a Student: roll number, name, address.
Values: 100, Ram, House No:133-A, Pragati Vihar, Dehradun.
Entity Set: A group or set of similar entities (e.g., employees of an organization).
Information: Processed data by applying certain rules, useful for decision-making.
Field: A single elementary unit of information representing an attribute of an entity (e.g., roll number field).
Record: A collection of field values of a given entity (e.g., roll number, name, address of a student).
File: A collection of records of the entities in a given entity set (e.g., students of a class).
Key: One or more fields in a record that take unique values and can distinguish one record from the others.
1.3 ALGORITHM DESIGN AND ANALYSIS
1.3.1 Algorithm and Specification
Algorithm: A step-by-step procedure with instructions executed in a specific order to get the desired output.
Algorithms are language-independent.
Characteristics of an Algorithm:
Unambiguous: Clear steps with single meaning.
Input: Zero or more well-defined inputs.
Output: One or more well-defined outputs.
Finiteness: Must terminate after a finite number of steps.
Feasibility: Feasible with available resources.
Independent: Step-by-step directions independent of programming code.
How to Write an Algorithm:
No well-defined standards; problem and resource-dependent.
Algorithms are not written for specific programming code.
Use common code constructs (loops, flow-control).
Write algorithms step-by-step after defining the problem domain.
Example Problem: Algorithm to add two numbers and display the result.
Step 1: START
Step 2: Declare three integers a, b, & c
Step 3: Define values of a & b
Step 4: Add values of a & b
Step 5: Store output of Step 4 to c
Step 6: Print c
Step 7: STOP
Alternative representation:
Step 1: START ADD
Step 2: Get values of a & b
Step 3:
Step 4: Display c
Step 5: STOP
The second method is preferred for analysis as it ignores unwanted definitions.
1.3.2 Different approaches to design algorithms
Algorithmic Paradigms
General approaches to constructing efficient solutions.
Provide templates for solving diverse problems.
Translated into common control and data structures.
Temporal and spatial requirements can be analyzed.
Paradigms:
Brute-force
Divide and Conquer
Recursive
Dynamic Programming
Greedy Method
Backtracking
Brute Force
Straightforward approach based on problem statement and definitions.
Useful for small-size instances.
Examples: Computing by multiplying aa…*a, computing , Selection sort, Bubble sort.
Divide-and-Conquer, Decrease-and-Conquer
Split the problem into smaller sub-instances, solve independently, and combine solutions.
Divide-and-conquer reduces the problem size by a factor.
Decrease-and-conquer reduces the size by a constant.
Examples: Computing by recursion, Binary search, Mergesort, Quicksort.
RECURSION AND TYPES OF RECURSION
What is Recursion?
A technique where solutions are often simple.
Problems defined in terms of themselves are good candidates.
Example: Factorial function :
Can be written as
Factorial is recursive: , where n > 0
Criteria for writing a recursive function:
Base case (halting case): Solvable without recursion.
General case: Recursive call.
Recursive call makes the problem smaller and approaches the base case.
Base Case
The base case stops the recursion.
Every recursive function must have at least one base case.
Factorial example: If or , then . Factorial is undefined for values less than 0.
Implementation:
int factorial(int n) { if (n<0) return 0; /* error value for inappropriate input */ else if (n<=1) return 1; /* if n==1 or n==0, n! = 1 */ else return n * factorial(n-1); /* n! = n * (n-1)! */ }Example: factorial(3).
Types of Recursion
Linear recursion
Function makes a single call to itself each time the function runs.
Example: Factorial function
int factorial(int n) { if (n<0) return 0; /* error value for inappropriate input */ else if (n<=1) return 1; /* if n==1 or n==0, n! = 1 */ else return n * factorial(n-1); /* n! = n * (n-1)! */ }
Binary Recursion
Functions with two (or more) recursive calls.
Example: Fibonacci series
int recursiveFib (int n) { // base case if (n <= 1) return n; // binary recursive call return recursiveFib(n-1) + recursiveFib (n - 2); }
Exponential Recursion
Exponential number of calls relative to the data set size.
Example: Ackerman's function
int ackermann(int m, int n) { if (m == 0) return(n+1); else if (n == 0) return(ackermann(m-1,1)); else return(ackerman(m-1,ackermannn(m,n-1))); }
Indirect Recursion
Functions call each other in a cycle.
Example:
int is_even(unsigned int n) { if (n==0) return 1; else return(is_odd(n-1)); } int is_odd(unsigned int n) { return (!iseven(n)); }
Tail Recursion
Recursive call is the last thing the function does; often the value of the recursive call is returned.
Can be easily implemented iteratively.
Good compilers can optimize tail recursion into iteration.
Example: GCD (Greatest Common Denominator)
int gcd(int m, int n) { int r; if (m < n) return gcd(n,m); r = m%n; if (r == 0) return(n); else return(gcd(n,r)); }
Pros and Cons of Recursion
Advantage: Less code, more concise, cleaner, and easier to understand.
Disadvantage: No storage or time saving; every recursive call is saved in a stack.
Pushing a call frame to the stack and removing it back is time-consuming relative to the record keeping required by iterative solution.
Possible stack overflow if the number of recursive calls is too much.
1.4 Dynamic Programming
Dynamic Programming optimizes the divide and conquer approach by storing the results of sub-instances in a table, avoiding repeated calculations.
Bottom-Up Technique: solves the smallest sub-instances first and constructs solutions to larger sub-instances.
Top-Down Technique: progress from the initial instance down to the smallest sub-instance via intermediate sub-instances.
Examples: Fibonacci numbers computed by iteration
1.4 PERFORMANCE ANALYSIS OF ALGORITHMS (ALGORITHM EFFICIENCY)
Algorithm Efficiency: Some algorithms are more efficient than others.
Complexity: A function describing algorithm efficiency in terms of the amount of data the algorithm must process.
Complexity Measures:
Time complexity
Space complexity
SPACE COMPLEXITY
The amount of computer memory required to solve the given problem of particular size.
Components:
Fixed Part: Instruction space, variable space, constants space, etc.
Variable Part: Instance of input and output data.
Example 1:
Algorithm sum(x,y) { return x+y; }
Two computer words required to store variables x and y.
Space Complexity (S) = 2
Example 2:
sum( a[], n) { sum =0; for(i=0;i<=n;i++) { sum = sum + a[i]; return sum; }
Space complexity calculation:
One computer word to store n.
n space to store array a[].
One space for variable sum and one for i.
Total Space(S) = 1 + n + 1 + 1 = n + 3.
2. Time complexity
A function describing the amount of time an algorithm takes in terms of the amount of input to the algorithm.
Components
Fixed Part – Compile time
Variable Part – Run time dependent on problem instance
*Note: Run time is considered usually and compile time is ignored.
**