final report
Cover Page
Course Name: Analysis of Algorithms (SENG 313 Section 3)Instructor: Öğr. Gör. Alpaslan ErdağPrepared By: Basmalah Mohamed Ahmed Elsayyed (Ahmed 220208760)Group Members: Maimuna Aminu Suleiman Mirane Moussa Omar (210208882)
Sorting Algorithms
Problem Description
Task: Optimize product listing on an e-commerce platform by sorting products based on sales volume and user ratings.Goal: Enhance user experience by prioritizing highly rated and popular products.
Time Complexity Comparison
Merge Sort
Best Case: O(n log n)Even when the array is sorted, it continues to split and merge elements, ensuring consistent performance.
Average Case: O(n log n)This complexity reflects the methodical nature of the divide-and-conquer approach for diverse data arrangements.
Worst Case: O(n log n)Maintains performance standards due to its predictable recursive process, ensuring dependability under varying conditions.
Quick Sort
Best Case: Ω(n log n)Optimal performance occurs when the pivot divides the array into nearly equal parts, significantly enhancing efficiency.
Average Case: O(n log n)Efficient partitioning typically leads to superior runtime, making it suitable for many applications.
Worst Case: O(n²)Suboptimal performance arises under poor pivot selection scenarios, where partitions are imbalanced, leading to inefficiency.
Space Complexity Comparison
Merge Sort
Space Requirement: O(n)Additional space is needed for merging temporary arrays. The extra memory utilization can be a limitation in memory-constrained environments.
Quick Sort
Average Case Space: O(log n)Space usage aligns with the recursion depth, providing an advantage in memory efficiency during execution.
Worst Case Space: O(n)Poor pivot selection can escalate space consumption to linear in the worst-case scenario, which may impact performance negatively in resource-limited systems.
Performance Analysis & Graph Interpretation
Time vs. Input Size Analysis
Merge Sort: Consistently demonstrates predictable growth (O(n log n)) across varying input sizes, making it a reliable choice for applications requiring guaranteed timing.
Quick Sort: Shows enhanced performance for smaller input sizes due to in-place sorting, but this advantage may diminish with larger datasets if pivot selection strategies are not optimal.
Key Insights
Execution Speed: Quick Sort is fast with small datasets, while Merge Sort's predictability makes it preferable for larger datasets that require consistent timing for successful application.
Scalability: Merge Sort scales efficiently with increasing data size, whereas Quick Sort's performance is more sensitive to pivot selection methods that can influence time complexity significantly.
Observations From Graphs
Deployment and analysis of graphs suggested that both algorithms exhibit O(n log n) growth trends, though certain anomalies emerged in edge cases.Quick Sort may demonstrate increased execution times when processing pre-sorted inputs owing to its method of operation, necessitating careful consideration for practical use cases.
Execution Time Comparison Table (in seconds)
Input Size Merge Sort Quick Sort | ||
1000 | 0.0070 | 0.0146 |
5000 | 0.0405 | 0.1002 |
10000 | 0.0886 | 0.2008 |
Space Usage Graph Analysis
Merge Sort's space complexity increases linearly (O(n)), which could be a disadvantage in scenarios where memory availability is limited.
Quick Sort exhibits logarithmic growth in space usage under optimal conditions, but this can shift to linear growth in situations where pivot selection leads to suboptimal partitioning.
Conclusions on Sorting Algorithms
From the comparative analysis, it is clear that Merge Sort offers reliability for large datasets with its stable run times. Quick Sort, on the other hand, demonstrates remarkable speed and efficient memory usage for smaller to moderate datasets. To achieve optimal performance from Quick Sort, careful pivot strategies must be implemented, as this can significantly alter the execution profile.
Task Scheduling Using Greedy Algorithm
Introduction
Efficient task scheduling remains pivotal in various domains such as conference planning, CPU management, and resource allocation scenarios. Utilizing the greedy approach allows for the maximization of non-overlapping sessions, effectively optimizing the schedule for all stakeholders involved.
Problem Description
Objective: Select the maximum number of non-overlapping sessions based on start and end times to ensure maximum utilization of resources.
Algorithm Design Steps
Sort sessions by their end times in ascending order to facilitate optimal selection.
Initialize an empty list to accumulate selected sessions based on the non-overlapping criterion.
Iterate through the sorted list of sessions, adding a session to the selected list only if it starts after the last selected session ends.
Return the constructed list of selected sessions reflecting the optimal non-overlapping configuration.
Complexity Analysis
In terms of computational pathways, the Greedy Algorithm maintains a time complexity of O(n log n) which is primarily dictated by the sorting step. This efficiency is crucial for applications requiring real-time scheduling.The space complexity remains efficient at O(n), representing the primary storage requirement needed for session management during the process. Total space usage is characterized by input storage, which is essential to track all the sessions being processed.
Performance Testing Results
Empirical testing and analysis indicate that as the input size grows, the execution time exhibits logarithmic growth trends, aligning well with theoretical expectations drawn from the complexity analysis. Both execution and space utilization graphs reinforce this trend across various test cases, suggesting scalability and robustness in performance.
Key Observations
The Greedy Algorithm effectively maximizes the number of non-overlapping sessions, which is imperative in scenarios like conference planning where efficient time management is essential. Execution time for large datasets remains efficient, showcasing the capability of the algorithm under practical conditions.
Recommendations
For practical implementation, it is recommended to utilize parallel sorting techniques when dealing with large datasets in order to enhance processing times significantly. Implementing tests with various edge cases can improve robustness and ensure reliability in varied operational scenarios.
Dynamic Programming (Longest Common Subsequence - LCS)
Problem Description
The process of developing a plagiarism detection tool necessitates the comparison of text documents using the LCS algorithm to identify genuine matching sequences between texts. This technique has broad applications not only in plagiarism detection but also in fields such as data comparison and bioinformatics for sequence alignment.
Time & Space Complexity
Time Complexity: O(m * n) for all cases, where m and n represent the lengths of the two documents being compared. This time efficiency makes LCS feasible for moderate-sized document comparisons.
Space Complexity: O(m * n) primarily for storing the DP table, although optimizing it can reduce storage requirements to O(min(m, n)), which allows for more memory-efficient operations in practice.
Step-by-Step LCS Explanation
Cleansing the text data involves processes such as tokenization and standardization to facilitate accurate comparison.
Constructing and populating the dynamic programming table based on character matches between the two sequences aids in effective comparison.
The backtrack operation is crucial in extracting relevant sequences and managing multiple outputs, thus providing a comprehensive view of the similarity between text documents.
Dijkstra’s Algorithm for Shortest Paths
Overview
Implementing Dijkstra’s algorithm is critical for finding the shortest path in weighted graphs through a greedy approach. The algorithm operates by initializing distances from a designated starting node. This process is essential for applications in routing and network analysis.
Key Steps
Initialize distances to infinity for all nodes except for the start node, which is set to zero.
Employing a priority queue, process each node to update the shortest distances to its neighboring nodes based on the established weights.
Manage dynamic updates as necessary, facilitating real-time adjustments to the shortest paths as graph conditions evolve.
Complexity Analysis
Time Complexity: O((V + E) log V) where V refers to the number of vertices and E refers to the number of edges, ensuring efficiency across varied graph structures.
Space Complexity: O(V + E) reflecting the storage required for maintaining graph details and distance information.
Ford-Fulkerson Algorithm for Maximum Flow
Overview
The Ford-Fulkerson algorithm plays a vital role in calculating the maximum flow in networks while focusing on identifying augmenting paths effectively. This principle is highly applicable in network optimization problems.
Key Steps
Commence with an initial flow of zero throughout the network.
Utilize BFS to identify augmenting paths and calculate the maximum adjustment to the flow until no further paths remain, ensuring optimal flow positioning.
Complexity Analysis
Time Complexity: O(E * V), emphasizing the dependence on the number of edges and vertices, which is significant for large-scale networks.
Space Complexity: O(V²), illustrating the storage needs associated with maintaining flow network information.
Recommendations for Further Work
It is advisable to explore variants of the Ford-Fulkerson algorithm, such as the Edmonds-Karp algorithm, which introduces BFS for finding augmenting paths to optimize performance further. Additionally, enhancing visual representations can greatly assist in illustrating the dynamics of flow network changes to stakeholders involved.
References
Compiling a list of relevant links, literature, and resources is crucial for further reading and academic validation, contributing to a well-rounded understanding of algorithms and their applications in various fields.