GM

Lecture 4: Thread-Based Parallelism

Lecture 4 discusses thread-based parallelism in Python, including how threads work, how to create them, and thread synchronization techniques [1, 2].

Key aspects covered in the lecture:

  • Thread-Based Parallelism in Python:

  • Multithreading splits a program into concurrent, independent tasks that operate simultaneously [1].

  • While multithreading can speed up programs, its primary goal is concurrency [1].

  • Challenges include handling synchronization and deadlocks to ensure smooth program execution [2].

  • A thread is an independent execution flow that runs in parallel and concurrently with other threads [3].

  • Thread programming is a shared memory paradigm where threads can share data and resources, with variables accessible to all threads within the same process via a shared space [3]. Each thread has its own program counters, registers, and stack space but shares data and resources with other threads from the same process [3, 4].

  • States of Threads:

  • A thread is created and enters a ready state [4].

  • When started, the OS schedules it, and it begins running [4].

  • If a thread waits for a condition, it enters a blocked state and returns to the ready state when unblocked [4, 5].

  • Once execution is complete, the thread is terminated [5].

  • Creating Threads:

  • The threading.Thread() object allows concurrent execution of multiple functions [5].

  • Key parameters include:

  • target: The callable function to be executed [5, 6].

  • args: A tuple of arguments passed to the target function [5, 6].

  • kwargs: A dictionary of keyword arguments for the target function [5, 6].

  • .start(): Sets the thread for scheduling and execution [7].

  • .join(): Makes the program wait for the thread to complete its task before moving to the next line of code [7].

  • Threads can be named, although using Python's default naming is often better, especially with multiple processes [6, 8]. You can use threading.current_thread().name to call the name of a thread [8, 9].

  • Threads can also be created using object-oriented programming by creating a child class that inherits from the Thread class, overriding the constructor (__init__) and the execution method (run) [9].

  • Thread Synchronization:

  • Thread synchronization ensures threads work together smoothly by preventing them from interfering with each other when accessing shared data or resources [10].

  • Mechanisms include Locks, RLocks, Semaphores, Conditions, and Events [11].

  • Locks:

  • A lock is an object a thread must acquire before accessing a protected section, ensuring only one thread accesses the code at a time [12].

  • Threads use acquire() to request the lock and release() to relinquish it [12, 13].

  • Locks are useful in scenarios like e-commerce checkout systems, reservation systems, and operating systems to prevent conflicts [14, 15].

  • RLocks (Re-entrant Locks):

  • An RLock allows a thread to acquire the same lock multiple times, preventing deadlocks [16].

  • It tracks the number of acquisitions and requires an equal number of releases to be fully released [16].

  • RLocks are useful in recursive functions or complex operations like undo/redo functionality in collaborative editing [17-19].

  • Semaphores:

  • A semaphore controls access to a common resource by multiple threads, limiting the number of concurrent threads in a section [20].

  • Threads must acquire a permit to enter a critical section and release it upon exit [20, 21].

  • Semaphores are useful for managing network connection pools and database access control [22, 23].

  • Conditions:

  • A condition variable allows threads to wait for certain conditions to be met before proceeding [24].

  • Threads wait for a condition to become true and are notified when the state changes [24].

  • Conditions are ideal for producer-consumer scenarios and inter-thread communication [25, 26].

  • Events:

  • An event object communicates the occurrence of an event between threads, where threads wait until an event is set by another thread [26].

  • Threads can set ( event.set() ) or clear ( event.clear() ) an internal flag and test it with is_set() , while other threads wait ( event.wait() ) until the event is set [27].

  • Events are useful for signaling state changes or triggering actions in other threads, such as asynchronous task initialization or pause/resume operations [27-29].

  • Barriers:

  • A barrier is a synchronization mechanism that makes multiple threads wait until a predefined number of threads have reached the barrier point before proceeding [29].

  • Once all threads have called barrier.wait(), they are simultaneously released to continue execution [30].

  • Barriers are useful in parallel data processing, scientific computing, and multi-stage pipeline processing [31, 32].

  • Queues:

  • Queues are data structures that follow the First-In-First-Out (FIFO) principle, ensuring thread safety and maintaining order [32, 33].

  • The queue.Queue class in Python is thread-safe and automatically handles locking [34].

  • Essential methods include put(item) to add an item, get() to remove an item, task_done() to indicate task completion, and join() to block until all items are processed [34, 35].

  • Queues are ideal for managing information flow between producer and consumer threads, as well as in web server request handling, task scheduling systems, and logging systems [36-38].

  • Choosing a Synchronization Mechanism: The lecture concludes by noting the importance of choosing the correct mechanism for each application [39].