Loading LLD design...
Design a multi-threaded merge sort with multiple parallelism strategies, benchmarked against a sequential baseline.
Implement four variants: (1) Sequential Merge Sort — standard divide-and-conquer with insertion sort cutoff at n ≤ 32 for cache efficiency; (2) Parallel Thread-per-Split — fork a new thread for the left half, compute right half in current thread, join; depth-limited to max_depth levels so at most 2^d threads are spawned; (3) Thread Pool Merge Sort — submit tasks to a fixed ThreadPoolExecutor with a sequential threshold (e.g., 10,000 elements); below threshold, sort sequentially to avoid task overhead; (4) Fork-Join Merge Sort — SortTask modelled as RecursiveTask: fork() left, compute() right, join() left, inspired by Java's ForkJoinPool with work-stealing. All implementations share a thread-safe SortMetrics class that atomically tracks comparisons, threads used, and recursion depth.
Sequential merge sort (baseline)
Standard single-threaded divide-and-conquer merge sort with insertion sort cutoff at n ≤ 32 for cache efficiency
Parallel thread-per-split
At each divide step, fork a new thread for the left half and compute the right half in the current thread; depth-limited to avoid thread explosion (2^d threads at depth d)
Thread pool merge sort
Submit sort tasks to a fixed-size ThreadPoolExecutor; avoids thread creation/destruction overhead; sequential fallback below threshold
Fork-Join merge sort
SortTask as RecursiveTask: fork() left subtask, compute() right in current thread, join() left; Java ForkJoinPool-inspired with work stealing
Depth-limited parallelism
Only fork new threads/tasks for the top max_depth levels of recursion; below that, fall back to sequential sort to avoid task overhead
Sequential threshold cutoff
Below a configurable threshold (e.g., 10,000 elements), sort sequentially to avoid task submission overhead exceeding parallel benefit
Insertion sort for small subarrays
Switch to insertion sort for n ≤ 32; better cache locality and constant-factor performance for small arrays
Thread-safe metrics collection
Atomic counters track comparisons, threads used, and max recursion depth across concurrent tasks
Benchmarking and correctness verification
Compare sequential vs parallel implementations on various array sizes; verify sorted output matches expected
Before diving into code, clarify the use cases and edge cases. Understanding the problem deeply leads to better class design.
Identify the primary actions users will perform. For a parking lot: park vehicle, exit vehicle, check availability. Each becomes a method.
Who interacts with the system? Customers, admins, automated systems? Each actor type may need different interfaces.
What are the limits? Max vehicles, supported vehicle types, payment methods. Constraints drive your data structures.
What happens on overflow? Concurrent access? Payment failures? Thinking about edge cases reveals hidden complexity.