Loading learning content...
Java's Fork/Join Framework (introduced in Java 7) is a production-ready implementation of the fork-join parallel programming model. It provides a high-performance work-stealing thread pool optimized for divide-and-conquer algorithms.
The framework elegantly combines the theoretical concepts we've studied—fork operations, join synchronization, task decomposition, and work-stealing load balancing—into a practical API used in countless production applications, including Java's parallel streams.
By the end of this page, you will understand: ForkJoinPool architecture and configuration, RecursiveTask vs RecursiveAction, the work-stealing algorithm, best practices and common pitfalls, integration with parallel streams, and performance tuning strategies.
ForkJoinPool is the execution engine of the framework. Unlike a traditional thread pool that uses a single shared queue, ForkJoinPool gives each worker thread its own deque (double-ended queue) and implements work-stealing for load balancing.
Key Components:
12345678910111213141516171819202122232425262728293031323334353637383940
import java.util.concurrent.*; public class ForkJoinPoolBasics { public static void main(String[] args) { // Option 1: Common pool (shared, default parallelism) ForkJoinPool commonPool = ForkJoinPool.commonPool(); System.out.println("Common pool parallelism: " + commonPool.getParallelism()); // Option 2: Custom pool with specific parallelism ForkJoinPool customPool = new ForkJoinPool(4); // Option 3: Fully configured pool ForkJoinPool configuredPool = new ForkJoinPool( Runtime.getRuntime().availableProcessors(), // parallelism ForkJoinPool.defaultForkJoinWorkerThreadFactory, null, // UncaughtExceptionHandler true // asyncMode (FIFO scheduling for event-style tasks) ); // Submitting tasks ForkJoinTask<Long> task = customPool.submit(() -> { return computeSum(1_000_000); }); try { Long result = task.get(); // Blocking wait System.out.println("Result: " + result); } catch (Exception e) { e.printStackTrace(); } customPool.shutdown(); } static long computeSum(int n) { return (long) n * (n + 1) / 2; }}| Parameter | Default | Purpose |
|---|---|---|
| parallelism | Runtime.availableProcessors() | Number of worker threads |
| factory | defaultForkJoinWorkerThreadFactory | Creates worker threads |
| handler | null | Handles uncaught exceptions |
| asyncMode | false | LIFO (false) vs FIFO (true) local task order |
The framework provides two abstract classes for defining fork-join tasks:
Both require implementing the compute() method, which contains your parallel algorithm logic.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
import java.util.concurrent.*; /** * RecursiveTask example: Parallel array sum with result */public class ParallelSum extends RecursiveTask<Long> { private static final int THRESHOLD = 10_000; private final long[] array; private final int lo, hi; public ParallelSum(long[] array, int lo, int hi) { this.array = array; this.lo = lo; this.hi = hi; } @Override protected Long compute() { // Base case: small enough for sequential if (hi - lo <= THRESHOLD) { long sum = 0; for (int i = lo; i < hi; i++) { sum += array[i]; } return sum; } // Recursive case: divide and conquer int mid = (lo + hi) / 2; ParallelSum left = new ParallelSum(array, lo, mid); ParallelSum right = new ParallelSum(array, mid, hi); // Fork left, compute right directly left.fork(); Long rightResult = right.compute(); Long leftResult = left.join(); return leftResult + rightResult; } public static void main(String[] args) { long[] data = new long[100_000_000]; for (int i = 0; i < data.length; i++) data[i] = i; long start = System.nanoTime(); Long result = ForkJoinPool.commonPool() .invoke(new ParallelSum(data, 0, data.length)); long elapsed = System.nanoTime() - start; System.out.printf("Sum: %d in %.2f ms%n", result, elapsed / 1_000_000.0); }}123456789101112131415161718192021222324252627282930313233343536
import java.util.concurrent.*; /** * RecursiveAction example: In-place parallel transform (no return) */public class ParallelTransform extends RecursiveAction { private static final int THRESHOLD = 10_000; private final double[] array; private final int lo, hi; public ParallelTransform(double[] array, int lo, int hi) { this.array = array; this.lo = lo; this.hi = hi; } @Override protected void compute() { if (hi - lo <= THRESHOLD) { // Sequential: transform in place for (int i = lo; i < hi; i++) { array[i] = Math.sqrt(array[i]) * Math.sin(array[i]); } return; } int mid = (lo + hi) / 2; // invokeAll forks all tasks and joins all invokeAll( new ParallelTransform(array, lo, mid), new ParallelTransform(array, mid, hi) ); }}Notice we fork() the left task but call compute() directly on the right. This keeps the current thread working rather than forking both and waiting. This 'fork-once' pattern is a key optimization—always compute one subtask directly.
Work-stealing is the load-balancing magic that makes fork-join efficient. Each worker maintains a deque of tasks:
Local Operations (LIFO - last-in-first-out):
fork(): Push task to bottom of own dequeStealing Operations (FIFO - first-in-first-out):
Following best practices ensures you get maximum performance from the Fork/Join framework.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
import java.util.concurrent.*; /** * Handling blocking operations in Fork/Join tasks. * Use ManagedBlocker to prevent thread starvation. */public class BlockingExample { // BAD: Direct blocking starves the pool static void badBlocking() { ForkJoinPool.commonPool().invoke(new RecursiveAction() { @Override protected void compute() { try { Thread.sleep(1000); // BAD: Blocks worker thread } catch (InterruptedException e) {} } }); } // GOOD: Use ManagedBlocker for necessary blocking static void goodBlocking() { ForkJoinPool.commonPool().invoke(new RecursiveAction() { @Override protected void compute() { try { ForkJoinPool.managedBlock(new ForkJoinPool.ManagedBlocker() { boolean done = false; @Override public boolean block() throws InterruptedException { Thread.sleep(1000); // Pool can compensate done = true; return true; } @Override public boolean isReleasable() { return done; } }); } catch (InterruptedException e) {} } }); }}Java 8's parallel streams are built on top of the Fork/Join framework. When you call .parallel() on a stream, operations are automatically decomposed and executed on the common ForkJoinPool.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
import java.util.*;import java.util.stream.*; public class ParallelStreamsExample { public static void main(String[] args) { List<Integer> numbers = IntStream.rangeClosed(1, 10_000_000) .boxed() .collect(Collectors.toList()); // Sequential stream long seqStart = System.nanoTime(); long seqSum = numbers.stream() .mapToLong(i -> i) .sum(); long seqTime = System.nanoTime() - seqStart; // Parallel stream - uses ForkJoinPool.commonPool() long parStart = System.nanoTime(); long parSum = numbers.parallelStream() .mapToLong(i -> i) .sum(); long parTime = System.nanoTime() - parStart; System.out.printf("Sequential: %d in %.2f ms%n", seqSum, seqTime / 1e6); System.out.printf("Parallel: %d in %.2f ms%n", parSum, parTime / 1e6); System.out.printf("Speedup: %.2fx%n", (double) seqTime / parTime); // Custom pool for parallel stream (if needed) ForkJoinPool customPool = new ForkJoinPool(2); try { long result = customPool.submit(() -> numbers.parallelStream() .mapToLong(i -> i) .sum() ).get(); System.out.println("Custom pool result: " + result); } catch (Exception e) { e.printStackTrace(); } customPool.shutdown(); }}Parallel streams add overhead. Avoid them for: small collections (<10,000 elements), I/O-bound operations, operations with side effects, or when operations aren't CPU-intensive. Profile before parallelizing—sequential is often faster for simple operations.
You now understand Java's Fork/Join framework—from ForkJoinPool architecture to RecursiveTask implementation. You can write efficient parallel algorithms using this production-ready framework. Next, we'll explore parallel divide-and-conquer algorithms in depth.