In Java, an ExecutorService manages a pool of threads that can run tasks. Most ExecutorServices treat all tasks the same. Somebody hands it something to do, the ExecutorService parcels it out to a thread, the thread runs it. Next!
A ForkJoinPool is an ExecutorService that recognizes explicit dependencies between tasks. It is designed for the kind of computation that wants to run in parallel, and then maybe more parallel, and then some of those can run in parallel too. Results of the parallel computations are then combined, so it's like the threads want to split off and then regroup.
Maybe it's a computation like, "What's the shortest path of followers from me to @richhickey?" or "What is the total size of all files in all directories?" or "What's the most relevant ad to display to this customer?" where we don't know what all we're going to have to execute until we're in the middle of executing it.
On an ordinary ExecutorService, when we split a computation up, each task goes its separate way. Each one is allocated to a thread individually. This becomes a problem when the tasks are small, and the overhead of allocating them to threads takes longer than running them. It becomes a bigger problem when threads split off tasks and wait for all the results to come back to combine them: pretty soon so many threads are waiting that there are no more threads to do the work. This can reach deadlock.
ForkJoinPool embraces many small computations that spawn off and then come back together. It says, "When my thread wants to split its work into many small computations, it shall create them, and then start working on them. If another thread wants to come along and help, great."
That's a completely unrealistic scenario in my household, but ForkJoinPools are more disciplined than my children. They support unpredictable parallel computation, preventing pool-induced deadlock, and minimize the work of switching back and forth between threads on the CPU.
What's not to love? Well, a ForkJoinPool is harder to use than a regular old ExecutorService. It's more complicated than calling "submit." External threads submit jobs to a ForkJoinPool in an ordinary way, but within the pool, tasks are created differently. ForkJoinTask subclasses get constructed, forked off for execution, and then joined. It's custom handling, and that requires planning ahead, and that means you have to guess that ForkJoinPool is the solution before you start coding. Or retrofit it later.
Scala does something clever to hide the difference between ForkJoinPools and regular ExecutorServices, so that its Futures work this way by default. Akka uses ForkJoinPools behind the scenes for its actor messaging. Clojure uses ForkJoinPools in its collection processing with Reducers. In Scala and Clojure, you can get these optimizations without extra headache. The abstractions, they keep getting deeper!
Doug Lea wrote ForkJoin for Java and Scala. http://gee.cs.oswego.edu/dl/papers/fj.pdf