Sunday, February 2, 2014

Scala: the global ExecutionContext makes your life easier

TL;DR - when in doubt, stick with scala.concurrent.ExecutionContext.global

When you want to run some asynchronous code, choosing a thread pool isn't any fun. Scala has your back, with its global ExecutionContext.

When I try to put some code in a Future without specifying where to run it, Scala tells me what to do:

scala> Future(println("Do something slow"))
<console>:14: error: Cannot find an implicit ExecutionContext, either require one yourself or import ExecutionContext.Implicits.global
      
There are some good reasons to use that recommended ExecutionContext. It tries to do things right in several ways. See below for how you can help it along.

The global ExecutionContext has an objective of keeping your CPU busy while limiting time spent context switching between threads. To do this, it starts up a ForkJoinPool[3] whose desired degree of parallelism is the number of CPUs.[1]

ForkJoinPool is particularly smart, able to run small computations with less overhead. It's more work for its users, who must implement each computation inside a ForkJoinTask. Scala's global ExecutionContext takes this burden from you: any task submitted to the global context from within the global context is quietly wrapped in a ForkJoinTask.

But wait, there's more! We also get special handling for blocking tasks. Scala's Futures resist supplying their values unless you pass them to Await.result(). That's because Future knows that its result may not be available yet, so this is a blocking call. The Await object wraps the access in scala.concurrent.blocking { ... }, which passes the code on to BlockingContext.

The BlockingContext object says, "Hey, current Thread, do you have anything special to do before I start this slow thing?" and the special thread created inside the global ExecutionContext says, "Why yes! I'm going to tell the ForkJoinPool about this."

The thread's block context defers to the managedBlock method in ForkJoinPool, which activates the ForkJoinPool's powers of compensation. ForkJoinPool is trying to keep the CPU busy by keeping degree-of-parallelism threads computing all the time. When informed that one of those threads is about to block, it compensates by starting an additional thread. This way, while your thread is sitting around, a CPU doesn't have to. As a bonus, this prevents pool-induced deadlock.

In this way, Scala's Futures and its global ExecutionContext work together to keep your computer humming without going Thread-wild. You can invoke the same magic yourself by wrapping any Thread-hanging code in blocking { ... }.[2]

All this makes scala.concurrent.ExecutionContext.global an excellent general-purpose ExecutionContext.

When should you not use it? When you're writing an asynchronous library, or when you know you're going to do a lot of blocking, declare your own thread pool. Leave the global one for everyone else.

----------
[1] You can alter this: set scala.concurrent.context.numThreads to a hard number or to a multiple, such as "x2" for double your CPUs. The documentation is the source code.
[2] Here's some code that illustrates using blocking { ... } to get the global ExecutionContext to start extra threads.
[3] Scala has its own ForkJoinPool implementation, because Java doesn't get it until 7, and Scala runs on Java 6 or higher.

4 comments:

  1. You need to switch to comparing against Java 8. It is GA in March and has CompletableFurture and uses the common ForkJoinPool.

    http://download.java.net/jdk8/docs/api/java/util/concurrent/CompletableFuture.html

    ReplyDelete
  2. Thanks for this post, that dang global ExecutionContext was biting me bad!

    ReplyDelete
  3. How about using something like `ExecutionContext.fromExecutor(new ForkJoinPool)`? the constructor of `ForkJoinPool` seems to create a pool identical to the default one when called without parameters, but this way you get e.g. full lifecycle control over your pool.

    ReplyDelete
    Replies
    1. The ForkJoinPool created there won't have the extra magic of Scala's global one, which converts blocking{...} code and other submitted work into ForkJoin tasks.
      If you care about thread pool lifecycle, and are paying attention to using the ForkJoinPool, then your suggestion is the right way.

      Delete