When you need an ExecutorService, the Executors class has several for you. Sometimes it doesn't matter which you choose, and other times it matters a LOT. The above flow chart is approximate. If it's simple parallel computation you're doing, then a fixed pool with as many threads as CPUs works. If those computations start other computations, then you'll want a ForkJoinPool (that's another post). If the purpose of threading is to avoid blocking on I/O, that's different. Maybe you don't want to limit the number of threads that can wait on I/O, and then Executors.newCachedThreadPool is a good choice.
When you would like to limit the number of threads you start AND your tasks might start other tasks in the same thread pool, then you must worry about Pool Induced Deadlock. Then you need to think about what happens when one task needs to start another. It's time to get into the nitty-gritty of a ThreadPoolExecutor. This is what the methods on Executors construct for you under the hood, and you can make your own for finer control. To choose them, you need to understand what the ThreadPoolExecutor does when a new task comes in.
Here's a nice new ThreadPoolExecutor.
Some tasks come in, Runnables passed through execute or submit. At first the ThreadPoolExecutor happily starts a new thread per task. It does this up to the core pool size. Note that even if a thread is idle, it'll still start a new thread per task until the core number of threads are running. These are up until the pool shuts down, unless you configure it otherwise.
If all the core threads are busy, then the ThreadPoolExecutor will begin storing Runnables in its queue. The BlockingQueue passed in at construction can have capacity 0, MAX_INT, or anywhere in between. Note that Runnables are stored here even if the maximum number of threads are not running. This can cause deadlock if the tasks in the core threads are waiting on tasks they submitted.
Only if the queue is full will the ThreadPoolExecutor start more than the core number of threads. It'll start them up to the maximum pool size, only as long as the queue is full and more tasks are coming in.
Finally, if there's no more room in the queue and no more threads allowed, submitting a task to this ThreadPoolExecutor will throw RejectedExecutionException. (You can configure it to drop work, or to tell the calling thread to do its own task instead.)
If you need something in between, you can construct it yourself. The fixed thread pool is good to put a limit on I/O or CPU context switches. The cached one avoids pool-induced deadlock. If you're doing interesting recursive calculations, then look into ForkJoinPool.
Aside: All of these ExecutorServices will, by default, start Threads that keep your application alive until they're done. If you don't like that, build a ThreadFactory that sets them up as daemon threads, and pass it in when you create the ExecutorService.
Bonus material: Scala REPL code that I used to poke around in the REPL and check these out.
If you're in Scala, a great way to create a thread pool is Akka's ThreadPoolBuilder.
Example: The thermometer-looking drawings represent tasks either in process or queued inside the ExecutorService. Red and green represent tasks in process, and amber ones are sitting in the queue. As tasks are completed, the level drops. As tasks are submitted to the ExecutorService, the level goes up.
5, // core pool size
8, // max pool size, for at most (8 - 5 = 3) red threads
10, TimeUnit.SECONDS, // idle red threads live this long
new ArrayBlockingQueue(7)); // queue with finite maximum size
Assuming no tasks complete yet: The first 5 tasks submitted will start up threads; the next 7 tasks will queue; the next 3 tasks coming in will cause more threads to be started. (Tasks will be pulled from the front of the queue and put on the new threads, so the last tasks submitted can fit at the back of the queue.) Any more submission attempts throw RejectedExecution exception.
 Core threads will be closed when idle only if you set that up: threadPoolExecutor.allowCoreThreadTimeout(true)
 Do this by supplying a different RejectedExecutionHandler in the constructor.