Testing akka actor termination

When testing akka code, I want to make sure a particular actor gets shut down within a time limit. I used to do it like this:

 Thread.sleep(2.seconds)
 assertTrue(actorRef.isTerminated())

That isTerminated method is deprecated since Akka 2.2, and good thing too, since my test was wasting everyone’s time. Today I’m doing this instead:

import akka.testkit.TestProbe

val probe = new TestProbe(actorSystem)
probe.watch(actorRef)
probe.expectMsgPF(2.seconds){ case Terminated(actorRef) => true }

This says: set up a TestProbe actor, and have it watch the actorRef of interest. Wait for the TestProbe to receive notification that the actor of interest has been terminated. If actorRef has already terminated, that message will come right away. My test doesn’t have to wait the maximum allowed time.[1]

This works in any old test method with access to the actorSystem — I don’t have to extend akka.testkit.TestKit to use the TestProbe.

BONUS: In a property-based test, I don’t want to throw an exception, but rather return a result, a property with a nice label. In that case my function gets a little weirder:

def shutsDown(actorSystem: ActorSystem, 
              actorRef: ActorRef): Prop = {
  val maxWait = 2.seconds
  val probe = new TestProbe(actorSystem)
  probe.watch(actorRef)
  try {
   probe.expectMsgPF(maxWait){case Terminated(actorRef) => true }
  } catch { 
   case ae: AssertionError => 
    false 😐 s”actor not terminated within $maxWait
  }
}

———–
[1] This is still blocking the thread until the Terminated message is received or the timeout expires. I eagerly await the day when test methods can return a Future[TestResult].

Weakness and Vulnerability

Weakness and vulnerability are different. Separate the concerns: [1]

Vulnerability is an openness to being wounded.
Weakness is inability to live through wounds.

In D&D terms: vulnerability is a low armor class, weakness is low hit points. Armor class determines how hard it is for an enemy to hit you, and hit points determine how many hits you can take. So you have a choice: prevent hits, or endure more hits.

If you try to make your software perfect, so that it never experiences a failure, that’s a high armor class. That’s aiming for invulnerability.

Thing is, in D&D, no matter how high your armor class, if the enemy makes a perfect roll (a 20 on a d20, a twenty-sided die), that’s a critical hit and it strikes you. Even if your software is bug-free, hardware goes down or misbehaves.

If you’ve spent all your energy on armor class and little on hit points, that single hit can kill you.

Embrace failure by letting go of ideal invulnerability, and think about recovery instead. I could implement signal handlers, and maintain them, and this is a huge pain and makes my code ugly. Or I could implement a separate cleanup mechanism for crashed processes. That’s a separation of concerns, and it’s more robust: signal handlers don’t help when the app is out of memory, a separate recovery does.

In the software I currently work on, I take the strategy of building safety nets at the application, process, subsystem, and module levels, as feasible.[3] Then while I try to get my code right, I don’t convolute my code looking for hardware and network failures, bad data and every error I can conceive. There are always going to be errors I don’t conceive. Fail gracefully, and pick up the pieces.

—–
An expanded version of this post, adding the human element, is on True in Software, True in Life.

—–
[1] Someone tweeted a quote from some book on this, on the difference between weakness and vulnerability, a few weeks ago and it clicked with me. I can’t find the tweet or the quote anymore. Anyone recognize this?
[3] The actor model (Akka in my case) helps with recovery. It implements “Have you restarted your computer?” at the small scale.

Testing in Akka: sneaky automatic restarts

Restarts are awesome when stuff fails and you want it to work. Akka does this by default for every actor, and that’s great in production. In testing, we’re looking for failure. We want to see it, not hide it. In testing, it’s sneaky of Akka to restart our whole actor hierarchy when it barfs.

Change this sneaky behavior in actor system configuration by altering the supervisor strategy of the guardian actor. The guardian is the parent of any actor created with system.actorOf(…), so its supervision strategy determines what happens when your whole hierarchy goes down.

Then, in your test’s assertions, make sure your actor hasn’t been terminated. If the guardian has a StoppingSupervisionStrategy and your actor goes down, it’ll stay down.

Code example in this gist.

Testing akka: turn on debug

Testing is easier when you can see what messages are received by your actors. To do this, add logging to your real code, and then change your configuration. This post shows how to hard-code some debugging configuration for testing purposes.

Step 1: put logging into your receive method, by wrapping your existing receive definition in LoggingReceive:

import akka.event.LoggingReceive
def receive = LoggingReceive({ case _ => “your code here” })

Step 2: In actor system creation, put some hard-coded configuration in front of the default configuration.

val config: Config = ConfigFactory.parseString(“””akka {
         loglevel = “DEBUG”
         actor {
           debug {
             receive = on
             lifecycle = off
           }
         }
       }”””).withFallback(ConfigFactory.load())
val system = ActorSystem(“NameOfSystem”, config)

Now when you run you’ll see messages like:

[DEBUG] [01/03/2014 16:41:57.227] [NameOfSystem-akka.actor.default-dispatcher-4] [akka://Seqosystem/user/$c] received handled message Something

Step 3: This is a good time to add names for your actors at creation, so you’ll see them in the log. 

val actor = system.actorOf(Props(new MyActor()), “happyActor”)

[DEBUG] [01/03/2014 16:41:57.227] [NameOfSystem-akka.actor.default-dispatcher-4] [akka://Seqosystem/user/happyActor] received handled message Something

Now at least you have some indication of messages flying around during your tests. Do you have any other pointers for debugging actor systems?