a Rug story: adding test cases

These days I work on Rug, Atomist’s library for coding code modifications.

Adding a feature, I start by creating a test. While it’s tempting to create a narrow test around the piece of code I want to change, it’s better to create an API-level test. Testing at the outside has a few benefits: it tells the story of why this feature is needed; it drives pleasing API design; and it places minimum constraints on the implementation. The cost is, it’s more work.

The API level of Rug is in TypeScript, where people write programs to modify other programs. The test compiles the TypeScript to JavaScript, and rug executes that inside the JVM, where our Scala code does the tricky work of implementing the Rug programming model — navigating the project, parsing code, and making atomic modifications (it all works or none of it is saved). This means that my API-level tests include a TypeScript file and a Scala file, plus a bunch of wiring to hook them together. I get tired of remembering how to do this. Plus, we’re constantly improving the programming model in TypeScript, so “the right way” is a shifting target.

Last year, I would have copied an existing test (which one is up to date? I don’t know! guess and hope it works), modified parts of it for my needs (and forgotten some), embedded the TypeScript code as a string in Scala (seems easier than making a .ts file), and tried to abstract away some of the repetitive bits that are shared between tests (even though that obscures the storyline of the test).

This year, I have a new tool. About the third time I needed a new test, I wrote a program to create it for me. I wrote a Rug! My AddTypeScriptTest Rug editor creates a new TypeScript file in test/resources, and a new Scala file in test/scala. It bases these off of sample files that exemplify the current standard in Rugs and their tests, performing all the modifications that I mess up in the copy-paste-modify strategy.

me:

rug edit -l AddTypeScriptTest class_under_test=com.atomist.rug.NewFeature

my Rug program:

  • copies SampleTypeScriptTest.scala to a new location. Changes the package name, the class name, and the location of the TypeScript file it will load.
  • copies SampleTypeScriptTest.ts to a new location. Changes the name of the class and the exported instance.

SampleTypeScriptTest.scala and SampleTypeScriptTest.ts form a real test in rug’s test suite, so I know that my baseline continues to work. When I update the style of them (as I did today), I can run the sample test to be sure it works (caught two errors today). I maximize their design to best tell the story of how rug goes from a TypeScript file to a Rug archive to running that program on a separate project and seeing the results. This helps people spinning up on Rug understand it. Repetition (of the Scala package name and the path to the test program, for instance) doesn’t hurt because a program is modifying them consistently (bonus: IntelliJ will ctrl-click into the referenced file on the classpath. It didn’t when that repetition was abstracted). If I want to change the way all these tests work, I can do that with a Rug editor too, since they’re consistent. Ahhhh the consistency: when a test breaks, and it looks exactly like the other tests except for meaningful differences, debugging is easier.

I created this Rug editor inside the rug project itself, since it’s only relevant to this particular project. Then I run the rug CLI in local mode, on the local project, and poof. I’ve used rug to modify rug using a Rug inside rug. Super meta! (It doesn’t have to be so incestuous. Other days, I use rug to modify any project using a Rug in any Rug archive.)

If you want to create a Rug to automate your own frequent tasks, install the Rug CLI and, from your project root, use this Rug: rug edit atomist-rugs:rug-editors:AddLocalEditor editorName=WhatDoYouWantToCallIt . Find your starting point in .atomist/editors/WhatDoYouWantToCallIt.ts

Pop into Atomist community slack with questions and we will be soooo happy to help you.

Scala Maven rugs

“Add a pom to my toy Scala project so I can build it with Maven” sounds simple. It is, if you do it every day. I can look it up, yet again. And get it wrong, yet again. And consult an expert with “Why doesn’t it see my Scala sources! This works fine from the IDE.” And get the answer, and then forget it again.

This time I encoded the answer into a Rug. Now I can run this program to add the pom.xml to toy Scala projects. While I was at it, I encoded how to add Scalatest, so I can stop looking that up over and over.

These rugs aren’t published. To use them, clone the repository and run rug in local mode. That repository has collected three related Rugs now; it seems worthwhile to publish them as a group. Then nobody else would have to look this up again either! I might do this … after I create an Atomist executor to move Rugs from one repository to another. (I have a feeling I’m going to do a lot of this while working on Atomist. Automate all the things!!)

That good feeling that I get from encoding this knowledge so that I don’t have to look it up again is deceptive. Version numbers increment, practices evolve. This carefully encoded knowledge grows stale.

Publishing the Rugs is a responsibility of ownership, of keeping them up to date. Only then do they shine with enduring value to the community. Am I ready to accept that obligation? Not today, because I’m waiting until I can automate that too. This day will come.

For now, I’m happy with making my near-future-self more productive — when it’s a step toward making oodles of developers more productive someday.

Scaling Intelligence

You can watch the full keynote from Scala eXchange 2015 (account creation required, but free). The talk includes examples and details; this post is a summary of one thread.

Scala is a scalable language, from small abstractions to large ones. This helps with the one scaling problem every software system has: scaling the feature set while still fitting it in our heads. Scaling our own intelligence.

Scala offers complicated powerful language features built from combinations of simpler language features. The aim is a staircase of learning: gradually learn features as you need them. The staircase starts in the green grass of toy programs, moves through the blue sky of useful business software, and finally into the outer space of abstract libraries and frameworks. (That dark blob is supposed to represent outer space.)

This is not how people experience the language.

The green grass is great: Odersky’s Coursera courses, Atomic Scala. Next, we want to write something useful for work: the blue sky. It is time to use libraries and frameworks. I want a web app, so I bring in Spray. Suddenly I need to understand typeclasses and the magnet pattern. The magnet pattern? The docs link to a post on this. It’s five thousand words long. I’m shooting into outer space — I don’t want to be an astronaut yet!

The middle of the staircase is missing.

Who can repair this? Not the astronauts, the compiler and library authors. They can write posts around program language theory, defining one feature in terms of a bunch of other concepts I don’t understand yet. I need explanations by people who share my objectives, people a little bit ahead of me in the blue sky, who recently learned how to use Spray themselves. I don’t need research papers, I need StackOverflow. Blog posts, not textbooks.

This is where we need each other. As a community, we can fill this staircase. At a macro level, we scale intelligence with teaching.

Scala as a language is not enough. We don’t work in languages, especially not in the blue sky. We work in language systems, including all the libraries and tooling and all the people. The resources we create, and the personal interactions in real life and online. When we teach each other, we scale our collective intelligence, we scale our community.

Scaling the community is important, because only a large, diverse group can answer two crucial questions. To make the language and libraries great, we need to know about each feature: is this useful? and to make this staircase solid, we need to know about each source and document: is this clear?

Useful isn’t determined by the library author, but by its users. Clear isn’t determined by the writer, but by the reader. If you read the explanation of Futures on the official Scala site and you don’t get it, if you feel stupid, that is not your fault. When documentation is not clear to you, its maintainers fail. Teaching means entering the context of the learner, and starting there. It means reaching for the person a step or two down, and pulling them up to where you are.

Michael Bernstein described his three years of learning Haskell. “I tried over and over again to turn my self doubt into a pure functional program, and eventually, it clicked.”
Ouch. Not everyone has this tenacity. Not everyone has three years to spend becoming an astronaut. Teaching makes the language accessible to more people. At the same time, it makes everyone’s life easier — what might Mr Bernstein have accomplished during that year?

Scala, the language system, does not belong to Martin Odersky.  It belongs to everyone who makes Scala useful. We can each be part of this.

Ask and answer questions on StackOverflow. Blog about what you learned, especially about why it was useful.[1] Request more detail — if something is not clear to you, then it is not clear. Speak at your local user group.[2] The less type theory you understand, the more people you can help!

Publish your useful Scala code. We need examples from the blue sky. If you do, tweet about it with #blueSkyScala.

It is up to all of us to teach each other, to scale our intelligence. Then we can make use of those abstractions that Scala builds up. Then it will be a scalable language.

[1] example: Remco Beckers’s post on Option and Either and Try.
[2] example: Heather Miller’s talk compensates for bad documentation around Scala Futures.

The Emperor has no clothes: Bad actors in tech

Maybe you are interested in a language, or an open source project, but you feel like the community is unwelcoming: Some big voices are rude, they’re downright hostile to newcomers and anyone who disagrees with them. Let’s not get involved.


Or in your workplace: influential people in the organization aren’t nearly as helpful, or as smart, as your teammates. Yet their opinions, and bad behavior, are followed by everyone else, and you feel bullied into decisions, you accept their little abuses. Ultimately the people with the most freedom move away, until the workplace is a surreal island where time stood still


There’s this old Christian Andersen tale, The Emperor’s New Clothes. Some weavers sold the emperor a suit, which cannot be seen by those who are incompetent or unfit for their position. It was really not a suit: they gave him nothing, the emperor was running around nude. Nobody said a thing. Each assumed they were in the wrong. Only when a child cried out “he has no clothes!” did everyone else realize that they were not alone in pretending.


While the Emperor’s situation is humorous and extreme, the open source and workplace situations are not. They are real, and they have mathematical underpinnings! A paper, called “The Majority Illusion in Social Networks,” studies this very phenomena. The Washington Post describes how public opinion appears to change rapidly, when sometimes it’s really that public opinion suddenly became known.


Those whose opinions are the most visible control what opinions are seen as acceptable and polite. The moment enough of the social network sees their own opinion as acceptable, bam! a major change in sentiment appears. That feeling was already there, but it was masked by a few local celebrities who didn’t share the values of the majority.


When we dislike bad behavior, we feel alone, but we are not special. We all see it, we all dislike it. We all wish we were using the right tools for the job. We all wish the mailing list contained only polite helpful responses. But social norms — set by the few who are also the loudest — make us not care enough, not invest enough, to communicate our opinions with such volume. Shame and fear of rejection freeze us. Or we lack the hunger for conflict. None of us say the emperor has no clothes, bad behavior persists.


The math says the evident culture is not always the predominant culture. A few confident, inconsiderate people in key positions intimidate an entire department. A few derogatory voices fence off a language, leaving erstwhile contributors to chew on rocks outside.


If you care about your organization, work to make everyone’s voice heard.
If you care about your open source community, speak out against behavior that masks the majority attitude. Watch for a red flag in your head: “I think his opinion is wrong, but I’m not going to say anything because arguing with him is not worth it.” You’re not the only one who sees through that clothing.


Testing akka actor termination

When testing akka code, I want to make sure a particular actor gets shut down within a time limit. I used to do it like this:

 Thread.sleep(2.seconds)
 assertTrue(actorRef.isTerminated())

That isTerminated method is deprecated since Akka 2.2, and good thing too, since my test was wasting everyone’s time. Today I’m doing this instead:

import akka.testkit.TestProbe

val probe = new TestProbe(actorSystem)
probe.watch(actorRef)
probe.expectMsgPF(2.seconds){ case Terminated(actorRef) => true }

This says: set up a TestProbe actor, and have it watch the actorRef of interest. Wait for the TestProbe to receive notification that the actor of interest has been terminated. If actorRef has already terminated, that message will come right away. My test doesn’t have to wait the maximum allowed time.[1]

This works in any old test method with access to the actorSystem — I don’t have to extend akka.testkit.TestKit to use the TestProbe.

BONUS: In a property-based test, I don’t want to throw an exception, but rather return a result, a property with a nice label. In that case my function gets a little weirder:

def shutsDown(actorSystem: ActorSystem, 
              actorRef: ActorRef): Prop = {
  val maxWait = 2.seconds
  val probe = new TestProbe(actorSystem)
  probe.watch(actorRef)
  try {
   probe.expectMsgPF(maxWait){case Terminated(actorRef) => true }
  } catch { 
   case ae: AssertionError => 
    false 😐 s”actor not terminated within $maxWait
  }
}

———–
[1] This is still blocking the thread until the Terminated message is received or the timeout expires. I eagerly await the day when test methods can return a Future[TestResult].

Left to right, top to bottom

TL;DR – Clojure’s threading macro keeps code in a legible order, and it’s more extensible than methods.

When we create methods in classes, we like that we’re grouping operations with related data. It’s a useful organizational scheme. There’s another reason to like methods: they put the code in an order that’s easy to read. In the old days it might read top-to-bottom, with subject and then verb and then the objects of the verb:

With a fluent interface that supports immutability, methods still give us a pleasing left-to-right ordering:
Methods look great, but it’s hard to add new ones. Maybe I sometimes want to add functionality for returns, or print a gift receipt. With functions, there is no limit to this. The secret is: methods are the same thing as functions, except with an extra secret parameter called this
For example, consider JavaScript. (full gist) A method there can be any old function, and it can use properties of this.


var
completeSale = function(num) {
console.log("Sale " + num + ": selling " 


+ this.items + " to " + this.customer);
}

Give that value to an object property, and poof, the property is a method:

var sale = {

customer: "Fred",

items: ["carrot","eggs"],

complete: completeSale

};
sale.complete(99);
// Sale 99: selling carrot,eggs to Fred

Or, call the function directly, and the first argument plays the role of “this”:

completeSale.call(sale, 100)
// Sale 100: selling carrot,eggs to Fred
In Scala we can create methods or functions for any operation, and still organize them right along with the data. I can choose between a method in the class:
class Sale(…) {
   def complete(num: Int) {…}
}
or a function in the companion object:
object Sale {
   def complete(sale: Sale, num: Int) {…}
}
Here, the function in the companion object can even access private members of the class[1]. The latter style is more functional. I like writing functions instead of methods because (1) all input is explicit and (2) I can add more functions as needed, and only as needed, and without jumbling up the two styles. When I write functions about data, instead of attaching functions to data, I can import the functions I need and no more. Methods are always on a class, whether I like it or not.
There’s a serious disadvantage to the function-with-explicit-parameter choice, though. Instead of a nice left-to-right reading style, we get:

It’s all inside-out-looking! What happens first is in the middle, and the objects are separated from the verbs they serve. Blech! It sucks that function application reads inside-out, right-to-left. The code is hard to follow.

We want the output of addCustomer to go to addItems, and the output of addItems to go to complete. Can I do this in a readable order? I don’t want to stuff all my functions into the class as methods.
In Scala, I wind up with this:

Here it reads top-down, and the arguments aren’t spread out all over the place. But I still have to draw lines, mentally, between what goes where. And sometimes I screw it up.

Clojure has the ideal solution. It’s called the threading macro. It has a terrible name, because there’s no relation to threads, nothing asynchronous. Instead, it’s about cramming the output of one function into the first argument of the next. If addCustomer, addItems, and complete are all functions which take a sale as the first parameter, the threading macro says, “Start with this. Cram it into first argument of the function call, and take that result and cram it into the first argument of the next function call, and so on.” The result of the last operation comes out. (full gist


\\ Sale 99 : selling [carrot eggs] to Fred


This has a clear top-down ordering to it. It’s subject, verb, object. It’s a great substitute for methods. It’s kinda like stitching the data in where it belongs, weaving the operations together. Maybe that’s why it’s called the threading macro. (I would have called it cramming instead.)

Clojure’s prefix notation has a reputation for being harder to read, but this changes that. The threading macro pulls the subject out of the first function argument and puts it at the top, at the beginning of the sentence. I wish Scala had this!
—————–
Encore:
In case you’re still interested, here’s a second example: list processing.

Methods in Scala look nice:

but they’re not extensible. If these were functions I’d have:

which is hideous. So I wind up with:
That is easy to mess up; I have to get the intermediate variables right.
In Haskell it’s function composition:
That reads backwards, right-to-left, but it does keep the objects with the verbs.

Notice that in Haskell the map, filter, reduce functions take the data as their last parameter.[2] This is also the case in Clojure, so we can use the last-parameter threading macro. It has the cramming effect of shoving the previous result into the last parameter:

Once again, Clojure gives us a top-down, subject-verb-object form. See? the Lisp is perfectly readable, once you know which paths to twist your brain down.

Update: As @ppog_penguin reminded me, F# has the best syntax of all. Its pipe operator acts a lot like the Unix pipe, and sends data into the last parameter.
F# is my favorite!
————
[1] technical detail: the companion object can’t see members that are private[this]
[2] technical detail: all functions in Haskell take one parameter; applying map to a predicate returns a function of one parameter that expects the list.

Modularity in Scala: Isolation of dependencies

Today at work, I said, “I wish I could express the right level of encapsulation here.” Oh, but this is Scala! Many things are possible!

We have a class, an akka Actor, whose job is to keep an eye on things. Let’s pretend its job is to clean up every so often: sweep the corners, wash the dishes, and wipe the TV screen. At construction, it receives all the tools needed to do these things.

class CleanerUpper(broomBroom
                   rag: Dishcloth, 
                   wiper: MicrofiberTowel, 
                   config: CleanConfig) … {

  def work(…) {
    broom.sweep(config, corners)
    rag.wipe(config, dishes) 
    wiper.clear(tv)
  }
}

Today, we added reporting to the sweeping functionality. This made the sweeping part complicated enough to break out into its own class. At construction of the Sweeper, we provide everything that remains constant (from config) and the tools it needs (the broom). When it’s time to sweep, we pass in the parts that vary each time (the corners).[1]

class CleanerUpper(broom: Broom
                   rag: Dishcloth, 
                   wiper: MicrofiberTowel, 
                   config: CleanConfig) … {
  val sweeper = new Sweeper(configbroom)

  def work(…) {
    sweeper.sweep(corners)
    rag.wipe(config, dishes) 
    wiper.clear(tv)
  }
}

Looking at this, I don’t like that broom is still available everywhere in the CleanerUpper. With the refactor, all broom-related functionality belongs in the Sweeper. The Broom constructor parameter serves only to construct the dependency. Yet, nothing stops me (or someone else) from adding a call directly to broom anywhere in CleanerUpper. Can I change this?

One option for is to construct the Sweeper outside and pass it in, in place of the Broom. Then construction would look like

new CleanerUpper(new Sweeper(configbroom), rag, wiper, config)

I don’t like this because no one outside of CleanerUpper should have to know about the submodules that CleanerUpper uses. I want to keep this internal refactor from having so much impact on callers.

More importantly, I want to express “A Broom is needed to initialize dependencies of CleanerUpper. After that it is not available.”

The solution we picked separates construction of dependencies from the class’s functionality definition. I made the class abstract, with an uninitialized Sweeper field. The Broom is gone.

abstract class CleanerUpper
                   rag: Dishcloth, 
                   wiper: MicrofiberTowel, 
                   config: CleanConfig) … {
  val sweeper: Sweeper

  def work(…) {
    sweeper.sweep(corners)
    rag.wipe(config, dishes) 
    wiper.clear(tv)
  }
}

Construction happens in the companion object. Its apply method accepts the same arguments as the original constructor — the same objects a caller is required to provide. Here, a Sweeper is initialized.

object CleanerUpper {
  def apply(broom: Broom
            rag: Dishcloth,
            wiper: MicrofiberTowel, 
            config: CleanConfig): CleanerUpper = 
    new CleanerUpper(rag, wiper, config) {
      val sweeper = new Sweeper(config, broom)
    }
}

The only change to construction is use of the companion object instead of explicitly new-ing one up. Next time I make a similar refactor, it’ll require no changes to external construction.

val cleaner = CleanerUpper(broom, rag, wiper, config)

I like this solution because it makes the dependency on submodule Sweeper explicit in CleanerUpper. Also, construction of that dependency is explicit.

There are several other ways to accomplish encapsulation of the broom within the sweeper. Scala offers all kinds of ways to modularize and break apart the code — that’s one of the fascinating things about the language. Modularity and organization are two of the biggest challenges in programming, and Scala offers many paths for exploring these.

————-
[1] This example is silly. It is not my favorite kind of example, but all the realistic ones I came up with were higher-cognitive-load.

When OO and FP meet: returning the same type

In the left corner, we have Functional Programming. FP says, “Classes shall be immutable!”

In the right corner, we have Object-Oriented programming. It says, “Classes shall be extendable!”

The battlefield: define a method on the abstract class such that, when you call it, you get the same concrete class back. In Scala.
Fight!

Here comes the sample problem —

We have some insurance policies, auto policies and home policies. On any policy, you can adjust by a discount and receive a policy of the same type. Here is the test:

case class Discount(name: String)

  def test() {
    def adjust[P <: Policy](d: Discount, p: P): P = p.adjustBy(d)
    val p = new AutoPolicy
    val d = Discount(“good driver”)
    val adjustedP: AutoPolicy = adjust(d, p)
    println(“if that compiles, we’re good”)
  }

OO says, no problem!

abstract class Policy {
   protected def changeCost(d: Discount)

   def adjustBy(d: Discount) : this.type = {
       changeCost(d:Discount)
       return this
   }
}

class AutoPolicy extends Policy {
  protected def changeCost(d: Discount) { /* MUTATE */ }
}

FP punches OO in the head and says, “Mutation is not allowed! We must return a new version of the policy and leave the old one be!”[1] The easiest way is to move the adjust method into an ordinary function, with a type parameter:

object Policy {
   def adjust[<: Policy](p: P, d: Discount): P = {
     case ap: AutoPolicy => new AutoPolicy
     … all the other cases for all other policies …
   }
}

But no no no, we’d have to change this code (and every method like it) every time we add a Policy subclass. This puts us on the wrong side of the Expression Problem.[2]

If we step back from this fight, we can find a better way. Where we declare adjustBy, we have access to two types: the superclass (Policy) and this.type, which is the special-snowflake type of that particular instance. The type we’re trying to return is somewhere in between:

How can we specify this intermediate type? It seems obvious to us as humans. “It’s the class that extends Policy!” but an instance of AutoPolicy has any number of types — it could include lots of traits. Somewhere we need to specify “This is the type it makes sense to return,” and then in Policy say “adjustBy returns the type that makes sense.” Abstract types do this cleanly:

abstract class Policy {
  type Self <: Policy
   protected def changeCost(d: Discount): Self

   def adjustBy(d: Discount) : Self = {
       changeCost(d:Discount)
   }
}

class AutoPolicy extends Policy {
  type Self = AutoPolicy
  protected def changeCost(d: Discount) = 
    { /* copy self */ new AutoPolicy }
}

I like this because it expresses cleanly “There will be a type, a subclass of this one, that methods can return.”
There’s one problem:

error: type mismatch;
 found   : p.Self
 required: P
           def adjust[P <: Policy](d: Discount, p:P):P = p.adjustBy(d)

The adjust method doesn’t return P; it returns the inner type P#Self. You and I know that’s the same as P, but the compiler doesn’t. OO punches FP in the head!

Wheeeet! The Scala compiler steps in as the referee. Scala offers us a way to say to the compiler, “P#Self is the same as P.” Check this version out:

def adjust[P <: Policy](d: Discount, p: P)
               (implicit ev: P#Self =:= P): P = p.adjustBy(d)

This says, “Look Scala, these two things are the same, see?” And Scala says, “Oh you’re right, they are.” The compiler comes up with the implicit value by itself.
The cool part is, if we define a new Policy poorly, we get a compile error:
class BadPolicy extends Policy {
  type Self = AutoPolicy
  protected def changeCost(d: Discount) = { new AutoPolicy }
}
adjust(d, new BadPolicy)
error: Cannot prove that FunctionalVersion.BadPolicy#Self =:= FunctionalVersion.BadPolicy.
           adjust(d, new BadPolicy)

Yeah, bad Policy, bad.

This method isn’t quite ideal, but it’s close. The positive is: the abstract type is expressive of the intent. The negative is: any function that wants to work polymorphically with Policy subtypes must require the implicit evidence. If you don’t like this, there’s an alternative using type parameters, called F-bounded polymorphism. It’s not quite as ugly as that sounds.

Scala is a language of many options. Something as tricky as combining OO and FP certainly demands it. See the footnotes for further discussion on this particular game.

The referee declares that FP can have its immutability, OO can have its extension. A few function declarations suffer, but only a little.

————–
[1] FP prefers to simply return a Policy from adjustBy; all instances of an ADT have the same interface, so why not return the supertype? But we’re not playing the Algebraic Data Type game. OO insists that AutoPolicy has additional methods (like penalizeForTicket) that we might call after adjustBy. The game is to combine immutability with extendible superclasses, and Scala is playing this game.
[2] The solution to the expression problem here — if we want to be able to add both functions and new subclasses — is typeclasses. I was totally gonna go there, until I found this solution. For the case where we don’t plan to add functions, only subclasses, abstract types are easier.

More references:
F-bounded type polymorphism (“Give Up Now”)
MyType problem
Abstract self types