Left to right, top to bottom

TL;DR – Clojure’s threading macro keeps code in a legible order, and it’s more extensible than methods.

When we create methods in classes, we like that we’re grouping operations with related data. It’s a useful organizational scheme. There’s another reason to like methods: they put the code in an order that’s easy to read. In the old days it might read top-to-bottom, with subject and then verb and then the objects of the verb:

With a fluent interface that supports immutability, methods still give us a pleasing left-to-right ordering:
Methods look great, but it’s hard to add new ones. Maybe I sometimes want to add functionality for returns, or print a gift receipt. With functions, there is no limit to this. The secret is: methods are the same thing as functions, except with an extra secret parameter called this
For example, consider JavaScript. (full gist) A method there can be any old function, and it can use properties of this.


var
completeSale = function(num) {
console.log("Sale " + num + ": selling " 


+ this.items + " to " + this.customer);
}

Give that value to an object property, and poof, the property is a method:

var sale = {

customer: "Fred",

items: ["carrot","eggs"],

complete: completeSale

};
sale.complete(99);
// Sale 99: selling carrot,eggs to Fred

Or, call the function directly, and the first argument plays the role of “this”:

completeSale.call(sale, 100)
// Sale 100: selling carrot,eggs to Fred
In Scala we can create methods or functions for any operation, and still organize them right along with the data. I can choose between a method in the class:
class Sale(…) {
   def complete(num: Int) {…}
}
or a function in the companion object:
object Sale {
   def complete(sale: Sale, num: Int) {…}
}
Here, the function in the companion object can even access private members of the class[1]. The latter style is more functional. I like writing functions instead of methods because (1) all input is explicit and (2) I can add more functions as needed, and only as needed, and without jumbling up the two styles. When I write functions about data, instead of attaching functions to data, I can import the functions I need and no more. Methods are always on a class, whether I like it or not.
There’s a serious disadvantage to the function-with-explicit-parameter choice, though. Instead of a nice left-to-right reading style, we get:

It’s all inside-out-looking! What happens first is in the middle, and the objects are separated from the verbs they serve. Blech! It sucks that function application reads inside-out, right-to-left. The code is hard to follow.

We want the output of addCustomer to go to addItems, and the output of addItems to go to complete. Can I do this in a readable order? I don’t want to stuff all my functions into the class as methods.
In Scala, I wind up with this:

Here it reads top-down, and the arguments aren’t spread out all over the place. But I still have to draw lines, mentally, between what goes where. And sometimes I screw it up.

Clojure has the ideal solution. It’s called the threading macro. It has a terrible name, because there’s no relation to threads, nothing asynchronous. Instead, it’s about cramming the output of one function into the first argument of the next. If addCustomer, addItems, and complete are all functions which take a sale as the first parameter, the threading macro says, “Start with this. Cram it into first argument of the function call, and take that result and cram it into the first argument of the next function call, and so on.” The result of the last operation comes out. (full gist


\\ Sale 99 : selling [carrot eggs] to Fred


This has a clear top-down ordering to it. It’s subject, verb, object. It’s a great substitute for methods. It’s kinda like stitching the data in where it belongs, weaving the operations together. Maybe that’s why it’s called the threading macro. (I would have called it cramming instead.)

Clojure’s prefix notation has a reputation for being harder to read, but this changes that. The threading macro pulls the subject out of the first function argument and puts it at the top, at the beginning of the sentence. I wish Scala had this!
—————–
Encore:
In case you’re still interested, here’s a second example: list processing.

Methods in Scala look nice:

but they’re not extensible. If these were functions I’d have:

which is hideous. So I wind up with:
That is easy to mess up; I have to get the intermediate variables right.
In Haskell it’s function composition:
That reads backwards, right-to-left, but it does keep the objects with the verbs.

Notice that in Haskell the map, filter, reduce functions take the data as their last parameter.[2] This is also the case in Clojure, so we can use the last-parameter threading macro. It has the cramming effect of shoving the previous result into the last parameter:

Once again, Clojure gives us a top-down, subject-verb-object form. See? the Lisp is perfectly readable, once you know which paths to twist your brain down.

Update: As @ppog_penguin reminded me, F# has the best syntax of all. Its pipe operator acts a lot like the Unix pipe, and sends data into the last parameter.
F# is my favorite!
————
[1] technical detail: the companion object can’t see members that are private[this]
[2] technical detail: all functions in Haskell take one parameter; applying map to a predicate returns a function of one parameter that expects the list.

F# made Java easier today

Today I had a sticky problem to solve, the kind business app programmers drool over because it’s nontrivial and contained. Halfway through the TDD process, my logic got all jumbled and I couldn’t see the solution. Stepping back, I took ten minutes to write pseudocode in an F# style, then translated that into Java. Bam! green all over. Turns out this problem is more suited to recursion and pattern matching than loops and variables. The same style works just fine in Java, but thinking in F# brought it to light.

As an illustrative example, here’s the code in both languages.

The problem:
Money comes in through deposits. Money goes out through dispensations. Take an ordered list of each and pair them up into Utilizations, which identify which deposits were used to satisfy which dispensations. If any dispensation cannot be satisfied with the provided deposits, do not satisfy the dispensation at all.

The F#:

type Deposit = { amount : int }
type Dispensation { amount : int }


type Utilization = 
  | DepUtil of Deposit * Dispensation * int


let rec utilizeInternal outputSoFar currentDispOutput (deposits:Deposit list) (dispensations:Dispensation list) depUsed dispUsed : Utilization list = 
 match (deposits, dispensations) with
  | [] , _ -> outputSoFar
  | _ , [] -> outputSoFar
  | dep :: depTail, disp:: dispTail ->
     match(dep.amount – depUsed, disp.amount – dispUsed) with 
       | (depAmt, dispAmt) when depAmt = dispAmt ->
           utilizeInternal (DepUtil(dep, disp, depAmt) :: currentDispOutput @ outputSoFar) [] depTail dispTail 0 0
       | (depAmt, dispAmt) when depAmt  
           utilizeInternal outputSoFar (DepUtil(dep, disp, depAmt) ::  currentDispOutput) depTail (disp :: dispTail) 0 (dispUsed + depAmt)
       | (depAmt, dispAmt) when dispAmt  
           utilizeInternal( DepUtil(dep, disp, dispAmt) :: currentDispOutput @ outputSoFar) [] (dep :: depTail) dispTail (depUsed + dispAmt) 0


let utilize deposits dispensations = utilizeInternal [] [] deposits dispensations 0 0

The key portion, the recursive function definition, works out to nine long lines. It divides the problem up neatly into: the deposit and dispensation match exactly; the deposit is more than the dispensation; the dispensation is more than the deposit. Those are the three non-trivial cases.

Translating this into Java simply puts these cases into if-then statements.

The Java:

public class DepositUtilizer {
    public static class Utilization {
        public final Deposit deposit;
        public final Dispensation dispensation;
        public final long amountInPennies;
        public Utilization(Deposit deposit, Dispensation dispensation, long amountInPennies) {
            this.deposit = deposit;
            this.dispensation = dispensation;
            this.amountInPennies = amountInPennies;
        }
    }
    public Iterable utilize(Iterable deposits, Iterable dispensations) {
        return utilizeInternal(new ArrayList(), new ArrayList(),
                peekingIterator(deposits.iterator()), peekingIterator(dispensations.iterator()), 0l, 0l);
    }
    private Iterable utilizeInternal(Iterable outputSoFar,
            List currentDispensationOutput, PeekingIterator deposits,
            PeekingIterator dispensations, long amountUsedSoFarInCurrentDeposit,
            long amountUsedSoFarInCurrentDispensation) {
        if (!deposits.hasNext() || !dispensations.hasNext()) {
            return outputSoFar;
        }
        long dispensationAmount = dispensations.peek().getAmountForJPA() – amountUsedSoFarInCurrentDispensation;
        long depositAmount = deposits.peek().getUnutilizedAmount().getPennies() – amountUsedSoFarInCurrentDeposit;
        if (depositAmount == dispensationAmount) {
            // great, use them both up
            Deposit usedDeposit = deposits.next();
            Dispensation usedDispensation = dispensations.next();
            currentDispensationOutput.add(new Utilization(usedDeposit, usedDispensation, depositAmount));
            return utilizeInternal(concat(outputSoFar, currentDispensationOutput), new ArrayList(), deposits,
                    dispensations, 0l, 0l);
        }
        if (depositAmount < dispensationAmount) {
            // use all of the deposit
            Deposit usedDeposit = deposits.next();
            currentDispensationOutput.add(new Utilization(usedDeposit, dispensations.peek(), depositAmount));
            return utilizeInternal(outputSoFar, currentDispensationOutput, deposits, dispensations, 0l,
                    amountUsedSoFarInCurrentDispensation + depositAmount);
        }
        if (dispensationAmount < depositAmount) {
            // use all of the dispensation
            Dispensation usedDispensation = dispensations.next();
            currentDispensationOutput.add(new Utilization(deposits.peek(), usedDispensation, dispensationAmount));
            return utilizeInternal(concat(outputSoFar, currentDispensationOutput), new ArrayList(), deposits,
                    dispensations, amountUsedSoFarInCurrentDeposit + dispensationAmount, 0l);
        }
        throw new IllegalStateException(“The sky is falling!”);
    }
}

The Java is twice as long as the F#, sure, but it works beautifully. What matters is that knowing a different language, and thinking about the problem in terms of the better-suited language, led me to a good solution in the language of my day job.

Functional programming! It’s not just for math geeks!

Continuation Style Without the Fugly

The previous post discussed advantages of using a continuation style to send our code to the data instead of making our code wait for the data to come back to it.

This pseudocode example has three I/O operations, each encasing the operations which should follow in a callback function parameter.

let filename = // calculate filename
new File(file).readAllLines().andThen( { data ->
// filter the data
// summarize the data
// reformat the data
new File("output").writeAllLines(newData).andThen( { status ->
println("done!")
sendEmail("done! status = " + status).andThen( {
println("email sent")
})
})
})

There’s an obvious negative here: it’s ugly. All this stuff happens in a clear sequence, the same sequence as in the first pseudocode, but it’s hard to see that with all those indentations.

It would be nice if we could describe this without all those curly braces and indentation.
F# async workflows are an example of that goal achieved. The following is pseudocode stuck in the middle of an F# async workflow block.

async {
let filename = // calculate filename
let! data = new File(file).readAllLines();
// filter the data
// summarize the data
// reformat the output
let! status = new File("output").writeAllLines(newData);
println("done!");
do! sendEmail("done! status = " + status);
println("email sent");
} |> Async.Start

This is precisely what we’d like the code to look like, with a few exceptions. The “async” at the beginning and the piping of the block to Async.Start are the F# overhead to trigger the magic asynchronicity.
The let! and do! keywords are where the magic happens: they trigger F# to wrap the rest of the block into a super-sneaky callback that gets passed to an asynchronous operation which evaluates the expression in the let! or do! in another thread. When the file read is complete, the rest of the code proceeds, in whatever thread it’s in.

The second example executes exactly like the first one. But it reads sooo much more smoothly!

The tricky bit is that let! and do! sneakily break up the block of code. Anything after the let! may happen in a different thread, and then the next let! might switch execution to yet another thread. As a programmer we don’t have to worry about that, but it’s mind-bending when the code looks so sequential.

I hope we’ll see more languages and frameworks that can operate like this: pass the code to the data, but do so in a manner that keeps our code readable and organized.

The confluence of FP and OO

Michael Feathers has a great blog post up today about how object-oriented and functional programming styles are suited to different portions of an application.

Specifically, the object-oriented style fits the high-level application components. The original intention of OO was for objects to pass messages to each other, which is suited to a SOA style and asynchronous message-passing. At this level, components should tell each other what to do, rather than asking for information (as return values). OO and top-down design work well together.

The functional style is better suited to the low-level components. Within objects, a functional style is cleaner and simpler for computations. A functional style is friendlier to an iterative, test-driven approach.

There is one more piece where functional precepts can help us become even more object-oriented, in the message-passing sense. We can make more messages asynchronous when we can pass callbacks. These functions-as-data let us specify what happens next, removing the need to wait for a return value. This looks different from an imperative style, but it is embraced by JavaScript and Node.js, and it’s getting more ubiquitous by the day. It’s suited to asynchronicity, which enables more OO-style message passing.

We can embrace OO and FP in their areas of strength. F# and Scala seem very important in this light. They take away the pain of crossing the seam between OO and FP. We can even embrace the callback (or continuation) style without making our code unreadable: in F#, computation expressions can break imperative-looking code into pieces that execute as each message-pass is completed.

The hybrid languages give us opportunity to optimize our code at both high and low levels, to break our application into large object-oriented chunks and implement each solution functionally. When both paradigms have full language support, we can employ the strengths of each and interoperate at will.

Humility in programming languages

John Backus (pdf) distinguishes two parts of a programming language: “First, its framework which gives the overall rules of the system, and second, its changeable parts, whose existence is anticipated by the framework but whose particular behaviour is not specified by it.”

The framework part is all the built-in language constructs.

The changeable parts are everything we create in the language. Classes, methods, functions, modules, libraries.

Backus’s point is that strong languages are a platform for developers to build on. Adding a bunch of framework — for loops, case statements, new syntax — might increase productivity, but it’s what we can build on top of the language that makes it take off.

When Scala tries to be a SCAlable LAnguage, this in part of what scalable means. The language lets developers add to it.

For example –

Java and C# give us case statements, very restricted matching mechanisms that only work on constants. It was a Big Deal when Java added Enum and supported in case statements, in addition to int.

Scala has more pattern syntaxes, including matching strings and lists and arrays and tuples etc. Scala gives the program access to the matched parts.

F# goes one better and lets the programmer define patterns. Active patterns let the programmer create whatever matcher is needed, exponentially increasing the power of pattern matching.

For details and examples on active patterns, read my article at developerfusion.

In general, functional languages please Backus because they let the programmer create and use changeable parts in all sizes with minimal ceremony. F# active patterns exemplify this goal.

Let the language remain humble, and let its framework get out of the way of the changeable parts.

Dependency Injection vs Functional

At work, we use dependency injection to build up small, decoupled classes each with a single responsibility. DI lets us assemble these in multiple different ways, so we can use the same class for similar processes. This strategy of combining small pieces of functionality in different ways is also a goal of functional programming. Let’s contrast the two methods.

There’s a chunk of code whose job it is to string together the contents of a list of segments. The contents can be in one of two different fields in the segments. The segments might have connectors, which might need to go between the contents. We might need to skip the first and last segment if they’re not interesting.

To avoid duplication of any part of this, our current Spring-y solution at work uses dependency injection to construct different SegmentContentsConcatenators with different components. The result is 9 Java classes and 6 Interfaces.

  • Builder
    • has a SegmentRemover (or a not-remover)
    • has a Concatenator
      • has a ConnectorRetriever (or a not-retriever)
      • has a first-or-lastness determiner
      • has a ContentExtractor (two implementations, one for each possible field)

The Spring configuration defines 3 beans that are useful, but in order to hydrate these in all possible combinations, 17 bean definitions are required. Fortunately for us we care about three of the eight possible behaviour combinations, so we only need 12 bean definitions. Only!

That’s a at least 120 lines of Java in 15 files, plus about 30 lines of XML. Not counting tests.

Just for fun, I implemented the same functionality in F# using tryfsharp.org. The result is 12 little function (including two that are generic and could be placed in a list-processing library). It took 30 lines of F# code. The functions build on each other, and the three desired functionality combinations are pieced together from other small functions. Best of all, there’s no XML at all!

Both methods have the same goal: take small, testable pieces of functionality and piece them together in multiple ways. This is what functional languages DO. Trying to achieve the same goal in Java with Spring was not so pretty.

For reference, here’s the F# code:


// defining the types
type Contents = string

type ConnectorMethod =
| Easy
| Complicated of Contents

type Segment = { ExpectedContents : Contents; ObservedContents : Contents; Connector : ConnectorMethod; IsExtraneous : bool;}

// Functions that build up all the pieces
let findConnector c : Contents =
match c with
| Easy -> ""
| Complicated x -> x

let expected s = s.ExpectedContents
let observed s = s.ObservedContents

let expectedInternalJunction s = expected s + findConnector s.Connector

let rec concatSegmentsInternal f fIJ acc list =
match list with
| [] -> acc
| [x] -> acc + f x
| head :: tail -> concatSegmentsInternal f fIJ ( acc + fIJ head ) tail

let concatSegments f fIJ l : Contents =
match l with
| [] -> ""
| x -> concatSegmentsInternal f fIJ "" l

let isExtraneous s = s.IsExtraneous

// these last two are generic, could apply to any predicate and list
let rec removeLastIf s l : 'a list =
match l with
| [] -> []
| [x] when s x -> []
| [x] -> [x]
| head :: tail -> head :: (removeLastIf s tail)

let removeEndsIf s l =
match l with
| [] -> []
| head :: tail when s head -> removeLastIf s tail
| head :: tail -> head :: removeLastIf s tail

// Publically useful methods - these put the pieces together.
let concatExpectedSegments : Segment list -> Contents = concatSegments expected expectedInternalJunction
let concatObservedSegments : Segment list -> Contents = concatSegments observed observed
let concatInterestingObservedSegments ss : Contents = concatSegments observed observed (removeEndsIf isExtraneous ss)

// Tests
let s1 = {ExpectedContents = "ABC"; ObservedContents = "ABCD"; Connector = Complicated "-"; IsExtraneous = true;}
let s2 = {ExpectedContents = "EFG"; ObservedContents = "EFGH"; Connector = Complicated "a"; IsExtraneous = false }
let s3 = {s1 with IsExtraneous = false}

let ss = [s1;s2;s3]

assert (concatExpectedSegments ss == "ABC-EFGaABC")
assert (concatObservedSegments ss == "ABCDEFGHABCD" )
assert (concatInterestingObservedSegments ss == "EFGHABCD")