Saturday, November 23, 2013

Trains within trains

In Java, there are primitive types and there are objects. Often we want to work with everything as an object, so those primitives get boxed up into object wrappers.
Ruby and Scala say, "That's silly. Let everything be an object to begin with." That keeps parameter-passing semantics and comparison and printing operations consistent.

Yesterday while writing code, I thought I was working with a sequence, so I wrote

input map { transformation } map { transformation2 }

Then the compiler complained, because my input was a simple value. Value? pshaw. I've clearly gotten used to working with Functors. As that last post says, functors are like trains that hold a value. You can attach transformations like extra train cars, and the result is still one train.
So I wrapped my value in an Option and went about my business.

Some(input) map { transformation } map { transformation2 }

Option is a functor, a context that can hold a value and incorporate transformations. However, Option is more than that. As a context, it has a special power: it might hold no value at all.

The map method on Option doesn't let me use that special power. Map works within the context in that transforming an empty Option has no effect, but map won't let me switch from a full Option to an empty Option. If I have a transformation that fails, I'd want the Option to turn out empty.

This is the flatmap method: it might map it, or it might flat it.... OK, that doesn't make any sense. That's the name of the method, anyway.[0]

As a train, flatmap says, "Instead of telling you what the next car is, I'll wait for you to give me the value and then I'll give you a whole new train." 

The net result is still exactly one train. Unlike mapflatmap lets us use the special power of the context.

List (or Seq) is another functor with a special power: it holds 0 to many values. map transforms values, but flatmap takes advantage of the special power, so the result can contain a different number of values. Maybe twice as many, maybe none.

One more example: Future holds a value may not exist yet. Future.flatmap lets you produce another value that may not exist yet, and the result is still one Future

Each context has a special power, and flatmap lets you exercise that power a second time, always returning one context. The context doesn't get deeper. You don't get an Option of an Option back from flatmap; you still have one layer of Option around a value.

The Option train probably evaluates your trains-formation immediately, and gives you back a Some or None right away. A lazy List or a Future might store the instruction for later, or execute it right away -- the important thing is you can't tell the difference.

You know where this is going, right?

A monad is a context with a special power, and a method flatmap [1] that lets you use its special power again and again.

There's one more consistent feature of all monads: they all provide a constructor that lets you put a value in the context. For Option, that's Some(x). There's List(x), and[2] It says, "if you have a value, you can put it on a train."

What they DON'T have in common is a way to pull values OUT of the train. This is different for every monad, if it exists at all. We have Option.get, and List can be accessed by index. Future doesn't really want you to ever get the thing out; if you insist, Await can dig it out for you.

Since you have to do something different to get the value out every time, what's the point of naming them "monad" at all? It turns out there are things we can do to "anything inside a monad." One is to build up a computation without (necessarily) running it right away. Right now I'm looking at Process in scalaz-stream, which does this. Yet, many things you now do to a List could be done to a monad in general. You can always map or flatmap, without knowing what special powers you're working with.
Ever write a function to operate on a sequence of its arguments rather than just one, so that you can call it either way? Take a function that saves items to a database. It can save one or many; maybe you even made it accept a variable argument list. What if, instead, you wrote that function to operate on any monad containing items? Then different callers could pass multiple items in a List, a possibly-nonexistent item in an Option, or an item that hasn't been determined yet in a Future. The Option would execute the database-inserting action immediately, while the Future would save it for later. Your function returns the result of the insertion in that same context. Now that's accommodating![3]
If the caller really does want to pass in exactly one item, she could wrap it in the Boring monad that has no special powers at all.[4] Why, with that, I could work in monadic style all the time! map and flatmap all the things!

It sounds crazy now, but hey, we used to manually box up our primitives in Java, too.
[0] I can think of a few ways to interpret "flatmap" as a reasonable name for the method, but it'll be more fun to let people comment.

[1] Haskellers call this flatmap method "bind." You can combine one within another without getting any deeper, and if you can combine two you can combine as many as you like: therefore, monads are composable.

[2] Haskellers call the single-value constructor "unit."

[3] technically you could accept any functor, I think.

[4] I totally made that up.

1 comment:

  1. To [3] I agree, as long as your db-save function only uses "map" to transform the result. This is exactly the functor abstraction.

    To [4] I add that your Boring monad of course exists and is the so-called Id monad