Saturday, June 1, 2013

What FP taught me about OO: Liskov Substitution Principle explained

TL;DR - functional programming taught me that LSP is a special case of PLS: Principle of Least Surprise.

One thing that bugs me while reading Java: I'm reading along, come to a method call, ctrl-click on the method name, and get a list of implementations of the interface. IntelliJ can't tell me exactly what will run at this point in the code. Java has dynamic dispatch: it decides at runtime which implementation to run.

Functional style has a different concept to offer instead of Java's interface. It's called "typeclass," which is incredibly confusing. Please accept that this idea has little to do with "type" or "class."

While an interface says, "This class can perform this method," a typeclass says, "There exists something that can perform this method on this class." (using 'class' loosely) An interface is baked into a class, while a typeclass instance is provided separately.

With typeclasses, the method implementation isn't attached to the runtime class. Instead, the compiler digs up the typeclass instance, based on a declared type instead of a concrete runtime type. This isn't flexible in the same way as OO's dynamic dispatch, but it has the advantage that we can tell by looking at the code exactly which implementation of the method will execute.

This gives us better ability to read the code and figure out what will happen in all circumstances, without using the debugger. Functional programmers call this "reasoning about code."

In Java, how can we make statements about what the code will do when we don't know exactly what code will run? One answer is: Liskov Substitution Principle. This OO design principle says, If you inherit from some other class or implement an interface, then be sure your class meets all expectations people have of the superclass or interface. The compiler enforces that our methods have the right signatures, while LSP demands that our implementation have the same invariants, postconditions, and other properties implied the documentation of the superclass or interface.

For instance, if you create an implementation of List with a .length() method that sometimes returns -1, you violate the expectations people have for Lists. This surprises people. Surprises are bad. Surprises stop us from accurately predicting how code will execute.

Therefore, the Liskov Substition Principle says, "Don't surprise people." If the documentation of List implies that length() will return a nonnegative integer, always do that. Also make sure that your List is 0-indexed like everyone else's. To implement an interface is to take on all the implicit and explicit promises made by that interface. That's LSP.

In Haskell or Scala, we can use typeclasses and be certain what the code will do at runtime. In Java, we can have the flexibility of dynamic dispatch and still have some knowledge of what will happen at runtime as long as we respect the Liskov Substitution Principle.


  1. You said:

    "[...] but it has the advantage that we can tell by looking at the code exactly which implementation of the method will execute."

    I'm not sure what example you had in mind, but the two approaches – interfaces vs. typeclasses – are more similar than they're different. If some function's signature depends on some typeclass, you can't tell which typeclass
    instance will be used. You'll have to see the call site of that function to find out that piece of information. Even then, you might need to traverse a few more layers until you reach the place where the actual type is being used.

    A simple example from Haskell:

    > :t print
    print :: Show a => a -> IO ()

    Will it print an Int or a String? You can't tell, because you dont'see the call site. Same thing happens with interfaces.

    1. Great point. With typeclasses, you can know which method will be called -from a particular call site-. But if you're looking at a type, you don't know which instance will always be called for it; that depends on what is in scope at each callsite.
      Interfaces have the opposite properties. A callsite does not have all the information to determine an implementation. But for a given type... well, a given final type... you know which implementation will be called for objects of that type. That's a tradeoff.

      It's the callsite knowledge that I've wished for when reading code.

  2. +1 on the "more similar than they are different" - and I suspect the way you're explaining the ideas is kind of obscuring this (sorry).

    Also consider that there's nothing magic about type classes - eg we can (in FP at least) define equivalent things as records of functions and pass around/construct records for certain types as and when we need them. Haskell just adds a bit of convenience on top of this.

    Also, type classes don't prevent nonsense code, eg "show" could be defined to do the sensible thing for booleans then to return "foo" for Ints. So, we still need something like the LSP - or my preference - a more flexible type system - to avoid such silliness.

    Do you agree though, that LSP or more types means that you don't really need to know which code is running?

    1. That's what laws are for, and transitively, why people get so het up about lawless type-classes.

      A lawless type-class may be a very useful thing – certainly syntactically – but it doesn't do anything to prevent misuse, or _surprising_ use.

      This is a good intro on laws: