Thursday, March 1, 2012

Strong Typing in Java: a religious argument

Strongly-typed, type-inferred languages like F#, Scala, and Haskell make Java feel like its static typing system is half-assed. This is not Java's fault entirely; it's the way we use it.

Say we have a domain class like

public class User {

public User(String firstName, String lastName, String email, long balanceInDollars) {}

...
}

I'm in agreement with Stephan that using String and primitive types here is wimping out. That is weak typing. Weak! Chuck Norris does not approve!

firstName and lastName and email are conceptually different. If each has their own type, then we get real compile-time type checking.

public class User {

public User(FirstName firstName, LastName lastName, EmailAddress email, Money balance) {}

...
}


This gives us type safety. As we pass a FirstName or an EmailAddress from class to class, we know what we're getting. If we mix up the order of the arguments, we hear about it from the compiler.

"But they're just strings!" you say. "Don't make a bunch of cruft - just call them what they are!"

NO! I say. They are not Strings. They are stored as strings. That, my friend, is an implementation detail. FirstName and EmailAddress represent distinct concepts, and they should have distinct types. The first part of wisdom is calling things by their right names.

There are other OO-style benefits from this, such as putting the validation for each type in its type class and changing the internal representation of the type without affecting its interface. Those may be significant in some situations, but in my argument they're icing. The benefit of strong typing is compile-time checking, and that's reason enough to call a FirstName a FirstName and not a vague String.

Now, let's address this "bunch of cruft" argument. No question, Java does not make this kind of type-wrapping pretty. In Haskell, it takes a one-line type alias. By OO principles, we ought to be able to inherit from String to get its behavior. But noooo, this is Java, and String is final, so we wind up with

public class FirstName {
public final String stringValue;

public FirstName(final String value) {
this.stringValue = value;
}

public String toString() {...}
public boolean equals() {...}
public int hashCode() {...}

}

(notice that I used a *gasp* public field. That's another religious argument for another post. I include it here just to stir people up even further.)

Then every time we want a user:

User u = new User(new FirstName("Joe"), new LastName("Frank"), new EmailAddress("..."), new Money(30));


We can get a little better by providing static methods and statically importing them:

(in FirstName)

public static FirstName(final String value) {
return new FirstName(value);
}


User u = new User(FirstName("Joe"), LastName("Frank"), EmailAddress("..."), Money(30));

That's a little better. But now we get into the part where user is an immutable value type, and we want to update one field.

User updated = new User("Joseph", old.getLastName(), old.getEmailAddress(), old.getBalance());

Ugh! and every time we add a new field to user, even though we simply want to copy it, we have to modify this code.
Copy constructors don't help when the type is immutable.

Let's talk about F# for a minute. F# has records:

type User = { firstName : FirstName; lastName : LastName ; email : EmailAddress; balance : Money }

and then the cool bit is, when you want another record that's almost the same:

let updated = { oldUser with firstName = FirstName("Joseph") }

I want to do this with my Java user. I want to say

User updated = newUser.with(FirstName("Joseph"));

which is cool and all; with strong typing we can overload the "with" method all day long. We can chain them. We can add implementations of "with" for common combinations of fields to reduce instantiations.

(in User)

public User with(FirstName firstName) {
if (this.firstName.equals(firstName)) {
return this; // avoid instantiating an identical value object
}
return new User(firstName, this.lastName, this.email, this.balance);
}

Now you have a whole ton of "with" methods that can be chained. If you add a new field to User, you need to change all of them, but they're all in the same place so that's just fine. What changes together, stays together.

Now, if you don't like an instantiation per changed field, or if you don't like all those "with" methods cluttering up your user class, here's another idea:

User updatedUser = new UserBuilder(oldUser).with(FirstName("Joseph").with(LastName("Frankenfurter").build();

where the UserBuilder keeps the oldUser and uses its values for any fields that aren't provided by the caller to build the new user. That's one instantiation, only one method that instantiates a user, and it's encapsulated into one builder class.

Some people may argue that immutable types and strong typing in Java is going against the way the language is intended to be used, and therefore does nothing but make our lives more difficult. "That's why God gave us POJOs," they say. Java is a powerful language, and it is capable of supporting more idioms than the ones the language designers envisioned. We can grow as programmers within our language of choice. Java supports strong typing.

Strong typing gives us compiler errors on what would otherwise be caught only during testing. It creates some extra code, sure, but it's localized. It can make working with immutable types a little cleaner.

I say, never use "String!"

13 comments:

  1. Cool post. You could also argue that if you're in a place where the devs are progressive enough to even consider such an idea, you might as well just switch to Scala :-)

    ReplyDelete
  2. Great post! Just yesterday I was contemplating a similar builder approach to constructing SQL statements out of string parts.

    ReplyDelete
  3. It gives you more compile time safety at the cost of extra runtime work. I wish Java either had a typedef-like compile-time mechanism or a JIT strategy for rearranging data structures at runtime.

    ReplyDelete
  4. In Scala, you can use "tagged types" which gives you compile-type safety while using the underlying String or (boxed) primitive at runtime:

    http://etorreborre.blogspot.com/2011/11/practical-uses-for-unboxed-tagged-types.html

    You would have something like:

    type FirstName = String @@ FirstNameTag
    type LastName = String @@ LastNameTag
    type EMailAddress = String @@ EMailAddressTag
    type Money = Int @@ MoneyTag

    class User(
    val firstName:FirstName,
    val lastName:LastName,
    val emailAddress:EMailAddress,
    val money:Money
    )

    With some helpers to cast into the tagged types:

    def firstName(value:String) = value.asInstanceOf[FirstName];
    def lastName(value:String) = value.asInstanceOf[LastName];
    def emailAddress(value:String) = value.asInstanceOf[EMailAddress];
    def money(value:Int) = value.asInstanceOf[Money];

    val user = new User(
    firstName("Joe"),
    lastName("Bagofdonuts"),
    emailAddress("joebagofdonuts@foo.com"),
    money(100)
    )

    Implicit conversions could be used for the casting, but there would be trade-offs regarding type safety since untagged types would get implicitly converted and you might not want that happen.

    In Scala 2.9.1, there is some issues with using them in case classes which would be preferable in the User example above. When those issues get ironed out, this could be an interesting technique for those who like stronger type safety.

    ReplyDelete
    Replies
    1. Also, I think value classes from SIP-15 (http://docs.scala-lang.org/sips/pending/value-classes.html) are interesting, when it hits mainline. Value classes may deprecate the tagged type idiom (although I haven't thought through a comprehensive comparison).

      Delete
  5. Tim, that's perfect, thank you! I was wondering how to best implement the same ideas in Scala.

    Nayan -- we are, indeed, considering switching to Scala.

    ReplyDelete
  6. Some things:
    1) Avro uses builders and it makes for a really nice programming environment.
    2) I write posts which are 'this is good - that is bad' but I never really 100% agree with them (my own, or other peoples). I like the domain object model but am loath to say 'never' use string. There are places where it is just going too far to define everything as its own object. You have actually done this; you do not use an abstract Name type for example.
    3) Immutability is becoming more and more in line with the design of the JVM and so cannot be considered to be against the idiom of Java. If objects are immutable then there is a better chance the JVM will be able to remove them from the heap using escape analysis (for example)

    ReplyDelete
    Replies
    1. Alexander - #2, I agree completely. Posts like this are an exploration of a principle. In real life, being aware of both extremes lets us choose intelligently where in the middle is optimal for each application. If correctness were paramount, I'd use strong typing all the time, but in real life, efficiency and simplicity matter too.
      #3, that's good news!

      Thank you for this input.

      Delete
  7. I agree, all (or at least most) systems should be build without primitive types. But you don't need to switch to Scala for this, it can be done perfectly well using Java.

    The system I work on has approx. 240 entities and 340 immutable value-type classes (Java with JPA). The BigDecimal valuetypes handle rounding, scaling and precision issues. I can not add an 'Amount' to a 'Price' unless I explicitly convert, which is a good thing. I can not provide a 'Name' where a 'Reference' is required as they are different concepts (and have different lengths). I can write getVatPercentage().ofAmount(getInvoiceAmountExt()) because value type 'Percentage' has extra methods. I can validate a value type on instantiation so I only ever have valid values. I can make type hierarchies where 'SalesAmount' is-a 'Amount'. Etc. etc...

    The business logic is simpler and more expressive and the compiler will tell you when you mix up concepts.

    By using code-generation for entities and value-types the extra effort is minimal. Why are we not all doing this... I think for most systems the benifits outweigh the performance loss by far.

    ReplyDelete
    Replies
    1. Marc, this is inspiring! Thank you.

      Delete
  8. Could I request a Name class instead of a FirstName and LastName pair? It won't be long before someone wishes to store their MiddleName and SecondMiddleName!

    ReplyDelete
    Replies
    1. Of course! As long as Name contains a FirstName and LastName, rather than Strings. :-)

      Delete
  9. I've had this idea for a while; why not use the static type system to support data semantics? It's great to read someone else had the idea too.

    Spend some time while with a dynamic language and you start to wonder what the benefit of the static type system really is, other then reducing the most gross of mistakes in a project with developers who don't communicate or pay attention.

    Since most classes aren't as much types as mere namespaces, why not give all domain data it's own type, not just the aggregates?

    ReplyDelete