Friday, May 31, 2013

Twisting the rules of logic in our code

In philosophy, there are very few things that can't be doubted. The basic laws of logic are among them. There's one that seems completely obvious and indisputable to normal people:

Law of Identity: Everything is identical to itself.

In my audiobook, the professor is going on about how not only is this statement true in our world, it's true in every conceivable world. And I'm thinking, "Oh, I can conceive of such a world where this is false! I've created one!" The programmer can create worlds the philosopher cannot conceive of.

Take Hibernate for example, which tries to represent a database in Java objects. It's optimal to define object identity based on a set of fields, rather than creating a gratuitous sequential ID. But woe betide the programmer who includes in the set of identity fields one that can be updated! I change your last name, and bam, your entity is not identical to itself. Been there, done that, learned not to do it again. It takes rigor, knowledge, and discipline to use Hibernate without creating a world that violates this most basic law of logic.

Whaaa? Why is this even possible? This is exactly what functional programmers are talking about with "reasoning about code." FP is about languages or practices that make sure our programs conform to the laws of logic, so they don't up and surprise us with objects that are not identical to themselves.

In Java, we can violate the Law of Identity when we define .hashcode() poorly and then stick objects in a HashMap. One key goes in, a property referenced in the key's .hashcode() changes, and then we try to get that entry out - no luck. It is not identical to itself. Things also get nutty when .equals() and .hashcode() are not dependent on the same fields. In Java, we have to be careful to create objects that follow the Law of Identity.

A Haskell programmer says, ridiculous! Why would we do this to ourselves?

After the Law of Identity, there's two laws attributed to Liebniz. The first one says identical objects have the same properties. This is a little harder to screw up in Java, unless:

class TrickyBugger {
  public double getHappiness() { return Math.random; }
}

The second form of Liebniz's law says if two things have the same properties, then they are identical. In reality and philosophy this law is controversial. Yet, when it's true, it's useful. In programming we can choose to keep this one true.

Enter the Algebraic Data Type. This is every type in Haskell, and the most useful sort of data type in Scala. A class is an algebraic data type if, when two instances have identical properties, they are considered identical. (All the properties are immutable.) In OO, the Flyweight pattern is an example of this. The concept of ADTs means that even if two instances live at different places in memory, when they have all the same properties, they might as well be the same instance for all we care.

This is useful because it keeps things simple in my head. I don't have to ask myself "should I use == or .equals in this comparison?" It's always .equals(), because properties determine identity. Back in Hibernate, this is the concept I needed. Keys should be ADTs, so naturally their fields will never update.

Programming is supposed to be the epitome of logical work. I went into programming because physics wasn't deterministic enough. Yet, here we are, in super-popular languages and frameworks, violating the most basic laws of logical inference! Programming is deterministic, and we can choose to create modules that conform to the logical principles our brain expects. Let's do that.

Wednesday, May 22, 2013

The Silver Pill

There is no silver bullet. What if there is a silver pill?

It is no single change that can rocket our productivity. It is a change in the rate of change.
There are two outputs of everything we write: some code, and a new version of ourselves. If we stop thinking of our product as the code, and focus also on improving ourselves with everything we write, then we increase our own productivity in all future code. Then our abilities grow with compound interest.

The other day, I asserted that our code should be concrete, because it is more clear and maintainable. Daniel Spiewak argued, abstract early! This policy has benefited him: once he has formed the abstraction, then the next time a seemingly disparate requirement comes up that he can boil down to the same abstraction, he can tell immediately and without experimentation what problems are inside. 
He was right, because what we do in ourselves is more valuable than what we do in the code. So what if that carefully abstracted code gets deleted two days later? The patterns created in the brain pay off for the rest of his life. And he can build to higher-level abstractions he'd never reach without that investment.

I've lived this payoff in another way: when I start a job, I'm less productive than other new developers are for 2-4 months. They want to jump right in and be productive. They're focused on their current code output. I want to understand the system, so I ask a ton of questions and dig around in the code to find the root cause of problems. This makes me slower at first, but by 6 months in, I'm one of the most productive people on the whole team, and still improving. The code we write pays off today, but learning pays off every day for the rest of our career.

It's the difference between building wheels, and building a machine that can make wheels. When we keep improving the builder of the machine, then production accelerates. From position to velocity to acceleration: raise the second derivative and the limit is infinity.

Trivial example: today, git merge came back with a pile of conflicts. I flipped through git documentation and asked a friend, learning about git's concepts of file status. This cost twenty minutes today, and it makes all future dealings with merge conflicts a bit easier. Now I know that git status -s will give me a grep-friendly summary.

Daniel is right -- spending time on code that never deploys to production is wasteful only if we learn nothing while writing it. The silver pill is: time spent coding is wasted if we learn nothing from it. The return value of our day is the self we become for the next day, while code is a handy side effect.

Monday, May 13, 2013

Two Models of Computation: or, Why I'm Switching Trains


"In cognitive science, we only use the lambda-calculus model of computation," says Dr Philipp Koralus (Phd, Philosophy & Neuroscience, on his way to be a lecturer at Oxford). "We want to talk about what the system is doing, and abstract the how."

Two models of computation: lambda calculus and the Turing machine. Teacher and student, Church and Turing.
Years later, engineering catches up and we start building computers.
John McCarthy favored an encoding of the lambda calculus. John Von Neumann built a Turing machine, with a separation of processing and data. "Von Neumann won because he was smarter," says Barbara Liskov. Von Neumann's particular type of intelligence was a huge working memory: he could hold a crapton of state in his head and manipulate it. Practical computation took the Turing machine route, stateful and imperative.

Von Neumann's intelligence is not reproducible. We looked for ways that normal people could write and maintain programs in this architecture. We abstracted to assembly, then C, then object-orientation. All abstractions that let us partition the program into small enough bits that each bit can fit in our heads. We created some design principles to add enough formal structure that, with great effort, we could port some pieces of solutions to other problems.

Meanwhile, lambda calculus and functional languages fell to academia. Now they're coming back. Now memory and processes are plentiful enough that we can afford to force a different computational model onto Von Neumann's architecture. But why would we?

What's the difference that makes the functional model of computation more useful to theoreticial cognitive scientists, who are modeling the reasoning method of the human mind? Philipp says it's the semantics and the systematicity.  "You can't get arbitrary novelty and recombination without a formal system."

But what about reuse in imperative languages? We do manage this. "You can get similarity in templates," says Philipp. We can have design patterns, frameworks, and libraries. We can attend conferences and learn the tools people have crafted so we can recognize when our problem might mostly fit. Bur for arbitrary novelty, we have to get down to the language and code it ourselves. With our fingers.

I used to think that FP concepts like monads and monoids were like OO design patterns, only more abstract. Now I see that they are fundamentally different. If code can be represented in a formal symbolic notation, then any piece can be broken away and used differently in a different place. Like you can pull any subexpression out of a mathematical equation and think about it on its own.

It's the difference between a neural network, which can learn to recognize faces, and a symbolic representation of faces that says, "This is a nose. This is an eye. This is the distance between the nose and eye." The symbolic system gives us meaningful dimensions we can use independently and learn from. The neural network tells us "You" or "Not you."

Then, composition. In OO our idea of composition is a has-a relationship. Yet, the owner object must be written to accommodate and pass messages through to its components. Contrast this with functional composition, which works like this:

find . -name '*.txt' | xargs cat | grep 'ERROR' | cut -d ':' -f 2 | sort | uniq -c

Each symbol here, "find", "cat", "grep", "sort" has an independent meaning and is useful alone. Functional composition is a fits-together relationship. Neither part knows anything about the others. We piece them together to serve arbitrary purposes. Best of all, we don't have to be Von Neumann to understand any given piece or conjunction.

Now that the field of software development recognizes the value of a more "tell us what it does" declarative and recombinable-without-template-matching computational model, some of us are struggling. We grew up learning to think like Von Neumann.
I used to say at interviews that my best skill was holding the whole system in my head. Now I recognize that this was a crappy way to reason. It doesn't scale, and I can't pass that understanding on to others as a whole. The wiser goal is to eliminate a need to load everything into one head. It's going to be a tough transition from the relatively-informal structure I'm used to, but now I'm sure it's worth it. I only hope I can take some of you with me.