Why is code so much easier than people?

Taking computers as the universe, where hardware determines the laws of nature and binary data are the basic substrate, we can understand the world bottom-up.

We know at each level of abstraction, at each size of particle, what rules the components follow. Binary composes to assembler, which follows rules according to the CPU’s construction. Assembler composes to higher level code. Code follows rules which, if we get them just right, begins to have output more complex than the simplicity of the rules suggests.

In the natural world our investigations take us in the opposite direction. From people to cells to proteins to DNA and molecules to atoms to neutrons to quarks… at each level, the components interact with each other according to rules. We are working to learn those rules.

Working up in computing, working down in biology and chemistry and physics, these two studies are complementary. We have 2 goals:

1) define rules for computers that result in behaviour as complex, interesting, meaningful, and productive as life

2) figure out the rules of the basic components of the natural universe, to lower and lower levels

The awesome part is that we can use computers to model the components of the natural universe. Often it takes millions of small components acting on similar rules to achieve complexity that defies the simplicity of the ruleset. To defy entropy and produce organization where there was none.

As in computing our level of abstraction increases, and our abstractions in the natural world grow more precise, they come together. This is an exciting time.

Amoeba Code

If organisms are programs, then DNA is our source code. Some people attempt to measure the complexity of an organism by the number of base pairs in its DNA.

For instance, the human genome is 250x longer than the yeast genome. Makes sense.

But then, the amoeba genome is 225x longer than the human genome[1]. Does this mean amoebas are the real intelligent life on the planet?

The complexity of the code does not determine the complexity of the results. 

We’ve all seen extremely complex code that performs a simple task. This is like the amoeba genome.

Our goal as developers is to keep our code simple. Like in NKS: simple rules can lead to very complex output. This is the holy grail – finding a new way to look at problems, so that the solution is simpler than you’d expect from the meaningfulness of the output.

Don’t write the amoeba genome. Learn from the yeast, and maybe someday we’ll build a human.

[1] Complexity, A Guided Tour, by Melanie Mitchell, chapter 7

Victory through failure

The status code you want to get from Microsoft: ERROR_SUCCESS. Or the next best, ERROR_SUCCESS_REBOOT_INITIATED.

Often in life our biggest victories, the best surprises, come out of failure to get what we thought we wanted.

And the best successes of all cause a reboot – a re-evaluation of what’s important, as we learn whole new parts of ourselves. New drivers installed. Reboot to integrate new hardware, new connections in our brains and our relationships.

Control limits potential

With centralized control, we’re limited to what we can hold in our heads.

A centralized economy doesn’t work beyond a few hundred people. A free-agent-based economy can build far more at larger scales.

A project controlled and understood by one person can only be so complex. A community of individuals can build something immensely greater – not predictable, much higher potential.

Inspired by:

Tom DeMarco "Implicit in [‘You can’t control what you can’t measure’] … is that control is an important aspect, maybe the most important, of any software project. But it isn’t.“

Nietzsche "Why should we not prefer untruth? And uncertainty?”

and complexity theory. 

Don Quixote was an Enterprise Architect

We need to invent a word:

?: (n) The goal that you aim for, not because you expect to hit it, but because aiming for it points you in the right direction.

Julian Browne in The Death of Architecture accepts that while an Enterprise Architecture will never be achieved, it can direct our efforts: “Great ideas and plans are what inspires and guides people to focus on making those tactical changes well.”

Having a plan is good. Sticking religiously to the plan is bad. The business is constantly changing, and so the architecture should align with the current status and the latest goals. Architecture is a map, but we operate in Street View. The map is always out of date.

So it is in life. As people we are constantly changing. There are objectives to shoot for, some concrete and achievable, others unrealistic but worth trying. Aim for enlightenment, if you like, but redefine what enlightenment means with each learning. Embrace change, and direct it by choosing where to focus your attention. If our goal is readable code, we’re never going to transform our entire legacy codebase to our current coding standards. Our coding standards will have changed by then. Instead, as we modify pieces of the code, we make these parts the best they can be according to the direction we’ve set.

Aim for the mountaintop, but recalibrate the location of that top. Appreciate each foot of ascent, as the good-software mountain is constantly shifting, and each improvement — each refactor or new test or simplification — counts as progress. An architecture achieved is an architecture out of date – touch the sky, find that it is made of paper, tear through it and aim again for the highest thing you can see.

What makes a functional programmer?

Michael O’Church has a lot to say about the functional programming community. His post, Functional Programming is a Ghetto (in the “isolated, exclusive neighborhood” sense of ghetto) contains some great descriptions of what goes through a programmer’s head after he or she learns to think in a functional style. The following is a paraphrase of his points, pointing out that the best functional programmers aren’t religious about it.

“What real functional programmers do is ‘multi-paradigm’– mostly functional, but with imperative techniques used when appropriate.” Writing to a database or to the console or calling services is what makes an application useful. Instead of eschewing these, we try to make all dependencies  and influence on the environment localized and explicit. 

The difference between an imperative and functional thinking is “what should be the primary, default ‘building block’ of a program. To a functional programmer, it’s a referentially-transparent (i.e. returning the same output every time per input, like a mathematical function) function. In imperative programming, it’s a stateful action. 

Imperative thinks of every line in the code as an action, as doing something. Functional thinks of every line as calculating something, and very specific lines as performing an action with external impact (database access, external API calls, etc). “Immutable data and referentially transparent functions should be the default, except in special cases where something else is clearly more appropriate. 

The result of isolating those external effects is that more code is easily testable. More code can be properly unit-tested, while specific code that interacts with the outside world can only be integration tested: “One needs to be able to know (and usually, to control) the environment in which the action occurs in order to know if it’s being done right” in imperative code.” Those are the failure points in our application. Don’t bury them under mounds of indirection or scatter them throughout your code.
In sum, “we don’t always write stateless programs, but we aim for referential transparency or for obvious state effects.”

Without the Ghetto-slang of referential transparency, partial application,  and reasoning about code, functional thinking boils down to: don’t change shit until you have to, and when you have to, call it out.

Database versioning, Android style

Question: how can we track database schema upgrades? How can we make sure our database structure matches the deployed code?

One answer: in Android’s SQLite database, they solve this problem by storing a version number in   a database property. When an application opens a connection, the version number in the code is checked against the version number in the database. If they don’t match, Android calls a hook to let the application update the schema.

For our web app purposes, we used the idea of storing the version number in the database. We threw it in a table. Our upgrade scripts are separate from the code, but they still do the job of converting data, applying DDL changes, and finally increasing the version number in the database.

An application should fail early if the database is out of date. For this, we created a ConnectionProvider that validates the version number when a database connection is established. Thus, there is a version stored in the database and a version hard-coded into the application. If we forget to upgrade the dev or test database before deploying the corresponding code, we find out on startup.

Programming experience in a different environment made our lives easier this day. Design was simple because we were aware of a pattern that worked in one architecture. It adapted well to ours.

using checkstyle in gradle

The checkstyle plugin brings code quality checks into the gradle build. This way, if team members are using disparate IDEs or have different settings, consistent coding standards are enforced in the build.

Here’s how to add checkstyle to your Java build in two steps:
1) In build.gradle, add this:

    apply plugin: ‘checkstyle’


    checkstyle {
       configFile = new File(rootDir, “checkstyle.xml“)
    }

The configFile setting is not necessary if you use the default location, which is config/checkstyle/checkstyle.xml. More configuration options are listed in the gradle userguide.


If yours is a multiproject build, put that configuration in a subprojects block in the parent build.gradle. This uses the parent’s rootDir/checkstyle.xml so the checkstyle configuration is consistent between projects.

2) Create checkstyle.xml. For reasonable default settings, google it and steal someone else’s. I took mine from a google-api repo and stripped it down. Here’s a really basic example, which checks only for tab characters and unused imports:


<!DOCTYPE module PUBLIC
    “-//Puppy Crawl//DTD Check Configuration 1.3//EN”
     “http://www.puppycrawl.com/dtds/configuration_1_3.dtd”&gt;


 
   
 

Visit the checkstyle site to find a million more options for checks.

When a git branch goes bad

Want to merge some good code into the master branch, but stymied by code you don’t care about causing a bunch of conflicts?
The other day, some crap got committed to master and pushed, accidentally. It didn’t belong. Since it was pushed to origin, we don’t want to change history*. When it was time to do a git-flow release, a bunch of conflicts stymied the merge from release branch into master. I wanted to say, “Merge this branch into master, but take all the code from the branch; don’t worry about what’s currently in master.”

There’s a merge strategy for merging in a branch and ignoring the changes into the branch: the “ours” merge strategy. But there’s no “theirs” merge strategy for choosing all the code from the branch. Here is one way to accomplish this:

  1. Start the merge

if using git-flow to do a release: git flow release finish versionName
OR
general case of merging something into master (or whatever branch you like; master is an example):

git checkout master
git merge branchWithGoodCode

This leaves the merge open, with conflicts. git status shows the files successfully merged (these are in the index) and the files with conflicts (not yet in the index). While the merge is in progress, git is in a special state, something like the below diagram. The objective is to get all the right files into the index and then commit, which will complete the merge.

  1. Get all the code from the branch 

git checkout MERGE_HEAD.

Here, MERGE_HEAD means “the tip of the branch we’re trying to merge in.” MERGE_HEAD is a ref (pointer to a commit) that exists while a merge is in process. This command pulls the files from there into the working directory and index.
The “.” at the end is important: when a path is provided as the last argument to git checkout, then git updates the files without changing your current branch. git checkout with no path will switch to that branch. (The — is optional, but it makes that . harder to miss.)

  1. Commit to finish the merge

git commit -m “Merge branchWithGoodCode, taking all files from branchWithGoodCode”
Hurray, now the tree looks like this:

* if the commits that I don’t care about existed only locally, not on origin, then I could wipe those commits from the history entirely. 
git checkout master
git reset –hard goodBranch
the reset says “take my current branch and move it to point at the same commit as goodBranch.” (Keep in mind that a branch is nothing but a label, a pointer to a commit.) The “–hard” says “and while you’re at it, replace everything in my working directory with the goodBranch code.” The commits that were only on master are gone from the tree, and eventually forgotten.