People aren’t born knowing what to do in all situations. We learn based on what does go wrong, based on the contingencies we do encounter. Gradually.

A chess program must have all situations programmed ahead of time, an algorithm for everything physically possible. A human player encounters a situations, then finds a solution – maybe not this time, but for next.

Humans add error handling only after we encounter the error. We can do that with our programs, too. With good logging and error notification, we can teach the program how to deal with what it’s proven is a real situation.

Then our programs can be like small children: we don’t teach them how to deal with everything. We teach them how to ask for help. In a program, include visibility and monitoring  from the start, add error handling later.

A victory for abstraction, re-use, and small libraries

The other day at Outpace, while breaking some coupling, Eli and I decided to retain some information from one run of our program to another. We need to bookmark how far we read in each input data table. How can we persist this small piece of data?

Let’s put it in a file. Sure, that’ll work.[1] 

Next step, make an abstraction. Each of three configurations needs its own “how to read the bookmark” and “how to write the bookmark.”[2] What can we name it?

After some discussion we notice this is basically a Clojure atom – a “place” to store data that can change – except persistent between runs.

Eli googles “clojure persist atom to disk” and bam! He finds a library. Enduro, by @alandipert. Persistent atoms for Clojure, backed by a file or Postgres. Complete with an in-memory implementation for testing. And thread safety, which we would not have bothered with. Hey, come to think of it, Postgres is a better place to store our bookmarks.

From a need to an abstraction to an existing implementation! with better ideas! win!

Enduro has no commits in the last year, but who cares? When a library is small enough, it reaches feature-completion. For a solid abstraction, there is such a thing as “done.”

Now, it happens that the library isn’t as complete as we hoped. There are no tests for the Postgres implementation. The release! method mentioned in the README doesn’t exist.

But hey, we can add these to the library faster and with less risk than implementing it all ourselves. Alan’s design is better than ours. Building on a solid foundation from an expert is more satisfying that building from scratch. And with pull requests, everybody wins!

This is re-use at its best. We paused to concentrate on abstraction before implementation, and it paid off.

[1] If something happens to the file, our program will require a command-line argument to tell it where to start.

[2] In OO, I’d put that in an object, implementing two single-method interfaces for ISP, since each function is needed in a different part of the program. In Clojure, I’m more inclined to create a pair of functions. Without types, though, it’s hard to see the meaning of the two disparate elements of the pair. The best we come up with is JavaScript-object-style: a map containing :read-fn and :write-fn. At least that gives them names.

REST as a debugging strategy

In REST there’s this rule: don’t save low-level links. Instead, start from the top and navigate the returned hyperlinks, as they may have changed. Detailed knowledge is transitory.
This same philosophy helps in daily programming work.

Say a bug report comes in: “Data is missing from this report.” My pair is more familiar with the reporting system. They say, “That report runs on machine X, so let’s log in to X and look at the logs.”

I say, “Wait. What determines which machine a report runs on? How could I figure this out myself?” and “Are all log files in the same place? How do we know?”

The business isn’t in a panic about this report, so we can take a little extra time to do knowledge transfer during the debugging. Hopefully my pair is patient with my high-level questions.

I want to start from sources of information I can always access. Deployment configuration, the AWS console, etc. Gather the context outside-in. Then I can investigate bugs like this alone in the future. And not only for this report, but any report.

“How can we ascertain which database it connected to? How can I find out how to access that database?”
“How can I find the right source repository? Which script runs it, with which command-line options? What runs that script?”

Perhaps the path is:
– deployment configuration determines which machine, and what repository is deployed
– cron configuration runs a script
– that script opens a configuration file, which contains the exact command run
– database connection parameters come from a service call, which I can make too
– log files are in a company-standard location
– source code reveals the rest.

This is top-down navigation from original sources to specific details. It is tempting to skip ahead, and if both of us already knew the whole path and had confidence nothing changed since last week, we might skip into the dirty details, go right to the log file and database. If that didn’t solve the mystery, we’d step back and trace from the top, verifying assumptions, looking for surprises. Even when we “know” the full context, tracing deployment and execution top-down helps us pin down problems.

Debugging strategy that starts from the top is re-usable: solve many bugs, not just this one. It is stateless: not dependent on environmental assumptions that may have changed when we weren’t looking.

REST as more than a service architecture. REST as a work philosophy.