Saturday, August 11, 2012

The opposite of simple is not complex

Studying biology or economics, one finds organisms, ecosystems, and economies that are more than the sum of their parts. Somehow many interacting agents with limited information produce increasing organization, creating amazing complexity out of relatively simple components.
In computing, if we want to harvest this potential for surprise, see results this interesting, we have to write complex systems.

This doesn't mean we want to write complicated systems.

Wait, isn't that contradictory? Complex systems that are not complicated?
Ah, but that is the whole point of complexity theory. 

Herbert Simon described a complex system as being composed of hierarchy and near-decomposable parts.[1] Just as an economy is composed of many humans making decisions independently, a piece of software can be composed of many parts that can't see inside each other. This preserves the potential for producing complex behavior, solving a complex problem. Yet, each part within the system may provide a simple abstraction.

The difference between complex and complicated - or, as Rich Hickey calls it, complected - is intertwining the different modules in the system. An application may consist of many uniform modules, each of which knows about the inner workings of the other -- complicated. Or it may consist of even more modules, each of which interacts only with the abstractions exposed by the others -- simpler, and yet with potentially complex results.

The key is breaking the software into independent modules, each of which says to the others "I don't know how you do what you do, and I don't care." Each module or component can be developed by a separate team with different coding standards or in a different language. Uniformity is compromised *gasp*. No single architect understands all parts of the system. Strict top-down, God-like orchestration is sacrificed. Instead, component interactions are abstract. Teams interact at higher levels, without knowledge of each others' code.

Each component handles versioning, security, releases, backwards compatibility - everything, in its own way. Reuse is at the component level, at the level of release, not at the class or function level.

A system based on independent components like these probably contains more code than a functionality-equivalent system that is orchestrated top-down. That's okay. It has potential to support much greater solution complexity.

A fact of software development: 
For every 25 percent increase in problem complexity, there is a 100 percent increase in complexity of the software solution. That's not a condition to try to change (even though reducing complexity is always a desirable thing to do); that's just the way it is. [2]
This is true because every feature we add potentially impacts every feature we already support. The increase in solution complexity is combinatorial, not linear. The way to combat this is to reduce the complectedness of our software. Stop intertwining all the bits -- a coherent architecture is more efficient in lines of code, but it is inherently limiting. It is limited by what the God Architect can hold in his head.

Break the software into simpler components. Some of these may be open-source components, which someone has to learn well enough to configure and implement. Others will be custom. Break them off into teams and give those teams autonomy. Let each team implement its piece in the simplest way possible for that particular problem.

Meanwhile, the systemwide architects don't know the particulars of each solution. In fact, they should not know the particulars. That would allow them to base solutions on implementation details. Don't let that happen! Raise the level of abstraction. Components must interact with abstractions, not with each other. Knowledge of inner workings of other modules is a negative.

Jeff Bezos forced this strategy at Amazon around 2002 by decree: all teams will interact via service interfaces only; they can use whatever technology they want; and all interfaces must be externalizable, ready to be exposed to the outside world. The result? The largest online bookseller became the largest vendor of cloud computing. Did anyone predict that ten years ago?

When you and I interact, our brains do not interact directly. Rather, we both interact with the physical world. I say something, you hear it. Abstractions on both sides reduce the granularity of the interaction, but maintain the independence of our individual brains. No mind-melds allowed. While this seems limiting, observation shows that amazing unpredictable structures - nations, communities, economies - emerge from these limited interactions.

Complicated systems are limited in growth. Complex systems have even greater potential for growth than the designers of the components conceived. Keep your components simple and independent, and there is no end to the problems we can solve.

[1] Complexity, A Guided Tour, by Melanie Mitchell, chapter 7
[2] Facts and Fallacies of Software Engineering, by Robert L. Glass, p. 58

Thursday, August 9, 2012

Brains, computers, and problem solutions

Programmers tell the computer what to do.

The computer has a processor, it has some memory, and it follows a set of instructions. The instructions say what to load into memory, what to do with the values in memory, how to change that memory, and when to store it off somewhere else. There is an instruction pointer that shows the next instruction to execute. There are tests that can change where the instruction pointer goes next.
(approximation of a Harvard architecture)
Assembly language is the closest to pure computer programming, to telling the computer exactly what to do. Imperative languages like C are an abstraction above that. We still tell the computer what to do, in what order, and how to change the values in memory. Code is in one place, data in another. Object oriented languages like Java offer a few more abstractions, organizing relevant code with data, hiding memory management. These languages are still imperative: we tell the processor what to do and in what order.

Now that we don't program business software in assembler, do programmers still tell the computer what to do? We tell the compiler what to do, and that tells the runtime what to do, and that tells the computer what to do. What do we really do?

Programmers solve problems.

When we learn to code we learn to think like the computer. But what if that is not the optimal way to think? what if we think instead of how the problem looks, what the solution looks like?
(it's abstract, OK? It's supposed to represent a data flow.)
Programming in a declarative style means stating solutions, not steps. State what is true, or state what we're doing - don't describe how the computer goes about it. That's an implementation detail for the compiler to optimize. (I'm talking business software here, not embedded systems.) Code in a language suited to the problem. This gives us more ways to solve problems, and cleaner ways to solve problems.

We're not limited to what comes naturally to a computer, or what comes naturally to our brains.

Abstraction is key. The right abstractions make our code easy to read and faster to write and maintain. Don't limit abstractions to the ones that fit the computer. Don't limit them to objects that model business entities. This is why learning math is useful, and it's why functional programming is useful: the more abstractions we know how to hold in our head -- abstractions with no analog in either the real world or the computer world -- the more tools we have for solving problems.

Wednesday, August 8, 2012

A third way

Programming is about translating what a human wants into instructions a computer can understand.

Or is it?

Thinking down this path, there are two ends of a programming language spectrum. A language can be close to the computer's perspective: imperative languages that declare data, move and store data, carry out instructions in a fixed order. At the other end, spoken languages aren't specific enough to convey instructions to a computer.
But there's more to this concept than one dimension! The key is: speak not to the human, nor to the computer, but to the problem we're solving. The appropriate abstractions, and therefore elegance and simplicity and productivity, lie in the language of the problem.

Some problems are well-suited to an imperative language, some to an object-oriented one. Others to neither. The key is to fit the language to the problem, not the programmer and not the computer. Every problem has its conceptual sweet spot.

Aim for a declarative style: state the solution rather than a series of instructions.

For instance, functional languages speak not in instruction sets, not in English, but in calculations. People naturally think in a more imperative manner, and computers operate with defined instructions and changing the contents of memory. But when we're calculating an output from some input, a functional language is more effective at stating the solution than either computer or human language.

Why do we care about this problem space? "Calculations" include more than math. Any translation fits: reporting from a database; taking a URL plus request parameters and outputting HTML. This sweet spot is relevant to all web programming and a big chunk of business software.

Coding in a third language space means we need translations in two places. The computer must be taught to understand instructions in this different form. That's the job of a compiler. The human must learn to write code in a shape unfamiliar to his or her brain. That's our job as developers.

Don't think like a computer. Think about the problem you're solving. Think and write in its language, not yours and not the hardware's.