Don’t build systems. Build subsystems.

Always consider your design a subsystem.

Jabe Bloom

When we build software, we aren’t building it in nowhere. We aren’t building a closed system that doesn’t interact with its environment. We aren’t building it for our own computer (unless we are; personal automation is fun). We are building it for a purpose. Chances are, we build it for a unique purpose — because why else would they pay us to do it?

Understanding that surrounding system, the “why” of our product and each feature, makes a big difference in making good design decisions within the system.

It’s like, the system we’re building is our own house. We build on a floor of infrastructure other people have created (language, runtime, dependency manager, platform), making use of materials that we find in the world (libraries, services, tools). We want to understand how those work, and how our own software works. This is all inside our house.

To do that well, keep the windows open. Look outside, ask questions of the world. What purpose is our system serving? What effects does it have, and what effects from other subsystems does it strengthen?

Whenever you’re designing something, the first step is: What is the system my system lives in? I need to understand that system to understand what my system does.

Jabe Bloom

It is a big world out there, and these are questions we can never answer completely. It’s tempting to stay indoors where it’s warm. We can’t know everything, but we gotta try for more.

Closing the feedback loop from the customer

Feedback loops are essential to learning. In business, they’re essential to getting the product right. We need to know what the customers think, what they’re struggling with, what they value.

There’s one department that has a lot of contact with customers. Whole conversations, where we can learn a lot about what frustrates people. Yet, customer service is generally operated as a cost center, optimized for low pay instead of high knowledge acquisition.

business experts talk to developers, who create an app, which is used by a whole slew of customers, who then call customer service. Does customer service get to send that feedback to the business experts?

Is the business getting feedback from this rich source of customer contact? or are we too busy coping with a quantity of calls? So many different people using the app. Each call represents only a tiny piece of customer experience.

In software-as-a-service, where the customers are developers, each of them is responsible for millions of uses of the app. Developers are high-leverage this way. What they struggle with, what stops them from using the product more, stops their applications from using the product LOTS more. The impact of each developer-customer is orders of magnitude larger than that of a single customer of a consumer product.

For this reason, in place of (or in addition to) customer service, we have Developer Advocates. A developer advocate answers questions, gathers experiences, and interacts with developer-customers at high bandwidth. Developer advocates are hired for impact, not for low pay.

replace the slew of customers with a few developers controlling their software which uses our app; these devs talk to a developer advocate, who talks to developers, business experts, and influences our app directly.

Developer advocates share feedback with developers of the product. They can impact the customer’s experience with the product directly: by changing it, and by adding plugins, tutorials, documentation, etc.

Feedback loops are short and thick compared to traditional customer service. It makes sense that this is possible in software, because the quantity of humans we need to interact with is much lower, and the impact of each is higher.

This seems like a win. I 💓 software-as-a-service.

Give me feedback that is qualitative, broad, and random

Today in standup, a colleague reported that a GitHub isn’t sending the hooks it did last week, after we wrote code to handle it, and it’s a problem for us.

“Did you contact GitHub?”

“I asked around, and I heard they’re very unhelpful.”

So no, he didn’t contact GitHub. The people writing that hook will not hear from us about how it makes their integrations less useful to everyone. Customer service, people. It’s so important!

Today T-mobile did something very wrong with my very straightforward order. I’m gonna have to call customer service and be like, “Hey, when I order a new phone and add a new line, I want the new phone connected to the new line.” Will this request ever make it back to the people who specify their software? If I had confidence it would, I would feel good about making the call. But I’ve worked at a telecom before, so I expect that data to stay with the rep, with no power to change the system.

You know what we need in order to make great software? Great feedback.

Feedback loops are the soul of any ongoing system. What is fed back, gets sustained, it gets built into the very structure of the system, that this is perpetuated. Only what is fed back.

So we need more metrics, right? Ha ha ha, no

Metrics sustain numbers. Not whatever you thought you were measuring.

We need broad feedback loops. We need information that we didn’t know we needed. We need to know what our customers experienced, what they expected and didn’t get, what surprised them, what pieces of our software make the larger system better. We need all the information we can get.

Except! all the information is the same as no information. We could look at log dumps that show all the usage. But we’d never see any of it, because it’s too much. Our attention is precious.

What we need is some of the information, and a different “some” each day. Little enough that we can listen, random enough that we sometimes encounter that precious bit that provides insight.

Breadth and surprises come from talking to people. So talk to people! Different people each day, and get their stories about today and whatever sticks in their heads.

metrics are tiny thin paths of feedback. Conversation is a big thick broad feedback path.

This is qualitative data. For real feedback about the real systems we contribute to, this should be most of what we look for.

Quantitative data, metrics, should be the exception. Every metric has its dark side. Use them with caution, because they distract us from everything else that we aren’t measuring.

All kinds of harm, all kinds of opportunities are obscured by a few shining metrics.

Seek out feedback loops that are incomplete and random and broad. Cherish the ones that bring you unexpected information. And please, please, meet your company’s customer service people and beg them for their stories.

hat tip to John Ohno for inspiring this post with a tweet thread.

Let’s reason about behavior

When we learn math, geometry, and logic in school, we’re always talking about things that are holding still. The element is in the set or it isn’t. The angle is acute or obtuse or right or we can’t know. Things are related or not. A thing has a property, or doesn’t.

Code can have properties. A function has side effects, or doesn’t. Data is mutated, or never. An API call can be idempotent. Whole programs can have properties and we like to reason about them.

Yesterday talking with Will Larson on >Code (episode 142), he pointed out that once we move beyond a single process, especially when we go to microservices and have oodles of processes running around, we don’t get to talk about properties anymore. Instead, we have behaviors. He said:

Properties, you can statically analyze.
Behaviors, you can verify they happen.

A call to one program, or even a series of calls, can be transactional. Once you’re in a distributed system, not a thing. You can talk about how the system behaves.

Rein pointed out that in distributed systems, properties are incredibly expensive. Guarantees like all-or-nothing transactions, exactly-once delivery, consistency are never perfect in the real world, and the closer you choose to be, the more you pay in money and latency. Coordination is expensive.

In addition, can we get better at verifying and reasoning about behaviors?

Will pointed out that fault injection is a way to verify behaviors. That makes sense: psychologists learn a lot about the way we think from a few people with localized brain injuries. Emitting and querying events is another way.

Then how do we reason about behaviors? Systems thinking helps. Will recommends Donella Meadows’s Primer as a start. (I loved that book too.) Also, the social sciences have been studying behaviors forever. Maybe their methods, like Grounded Theory, can help us.

We’re people, right? We have behaviors. If we can get better at naming and reasoning about them, maybe we can get better at being people. It could happen.

Mostly we orient

Observe, Orient, Decide, Act. This is the OODA loop, first recognized in fighter pilots and then in the Toyota Production System. It represents every choice of action in humans and higher level systems: take in sensory data, form a model of the world, choose the next action, make a change in the world.

At least in fighter pilots, and in all our daily life, most of this is automatic. We can’t help observing while we are awake. We constantly decide and act, it is part of being alive. The leverage point here is Orient.

The model we form of the world guides our decisions, both conscious and unconscious. Once the pilot has a geometric plane of battle in mind, the decisions are obvious. Once you see the bottleneck in production, you can’t look away from it. When I have an idea what’s going on in my daughter’s mind, I can talk to her.

Our power to change our actions, our habits, and our impact on the world lies in Orient. When we direct our attention to finding new models of the world, whole new possibilities of action open to us.

Fighter pilots can see what is possible when they picture the battle in the best geometric plane. Production managers needs to look at the flow of work. In software, I look at the flow of data through services and functions — different from when I used to see in objects or think about spots in memory.

The power of breaking work into smaller chunks is the chance to re-Orient in between them. TDD gives us lots of little stable points to stop and think. Pairing lets one person think about where we are in the problem space while the other is busy acting. Mob programming gives us the chance to negotiate an orientation among the whole group.

That co-orientation is crucial to collaboration. With that, we can predict each other’s decisions and understand each other’s actions. If we have a shared model of the world and when we are going, plus trust in the competence of our team in their respective specialties, that’s when we can really fly.

(This post is based on a conversation with Zack Kanter.)

Implementing all the interfaces

Humans are magic because we are components of many systems at once. We don’t just build into systems one level higher, we participate in systems many levels higher and everywhere in between.

In code, a method while is part of a class which is part of a library which is part of a service which is part of a distributed system — there is a hierarchy, and each piece fits where it does.

An atom is part of one molecule, which combines into one protein which functions in one cell in one tissue in one organ, if it’s lucky to be part of something exciting like a person.

But as a person, I am an individual and a mother and a team member and an employee and a citizen (of town, state, country) and a human animal. I am myself, and I participate in systems from relationship to family to community to culture. We function at all these levels, and often they load us with conflicting goals.

Gregory Bateson (PDF) describes native Bali culture: each full citizen participates in the village council. Outside of village council meetings, they speak for themselves. In the council, the speak in the interests of I Desa (literally, Mr. Village).

Stewart Brand lists these levels of pace and size in a civilization:

  • Fashion/art (changes fastest, most experimental)
  • Commerce
  • Infrastructure
  • Governance
  • Culture
  • Nature (changes slowest, moderates everything else)

Each of these work at different timescales. Each of us participates in each of them.

We each look out for our own interests (what is the fashionable coding platform of the day) and our family and company’s economic interest (what can we deliver and charge for this quarter) and infrastructure (what will let us keep operating and delivering long-term) and so on.

Often these are in conflict. The interests of commerce can conflict with the interests of nature. My personal finances conflict with the city building infrastructure. My nation might be in opposition to the needs of the human race. Yet, my nation can’t continue to exist without the stability of our natural world. My job won’t exist without an economic system, which depends on stable governance.

If we were Java classes, we’d implement twenty different interfaces, none of them perfectly, all of them evolving at different rates, and we’re single-threaded with very long GC pauses.

Tough stuff, being human.

Domain-specific laws

“there appear new laws and even new kinds of laws, which apply in the domain in question.”

David Bohm, quoted by Alicia Juarrero

He’s talking about the qualitative transformation that happens in a system when certain quantitative transition points are passed.

Qualitative transformation

I notice this when something that used to be a pain gets easier, sufficiently easier that I stop thinking about it and just use it. Like git log. There is such a thing as svn log but it’s so slow that I used it once ever in my years of svn. The crucial value in git log is that it’s so fast I can use it over and over again, each time tweaking the output.

  • git log
  • git log --oneline
  • git log --oneline | grep test
  • etc.

Now git log has way more functionality, because I can combine it with other shell commands, because it’s fast enough. This changes the system in more ways than “I use the commit log”: because I use the log, I make more commits with better messages. Now my system history is more informative than it used to be, all since the log command is faster.

The REPL has that effect in many languages. We try stuff all the time instead of thinking about it or looking it up, and as a result we learn faster, which changes the system.

Non-universal laws

I love the part about “laws, which apply in the domain in question.” There are laws of causality which are not universal, which apply only in specific contexts. The entire system history (including all its qualitative transformations) contribute to these contexts, so it’s very hard to generalize these laws even with conditions around them.

But can we study them? Can we observe the context-specific laws that apply on our own team, in our own symmathesy?

Can we each become scientists in the particular world we work in?

Designing Change vs Change Management

Our job as developers is to change software. And that means that when we decide what to do, we’re not designing new code, we’re designing change.

Our software (if it is useful) does not work in isolation. It does not poof transition to a new state and take the rest of the world with it.

If our software is used by people, they need training (often in the form of careful UI design). They need support. hint: your support team is crucial, because they talk to people. They can help with change.

If our software is a service called by other software, then that software needs to change to, if it’s going to use anything new that we implemented. hint: that software is changed by people. You need to talk to the people.

If our software is a library imported by other software, then changing it does nothing at all by itself. People need to upgrade it.

The biggest barriers to change are outside your system.

Designing change means thinking about observability (how will I know it worked? how will I know it didn’t hurt anything else?). It means progressive delivery. It often means backwards compatibility, gradual data migrations, and feature flags.

Our job is not to change code, it is to change systems. Systems that we are part of, and that are code is part of (symmathesy).

If we look at our work this way, then “Change Management” sounds ridiculous. Wait, there’s a committee to tell me when I’m allowed to do my job? Like, they might as well call it “Work Management.”

It is my team’s job to understand enough of the system context to guess at the implications of a change and check for unintended consequences. We don’t all have that, yet. We can add visibility to the software and infrastructure, so that we can react to unintended consequences, and lead the other parts of the system forward toward the change we want.

Reductionism with Command and Control

In hard sciences, we aim to describe causality from the bottom up, from elementary particles. Atoms form molecules, molecules form objects, and the reason objects bounce off each other is reduced to electromagnetic interactions between the molecules in their surfaces.

Molecules in DNA determine production of proteins which result in cell operations which construct organisms.

This is reductionism, and it’s valuable. The elementary particle interactions follow universal laws. They are predictable and deterministic (to the omits of quantum mechanics). From this level we learn fundamental constraints and abilities that are extremely useful. We can build objects that are magnetic or low friction or super extra hard. We can build plants immune to a herbicide.

Bottom-up causality. It’s science!

In Dynamics in Action, Juarrero spends pages and pages asserting and justifying that causality in systems is not only bottom-up; the whole impacts the parts. Causality goes both ways.

Why is it foreign to us that causality is also top-down?

In business, the classic model is all top-down. Command and control hierarchies are all about the big dog at the top telling the next level down what to do. Intention flows from larger (company) levels to smaller (division), and on down to the elementary humans at the sharp end of work.

Forces push upward from particles to objects; intentions flow downward through an org chart

Of course when life is involved, there is top-down causality as well as bottom-up. Somehow we try to deny that in the hard sciences.

Juarrero illustrates how top-down and bottom-up causality interact more intimately than we usually imagine. In systems as small as a forming snowflake, levels of organization influence each adjacent level.

We see this in software development, where our intention (design) is influenced by what is possible given available building blocks (implementation). A healthy development process tightens this interplay to short time scales, like daily.

Software design in our heads learns from what happens in the real world implementation

Now that I think about how obviously human (and organization) intention flows downward, impacted by limitations and human psychology pushing upward; and physical causality flows upward, impacted by what is near what and what moves together mattering downward; why is it even strange to us that causality moves both ways?

Mission Statement

“Code, as a medium, is unlike anything humans have worked with before. You can almost design right into it.”

me, in my Camerata keynote


But not totally, because we always find surprises. Complex systems are always full of surprises. That is their frustration and their beauty. 

We live in complex systems. From biology up through cultures and nations and economies, we breathe complexity. And yet in school we learned science as reductive.’

In software, we now have seriously complex systems that we can play with on a time scale that helps us learn. We have incidents we can learn from, with many clues to the real events, to the rich causalities, and sometimes we can trace those back to social pressures in the human half of our software systems. What is more, we can introduce new clues. We can add tracing, and we can make better tools that help the humans (and also provide a trail of what we did). So we have access to complex systems that are (1) malleable and (2) observable. 

My work in automating delivery increases that malleability. My speaking about collaborative automation aims to increase observability.

My quest is: as people, let’s create software systems that are complex and malleable and observable enough that we learn how to work with and within complex systems. That we develop instincts and sciences to change systems from the inside, in ways that benefit the whole system as well as ourselves. And that we apply that learning to the systems we live and breathe in: biology, ecology, economy, culture.

That’s my mission as a symmathecist.