Thursday, April 6, 2017

The Architects Below

This is the text of a short keynote for O'Reilly Software Architecture Conference 2017, New York.

Software developers have a particular power over the daily lives of our users.

A hospital

the nurse interacts with the patient, and they record those interactions in software.
an ellipse containing patient, nurse, and software

Software impacts the nurse: some things might be easier, but others are harder. On paper, leave a field in a form blank and you've still filled out the rest; on a computer, it can stop you from saving. The nurse must complete the form for each patient, recording interactions. Which interactions are easily recorded influences which interactions take place. The less time a nurse is at the computer, the more they can spend on direct care.

As developers, we control a piece of the sociotechnical system nurses work in. (Sociotechnical: includes both humans and software.) Software is practically a coworker now, a coworker we create.

the portion of the previous ellipse containing 'software'; an overlapping ellipse containing that software plus developers

We don't create the software for the nurses, though. We take our orders -- I mean, requirements -- from hospital administrators.

Administrators have different priorities than nurses. Yes, they care about quality of patient care. They also care about Safety and Legibility. Safety includes not only patient safety, but safety of the hospital from lawsuits or from losing certifications. Legibility is about understanding the system; leadership needs to understand what's going on in the hospital in order to improve it. They need the data to roll up into aggregate reports. They need required fields and dropdown boxes with valid values.

This impacts the nurses' choices. If the software makes it harder to do their jobs a particular way, they'll do it that way less often, or in a way that circumvents the record system. If this improves patient safety, great; if it only makes the hospital administration easier, bad.

It is possible for the designers and developers of the system to make both the administrators' and nurses' jobs easier. Recognize the conflict between these, and we can work to smooth it.

In order to give a concrete example of that, I have to switch from the compelling domain of hospitals into a domain I have personal experience in.

A furniture store

ellipse containing customer, cashier, software

A customer finds an item they like, but it's damaged. They want to buy it, but only if the price is adjusted for the damage. The cashier wants to sell the item, and they want to adjust the price. To do this, they interact with software: software whose requirements are set by retail management.
Retail management has other priorities. They care about safety.
Safety, in this context, is more than the physical integrity of humans. Safety is preventing disasters: in this case, it's a disaster if the company goes out of business. The associated safety constraint is: prevent cashiers from committing fraud using price adjustments.

Every dynamic system has control loops, other systems that watch for danger and adjust. In this case it's the software checking the size of price adjustments.

 ellipse containing customer, cashier, software; arrows go back and forth to a smaller box on the side representing the control loop
How does retail management get this safety control in place, and make sure it stays in place? Safety constraints must be addressed also in the system that builds the system. Here, with the developers that build the software. Management tells the developers to build this in, and checks that it is done.
ellipse containing software, developers; arrows go back and forth to a smaller box on the side representing management
As developers this is where we can help out the users. Retail administration tells us that
every price adjustment should make them type in their password and get approval from a manager. We go to the stores, we observe cashiers adjusting twelve items one at a time -- so we increment. We make a way for them to select many items before applying the discount And we get a compromise from retail: only adjustments over 10% require manager approval.

For legibility, we make it so the back office app lets managers see which cashiers make the most price adjustments. Discover fraud; don't make the cashier make the customer wait.

And then what happens? almost all adjustments are exactly 10%.

Software influences behavior.

Software architecture

As architects, we want to influence the behavior of software developers. We care about the output and the process, that our system is high quality and that we can keep improving it.

We want safety from disasters like:
   * downtime, breaking SLAs
   * losing important data
   * leaking private data
   * congealment: when the software gets so big and complicated it is super expensive to change.

To prevent congealment, we need to continue understanding the system as it grows. We need legibility.
I learned this concept from the book Seeing Like a State, which is about politics and city planning. Back in the day, people went by first names or nicknames, and everyone in the village knew where everyone lived. Then governments got bigger, and they wanted to tax. Everyone was assigned a forename and surname, every house got a number and every street a name. This let governments view populations and land in a way that scales up. These things can be aggregated and tracked. This wasn't done for the people, but top-down, to benefit the top.

In software architecture, legibility means we have enough harmony that we can scale up our view of our applications, and draw diagrams at each level that are accurate enough that we can reason about them. We can explain
how it works, and how it addresses the concerns specific to the business we're in. Explain it to the business, to ourselves, to the developers so they can know where their app fits.

The trick is to maintain legibility without losing too much flexibility.

There are two ways to enforce safety and legibility constraints on our software-builing systems. One is imposition: rules and processes enforced by management. Another is inclination: make the thing we want easy.
ellipse containing software, developers, software. There's a box at the side with arrows to and from the ellipse, representing management. "software" at the bottom of the ellipse is highlighted.
Developers build software using software. Those who control that software influence the behavior of developers.

At a trivial level, there's editors, IDEs, compilers. Version control: when I switched from svn to git, my behavior changed. I save my work way more often. I leave detailed stories in commit messages. I search the history all the time, because it's easy now. It's also easy to leave a lot of long-running local branches around, which is not good, but I do it.

What else can we influence? Frequency of deployment. How often we deploy is a function of how easy and how scary it is. This is determined by deployment automation and monitoring.

What programming language do we use? It's tempting to impose this, but then we introduce coupling (at a technical level, unnecessary to the business). Instead, we can incline people toward one programming language. Which language is easiest to monitor, deploy, log etc in our infrastructure? Developes want to deliver features to users. They'll use the language we've made easy for them -- unless there's a specific reason not to. If Ruby is easy to test and deploy, they'll use Ruby -- unless this specific app has serious performance constraints. Then they'll write it in Go.

Caveat: beware the internal library or framework. If all Ruby apps need to use this internal framework that was exceptional at the time but has since been surpassed by open source... they might use Go for reasons that aren't about the project's needs. Internal frameworks: code that starts out as leverage quickly becomes baggage.

How about a new feature: create a new service, or tack it on to an existing one? This depends how easy it is to spin up a new service. If it's all the same, developers will put the code where it belongs, in the place that provides information about the project.

All these incentives are set by infrastructure code.
bottom half of an ellipse containing developers and software; top half of an overlapping ellipse contains software and architects
When I had the title 'Architect,' I wrote infrastructure code. When I had the title 'Infrastructure Engineer,' I influenced architecture.

Beyond infrastructure: we care about the development flow, too. About making this legible. How many bugs are fixed, or features added? How long does each one take? We want to know what developers are working on, in a way that rolls up to managers, and to managers' managers. And so we bring in JIRA. With its required fields, and its valid values in dropdown boxes.

This is legibility through imposition. Nobody wants to context-switch over to JIRA to fill out its tracking forms. But the need for this information is real.

What if we can gather this data in tools that make the developer's life easier?

What if I get in on Monday and ask in slack, "@atomist what am I working on?" and receive a list of issues assigned to me, pull requests that need my review, and my PRs that are  ready to merge? And a button on each issue called "Start" that moves the ticket to 'in process' in JIRA?

If we want every commit associated with an issue, how about a bot that says, "I see that you made this commit. Is it for this issue you're working on? or would you like me to create one for you?"

At Atomist we're working on a programming model that lets you automate these interactions, smooth out your process until developers are thrilled to use the tool that also gives you the tracking you need. Some companies have teams for developer tools that implement some of this automation; we're trying to make it easy enough that one person, part time, can create this magic for their teams.

This kind of automation is not easy. If you value it only as "how many times I do the task" X "how often I do it," it won't be worth it. Take spinning up a new project: it's way more than creating the code, setting up the repo with your preferred labels and team access. There's setting up continuous integration. Logging maybe, or service discovery. Nginx configuration, deployment procedures: multiple repositories. We're trying to streamline this, and believe me, it is not trivial.

But automation isn't just savings. Automation adds value. Consistency. Repeatability. Documentation (the only real documentation is code). No context switching. Fewer errors -- the less frequently we do something, the higher the error rate, so the higher the value of automation. We don't just remove work! We remove fear.

Automation isn't doing the same things faster. It changes what we do.

When architects create this automation, we can bring to bear our deep understanding of the development process, and of our particular business.

Architects should code.
Because code is power
 to help developers
 to build flexible software
 to improve the lives of cashiers and nurses
 to get us all out of the furniture store and the hospital a little quicker.

Books referenced in this talk:

Engineering a Safer World (pdf), by Nancy Leveson
Seeing Like a State, by James C. Scott
37 Things One Architect Knows About IT Transformation, by Gregor Hohpe

If you have a Safari membership you can see the whole video.

No comments:

Post a Comment