Friday, May 5, 2017

Code and Coders: components of the sociotechnical system

TL;DR: Study all the interactions between people, code, and our mental models; gather data and we can make real improvements instead of guessing in our retros.

Software is hard to change. Even when it's clean, well-factored, and everyone working on it is sharp and nice. Why?

Consider a software team and its software. It's a sociotechnical system; people create the code and the code affects the people.
a blob of code and several people, with two-way arrows between the code and the people and the people

When we want to optimise this system to produce more useful code, what do we do? How do we make the developer->code interactions more productive?

the sociotechnical system, highlight on each person
As a culture, we started by focusing on the individual people: hire those 10x developers! As the software gets more complex, that doesn't go far. An individual can only do so much.

the sociotechnical system, highlight on the arrows between people The Agile software movement shifted the focus to the interactions between the people. This lets us make improvements at the team level.
the sociotechnical system, highlight on the blob of codeThe technical debt metaphor let us focus on how the code influences the developers. Some code is easier to change than other code.
We shape our tools, and thereafter our tools shape us. - McCluhan
the sociotechnical system, highlight on the arrows reaching the codeTest-driven development focuses on a specific aspect of the developer<->code interaction: tightening the feedback loop on "will this work as I expected?" Continuous Integration has a similar effect: tightening the feedback loop on "will this break anything else?"

All of these focuses are useful in optimizing this system. How can we do more?

Thereʼs a component in this system that we haven't explicitly called out yet. It lives in the heads of the coders. Itʼs the developerʼs mental model of the software.
a blob of code and two people. The people have small blobs in their heads. two-way arrows between the code and the small blobs, and between the people
Each developerʼs mental model of the software matches the code (or doesn't)
Every controller must contain a model of the process being controlled.
Nancy Leveson, Engineering a Safer World
When you write a program, you have a model of it in your head. When you come to modify someone else's code, you have to build a mental model of it first, through reading and experimenting. When someone else changes your code, your mental model loses accuracy. Depending on the completeness and accuracy of your mental model of the target software, adding features can be fun and productive or full of pain.

Janelle Klein models the developer⟺code interaction in her book Idea Flow.  We want to make a change, so we look around for a bit, then try something. If that works, we move forward (the Confirm loop). If it doesn't work, we shift into troubleshooting mode: we investigate, then experiment until we figure it out (the Conflict loop). We update our mental model. When weʼre familiar with the software, we make forward progress (Confirm). When weʼre not, pain! From the book:

to make a change, start with learn; modify; validate. If the validation works, Confirm! back to learn. If the validation is negative, Conflict! on to troubleshooting; rework; validate.

That 10x developer is the one with a strong mental model of this software. Probably they wrote it, and no one else understands it. Agile (especially pairing) lets us transfer our mental model to others on the team. Readable code makes it easier for others to construct an accurate mental model. TDD makes that Confirm loop happen many more times, so that Conflict loops are smaller.

We can optimize this developer⟺code interaction by studying it further. Which parts of the code cause a lot of conflict pain? Focus refactoring there. Who has a strong mental model of each part of the system, and who needs that model? Pair them up.

Idea Flow includes tools for measuring friction, for collecting data on the developer⟺code interaction so we can address these problems directly. Recording the switch from Confirm to Conflict tells us how much of our work is forward progress and how much is troubleshooting, so we can recognize when we're grinding.

Even better, we have data on the causes of the grinding.

We can reflect and choose actions based on what's causing the most pain, rather than on gut feel of what we remember on the day of the retrospective.

Picturing those internal models as part of the sociotechnical system changes my actions in subtle ways. For instance I now:
  • observe which of my coworkers are familiar with each part of the system.
  • refactor and then throw it away, because that improves my mental model without damaging anyone else's.
  • avoid writing flexible code if I don't need it yet, because alternatives inflate the mental model other people have to build.
  • spending more time reviewing PRs in order to keep my model up-to-date.

We can't do this by focusing on people or code alone. We have to optimize for learning. Well-factored code can help, but it isn't everything. Positive personal interactions help, but they aren't everything. Tests are only one way to minimize conflict. No individual skill or familiarity can overcome these challenges.

If we capture and optimize our conflict loops, consciously and with data, we can optimize the entire sociotechnical system. We can make collaborative decisions that let us change our software faster and faster.

Thursday, April 6, 2017

The Architects Below

This is the text of a short keynote for O'Reilly Software Architecture Conference 2017, New York.

Software developers have a particular power over the daily lives of our users.

A hospital

the nurse interacts with the patient, and they record those interactions in software.
an ellipse containing patient, nurse, and software

Software impacts the nurse: some things might be easier, but others are harder. On paper, leave a field in a form blank and you've still filled out the rest; on a computer, it can stop you from saving. The nurse must complete the form for each patient, recording interactions. Which interactions are easily recorded influences which interactions take place. The less time a nurse is at the computer, the more they can spend on direct care.

As developers, we control a piece of the sociotechnical system nurses work in. (Sociotechnical: includes both humans and software.) Software is practically a coworker now, a coworker we create.

the portion of the previous ellipse containing 'software'; an overlapping ellipse containing that software plus developers

We don't create the software for the nurses, though. We take our orders -- I mean, requirements -- from hospital administrators.

Administrators have different priorities than nurses. Yes, they care about quality of patient care. They also care about Safety and Legibility. Safety includes not only patient safety, but safety of the hospital from lawsuits or from losing certifications. Legibility is about understanding the system; leadership needs to understand what's going on in the hospital in order to improve it. They need the data to roll up into aggregate reports. They need required fields and dropdown boxes with valid values.

This impacts the nurses' choices. If the software makes it harder to do their jobs a particular way, they'll do it that way less often, or in a way that circumvents the record system. If this improves patient safety, great; if it only makes the hospital administration easier, bad.

It is possible for the designers and developers of the system to make both the administrators' and nurses' jobs easier. Recognize the conflict between these, and we can work to smooth it.

In order to give a concrete example of that, I have to switch from the compelling domain of hospitals into a domain I have personal experience in.

A furniture store

ellipse containing customer, cashier, software

A customer finds an item they like, but it's damaged. They want to buy it, but only if the price is adjusted for the damage. The cashier wants to sell the item, and they want to adjust the price. To do this, they interact with software: software whose requirements are set by retail management.
Retail management has other priorities. They care about safety.
Safety, in this context, is more than the physical integrity of humans. Safety is preventing disasters: in this case, it's a disaster if the company goes out of business. The associated safety constraint is: prevent cashiers from committing fraud using price adjustments.

Every dynamic system has control loops, other systems that watch for danger and adjust. In this case it's the software checking the size of price adjustments.

 ellipse containing customer, cashier, software; arrows go back and forth to a smaller box on the side representing the control loop
How does retail management get this safety control in place, and make sure it stays in place? Safety constraints must be addressed also in the system that builds the system. Here, with the developers that build the software. Management tells the developers to build this in, and checks that it is done.
ellipse containing software, developers; arrows go back and forth to a smaller box on the side representing management
As developers this is where we can help out the users. Retail administration tells us that
every price adjustment should make them type in their password and get approval from a manager. We go to the stores, we observe cashiers adjusting twelve items one at a time -- so we increment. We make a way for them to select many items before applying the discount And we get a compromise from retail: only adjustments over 10% require manager approval.

For legibility, we make it so the back office app lets managers see which cashiers make the most price adjustments. Discover fraud; don't make the cashier make the customer wait.

And then what happens? almost all adjustments are exactly 10%.

Software influences behavior.

Software architecture

As architects, we want to influence the behavior of software developers. We care about the output and the process, that our system is high quality and that we can keep improving it.

We want safety from disasters like:
   * downtime, breaking SLAs
   * losing important data
   * leaking private data
   * congealment: when the software gets so big and complicated it is super expensive to change.

To prevent congealment, we need to continue understanding the system as it grows. We need legibility.
I learned this concept from the book Seeing Like a State, which is about politics and city planning. Back in the day, people went by first names or nicknames, and everyone in the village knew where everyone lived. Then governments got bigger, and they wanted to tax. Everyone was assigned a forename and surname, every house got a number and every street a name. This let governments view populations and land in a way that scales up. These things can be aggregated and tracked. This wasn't done for the people, but top-down, to benefit the top.

In software architecture, legibility means we have enough harmony that we can scale up our view of our applications, and draw diagrams at each level that are accurate enough that we can reason about them. We can explain
how it works, and how it addresses the concerns specific to the business we're in. Explain it to the business, to ourselves, to the developers so they can know where their app fits.

The trick is to maintain legibility without losing too much flexibility.

There are two ways to enforce safety and legibility constraints on our software-builing systems. One is imposition: rules and processes enforced by management. Another is inclination: make the thing we want easy.
ellipse containing software, developers, software. There's a box at the side with arrows to and from the ellipse, representing management. "software" at the bottom of the ellipse is highlighted.
Developers build software using software. Those who control that software influence the behavior of developers.

At a trivial level, there's editors, IDEs, compilers. Version control: when I switched from svn to git, my behavior changed. I save my work way more often. I leave detailed stories in commit messages. I search the history all the time, because it's easy now. It's also easy to leave a lot of long-running local branches around, which is not good, but I do it.

What else can we influence? Frequency of deployment. How often we deploy is a function of how easy and how scary it is. This is determined by deployment automation and monitoring.

What programming language do we use? It's tempting to impose this, but then we introduce coupling (at a technical level, unnecessary to the business). Instead, we can incline people toward one programming language. Which language is easiest to monitor, deploy, log etc in our infrastructure? Developes want to deliver features to users. They'll use the language we've made easy for them -- unless there's a specific reason not to. If Ruby is easy to test and deploy, they'll use Ruby -- unless this specific app has serious performance constraints. Then they'll write it in Go.

Caveat: beware the internal library or framework. If all Ruby apps need to use this internal framework that was exceptional at the time but has since been surpassed by open source... they might use Go for reasons that aren't about the project's needs. Internal frameworks: code that starts out as leverage quickly becomes baggage.

How about a new feature: create a new service, or tack it on to an existing one? This depends how easy it is to spin up a new service. If it's all the same, developers will put the code where it belongs, in the place that provides information about the project.

All these incentives are set by infrastructure code.
bottom half of an ellipse containing developers and software; top half of an overlapping ellipse contains software and architects
When I had the title 'Architect,' I wrote infrastructure code. When I had the title 'Infrastructure Engineer,' I influenced architecture.

Beyond infrastructure: we care about the development flow, too. About making this legible. How many bugs are fixed, or features added? How long does each one take? We want to know what developers are working on, in a way that rolls up to managers, and to managers' managers. And so we bring in JIRA. With its required fields, and its valid values in dropdown boxes.

This is legibility through imposition. Nobody wants to context-switch over to JIRA to fill out its tracking forms. But the need for this information is real.

What if we can gather this data in tools that make the developer's life easier?

What if I get in on Monday and ask in slack, "@atomist what am I working on?" and receive a list of issues assigned to me, pull requests that need my review, and my PRs that are  ready to merge? And a button on each issue called "Start" that moves the ticket to 'in process' in JIRA?

If we want every commit associated with an issue, how about a bot that says, "I see that you made this commit. Is it for this issue you're working on? or would you like me to create one for you?"

At Atomist we're working on a programming model that lets you automate these interactions, smooth out your process until developers are thrilled to use the tool that also gives you the tracking you need. Some companies have teams for developer tools that implement some of this automation; we're trying to make it easy enough that one person, part time, can create this magic for their teams.

This kind of automation is not easy. If you value it only as "how many times I do the task" X "how often I do it," it won't be worth it. Take spinning up a new project: it's way more than creating the code, setting up the repo with your preferred labels and team access. There's setting up continuous integration. Logging maybe, or service discovery. Nginx configuration, deployment procedures: multiple repositories. We're trying to streamline this, and believe me, it is not trivial.

But automation isn't just savings. Automation adds value. Consistency. Repeatability. Documentation (the only real documentation is code). No context switching. Fewer errors -- the less frequently we do something, the higher the error rate, so the higher the value of automation. We don't just remove work! We remove fear.

Automation isn't doing the same things faster. It changes what we do.

When architects create this automation, we can bring to bear our deep understanding of the development process, and of our particular business.

Architects should code.
Because code is power
 to help developers
 to build flexible software
 to improve the lives of cashiers and nurses
 to get us all out of the furniture store and the hospital a little quicker.

Books referenced in this talk:

Engineering a Safer World (pdf), by Nancy Leveson
Seeing Like a State, by James C. Scott
37 Things One Architect Knows About IT Transformation, by Gregor Hohpe

If you have a Safari membership you can see the whole video.

Thursday, February 23, 2017


Developers have a love-hate relationship with code re-use. As in, we used to love it. We love our code and we want it to run everywhere and help everyone. We want to get faster with time by harnessing the work of our former selves.
And yet, we come to hate it. Reuse means dependencies. It means couplings. It means surprises, when changing code impacts something we did not expect, or else it means don't touch it, it's too scary. It means trusting code we don't understand because it's code didn't write.

Here's the thing: sharing code is dangerous. Do it sparingly.

When reuse is bad

Let's talk about sharing code. Take a business, developing software for its employees or its customers. Let's talk about code within an organization that is referenced in more than one service, or by multiple flows in a monolith. (Monolith is defined as "one deployable unit maintained by more than one small team.")

Let's see some pictures. Purple Service here has some classes or functions that it finds useful, and the team thinks these would be useful elsewhere. Purple team breaks this code out into a library, the peachy circle.

purple circle, peach circle inside

Then someone from Purple team joins Blue team, and uses that library in Blue Service. You think it looks like this:
peach circle under blue and purple circles

Nah, it's really more like this:
purple circle with peach circle inside. Blue circle has a line to peach circle

This is called coupling. When Purple team changes their library, Blue team is affected. (If it's a monolith, their code changed underneath them. I hope they have good tests.)
Now, you could say, Blue team doesn't have to update their version. The level of reuse is the release, we broke out the library, so this is fine.
picture of purple with orange circle, blue with peach circle.

At that point you've basically forked, the code isn't shared anymore. When Blue team needs to make their own changes, they first must upgrade, so they get surprised some unpredictable time later. (This happened to us at Outpace all the time with our shared "util" libraries and it was the worst. So painful. Those "timesavers" cost us a lot of time and frustration.)

This shared code is a coupling between two services that otherwise have nothing to do with each other. The whole point of microservices was to decouple! To make it so our changes impact only code that our team operates! dead. and for what?

To answer that, consider the nature of the shared code. Why is it shared?
Perhaps it is unrelated to the business: it is general utilities that would otherwise be duplicated, but we're being DRY and avoiding the extra work of writing and testing and debugging them a second time. In this case, I propose: cut and paste. Or fork. Or best of all, try a more formalized reuse-without-sharing procedure [link to my next post].

What if this is business-related code? What if we had good reason to DRY it out, because it would be wrong for this code to be different in Purple Service and Blue Service? Well sorry, it's gonna be different. Purple and Blue do not have the same deployment schedules, that's the point of decoupling into services. In this case, either you've made yourself a distributed monolith (requiring coordinated deployments), or you're ignoring reality. If the business requires exactly one version of this code, then make it its own service.
picture with yellow, purple, and blue circles separate, dotty lines from yellow to purple and to blue.

Now you're not sharing code anymore. You're sharing a service. Changes to Peachy can impact Purple and Blue at the same time, because that's inherent in this must-be-consistent business logic.

It's easier with a monolith; that shared code stays consistent in production, because there is only one deployment. Any surprises happen immediately, hopefully in testing. In a monolith, if Peachy is utility classes or functions, and Purple (or Blue) team wants to change them, the safest strategy is: make a copy, use the copy, and change that copy. Over time, this results in less shared code.

This crucial observation is #2 in Modern Software Over-engineering Mistakes by RMX.
"Shared logic and abstractions tend to stabilise over time in natural systems. They either stay flat or relatively go down as functionality gets broader."
Business software is an expanding problem. It will always grow, and not with more of the same: it will grow in ways you didn't plan for. This kind of code must optimize for change. Reuse is the enemy of change. (I'm talking about reuse of internal code.)

Back in the beginning, Blue team reused the peach library and saved time writing code. But writing code isn't the expensive part, compared to changing code. We don't add features faster as our systems get larger and we have more code hypothetically available for re-use. We add features more slowly, because every change has more impacts and is less safe. Shared code makes change less safe. The only code safe to share is code that doesn't change. Which means no versioning. Heck, you might as well have cut and pasted it.

When reuse is good

We didn't advance as an industry by rewriting, or cut and pasting, everything we need over and over. We build on libraries published by developers and companies all over the globe. They release them, we reuse them. Yes, we get into dependency hell, but it beats writing your own web framework. We get reuse not only of the code, but of understanding: Rails knowledge transfers between employers.

There is a tipping point where reuse is magical.

I argue that this point is well past a release, past a separate jar.
It is past a stable API
past a coherent abstraction
past automated tests
past solid documentation...

All these might be achieved within the organization if responsibility for the shared utilities lives in a separate team; you can try to use Conway's Law to enforce architectural boundaries, but within an org, those boundaries are soft. And this code isn't your business, and you don't have incentives to spend the time on these. Why have backwards compatibility when you can perform human coordination instead? It isn't worth it. In my past organizations, shared code has instead been the responsibility of no one. What starts out as "leverage" becomes baggage, as all the Ruby code is tied to an old version of Sinatra. Some switch to Go to get a clean slate.
Break those chains! Copy the pieces you need out of that internal library and make them yours.

At the level of winning reuse, that code has its own marketing department
its own sales team
its own office manager
its own stock price.

The level of reuse is the company.

(Pay for software.)

When the responsible organization succeeds by making its code stable and backwards-compatible and easy to work with and well-documented and extensively tested, that is code I want to reuse!

In addition to SaaS companies and vendors, there are organizations built around open-source software. This is why we look for packages and frameworks with a broad community around them. Or better, a foundation for keeping shared projects healthy. (Contribute to them.)


Reuse is dangerous because it introduces coupling. Share business code only when that coupling is inherent to the business domain. Share library and utility code only when it is maintained by an organization dedicated to publishing that code. (Same with services. If you can pay for infrastructure-level tools, you'll get better tools without distracting your organization.)

Why did we want to reuse internal code anyway?
For speed, but speed of change is more important.
For consistency, but that means coupling. Don't hold your teams back with it.
For propagation of bug fixes, which I've not seen happen.

All three of these can be automated [LINK to my next post] without dependencies.

Next time you consider making your code reusable, ask "who will I sell this to?"
Next time someone (including you) suggests you reuse their code, ask "who publishes that?" and if they say "me," copy it instead.


How important is correctness?

This is a raging debate in our industry today. I think the answer depends strongly on the kind of problem a developer is trying to solve: is the problem contracting or expanding? A contracting problem is well-defined, or has the potential to be well-defined with enough rigorous thought. An expanding problem cannot; as soon as you've defined "correct," you're wrong, because the context has changed.

A contracting problem: the more you think about it, the clearer it becomes. This includes anything you can define with math, or a stable specification: image conversion, what do you call it when you make files smaller for storage. There are others: ones we've solved so many times or used so many ways that they stabilize: web servers, grep. The problem space is inherently specified, or it has become well-defined over time.
Correctness is possible here, because there is such a thing as "correct." Programs are useful to many people, so correctness is worth effort. Use of such a program or library is freeing, it scales up the capacity of the industry as a whole, as this becomes something we don't have to think about.

An expanding problem: the more you think about it, the more ways it can go. This includes pretty much all business software; we want our businesses to grow, so we want our software to do more and different things with time. It includes almost all software that interacts directly with humans. People change, culture changes, expectations get higher. I want my software to drive change in people, so it will need to change with us.
There is no complete specification here. No amount of thought and care can get this software perfect. It needs to be good enough, it needs to be safe enough, and it needs to be amenable to change. It needs to give us the chance to learn what the next definition of "good" might be.

I propose we change our aim for correctness to an aim for safety. Safety means, nothing terrible happens (for your business's definition of terrible). Correctness is an extreme form of safety. Performance is a component of safety. Security is part of safety.

Tests don't provide correctness, yet they do provide safety. They tell us that certain things aren't broken yet. Process boundaries provide safety. Error handling, monitoring, everything we do to compensate for the inherent uncertainty of running software in production, all of these help enforce safety constraints.

In an expanding software system, business matters (like profit) determine what is "good enough" in an expanding system. Risk tolerance goes into what is "safe enough." Optimizing for the future means optimizing our ability to change.

In a contracting solution, we can progress through degrees of safety toward correctness, optimal performance. Break out the formal specification, write great documentation.

Any piece of our expanding system that we can break out into a contracting problem space, win. We can solve it with rigor, even make it eligible for reuse.

For the rest of it - embrace uncertainty, keep the important parts working, and make the code readable so we can change it. In an expanding system, where tests are limited and limiting, documentation becomes more wrong every day, the code is the specification. Aim for change.

Monday, January 16, 2017


Dependency management.
Nobody wants to think about it. We just want this stuff to work.
It is one of the nasty sneaky unsolved problems in software.

Each language system says, we've got a package manager and a build tool. This is how dependencies work. Some are better than others (I <3 Elm; npm OMG) but none of them are complete. We avert our eyes.

Dependencies are important. They're the edges in the software graph, and edges are always where the meaning lies. Edges are also harder to focus on than nodes.

They can be relatively explicit, declared in a pom.xml or package.json.
They can be hard to discover, like HTTP calls to URLs constructed from configuration + code + input.

Dependencies describe how we hook things together. Which means, they also determine our options for breaking things apart. And breaking things apart is how we scale a software system -- scale in our heads, that is; scale in terms of complexity, not volume.

If we look carefully at them, stop wishing they would stop being such a bother, maybe we can get this closer to right. There's a lot more to this topic than is covered in this post, but it's a start.

Libraries v Services

The biggest distinction. Definitions:

Libraries are compiled in. They're separate modules, in different repositories (or directories in a giant repository of doom (aka monorepo)). They are probably maintained by different companies or at least teams. Code re-use is achieved by compiling the same code into multiple applications (aka services). [I'm talking about compile-scope libraries here. Provided-scope, and things like .dll's (what is that even called) are another thing that should probably be a separate category in this post but isn't included.]

Services: one application calls another over the network (or sockets on the same machine); the code runs in different processes. There's some rigmarole in how to find each other: service discovery is totally a problem of its own, with DNS as the most common solution.



Libraries are declared explicitly, although not always specifically. Something physically brings their code into my code, whether as a jar or as code explicitly.

Service dependencies are declared informally if at all. They may be discovered in logging. They may be discernible from security groups, if you're picky about which applications are allowed to access which other ones.


Here's a crucial difference IMO. Libraries: you can release it and ask people to upgrade. If your library is internal, you may even upgrade the version in other teams' code. But it's the users of your library that decide when that new version goes into production. Your new code is upgraded when your users choose to deploy the upgraded code.

Services: You choose when it's upgraded. You deploy that new code, you turn off the old code, and that's it. Everyone who uses your service is using the new code. Bam. You have the power.
This also means you can choose how many of them are running at a time. This is the independent-scalability thing people get excited about.

If your library/service has data backing it, controlling code deployment means a lot for the format of the data. If your database is accessed only by your service, then you can any necessary translations into the code. If your database is accessed by a library that other people incorporate, you'd better keep that schema compatible.


There's a lovely Rich Hickey talk, my notes here, about versioning libraries. Much of it also applies to services.

If you change the interface to a library, what you have is a different library. If you name it the same and call it a new version, then what you have is a different library that refuses to compile with the other one and will fight over what gets in. Then you get into the whole question of version conflicts, which different language systems resolve in different ways. Version conflicts occur when the application declares dependencies on two libraries, each of which declares a dependency on the same-name fourth library. In JavaScript, whatever, we can compile in both of them, it's just code copied in anyway. In Java, thou mayest have only one definition of each class name in a given ClassLoader, so the tools choose the newest version and hope everyone can cope.

Services, you can get complicated and do some sort of routing by version; you can run multiple versions of a service in production at the same time. See? You call it two versions of the same service, but it's actually two different services. Same as the libraries. Or, you can support multiple versions of the API within the same code. Backwards compatibility, it's all the pain for you and all the actual-working-software for your users.

API Changes and Backwards Compatibility

So you want to change the way users interact with your code. There's an important distinction here: changing your code (refactoring, bug fix, complete rewrite) is very different from requiring customers to change their code in order to change yours correctly. That's a serious impact.

Services: who uses it? Maybe it's an internal service and you have some hope of grepping all company code for your URL. You have the option of personally coordinating with those teams to change the usage of your service.
Or it's a public-facing service. DON'T CHANGE IT. You can never know who is using it. I mean maybe you don't care about your users, and you're OK with breaking their code. Sad day. Otherwise, you need permanent backwards-compatibility forever, and yes, your code will be ugly.

Libraries: if your package manager is respectable (meaning: immutable, if it ever provides a certain library-version is will continue to provide the same download forever), then your old versions are still around, they can stay in production. You can't take that code away. However, you can break any users who aren't ultra-specific about their version numbers. That's where semantic versioning comes in; it's rude to change the API in anything short of a major version, and people are supposed to be careful about picking up a new major version of your library.
But if you're nice you could name it something different, instead of pretending it's a different number of the same thing.


A trick about libraries: it's way harder to know "what is an API change?"
With services it's clear; we recognize certain requests, and provide certain responses.
With libraries, there's all the public methods and public classes and packages and ... well, as a Java/Scala coder, I've never been especially careful about what I expose publicly. But library authors need to be if they're ever going to safely change anything inside the library.

Services are isolated: you can't depend on my internals because you physically can't access them. In order to expose anything to external use I have to make an explicit decision. This is much stronger modularity. It also means you can write them in different languages. That's a bonus.

There are a few companies that sell libraries. Those are some serious professionals, there. They have to test versions from way-back, on every OS they could run on. They have to be super aware of what is exposed, and test the new versions against a lot of scenarios. Services are a lot more practical to throw out there - even though backwards compatibility is a huge pain, at least you know where it is.


Libraries: it fails, your code fails. It runs out of memory, goodbye process. Failures are communicated synchronously, and if it fails, the app knows it.

Services: it fails, or it doesn't respond, you don't really know that it fails ... ouch. Partial failures, indeterminate failures, are way harder. Even on the same machine coordinating over a socket, we can't guarantee the response time or whether responses are delivered at all. This is all ouch, a major cost of using this modularization mechanism.


I think the biggest consideration in choosing whether to use libraries or services for distribution of effort / modularization is that choice of who decides when it deploys. Who controls which code is in production at a given time.

Libraries are more efficient and easier to handle failures. That's a thing. In-process communication is faster and failures are much easier to handle and consistency is possible.

Services are actual decoupling. They let a team be responsible for their own software, writing it and operating it. They let a team choose what is in production at a given time -- which means there's hope of ever changing data sources or schemas. Generally, I think the inertia present in data, data which has a lot of value, is underemphasized in discussion of software architecture. If you have a solid service interface guarding access to your data, you can (with a lot of painful work) move it into a different database or format. Without that, data migrations may be impossible.

Decoupling of time-of-deployment is essential for maintaining forward momentum as an organization grows from one team to many. Decoupling of features and of language systems, versions, tools helps too. To require everyone use the same tools (or heaven forbid, repository) is to couple every team to another in ways that are avoidable. I want my teams and applications coupled (integrated) in ways that streamline the customer's experience. I don't need them coupled in ways that streamline the development manager's experience.

Overall: libraries are faster until coordination is the bottleneck. Services add more openings to your bottle. That can make your bottle harder to understand.

There's a lot more to the problems of dependency management. This is one crucial distinction. All choices are valid, when made consciously in context. Try to focus through your tears.

Friday, January 13, 2017

Today's Rug: maven executable jar

I like being a polyglot developer, even though it's painful sometimes. I use lots of languages, and in every one I have to look stuff up. That costs me time and concentration.

Yesterday I wanted to promote my locally-useful project from "I can run it in the IDE" to "I can run it at the command line." It's a Scala project built in maven, so I need an executable jar. I've looked this up and figured this out at least twice before. There's a maven plugin you have to add, and then I have to remember how to run an executable jar, and put that in a script. All this feels like busywork.

What's more satisfying than cut-and-pasting into my pom.xml and writing another script? Automating these! So I wrote a Rug editor. Rug editors are code that changes code. There's a Pom type in Rug already, with a method for adding a build plugin, so I cut and paste the example from the internet into my Rug. Then I fill in the main class; that's the only thing that changes from project to project so it's a parameter to my editor. Then I make a script that calls the jar. (The script isn't executable. I submitted an issue in Rug to add that function.) The editor prints out little instructions for me, too.

$ rug edit -lC ~/code/scala/org-dep-graph MakeExecutableJar main_class=com.jessitron.jessMakesAPicture.MakeAPicture

Resolving dependencies for jessitron:scattered-rugs:0.1.0 ← local completed
Loading jessitron:scattered-rugs:0.1.0 ← local into runtime completed
run `mvn package` to create an executable jar
Find a run script in your project's bin directory. You'll have to make it executable yourself, sorry.
Running editor MakeExecutableJar of jessitron:scattered-rugs:0.1.0 ← local completed

→ Project
  ~/code/scala/org-dep-graph/ (8 mb in 252 files)

→ Changes
  ├── pom.xml updated 2 kb
  ├── pom.xml updated 2 kb
  ├── bin/run created 570 bytes
  └── .atomist.yml created 702 bytes

Successfully edited project org-dep-graph

It took a few iterations to get it working, probably half an hour more than doing the task manually.
It feels better to do something permanently than to do it again.

Encoded in this editor is knowledge:
* what is that maven plugin that makes an executable jar? [1]
* how do I add it to the pom? [2]
* what's the maven command to build it? [3]
* how do I get it to name the jar something consistent? [4]
* how do I run an executable jar? [5]
* how do I find the jar in a relative directory from the script? [6]
* how do I get that right even when I call the script from a symlink? [7]

It's like saving my work, except it's saving the work instead of the results of the work. This is going to make my brain scale to more languages and build tools.

below the fold: the annotated editor. source here, instructions here in case you want to use it -> or better, change it -> or even better, make your own.

@description "teaches a maven project how to make an executablejar"
@tag "maven"
editor MakeExecutableJar

@displayName "Main Class"
@description "Fully qualified Java classname"
@minLength 1
@maxLength 100
param main_class: ^.*$

let pluginContents = """<plugin>
[4]         </configuration>
                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
""" [2]

let runScript = """#!/bin/bash

while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
  DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
  SOURCE="$(readlink "$SOURCE")"
  [[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done [7]
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"

java -jar $DIR/../target/executable.jar "$@" [5]

with Pom p
  do addOrReplaceBuildPlugin "org.apache.maven.plugins" "maven-shade-plugin" pluginContents [1]

with File f when path = "pom.xml" begin
  do replace "__I_AM_THE_MAIN__" main_class
  do eval { print("run `mvn package` to create an executable jar")[3]

with Project p begin
  do eval { print("Find a run script in your project's bin directory. You'll have to make it executable yourself, sorry") }
  do addFile "bin/run" runScript

Monday, January 9, 2017

It's Atomist Time!

I'm hella excited to get to work on Atomist full-time starting now (January 2017). Why? What do they do? Oh let me tell you!

I love developing software, not least because we (as an industry) have not yet figured out how to software. We know it's powerful, but not yet how powerful. Software is like engineering except that the constraints aren't in what we can build, but what we can design and specify. Atomist is expanding this capacity.
Atomist builds tooling to smooth some bumps in software development. There are three components that I'm excited about, three components that open new options in how we develop software.

Component 1: Code that changes code

First, there are code editors, called Rugs. On the surface, these automate the typing part. Like code generators, except they continue to work with the code after you modify it. Like refactorings in an IDE, except they appear as a pull request, and then you can continue development on that branch. If you have some consistent code structure (and if you use a framework, you do), Rugs can perform common feature-adding or upgrading or refactoring operations. Use standard Rugs to, say, add graph database support to an existing Spring Boot project. Customize Rugs to set up your Travis build uniformly in your projects. Create your own Rugs to implement metrics integration according to your company's standards -- and to upgrade existing code when those standards change.

On the surface this is an incremental improvement over existing code generation and IDE refactoring tools. Yet, I see it as something more. I see it as a whole new answer to the question of "indirection or repetition?" in code. Take for instance: adding a field to a Rails app makes us change the controller, the model, and four other places. Or creating a new service means changing deployment configuration, provisioning, and service discovery. Whenever a single conceptual change requires code changes in multiple spots, we complain about the work and we make mistakes. Then we start to get clever with it: we introduce some form of indirection that localizes that change to one place. Configuration files get generated in the build, Ruby metaprogramming introduces syntax that I can't even figure out how it's executable -- magic happens. The code gets less explicit, so that we can enforce consistency and make changing it ... well, I'm not gonna say "easier" because learning to cast the spell is tricky, but it is less typing.

Atomist introduces a third alternative: express that single intention ("create a new service" or "add this field") as a Rug editor. This makes writing it one step, and then the editor makes all those code changes in a single commit in a branch. From there, customize your field or your new service; each commit that you make shows how your feature is special. The code remains explicit, without additional magic. When I come back and read it, I have some hope of understanding what it's doing. When I realize that I forgot something ("oops! I also need to add that service to the list of log sources") then I fix it once, in the NewService.rug editor. Now I never forget, and I never have to remember.

I love this about developing with Rugs: as I code, I'm asking myself, "how could I automate this?" and then what I learn is encoded in the Rug, for the benefit of future-me and (if I publish it) of future-everyone-else. That is when I feel productive.

Component 2: Coordination between projects

Editors are cute when applied to one project. When applied across an organization, they start to look seriously useful. Imagine: A library released a security update, and we need to upgrade it across the organization. Atomist creates a pull request on every project that uses that library. The build runs, maybe we even auto-merge it when the build passes. Or perhaps there are breaking changes; the editor can sometimes be taught how to make those changes in our code.

And if a Rug can change the way we use a library, then it can change the way we use ours. This is cross-repository refactoring: I publish an internal library, and I want to rename this function in the next version. Here's my game: I publish not only the new version of my library, but an editor - and then I ask Atomist to create pull requests across the organization. Now it is a quick code review and "accept" for teams to upgrade to the new version.

Atomist coordinates with teams in GitHub and in Slack. Ask Atomist in Slack to start that new feature for you, or to check all repositories in the organization and create pull requests. Atomist can also coordinate with continuous integration. It ties these pieces together across repositories, and including humans. It can react to issues, to build results, to merges; and it can ping you in Slack if it needs more information to act appropriately. I have plans to use this functionality to link libraries to the services that use them: when the build passes on my branch, go build the app that uses my library with this new version, and tell me whether those tests pass.

This is cross-repository refactoring and cross-repository build coordination. This gives companies an alternative to the monorepo, to loading all their libraries and services into one giant repository in order to test them together. The monorepo is a lie: our deployments are heterogenous, so while the monorepo is like "look at this lovely snapshot of a bunch of code that works together" the production environment is something different. The monorepo is also painful because git gets slow when the repository gets large; because it's hard to tell which commits affect which deployed units; and because application owners lose control over when library upgrades are integrated. Atomist will provide a layer on top of many repositories, letting us coordinate change while our repositories reflect production realities.

Atomist tooling will make multirepo development grow with our codebases.

Component 3: is still a secret

I'm not sure I can talk about the third piece of possibility-expanding tooling yet. So have this instead:

Automated coordination among systems and people who interact with code -- this is useful everywhere, but it's a lot of work to create our own bots for this. Some companies put the resources into creating enough automation for their own needs. No one business-software-building organization has a reason to develop, refine, and publish a general solution for this kind of development-process automation. Atomist does.

When it becomes easy for any developer to script this coordination and the reactions just happen -- "Tell me when an issue I reported was closed" "Create a new issue for this commit and then mark it closed as soon as this branch is merged" -- then we can all find breakages earlier and we can all keep good records. This automates my work at a higher level than coding. This way whenever I feel annoyed by writing a status report, or when I forget to update the version in one place to match the version in another, my job is not to add an item to a checklist. My job is to create an Atomist handler script to make that happen with no attention from me.

My secret

I love shaving yaks. Shaving them deeply, tenderly, finding the hidden wisdom under their hair. I love adding a useful feature, and then asking "How could that be easier?" and then "How could making that easier be easier?" This is Atomist's level of meta: We are making software to make it easier for you to make your work easier, as you work to make software to make your customers' lives easier.

I think we're doing this in depths and ways other development tools don't approach. At this level of meta (software for building software for building software for doing work), there's a lot of leverage, a lot of potential. This level of meta is where orders-of-magnitude changes happen. Software changes the world. I want to be part of changing the software world again, so we can change the real world even faster.

With Atomist, I get to design and specify my own reality, the reality of my team's work. (Atomist does the operations bit.) Without spending tons of time on it! Well, I get to spend tons of time on it because I get to work for Atomist, because that's my thing. But you don't have to spend tons of time on it! You get to specify what you want to happen, in the simplest language we can devise.
We're looking for teams to work with us on alpha-testing, if you're interested now. (join our slack, or email me) Let's learn together the next level of productivity and focus in software development.