Thursday, February 23, 2017

Reuse

Developers have a love-hate relationship with code re-use. As in, we used to love it. We love our code and we want it to run everywhere and help everyone. We want to get faster with time by harnessing the work of our former selves.
And yet, we come to hate it. Reuse means dependencies. It means couplings. It means surprises, when changing code impacts something we did not expect, or else it means don't touch it, it's too scary. It means trusting code we don't understand because it's code didn't write.

Here's the thing: sharing code is dangerous. Do it sparingly.

When reuse is bad


Let's talk about sharing code. Take a business, developing software for its employees or its customers. Let's talk about code within an organization that is referenced in more than one service, or by multiple flows in a monolith. (Monolith is defined as "one deployable unit maintained by more than one small team.")

Let's see some pictures. Purple Service here has some classes or functions that it finds useful, and the team thinks these would be useful elsewhere. Purple team breaks this code out into a library, the peachy circle.

purple circle, peach circle inside

Then someone from Purple team joins Blue team, and uses that library in Blue Service. You think it looks like this:
peach circle under blue and purple circles


Nah, it's really more like this:
purple circle with peach circle inside. Blue circle has a line to peach circle


This is called coupling. When Purple team changes their library, Blue team is affected. (If it's a monolith, their code changed underneath them. I hope they have good tests.)
Now, you could say, Blue team doesn't have to update their version. The level of reuse is the release, we broke out the library, so this is fine.
picture of purple with orange circle, blue with peach circle.

At that point you've basically forked, the code isn't shared anymore. When Blue team needs to make their own changes, they first must upgrade, so they get surprised some unpredictable time later. (This happened to us at Outpace all the time with our shared "util" libraries and it was the worst. So painful. Those "timesavers" cost us a lot of time and frustration.)

This shared code is a coupling between two services that otherwise have nothing to do with each other. The whole point of microservices was to decouple! To make it so our changes impact only code that our team operates! dead. and for what?

To answer that, consider the nature of the shared code. Why is it shared?
Perhaps it is unrelated to the business: it is general utilities that would otherwise be duplicated, but we're being DRY and avoiding the extra work of writing and testing and debugging them a second time. In this case, I propose: cut and paste. Or fork. Or best of all, try a more formalized reuse-without-sharing procedure [link to my next post].

What if this is business-related code? What if we had good reason to DRY it out, because it would be wrong for this code to be different in Purple Service and Blue Service? Well sorry, it's gonna be different. Purple and Blue do not have the same deployment schedules, that's the point of decoupling into services. In this case, either you've made yourself a distributed monolith (requiring coordinated deployments), or you're ignoring reality. If the business requires exactly one version of this code, then make it its own service.
picture with yellow, purple, and blue circles separate, dotty lines from yellow to purple and to blue.


Now you're not sharing code anymore. You're sharing a service. Changes to Peachy can impact Purple and Blue at the same time, because that's inherent in this must-be-consistent business logic.

It's easier with a monolith; that shared code stays consistent in production, because there is only one deployment. Any surprises happen immediately, hopefully in testing. In a monolith, if Peachy is utility classes or functions, and Purple (or Blue) team wants to change them, the safest strategy is: make a copy, use the copy, and change that copy. Over time, this results in less shared code.

This crucial observation is #2 in Modern Software Over-engineering Mistakes by RMX.
"Shared logic and abstractions tend to stabilise over time in natural systems. They either stay flat or relatively go down as functionality gets broader."
Business software is an expanding problem. It will always grow, and not with more of the same: it will grow in ways you didn't plan for. This kind of code must optimize for change. Reuse is the enemy of change. (I'm talking about reuse of internal code.)

Back in the beginning, Blue team reused the peach library and saved time writing code. But writing code isn't the expensive part, compared to changing code. We don't add features faster as our systems get larger and we have more code hypothetically available for re-use. We add features more slowly, because every change has more impacts and is less safe. Shared code makes change less safe. The only code safe to share is code that doesn't change. Which means no versioning. Heck, you might as well have cut and pasted it.

When reuse is good


We didn't advance as an industry by rewriting, or cut and pasting, everything we need over and over. We build on libraries published by developers and companies all over the globe. They release them, we reuse them. Yes, we get into dependency hell, but it beats writing your own web framework. We get reuse not only of the code, but of understanding: Rails knowledge transfers between employers.

There is a tipping point where reuse is magical.

I argue that this point is well past a release, past a separate jar.
It is past a stable API
past a coherent abstraction
past automated tests
past solid documentation...

All these might be achieved within the organization if responsibility for the shared utilities lives in a separate team; you can try to use Conway's Law to enforce architectural boundaries, but within an org, those boundaries are soft. And this code isn't your business, and you don't have incentives to spend the time on these. Why have backwards compatibility when you can perform human coordination instead? It isn't worth it. In my past organizations, shared code has instead been the responsibility of no one. What starts out as "leverage" becomes baggage, as all the Ruby code is tied to an old version of Sinatra. Some switch to Go to get a clean slate.
Break those chains! Copy the pieces you need out of that internal library and make them yours.

At the level of winning reuse, that code has its own marketing department
its own sales team
its own office manager
its own stock price.

The level of reuse is the company.

(Pay for software.)

When the responsible organization succeeds by making its code stable and backwards-compatible and easy to work with and well-documented and extensively tested, that is code I want to reuse!

In addition to SaaS companies and vendors, there are organizations built around open-source software. This is why we look for packages and frameworks with a broad community around them. Or better, a foundation for keeping shared projects healthy. (Contribute to them.)

Conclusion


Reuse is dangerous because it introduces coupling. Share business code only when that coupling is inherent to the business domain. Share library and utility code only when it is maintained by an organization dedicated to publishing that code. (Same with services. If you can pay for infrastructure-level tools, you'll get better tools without distracting your organization.)

Why did we want to reuse internal code anyway?
For speed, but speed of change is more important.
For consistency, but that means coupling. Don't hold your teams back with it.
For propagation of bug fixes, which I've not seen happen.

All three of these can be automated [LINK to my next post] without dependencies.

Next time you consider making your code reusable, ask "who will I sell this to?"
Next time someone (including you) suggests you reuse their code, ask "who publishes that?" and if they say "me," copy it instead.

Correctness

How important is correctness?

This is a raging debate in our industry today. I think the answer depends strongly on the kind of problem a developer is trying to solve: is the problem contracting or expanding? A contracting problem is well-defined, or has the potential to be well-defined with enough rigorous thought. An expanding problem cannot; as soon as you've defined "correct," you're wrong, because the context has changed.

A contracting problem: the more you think about it, the clearer it becomes. This includes anything you can define with math, or a stable specification: image conversion, what do you call it when you make files smaller for storage. There are others: ones we've solved so many times or used so many ways that they stabilize: web servers, grep. The problem space is inherently specified, or it has become well-defined over time.
Correctness is possible here, because there is such a thing as "correct." Programs are useful to many people, so correctness is worth effort. Use of such a program or library is freeing, it scales up the capacity of the industry as a whole, as this becomes something we don't have to think about.

An expanding problem: the more you think about it, the more ways it can go. This includes pretty much all business software; we want our businesses to grow, so we want our software to do more and different things with time. It includes almost all software that interacts directly with humans. People change, culture changes, expectations get higher. I want my software to drive change in people, so it will need to change with us.
There is no complete specification here. No amount of thought and care can get this software perfect. It needs to be good enough, it needs to be safe enough, and it needs to be amenable to change. It needs to give us the chance to learn what the next definition of "good" might be.

Safety
I propose we change our aim for correctness to an aim for safety. Safety means, nothing terrible happens (for your business's definition of terrible). Correctness is an extreme form of safety. Performance is a component of safety. Security is part of safety.

Tests don't provide correctness, yet they do provide safety. They tell us that certain things aren't broken yet. Process boundaries provide safety. Error handling, monitoring, everything we do to compensate for the inherent uncertainty of running software in production, all of these help enforce safety constraints.

In an expanding software system, business matters (like profit) determine what is "good enough" in an expanding system. Risk tolerance goes into what is "safe enough." Optimizing for the future means optimizing our ability to change.

In a contracting solution, we can progress through degrees of safety toward correctness, optimal performance. Break out the formal specification, write great documentation.

Any piece of our expanding system that we can break out into a contracting problem space, win. We can solve it with rigor, even make it eligible for reuse.

For the rest of it - embrace uncertainty, keep the important parts working, and make the code readable so we can change it. In an expanding system, where tests are limited and limiting, documentation becomes more wrong every day, the code is the specification. Aim for change.