Saturday, June 6, 2015

Ultratestable Coding Style

Darn side-effecting programs. Programs that change things in the outside world are so darn useful, and such a pain to test.
what's better than green? Ultra!For every piece of code, there is another piece of code that answers the question, "How do I know that code works?" Sometimes that's more work than the code itself -- but there is hope.

The other day, I made a program to copy some code from one project to another - two file copies, with one small change to the namespace declaration at the top of each file. Sounds trivial, right?

I know better: there are going to be a lot of subtleties. And this isn't throwaway code. I need good, repeatable tests.

Where do I start? Hmm, I'll need a destination directory with the expected structure, an empty source directory, files with the namespace at the top... oh, and cleanup code. All of these are harder than I expected, and the one test I did manage to write is specific to my filesystem. Writing code to verify code is so much harder than just writing the code!

Testing side-effecting code is hard. This is well established. It's also convoluted, complex, generally brittle.
The test process looks like this:
grumpy cat says "0 out of 10"input to code under test to output, but also prep the files in the right place and clear old files out, then the code under test does read & write on the filesystem, then check that the files are correct


Before the test, create the input AND go to the filesystem, prepare the input and the spot where output is expected.
After the test, check the output AND go to the filesystem, read the files from there and check their contents.
Everything is intertwined: the prep, the implementation of the code under test, and the checks at the end. It's specific to my filesystem. And it's slow. No way can I run more than a few of these each build.


The usual solution to this is to mock the filesystem. Use a ports-and-adapters approach. In OO you might use dependency injection; in FP you'd pass functions in for "how to read" and "how to write." This isolates our code from the real filesystem. Test are faster and less tightly coupled to the environment. The test process looks like this:
input and "how to read" and "how to write" go into the test, plus prepare results in "how to read"; code under test hits "how to read" and "how to write"; check the number and input of calls to "how to write" at the end.

Before the test, create the input AND prepare the mock read results and initialize the mock for write captures.
After the test, check the output AND interrogate the mock for write captures.

It's an improvement, but we can do better. The test is still convoluted. Elaborate mocking frameworks might make it cleaner, but conceptually, all those ties are still there, with the stateful how-to-write that we pass in and then ask later, "What were your experiences during this test?"

If I move the side effects out of the code under test -- gather all input beforehand, perform all writes afterward -- then the decisionmaking part of my program becomes easier and more clear to test. It can look like this (code):
grumpy cat smiles and says "YES"input and contents you might read go in; code under test; "please write this to here" and "please write that to there" come out with output

The input includes everything my decisions need to know from the filesystem: the destination directory and list of all files in it; the source directory and list plus contents of all files in it.
The output includes a list of instructions, for the side effects the code would like to perform. This is super easy to check at the end of a test.

The real main method looks different in this design. It has to gather all the input up front[1], then call the key program logic, then carry out the instructions. In order to keep all the decisionmaking, parsing, etc in the "code under test" block, I keep the interface to that function as close as possible to that of the built-in filesystem-interaction commands. It isn't the cleanest interface, but I want all the parts outside "code-under-test" to be trivial.
simplest possible code to gather input, to well-tested code that makes all the decisions, to simplest-possible code to carry out instructions.

With this, I answer "How do I know this code works?" in two components. For the real-filesystem interactions, the documentation plus some playing around in the REPL tell me how they work. For the decisioning part of the program, my tests tell me it works. Manual tests for the hard-to-test bits, lots of tests for the hard-to-get-right bits. Reasoning glues them together.

Of course, I'm keeping my one umbrella test that interacts with the real filesystem. The decisioning part of the program is covered by poncho tests. With an interface like this, I can write property-based tests for my program, asserting things like "I never try to write a file in a directory that doesn't exist" and "the output filename always matches the input filename."[2]

As a major bonus, error handling becomes more modular. If, on trying to copy the second file, it isn't found or isn't valid, the second write instruction is replaced with an "error" instruction. Before any instructions are carried out, the program checks for "error" anywhere in the list (code). If found, stop before carrying out any real action. This way, validations aren't separated in code from the operations they apply to, and yet all validations happen before operations are carried out. Real stuff happens only when all instructions are possible (as far as the program can tell). It's close to atomic.

There are limitations to this straightforward approach to isolating decisions from side-effects. It works for this program because it can gather all the input, produce all the output, and hold all of it in memory at the same time. For a more general approach to this same goal, see Functional Programming in Scala.

Moving all the "what does the world around me look like?" side effects to the beginning of the program, and all the "change the world around me!" side effects to the end of the program, we achieve maximum testability of program logic. And minimum convolution. And separation of concerns: one module makes the decisions, another one carries them out. Consider this possibility the next time you find yourself in testing pain.


The code that inspired this approach is in my microlib repository.
Interesting bits:
Umbrella test (integration)
Poncho tests (around the decisioning module) (I only wrote a few. It's still a play project right now.)
Code under test (decisioning module)
Main program
Instruction carrying-out part

Diagrams made with Monodraw. Wanted to paste them in as ASCII instead of screenshots, but that'd be crap on mobile.

[1] This is Clojure, so I put the "contents of each file" in a delay. Files whose contents are not needed are never opened.
[2] I haven't written property tests, because time.



Monday, May 25, 2015

git: handy alias to find the repository root

To quickly move to the root of the current git repository, I set up this alias:

git config --global alias.home 'rev-parse --show-toplevel'

Now,  git home prints the full path to the root directory of the current project.
To go there, type (Mac/Linux only)

cd `git home`

Notice the backticks. They're not single quotes. This executes the command and then uses its output as the argument to cd.

This trick is particularly useful in Scala, where I have to get to the project root to run sbt compile. (Things that make me miss Clojure!)

Saturday, May 2, 2015

Fitting in v. Belonging

In your team, do you feel like you fit in? Do you have a feeling of belonging?

These are very different questions.[2] When I fit in, it's because everyone is sufficiently alike. We have inside jokes, TV shows or sports we talk about, opinions we share and common targets of ridicule. New people can fit in by adopting these opinions and following these sports.

When I belong, it's because everyone wants me to be there, because the group wouldn't be the same without me. We value each other for our differences. We have values that we share, and opinions we discuss. New people are integrated as we come to know and appreciate each other.

"Fitting in," that superficial team culture of common interests and inside jokes, is much easier to establish. And it's easier to hire for, because we can base choices on appearances and social cues. Hoodies and quotes from The Princess Bride. But it doesn't get us a strong team. On a strong team, people share ideas, they pull from their varied perspectives, they emphasize their differences because that's their unique contribution. This weaving together of various strengths, respect for the unexpected -- this is how a team can be stronger than its parts, this is where novel solutions come from. It emerges from feelings of belonging, which come from the group's deeper culture. That's much harder to establish.

On a wholehearted team, we show up as our whole selves and put all our creativity into the team's goals. How can we achieve this? Hire for value fit, not culture fit. Don't settle for the comfort of "fitting in" - aim for the safety of belonging.

I had this kind of team at my last job, at Outpace. We loved each other as people and respected each other as developers. And this was a remote team - we didn't fall back on physical proximity as appearance of teamwork. We shared goals, discussed them and evolved them. We shared our frustrations, both work and personal. When opinions clashed, we asked why, and learned. On this team, I explored ideas like the sea map. We grew individually and together.

That feeling of belonging makes it safe to express ideas and to run with them. And to take ideas from others and expand on them. Poof: innovation. Without that feeling of belonging, when the aim is to fit in, we express agreement with dominant voices.[1] Superficial cultural fit actively represses new ideas.

How can we move our teams toward a greater sense of belonging? Ask people about their interests that you don't share. Respect each person's experiences and opinions, especially when these are unique among the group. Instead of "We all agree, so we must be right," say, "We all agree. This is dangerous; can we find another view?" When pair programming, if you think your pair has the wrong idea, try it anyway. When someone says something dumb, their perspective differs; respond with curiosity, not judgement. Cherish our differences, not superficial similarities. Sacrifice the comfort of fitting in for the safety to be ourselves.


[1] Research has shown that teams of similar-looking people emphasize similarities. They're driven toward groupthink, quiet silencing of dissent. When someone breaks the uniformity, the not-obviously-different people starts expressing the parts of them that are unique. (I can't find the reference, anyone know it?)

[2] The dichotomy between fitting-in and belonging comes from BrenĂ© Brown's book, Daring Greatly.


Wednesday, April 29, 2015

Data v Awareness

In the computer industry, data and conscious thinking are praised, as opposed to an integrated awareness.[1] How is the work going? the task-tracking tools, the commits, and the build results provide data, but only conversations with the team can provide awareness. Awareness of mood and relationships and trends, of uncertainties and risks. Perhaps this is part of organizations' fear of remote work: colocation provides opportunities to read the mood of the team. Data alone can't provide that.

In sociological research like Brené Brown's, she starts with awareness: interviews, a person's story in context. Then she codes (in this context, "to code" is to categorize and label) the answers, and they become data. She aggregates that data to get a broader picture, and that leads to a broader awareness.

The key is: local awareness, to data, to aggregated data, to broader awareness.

On my last team, we were working on this. I wanted to track what was holding us back, and what was helping us move. Which tools in our technology stack cost us the most energy, and which improvements are paying off. To do this, we started posting in Slack whenever something frustrated us or helped us along, with custom emoticons as labels. For instance:
weight: clojure set operations behave unpredictably if passed a vector; lift: test-data generation utility for X service; weight: local elasticsearch version different from prod
This turns our awareness of the current situation into data, which a program can aggregate later. At retro time, I turned the words next to the hot-air balloon ("lift," because it helps us move the project up and forward) into a word cloud.[2] The words next to the kettlebell ("weight," because it's weighing down the balloon, holding us back) formed a separate word cloud. This gave us a visualization to trigger discussion.

The aggregation of the data produced a broader level of awareness in our retrospective. This contrasts with our remembered experience of the prior sprint. Our brains are lousy at aggregating these experiences; we remember the peak and the end. The most emotional moment, and the most recent feelings. The awareness -> data -> aggregation -> awareness translation gives us a less biased overview.

The honest recording of local awareness happens when the data is interpreted within the team, within the circle of trust, within context. There's no incentive to game the system, except where that is appropriate and deliberate. For instance, the week after the first word cloud, Tanya posted in the channel:
weight: elasticsearch elasticsearch elasticsearch elasticsearch elasticsearch
She's very deliberately inflating a word in the word cloud, corresponding to the level of pain she's experiencing. (context: we were using Elasticsearch poorly, totally nothing wrong with the tech, it was us.) Her knowledge of how the data would be used allowed her to translate her local awareness into a useful representation.

Data alone is in conflict with a broad, compassionate awareness of the human+technological interactions in the team. But if the data starts with awareness, and is aggregated and interpreted with context, it can help us overcome other limitations and biases of our human brains. In this way, we can use both data and awareness, and perhaps gain wisdom.

----
[1] "Computing: Yet Another Reality Construction," by Rodney Burstall, inside Software Development and Reality Construction
[2] Thank you @hibikir1 for suggesting the first reasonable use of a word cloud in my experience

The Quality Wheel

"Quality software." It means something different to everyone who hears it.

You know quality when you see it, right? Or maybe when you smell it. Like a good perfume. Perfume preferences are different for everyone, and quality means something different for every application.

In perfume, we can discover and describe our preferences using the Fragrance Wheel. This is a spectrum of scent categories, providing a vocabulary for describing each perfume, the attributes of a scent.
Floral notes (Floral, Soft Floral); Oriental notes (Floral Oriental, Soft Oriental, Woody Oriental); Woody notes (Mossy woods, dry woods); Fresh notes (citrus, green, water)
Perhaps a similar construction could help with software quality?

When a developer talks about quality, we often mean code consistency and readability, plus automated testing. A tester means lack of bugs. A designer means a great UI, a user means great experience and exactly the right features and lack of errors or waiting. An analyst means insightful reporting and the right integrations, a system administrator means low CPU usage and consistent uptime and informative logging. Our partners mean well-documented, discoverable APIs and testing tools.
Usability (Features, Discoverability, User Experience); Performance (Responsiveness, Availability, Scalability); Flexibility (Speed of Evolution, Configurability); Correctness (Visibilty, Automated Tests, Accuracy)
Each of these are attributes of quality. For any given software system and for each component, different quality attributes matter most. What's more, some aspects of quality compliment each other, each makes the other easier - for instance, a good design facilitates a great user experience. Readable code facilitates lack of bugs. Consistent uptime facilitates lack of waiting. Beautiful (consistent, modular, readable) code facilitates all the externally-visible aspects of quality.

However, other aspects of quality are in conflict. Quantity of features hurts code readability. More integrations leads to more error messages. Logging can increase response time.

If we add nuance to our vocabulary, we can discuss quality with more detail, less ambiguity. We can decide which attributes are essential to our software system, and to each piece of our system. Make the tradeoffs explicit, and allocate time and attention to carefully chosen quality attributes. This gets our system closer to something even greater: usefulness.

The quality wheel pictured above is oversimplified; it's designed to parallel the original version of the Fragrance Wheel. I have a lot more quality attributes in mind. I'd love to have definitions of each piece, along with Chinese-Zodiac-style "compatible with/poor match" analysis. If this concept seems useful to you, please contribute your opinions in the comments, and we can expand this together.

Sunday, April 26, 2015

Your Code as a Crime Scene: book review

What can we learn about our projects with a little data science and a lot of version control?
Locate the most dangerous code! Find where Conway's Law is working against us! Know who to talk to about a change you want to make! See whether our architectural principles are crumbling!

Adam Tornhill's book shows both how and why to answer each of these questions. The code archaeology possibilities are intriguing; he shows how to get raw numbers and how to turn them into interactive graphs with a few open-source scripts. He's pragmatic about the numbers, reminding the reader what not to use them for. For instance: "the value of code coverage is in the implicit code review when you look at uncovered lines." Trends are emphasized over making judgements about particular values.

Even better are Adam's expansive insights into psychology, architecture, and the consequences of our decisions as software engineers. For instance: we know about the virtues of automated tests, but what about the costs? And, what is beauty in code? (answer: lack of surprise)

There's plenty of great material in here, especially for a developer joining an existing team or open-source project, looking to get their mind around the important bits of the source quickly. I also recommend this book for junior- to mid-level developers who want to learn new insight into both their team's code and coding in general. If you want to accelerate your team, to contribute in ways beyond your own share of the coding, then run Adam's analyses against your codebase.

One word of caution: it gets repetitive in the intro and conclusion to the book as a whole and each section and each chapter. Whoever keeps repeating "Tell them what you're going to tell them, tell them, tell them what you just told them," can we please get past that now??

A few factoids I learned today from this book:
- Distributed teams have the most positive commit messages
- Brainstorming is more effective over chat than in an in-person meeting

and when it comes to the costs of coordination among too-large teams: "The only winning strategy is not to scale." (hence, many -independent- teams)

Post-agile: microservices and heads-up development

Notes from Craft Conference 2015, Budapest.

Craft conference was all about microservices this year.[1] Yet, it was about so much more at the same time -- even when it was talking about microservices.

lobby of the venue. Very cool, and always packed
Dan and I went on about microservices in our opening keynote,[2] about how it's not about size, it's about each service being a responsible adult and taking care of its own data and dependencies. And being about one bounded context, so that it has fewer conflicting cross-cutting concerns (security, durability, resilience, availability, etc) to deal with at any one time.

But it was Mary Poppendieck, in her Friday morning keynote,[3] who showed us why microservices aren't going away, not any more than the internet is going away. This is how systems grow: through federation and wide participation. (I wish "federated system" wasn't taken by some 1990s architecture; I like it better than "microservices.") Our job is no longer to control everything all the computers do, to make it perfectly predictable.[a]

Instead, we need to adapt to the sociotechnical system around us and our code. No one person in can understand all the consequences of their decision, according to Michael Nygard.[4] We can't SMASH our will upon a complex system, Mary says, but we can poke-poke-poke it; see how it responds; and adjust it to our purposes.

What fun is this?? I went into programming because physics became unsatisfying once I hit quantum mechanics, and I couldn't know everything all at once anymore. Now I'm fascinated by systems; to work with a system is to be part of something bigger than me, bigger than my own mental model. This is going to be a tough transition for many programmers, though. We spent our training time learning to control computers, and now we are exhorted to give up control, to experiment instead.

And worse: as developers must adapt, so must our businesses. In the closing keynote,[5] Marty Cagan made it very clear that our current model is broken. When most ideas come from executives, implemented according to the roadmap, it doesn't matter how efficient our agile teams are: we're wasting our time. Most ideas fail to make money. And the ones that do make money usually take far longer than expected. He ridicules the business case: "How much revenue will it generate? How much will it cost?" We don't know, and we don't know! Instead of measuring the impact of an idea after months of development, product teams need to measure in hours or days. And instead of a few ideas from upper management, we need to try out many ideas from the most fruitful source: developers. Because we're most in the position to see what has just become possible.

Exterior of the venue! (after the tent is down.)
I'd say "developers are a great source of innovation," except Alf Rehn reports that the word has been drained of meaning.[6] Marty Cagan corroborates that by using "ideas" throughout his keynote instead of "innovation." So where do these ideas come from? Diversity, says Pieter Hintjens,[7] let people try lots of things. Discovery, says Mike Nygard, let them see what other teams are doing.

Ideas come from having our heads up, not buried only in the code. They come from the first objective of software architecture: understanding the business problem. They come from handing teams an objective, NOT a roadmap. Marty Cagan made that point very clear. Adrian Trenaman concurred,[8] describing how Gilt teams went from a single IT to a team per line of business to a team per initiative. It is about results, measured outcomes.

All these measurements, of results, of expectations, of production service activity, come down to my favorite question - "How do we know what we know?"[b] Property-based (aka generative) testing is experiencing a resurgence (maybe its first major surgence) lately, as black-box testing around service-level components. In my solo talk,[10] I proposed a possible design for lowering the risk around interacting components. Mary had some other ideas in her talk too, which I will check out. Considering properties of a service can help us find the seams that align simplicity with options.

Mike Nygard remarked that the most successful microservices implementations he's seen started as a monolith, where refactoring identified those seams. There's nothing wrong with a monolith when that supports the business objectives; Randy Shoup said that microservices solve scaling problems, not business problems.[9] Mike and Adrian both pointed out that a target architecture is not a revolution, but an evolving direction. Architecture is like a city: as we build microservices in the new, hip part of town, those legacy tenements are still useful. The architecture is done only when the company goes out of business. Instead of working to a central plan, we want to develop situational awareness ("knowing what's happening in time to do something about it"[3]), and choose to work on what's most important right now.

It isn't enough to be good at coding anymore. The new "full-stack" is from network to customer. Marty: if your developers are only coding, you're not getting half their value. I want to do heads-up development. "Software Craftsmanship is less about internal efficiency, and more about engaging with the world around us," says Alf Rehn. "Creators need an immediate connection to what they are creating," quotes Mary Poppendieck.

As fun as it is to pop the next story off the roadmap and sit down and code it, we can have more impact. We can look up, as developers, as organizations. We can look at results, not requirements. We can learn from consequences, as well as conferences.

This transition won't be easy. It's the next step after agile. Microservices are a symptom of this kind of focus, the way good retrospectives are associated with constant improvement. Sure, it's all about microservices - in that microservices are about reducing friction and lowering risk. The faster we can learn, the farther we can get.



I'll add the links as Gergely posts the videos.

[1] Maciej was starting to get bored
[2] my keynote with Dan, "Complexity is Outside the Code"
[3] Mary Poppendieck's keynote, "The New New Software Development Game"
[a] Viktor Klang: "Writing software that is completely deterministic is nonsense because no machine is completely deterministic," much less the network.
[4] Mike Nygard's talk, "Architecture Without an End State"
[5] Marty's keynote
[6] Alf Rehn (ah!  what a beautiful speaker! such rhythm!) keynote. Maybe he didn't allow recording?
[7] Pieter's talk
[8] Adrian's talk, "Scaling Micro-services at Gilt"
[b] OK my real favorite question is "What is your favorite color?" but this is a deep second.
[9] Randy's talk, "From the Monolith to Microservices"
[10] my talk, "Contracts in Clojure: a compromise between types and tests"