Monday, December 5, 2016

Using Rug with Elm

At elm-conf and CodeMesh and YOW! Australia this year, I did live demos using automated code modification with Atomist Rug.

Rug is now officially open source, and the Rug CLI is available so that you can try (and change! and improve!) these editors on your own Elm code. This blog post tells you how.

I usually start a new Elm project as a static page, make it look like something; then turn it into a beginner program, add some interactivity; then turn it into an advanced program and add subscriptions. I like how this flow lets me start super-simple, and then add the pieces for access to the world as I need it.

Now you can do this too!

Watch out: these editors (and the parser behind them) work for the code I've tried them on. As you try them, you'll find cases I didn't cover. Please file an issue when you do, or find me on Atomist-Community slack.

Install Rug

The local version of the Rug runtime is the Rug CLI. Complete installation instructions are here.

TL;DR for Mac:
brew tap atomist/tap
brew install rug-cli

Generate a project

This will create a directory containing a new static Elm app, with a build script etc. This will put a project named banana under your current directory, make it a git repo and make an initial commit:
rug generate -R jessitron:elm-rugs:StaticPage banana
Inside banana, edit src/Main.elm. Put something in that empty div.
Run ./build
Open target/index.html to see the results.


Upgrade it to a beginner program

After your banana looks OK, make it interactive. Run this inside your project directory:
rug edit jessitron:elm-rugs:UpgradeToBeginnerProgram
Now your src/Main.elm contains the beginnings of a beginner program. The model is empty and the only message is Noop, which does nothing. This is the beginner program template from the Elm tutorial, except that the view function is populated based on your main from the static page.

You could add a button:
rug edit jessitron:elm-rugs:AddButton button_text="Push Me" button_message=ButtonPushed
Now your src/Main.elm contains a new message type, ButtonPushed. Your update function handles it, but does nothing interesting.
type Msg
    = Noop
    | 
ButtonPushed

update : Msg -> Model -> Model
update msg model =
    case msg of
        Noop ->
            model

        
ButtonPushed ->            model
Find a new function hanging out at the end of the file, buttonPushedButton. Incorporate that into your view to display the button. Run ./build and refresh target/index.html; push the button and see the message in the debugger.

Try adding a text input in a similar way, with
rug edit jessitron:elm-rugs:AddTextInput input_name=favoriteColor
This adds a function, a message, and a field to the model so that you'll have access to the content of the text input.

Try passing -R to rug, and it'll make a commit for you after the editor completes. You have to make a commit yourself right before running rug, or it'll complain.

For further edit operations, see my elm-rugs repo. You can upgrade to a full program, and add subscriptions to clicks and window size.

Change these editors! Add more!

The best part of running locally is running local versions.
Clone my repository: git clone git@github.com:jessitron/elm-repo.git
Now, go to the secret directory holding the editors: cd elm-repo/.atomist/editors
Here, you can see the scripts that work on the code, like AddButton.rug.

To run the local versions, be in that elm-rugs directory, and point rug at your project directory with -C:
rug -l -C /path/to/my/projects/banana edit AddButton button_message=Yay button_text="Say hooray"
I don't have to qualify the editor name with jessitron:elm-rugs when it's local.

There's more information in the Atomist docs on how rug works. TL;DR is, the files in the top level of elm-rugs/ are the starting point for newly generated project. NewStaticPage.rug, as a generator, starts from those and then changes the project name. The editors all start from whatever project they're invoked on, and they can change files in place, or create new ones from templates in the elm-rugs/.atomist/templates directory. (Most of my templates are straight files, with a .vm suffix to make Rug's merge function work.)

Questions very welcome on either the elm-lang slack in the #atomist channel, or the Atomist community slack on the #rug-elm channel.

Pull requests are even more welcome. Issues, too. These rugs work for the narrow cases I've tested them on. It'll be a community effort to refine and expand them!

Thursday, November 10, 2016

Areas of responsibility

"And the Delivery team is in charge of puppet...." said our new manager.

"Wait we're in charge of WHAT?" - me

"Well I thought that it fits in with your other responsibilities."

"That's true. But we're not working on it, we're working on these other things. You can put whatever you want in our yellow circle, but that's it."

"The yellow circle?"





See, I model our team's areas of responsibility as three circles. The yellow system is everything we're responsible for -- all the legacy systems and infrastructure have to belong to some team, and these are carved out for us. Some we know little about.
Inside the yellow circle is an orange circle: the systems we plan to improve. These appear on our backlog in JIRA epics. We talk about them sometimes.
Inside the orange circle, a red circle: active work. These systems are currently under development by our team. We talk about them every day, we add features and tests, we garden them.


That yellow circle holds a lot of risks: when something there breaks we'll stop our active work and learn until we can stand it back up. Management may add items here, as they recognize the schedule risk. We sometimes spend bits of time researching these, to reduce our fear of pager duty.

The orange circle holds a lot of hope, our ambitions for the year and the quarter. We choose these in negotiation with management.

The red circle is ours. We decide what to work on each day, based on plans, problems, and pain. Pushing anything directly into active work is super rude and disruptive.

"OK, it's in the yellow circle, cool. I'll work on hiring more people, so we can expand the orange and red circles too."







Sunday, September 25, 2016

Provenance and causality in distributed systems

Can you take a piece of data in your system and say what version of code put it in there, based on what messages from other systems? and what information a human viewed before triggering an action?

Me neither.

Why is this acceptable? (because we're used to it.)
We could make this possible. We could trace the provenance of data. And at the same time, mostly-solve one of the challenges of distributed systems.

Speaking of distributed systems...

In a distributed system (such as a web app), we can't say for sure what events happened before others. We get into general relativity complications even at short distances, because information travels through networks at unpredictable speeds. This means there is no one such thing as time, no single sequence of events that says what happened before what. There is time-at-each-point, and inventing a convenient fiction to reconcile them is a pain in the butt.

We usually deal with this by funneling every event through a single point: a transactional database. Transactions prevent simultaneity. Transactions are a crutch.

Some systems choose to apply an ordering after the fact, so that no clients have to wait their turn in order to write events into the system. We can construct a total ordering, like the one that the transactional database is constructing in realtime, as a batch process. Then we have one timeline, and we can use this to think about what events might have caused which others. Still: putting all events in one single ordering is a crutch. Sometimes, simultaneity is legit.

When two different customers purchase two different items from two different warehouses, it does not matter which happened first. When they purchase the same item, it still doesn't matter - unless we only find one in inventory. And even then: what matters more, that Justyna pushed "Buy" ten seconds before Edith did, or that Edith upgraded to 1-day shipping? Edith is in a bigger hurry. Prioritizing these orders is a business decision. If we raise the time-ordering operation to the business level, we can optimize that decision. At the same time, we stop requiring the underlying system to order every event with respect to every other event.

On the other hand, there are events that we definitely care happened in a specific sequence. If Justyna cancels her purchase, that was predicated on her making it. Don't mix those up. Each customer saw a specific set of prices, a tax amount, and an estimated ship date. These decisions made by the system caused (in part) the customer's purchase. They must be recorded either as part of the purchase event, or as events that happened before the purchase.

Traditionally we record prices and estimated ship date as displayed to the customer inside the purchase. What if instead, we thought of the pricing decision and the ship date decision as events that happened before the purchase? and the purchase recorded that those events definitely happened before the purchase event?

We would be working toward establishing a different kind of event ordering. Did Justyna's purchase happen before Edith's? We can't really say; they were at different locations, and neither influenced the other. That pricing decision though, that did influence Justyna's purchase, so the price decision happened before the purchase.

This allows us to construct a more flexible ordering, something wider than a line.

Causal ordering

Consider a git history. By default, git log prints a line of commits as if they happened in that order -- a total ordering.

But that's not reality. Some commits happen before others: each commit I make is based on its parent, and every parent of that parent commit, transitively. So the parent happened before mine. Meanwhile, you might commit to a different branch. Whether my commit happened before yours is irrelevant. The merge commit brings them together; both my commit and yours happen before the merge commit, and after the parent commit. There's no need for a total ordering here. The graph expresses that.

This is a causal ordering. It doesn't care so much about clock time. It cares what commits I worked from when I made mine. I knew about the parent commit, I started from there, so it's causal. Whatever you were doing on your branch, I didn't know about it, it wasn't causal, so there is no "before" or "after" relationship to yours and mine.

We can see the causal ordering clearly, because git tracks it: each commit knows its parents. The cause of each commit is part of the data in the commit.

Back to our retail example. If we record each event along with the events that caused it, then we can make a graph with enough of a causal ordering.


There are two reasons we want an ordering here: external consistency and internal legibility.

External Consistency

External consistency means that Justyna's experience remains true. Some events are messages from our software system to Justyna (the price is $), and others are messages coming in (Confirm Purchase, Cancel Purchase). The sequence of these external interactions constrains any event ordering we choose. Messages crossing the system boundary must remain valid.

Here's a more constricting example of external consistency: when someone runs a report and sees a list of transactions for the day, that's an external message. That message is caused by all the transactions reported in it. If another transaction comes in late, it must be reported later as an amendment to that original report -- whereas, if no one had run the report for that day yet, it could be lumped in with the other ones. No one needs to know that it was slow, if no one had looked.

Have you ever run a report, sent the results up the chain, and then had the central office accuse you of fudging the numbers because they run the same report (weeks later) and see different totals? This happens in some organizations, and it's a violation of external consistency.

Internal Legibility

Other causal events are internal messages: we displayed this price because the pricing system sent us a particular message. The value of retaining causal information here is troubleshooting, and figuring out how our system works.

I'm using the word "legibility"[1] in the sense of "understandability:" as a person we have visibility into the system's workings, we can follow along with what it's doing. Distinguish its features, locate problems and change it.

 If Justyna's purchase event is caused by a ship date decision, and the ship date decision ("today") tracked its causes ("the inventory system says we have one, with more arriving today"), then we can construct a causal ordering of events. If Edith's purchase event tracked a ship date decision ("today") which tracked its causes ("the inventory system says we have zero, with more arriving today"), then we can track a problem to its source. If in reality we only send one today, then it looks like the inventory system's shipment forecasts were inaccurate.

How would we even track all this?

The global solution to causal ordering is: for every message sent by a component in the system, record every message received before that. Causality at a point-in-time-at-a-point-in-space is limited to information received before that point in time, at that point in space. We can pass this causal chain along with the message.

"Every message received" is a lot of messages. Before Justyna confirmed that purchase, the client component received oodles of messages, from search results, from the catalog, from the ad optimizer, from the review system, from the similar-purchases system, from the categorizer, many more. The client received and displayed information about all kinds of items Justyna did not purchase. Generically saying "this happened before, therefore it can be causal, so we must record it ALL" is prohibitive.

This is where business logic comes in. We know which of these are definitely causal. Let's pass only those along with the message.

There are others that might be causal. The ad optimizer team probably does want to know which ads Justyna saw before her purchase. We can choose whether to include that with the purchase message, or to reconstruct an approximate timeline afterward based on clocks in the client or in the components that persist these events. For something as aggregated as ad optimization, approximate is probably good enough. This is a business tradeoff between accuracy and decoupling.

Transitive causality

How deep is the causal chain passed along with a message?

We would like to track backward along this chain. When we don't like the result of Justyna and Edith's purchase fulfillment, we trace it back. Why did the inventory system said the ship date would be today in both cases. This decision is an event, with causes of "The current inventory is 1" and "Normal turnover for this item is less than 1 per day"; or "The current inventory is 0" and "a shipment is expected today" and "these shipments usually arrive in time to be picked the same day." From there we can ask whether the decision was valid, and trace further to learn whether each of these inputs was correct.

If every message comes with its causal events, then all of this data is part of the "Estimated ship date today" sent from the inventory system to the client. Then the client packs all of that into its "Justyna confirmed this purchase" event. Even with slimmed-down, business-logic-aware causal listings, messages get big fast.

Alternately, the inventory system could record its decision, and pass a key with the message to the client, and then the client only needs to retain that key. Recording every decision means a bunch of persistent storage, but it doesn't need to be fast-access. It'd be there for troubleshooting, and for aggregate analysis of system performance. Recording decisions along with the information available at the time lets us evaluate those decisions later, when outcomes are known.

Incrementalness

A system component that chooses to retain causality in its events has two options: repeat causal inputs in the messages it sends outward; or record the causal inputs and pass a key in the messages it sends outward.

Not every system component has to participate. This is an idea that can be rolled out gradually. The client can include in the purchase event as much as its knows: the messages it received, decisions it made, and relevant messages sent outward before this incoming "Confirm Purchase" message was received from Justyna. That's useful by itself, even when the inventory system isn't yet retaining its causalities.

Or the inventory system could record its decisions, the code version that made them, and the inputs that contributed to them, even though the client doesn't retain the key it sends in the message. It isn't as easy to find the decision of interest without the key, but it could still be possible. And some aggregate decision evaluation can still happen. Then as other system components move toward the same architecture, more benefits are realized.

Conscious Causal Ordering

The benefits of a single, linear ordering of events are consistency, legibility, and visibility into what might be causal. A nonlinear causal ordering gives us more flexibility, consistency, a more accurate but less simplified legibility, and clearer visibility into what might be causal. Constructing causal ordering at the generic level of "all messages received cause all future messages sent" is expensive and also less meaningful than a business-logic-aware, conscious causal ordering. This conscious causal ordering gives us external consistency, accurate legibility, and visibility into what we know to be causal.

At the same time, we can have provenance for data displayed to the users or recorded in our databases. We can know why each piece of information is there, and we can figure out what went wrong, and we can trace all the data impacted by an incorrect past event.

I think this is something we could do, it's within our ability today. I haven't seen a system that does it, yet. Is it because we don't care enough -- that we're willing to say "yeah, I don't know why it did that, can't reproduce, won't fix"? Is it because we've never had it before -- if we once worked in a system with this kind of traceability, would we refuse to ever go back?




[1] This concept of "legibility" comes from the book Seeing Like a State.

Wednesday, August 17, 2016

Harder and better (slides)

At Scenic City Summit in Chattanooga last week, I gave a closing keynote about 3 ways our jobs are harder than they used to be, and how each of these makes our jobs better.

Annotated slides are on Dropbox.

Saturday, May 7, 2016

Tradeoffs in Coordination Among Teams

The other day in Budapest, Jez Humble and I wondered, what is the CAP theorem for teams? In distributed database systems, the CAP theorem says: choose two of Consistency, Availability, and Partitioning — and you must choose Partitioning.
Consider a system for building software together. Unless the software is built by exactly one person, we have to choose Partitioning. We can’t meld minds, and talking is slow.
In databases we choose between Consistency (the data is the same everywhere) and Availability (we can always get the data). As teams grow, we choose between Consensus (doing things for the same reasons in the same way) and Actually-getting-things-done.
Or, letting go of the CAP acronym: we balance Moving Together against Moving Forward.

Moving Together


A group of 1 is the trivial case. Decision-making is the same as consensus. All work is forward work, but output is very limited, and when one person is sick everything stops.

A group of 2-7 is ideal: the communication comes with interplay of ideas, and whole new outputs of dialogue make up for the time cost of talking to each other. It is still feasible for everyone in the group to have a mental model of each other person, to know what that person needs to know. Consensus is easy to reach when every stakeholder is friends with every other stakeholder.

Beyond one team, the tradeoffs begin. Take one team of 2-7 people working closely together. Represent their potential output with this tall, hollow arrow pointing up.


This team is building software to run an antique store. Look at them go, full forward motion. (picture: tall, filled arrow.)

Next we add more to the web site while continuing development on the register point-of-sale tools. We break into two teams. We’re still working with the same database of items, and building the same brand, so we coordinate closely. We leverage each others' tools. More people means more coordination overhead, but we all like each other, so it’s not much burden. We are a community, after all.
A green arrow and a red arrow, each connected by many lines of communication, are filled about halfway up with work.

Now the store is doing well. The web site attracts more retail business, the neighboring antique stores want to advertise their items on our site, everything is succeeding and we add more people. A team for partnerships, which means we need externally-facing reports, which means we need a data pipeline.
A purple arrow and a blue arrow join the red and green ones. Lines crisscross between them, a snarly web. The arrows are filled only a little because of these coordination costs. The purple arrow is less connected, and a bit more full, but it's pointed thirty degrees to the left.

The same level of consensus and coordination isn’t practical anymore. Coordination costs weigh heavily. New people coming in don’t get to build a mental model of everyone who already works there. They don’t know what other people know, or which other people need to know something. If the partnerships team touches the database, it might break point of sale or the web site, so they are hamstrung. Everyone needs to check everything, so the slowest-to-release team sets the pace. The purple team here is spending less time on coordination, so the data pipeline is getting built, but without any ties to the green team, it’s going in a direction that won’t work for point of sale.

This mess scales up in the direction of mess. How do we scale forward progress instead?

Moving Forward


The other extreme is decoupling. Boundaries. A very clear API between the data pipeline, point of sale, and web. Separate databases, duplicating data when necessary. This is a different kind of overhead: more technical, less personal. Break the back-end coupling at the database; break the front-end (API) coupling with backwards compatibility. Teams operate on their own schedules, releases are not coordinated. This is represented by wider arrows, because backwards compatibility and graceful degradation are expensive. 
Four arrows, each wide and short. A few lines connect them. They're filled, but the work went to width (solidness) rather than height (forward progress).

These teams are getting about as far as the communication-burdened teams. The difference is: this does scale out. We can add more teams before coordination becomes a limitation again.

Amazon is an extreme example of this: backwards compatible all the things. Each team Moving Forward in full armor. Everything fully separate, so no team can predict what other teams depend on. This made the AWS products possible. However, this is a ton of technical overhead, and maybe also not the kindest culture to work in.

Google takes another extreme. Their monorepo allows more coupling between teams. Libraries are shared. They make up for this with extreme tooling. Tests, refactoring tools, custom version control and build systems — even whole programming languages. Thousands of engineers work on infrastructure at Google, so that they can Move Together using technical overhead.

Balance


For the rest of us, in companies with 7-1000 engineers, we can’t afford one extreme or the other. We have to ask: where is consensus important? and where is consensus holding us back?

Consensus is crucial in objectives and direction. We are building the same business. The business results we are aiming for had better be the same. We all need to agree on “Which way is up?"

Consensus is crippling at the back end. When we require any coordination of releases. When I can’t upgrade a library without impacting other teams in way I can't predict. When my database change could break a system more production-critical than mine. This is when we are paralyzed. Don't make teams share databases or libraries.

What about leveraging shared tools and expertise? if every team runs its own database, those arrows get really wide really fast, unless they skimp on monitoring and redundancy — so they will skimp and the system will be fragile. We don't want to reinvent everything in every team.

The answer is to have a few wide arrows. Shared tools are great when they’re maintained as internal services, by teams with internal customers. Make the data pipeline serve the partnership and reporting teams. Make a database team to supply well-supported database instances to the other teams. (They’re still separate databases, but now we have shared tools to work with them, and hey look, a data pipeline for syncing between them.)


The green, red, and blue arrows are narrow and tall, and mostly full of work, with some lines connecting them. The purple arrow and a new black arrow are wide and short and full of work. The wide arrows (internal services) are connected to the tall arrows (product teams) through their tips.

Re-use helps only when there is a solid API, when there is no coupling of schedules, and when the providing team focuses on customer service.

Conclusions


Avoid shared code libraries, unless you’re Google and have perfect test coverage everywhere, or you’re Amazon and have whole teams supporting those libraries with backwards compatibility.
Avoid shared database instances, but build internal teams around supporting common database tools.

Encourage shared ideas. Random communication among people across an organization has huge potential. Find out what other teams are doing, and that can refine your own direction and speed your development — as long as everything you hear is information, not obligation.

Reach consensus on what we want to achieve, why we are doing it, and how (at a high level) we might achieve it. Point in the same direction, move independently.

Every organization is a distributed system, even when we sit right next to each other. Coordination makes joint activity possible, but not free. Be conscious of the tradeoffs as your organization grows, as consensus becomes less useful and more expensive. Recognize that new people entering the organization experience higher coordination costs than you do. Let teams move forward in their own way, as long as we move together in a common direction. Distributed systems are hard, and we can do it.







Bonus material 

Here is a picture of Jez in Budapest.



And here is a paper about coordination costs:
Common Ground and Coordination in Joint Activity

Saturday, April 16, 2016

Property Testing in Elm

Elm is perfectly suited to property testing, with its delightful data-in--data-out functions. Testing in Elm should super easy.

The tooling isn't there yet, though. This post documents what was necessary today to get a property to run in Elm.

Step 1: elm-test

This includes an Elm library and a node module for a command-line runner. The library alone will let you create a web page of test results and look at it, but I want to run them in my build script and see results in my terminal.

Installation in an existing project:
elm package install deadfoxygrandpa/elm-test
npm install -g elm-test
The node module offers an "elm test init" functionality to put some test files in the current directory: TestRunner (which is the Main module for test runs[1]) and Tests.elm which holds actual tests. Personally, I found it necessary to follow the following steps as well.

  • create a test directory (I don't want tests in my project home), and move the TestRunner.elm and Tests.elm files there.
  • add that test directory to the source directories in elm-package.json

Step 2: elm-check


The first thing to know is: which elm-check to install. You need the one from NoRedInk:
elm package install NoRedInk/elm-check
The next thing is: what to import. Where do all those methods used in the README live?

Here is a full program that lets elm-test execute the properties from the elm-check readme.
TL;DR: You need to import stuff from Check and Check.Producer for all properties; and  for the runner program, ElmTest and Check.Test and Signal, Console, and Task.

Name it test/Properties.elm and run it with
elm test test/Properties.elm
The output looks like
Successfully compiled test/Properties.elm
Running tests...
  1 suites run, containing 2 tests
  All tests passed
Here's the full text just in case.
module Main (..) where
import ElmTest
import Check exposing (Evidence, Claim, that, is, for)
import Check.Test
import Check.Producer as Producer
import List
import Signal exposing (Signal)
import Console exposing (IO)
import Task

console : IO ()
console =
  ElmTest.consoleRunner (Check.Test.evidenceToTest evidence)

port runner : Signal (Task.Task x ())
port runner =
  Console.run console

myClaims : Claim
myClaims =
  Check.suite
    "List Reverse"
    [ Check.claim
        "Reversing a list twice yields the original list"
        `that` (\list -> List.reverse (List.reverse list))
        `is` identity
        `for` Producer.list Producer.int
    , Check.claim
        "Reversing a list does not modify its length"
        `that` (\list -> List.length (List.reverse list))
        `is` (\list -> List.length list)
        `for` Producer.list Producer.int
    ]

evidence : Evidence
evidence =
  Check.quickCheck myClaims
How to write properties is a post for another day. For now, at least this will get something running.

See also: a helpful post for running elm-check in phantom.js


[1] How does that even work? I thought modules needed the same name as their file name. Apparently this is not true of Main. You must name the module Main. You do not have to have a 'main' function in there (as of this writing). The command-line runner needs the 'console' function instead.

Thursday, March 31, 2016

Scaling Intelligence

You can watch the full keynote from Scala eXchange 2015 (account creation required, but free). The talk includes examples and details; this post is a summary of one thread.

Scala is a scalable language, from small abstractions to large ones. This helps with the one scaling problem every software system has: scaling the feature set while still fitting it in our heads. Scaling our own intelligence.

Scala offers complicated powerful language features built from combinations of simpler language features. The aim is a staircase of learning: gradually learn features as you need them. The staircase starts in the green grass of toy programs, moves through the blue sky of useful business software, and finally into the outer space of abstract libraries and frameworks. (That dark blob is supposed to represent outer space.)


This is not how people experience the language.

The green grass is great: Odersky's Coursera courses, Atomic Scala. Next, we want to write something useful for work: the blue sky. It is time to use libraries and frameworks. I want a web app, so I bring in Spray. Suddenly I need to understand typeclasses and the magnet pattern. The magnet pattern? The docs link to a post on this. It's five thousand words long. I'm shooting into outer space -- I don't want to be an astronaut yet!

The middle of the staircase is missing.


Who can repair this? Not the astronauts, the compiler and library authors. They can write posts around program language theory, defining one feature in terms of a bunch of other concepts I don't understand yet. I need explanations by people who share my objectives, people a little bit ahead of me in the blue sky, who recently learned how to use Spray themselves. I don't need research papers, I need StackOverflow. Blog posts, not textbooks.

This is where we need each other. As a community, we can fill this staircase. At a macro level, we scale intelligence with teaching.

Scala as a language is not enough. We don't work in languages, especially not in the blue sky. We work in language systems, including all the libraries and tooling and all the people. The resources we create, and the personal interactions in real life and online. When we teach each other, we scale our collective intelligence, we scale our community.

Scaling the community is important, because only a large, diverse group can answer two crucial questions. To make the language and libraries great, we need to know about each feature: is this useful? and to make this staircase solid, we need to know about each source and document: is this clear?

Useful isn't determined by the library author, but by its users. Clear isn't determined by the writer, but by the reader. If you read the explanation of Futures on the official Scala site and you don't get it, if you feel stupid, that is not your fault. When documentation is not clear to you, its maintainers fail. Teaching means entering the context of the learner, and starting there. It means reaching for the person a step or two down, and pulling them up to where you are.

Michael Bernstein described his three years of learning Haskell. "I tried over and over again to turn my self doubt into a pure functional program, and eventually, it clicked."
Ouch. Not everyone has this tenacity. Not everyone has three years to spend becoming an astronaut. Teaching makes the language accessible to more people. At the same time, it makes everyone's life easier -- what might Mr Bernstein have accomplished during that year?

Scala, the language system, does not belong to Martin Odersky.  It belongs to everyone who makes Scala useful. We can each be part of this.

Ask and answer questions on StackOverflow. Blog about what you learned, especially about why it was useful.[1] Request more detail -- if something is not clear to you, then it is not clear. Speak at your local user group.[2] The less type theory you understand, the more people you can help!

Publish your useful Scala code. We need examples from the blue sky. If you do, tweet about it with #blueSkyScala.

It is up to all of us to teach each other, to scale our intelligence. Then we can make use of those abstractions that Scala builds up. Then it will be a scalable language.




[1] example: Remco Beckers's post on Option and Either and Try.
[2] example: Heather Miller's talk compensates for bad documentation around Scala Futures.