Tuesday, August 25, 2015

Functional principles come together in GraphQL at React Rally

Sometimes multiple pieces of the industry converge on an idea from
different directions. This is a sign the idea is important.

Yesterday I spoke at React Rally (video coming later) about the
confluence of React and Flux in the front end with functional
programming in the back end. Both embody principles of composition, declarative
style, isolation, and unidirectional flow of data.

In particular, multiple separate solutions focused on:

  •   components declare what data they need
  •   these small queries compose into one large query, one GET request
  •   the backend gathers everything and responds with data in the requested format

This process is followed by Falcor from Netflix (talk by Brian Hunt) and GraphQL from Facebook (talks by Lee Byron and Nick Schrock, videos later). Falcor adds caching on the client, with cache
invalidation defined by the server (smart; since the server owns
the data, it should own the caching policy). GraphQL adds an IDE for
queries, called GraphiQL (sounds like "graphical"), released as
open source for the occasion! The GraphQL server provides introspection
into the types supported by its query language. GraphiQL uses this to let the developer
work with live, dynamically fetched queries. This lets us explore the available
data. It kicks butt.

Here's an example of GraphQL in action. One React component in a GitHub client might specify that it needs
certain information about each event (syntax is approximate):
{
  event {
    type,
    datetime,
    actor {
      name
    }
  }
}
and another component might ask for different information:
{  event {    actor {      image_uri    }  }}

The parent component assembles these and adds context, including
selection criteria:
{  repository(owner:"org", name:"gameotron") {    event(first: 30) {       type,       datetime,       actor {         name,         image_url      }    }  }}
Behind the scenes, the server might make one call to retrieve the repository,
another to retrieve the events, and another to retrieve each actor's
data. Both GraphQL and Falcor see the query server as an abstraction
layer over existing code. GraphQL can stand in front of a REST
interface, for instance. Each piece of data can be
fetched with a separate call to a separate microservice, executed in
parallel and assembled into the structure the client wants. One GraphQL
server can support many version of many applications, since the
structure of returned data is controlled by the client.
The GraphQL server assembles all the
results into a response that parallels the structure of the client's
query:
{  "repository" : {    "events" : [{      "type" : "PushEvent",      "datetime" : "2015-08-25Z23:24:15",      "actor" : {        "name" : "jessitron",        "image_url" : "https://some_cute_pic"      }    }    ...]  }}
It's like this:
The query is built as a composition of the queries from all the components. It goes to the server. The query server spreads out into as many other calls as needed to retrieve exactly the data requested.
The query is composed like a fan-in of all the components'
desires. On the server this fans out to as many back-end calls as
needed. The response is isomorphic to the query. The client then spreads
the response back out to the components. This architecture supports
composition in the client and modularity on the server.
The server takes responses from whatever other services it had to call, assembles that into the data structure specified in the query, and returns that to the client. The client disseminates the data through the component tree.
This happens to minimize network traffic between the client and server.
That's nice, but what excites me are these lovely declarative queries that
composes, the data flowing from the parent component into all the
children, and the isolation of data requests to one place. The exchange
of data is clear. I also love the query server as an abstraction over
existing services; store the data bits in the way that's most convenient
for each part. Assembly sold separately.

Seeing similar architecture in Falcor and GraphQL, as well as in
ClojureScript and Om[1] earlier in the year, demonstrates that this is
important in a general case. And it's totally compatible with
microservices! After React Rally, I'm excited about where front ends are
headed.


[1] David Nolen spoke about this process in ClojureScript at Craft Conf
earlier this year. [LINK]

Sunday, August 16, 2015

An Opening Example of Elm: building HTML by parsing parameters

I never enjoyed front-end development, until I found Elm. JavaScript with its `undefined`, its untyped functions, its widely scoped mutable variables. It's like Play-Doh, it's so malleable. And when I try to make a sculpture, the arms fall off. It takes a lot of skill to make Play-Doh look good.

Then Richard talked me into trying Elm. Elm is more like Lego Technics. Fifteen years ago, I bought and built a Lego Technics space shuttle, and twelve years ago I gave up on getting that thing apart. It's still in my attic. Getting those pieces to fit together takes some work, but once you get there, they're solid. You'll never get "method not found on `undefined`" from your Elm code.


Elm is a front-end, typed functional language; it to JavaScript for use in the browser. It's a young language (as of 2015), full of opportunity and surprises. My biggest surprise so far: I do like front-end programming!

To guarantee that you never get `undefined` and never call a method that doesn't exist, all Elm functions are Data in, Data out. All data is immutable. All calls to the outside world are isolated. Want to hit the server? Want to call a JavaScript library? That happens through a port. Ports are declared in the program's main module, so they can never hide deep in the bowels of components. Logic is in one place (Elm), interactions in another.
one section (Elm) has business logic and is data-in, data-out. It has little ports to another section( JavaScript) that can read input, write files, draw UI. That section blurs into the whole world, including the user.


This post describes a static Elm program with one tiny port to the outside world. It illustrates the structure of a static page in Elm. Code is here, and you can see the page in action here. The program parses the parameters in the URL's query string and displays them in an HTML table.[1]

All web pages start with the HTML source:
<html><head>
  <title>URL Parameters in Elm</title>
  <script src="elm.js" type="text/javascript"></script>
  <link href="http://yui.yahooapis.com/pure/0.6.0/pure-min.css" rel="stylesheet"></link>
</head>
<body></body>
<script type="text/javascript">
  var app = Elm.fullscreen(Elm.UrlParams,
                           { windowLocationSearch:
                               window.location.search
                           });
</script></html>

This brings in my compiled Elm program and some CSS. Then it calls Elm's function to start the app, giving it the name of my module which contains main, and extra parameters, using JavaScript's access to the URL search string.

Elm looks for the main function in my module. The output of this function can be a few different types, and this program uses the simplest one: Html. This type is Elm's representation of HTML output, its virtual DOM.

module UrlParams where

import ParameterTable exposing (view, init)
import Html exposing (Html)

main : Html
main = view (init windowLocationSearch)

port windowLocationSearch : String
The extra parameters passed from JavaScript arrive in the windowLocationSearch port. This is the simplest kind of port: input received once at startup. Its type is simply String. This program uses one custom Elm component, ParameterTable. The main function uses the component's view function to render, and passes it a model constructed by the component's init method.

Somewhere inside the JavaScript call to Elm.fullscreen, Elm calls the main function in UrlParams, converts the Html output into real DOM elements, and renders that in the browser. Since this is a static application, this happens once. More interesting Elm apps have different return types from main, but that's another post.

From here, the data flow of this Elm program looks like this:
The three layers are: a main module, a component, and a library of functions.
The main module has one input port for the params.  That String is transformed by init into a Model, which is transformed by View into Html. The Html is returned by main and rendered in the browser. This is the smallest useful form of the Elm Architecture that I came up with.

Here's a piece of the ParameterTable module:
module ParameterTable(view, init) where

import Html exposing (Html)
import UrlParameterParser exposing (ParseResult(..), parseSearchString)

--- MODEL
type alias Model = { tableData: ParseResult }

init: String -> Model
init windowLocationSearch =
  { tableData = parseSearchString windowLocationSearch }

--- VIEW
viewModel -> Html
view model =
  Html.div ...
The rest of the code has supporting functions and details of the view. These pieces (Model, init, and view) occur over and over in Elm. Often the Model of one component is composed from the Models of subcomponents, and the same with init and view functions.[2]

All the Elm files are transformed by elm-make into elm.js. Then index.html imports elm.js and calls its Elm.fullscreen function, passing UrlParams as the main module and window.location.search in the extra parameter. And so, a static (but not always the same) web page is created from data-in, data-out Elm functions. And I am a happy programmer.



[1] Apparently there's not a built-in thing in JavaScript for parsing these. Which is shocking. I refused to write such a thing in JavaScript (where by "write" I mean "copy from StackOverflow"), so I wrote it in Elm.

[2] Ditto with update and Action, but that's out of scope. This post is about a static page.





Monday, August 3, 2015

Data-in, Data-out

In functional programming, we try to keep our functions data-in, data-out: they take some data as parameters, return some data as output, and that's it. Nothing else. No dialog boxes pop, no environment variables are read, no database rows are written, no files are accessed. No global state is read or written. The output of the function is entirely determined by the values of its input. The function is isolated from the world around it.

A data-in, data-out function is highly testable, without complicated mocking. The test provides input, looks at the output, and that's all that it needs for a complete test.[1]

A data-in, data-out function is pretty well documented by its declaration; its input types specify everything necessary for the function to work, its output type specifies the entire result of calling it. Give the function a good name that describes its purpose, and you're probably good for docs.

It's faster to comprehend a data-in, data-out function because you know a lot of things it won't do. It won't go rooting around in a database. It won't interrupt the user's flow. It won't need any other program to be running on your computer. It won't write to a file[2]. All these are things I don't have to think about when calling a data-in, data-out function. That leaves more of my brain for what I care about.

If all of our code was data-in, data-out, then our programs would be useless. They wouldn't do anything observable. However, if 85% of our code is data-in, data-out, with some input-gathering and some output-writing and a bit of UI-updating -- then our program can be super useful, and most of it still maximally comprehensible. Restricting our code in this way when we're writing it provides more clarity when we're reading it and freedom when we're refactoring it.
Think about data-in, data-out while you're coding; make any dependencies on the environment and effects on the outside world explicit; and write most of your functions as transformations of data. This gets you many of the benefits of functional programming, no matter what language you write your code in.


[1] Because the output is fixed for a given input, it would be legit to substitute the return value for the function-call-with-that-input at any point. Like, one could cache the return values if that helped with performance, because it's impossible for them to be different next time, and it's impossible to notice that the function wasn't called because calling it has no externally-observable effect. Historically, this property is called referential transparency.

[2] We often make an exception for logging, especially logging that gets turned off in production.