Tuesday, July 23, 2013

Aqueductron - toying with dataflow in Ruby

I love playing with Ruby because it lets me express concepts clearly[1]. In my aqueductron gem, two concepts are expressed. It's about processing data, and about code modifying code, all without modifying anything[2].

The metaphor of Aqueductron is an aqueduct. Data is the water, taking the form of droplets flowing through the ducts. Each piece of duct might be a filter, or a map. At the end, there's a collector of some kind.

The interesting parts of dataflow are where the metaphor breaks down. For instance, take a duct with a split in it. With real water, each drop will take exactly one path. With data, each drop can go down all paths, or none.

In the real world, an aqueduct is constructed before the water ever flows, and the aqueduct stays the same forever. In aqueductron, a duct piece can change with every drop. The split can add paths as it encounters new information.


When the delta between drops is more interesting than the drops themselves, pieces can alter themselves based on what comes through. For added challenge and sanity, aqueductron does this without mutating state.

Since the ducts can change as data flows, it is useful to see what the aqueduct looks like in between drops. The Ruby REPL is handy, and aqueductron is equipped with ASCII art.

Create new paths

For the Lambda Lounge code shootout recently, we implemented some problems from Rosalind.info. Here's the simplest one, generalized from "count nucleotides" to "count all the characters that you see." My code is explained in detail here.

Construct a pipe that expands a string into its characters, then creates a path for each unique letter. It is empty at first.

2.0.0> a = Duct.new.expand(->(s) {s.each_char}).
                    partition(->(a) {a.upcase}
                              ->(a) {Duct.new.count})
 => Duct:
--- / 
 ~ <  +?
--- \  

Send it one string, and the duct creates new paths.

2.0.0> b = a.drip("ACtT")
 => Duct:
        # (1)
--- / ---\
 ~ <   C  # (1)
--- \ ---/
       T  # (2)

Modify existing paths

Another simple Rosalind problem describes rabbit populations as a modified fibonacci sequence. There's a multiplier (k) applied to the penultimate number as it's added to the last one to generate the next Fibonacci number. In aqueductron, the duct can learn from the data coming through, changing as the data flows in, each one generation's rabbit population. When it's time to make predictions, the pipe uses what it learned. In this case, the pieces are decorated with a description of the function inside them. Code details are here.

Build an empty pipe:

2.0.0> rabbits = Duct.new.custom(empty_fib_function).last
 => Duct:
 ~  last

Then drip information about the generations through, the duct learns.

2.0.0> rabbits.drip(2)
 => Duct:
 2..  last (2)
2.0.0> rabbits.drip(2).drip(5)
 => Duct:
 2,5..  last (5)
2.0.0> rabbits.drip(2).drip(5).drip(9)
 => Duct:
 ..5,9.. starting k~2.0  last (9)

When asked to predict future generations, the duct uses what it has learned.

2.0.0> rabbits.drip(2).drip(5).drip(9).drip(:unknown)
 => Duct:
 ..9,19.. k=2.0  last (19)

Learning dataflow

The part where the flow changes with the data fascinates me. That it changes itself without mutating state fascinates me even more. These are the concepts explored in aqueductron. Look for more aquedutron on this blog, past and future.

[1] where my target audience is devs (like me) who are more comfortable with objects than Lisps.
[2] it's Ruby, so forcing immutability is a lot of work. Since I'm going for clarity, aqueductron is immutable by choice, not by compiler restriction.

1 comment:

  1. I can see how valuable having more stitch options is going to be and I've already started using some of them. There wasnt a major growth but my boobs were a little more plump and full. Disappointed that it did not include a miter guide. I have even made adjustments so I can throw in a little ground flavored coffee while still using the grinder. google