Saturday, August 31, 2013

Idempotence in math and computing

"Idempotent" is a big word. It's one of the four-dollar words thrown around by functional programmers, and it confuses people -- including functional programmers. This is because we don't use it consistently.

In math, idempotence describes only unary functions that you can call on their own output. Math-idempotence is, "If you take the absolute value of a number, and then you take the absolute value of that, the result doesn't change on the second (or subsequent) operations." Math.abs is math-idempotent. Math-idempotence only applies to functions of one parameter where the parameter type and return type are the same. Not so useful in programming.

Here's the real definition.
In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters.[1]
Programming-idempotence is about side effects. It's about stuff that happens to the outside world when you call a function. Idempotence says "If you've called me once, it doesn't matter whether you called me again."


Not idempotent: Pushing "Place order" creates a new order in the database.
Idempotent: Pushing "Place order" moves order 4567 from 'cart' to 'finalized' status.

Not idempotent: Fetch a row from the database and create an object to represent it
Idempotent: Fetch a row from Hibernate, which returns the previously allocated object for the same row on a second call.

Idempotent: Give me that pizza.
Not idempotent: Make me a pizza.

Idempotent: check a password against the hash in the database
Not idempotent: check a password and increment the bad-login-attempts if it's wrong

Not idempotent: Install MongoDB
Idempotent: Install MongoDB if it isn't already installed

Idempotent: perform a calculation and return the result
Not strictly idempotent: perform a calculation, write to the log file, and return the result

Keep it idempotent

Programming-idempotence, every function has it or it doesn't. Data-in, data-out functions[2] do not affect the outside world at all, so they are naturally idempotent. If your function does something to the outside world, consider what will happen when your function is called twice. Will it blow up? cause twice as much to happen? or have the same result as calling it once?

It's a good idea to make side-effecting functions idempotent. They make your forms immune to repeated-submission problems. They make your chef scripts re-runnable when a download fails in the first attempt. They avoid surprising people.

The word idempotence came from math, but its meaning in programming context is broader and more useful.[3]


[2] "data-in, data-out" is a more accessible term for "Referentially transparent and nullipotent." Nullipotent means that the outside world doesn't care whether you called it 0 or 1 or 100 times.

[3] If you want a mapping from the math-idempotency to programming-idempotency, think of it this way:

All functions (in a language other than Haskell) have an invisible parameter: the world around them.
When we write function application in our program, we supply the declared parameters. The world around
is supplied at runtime.
Similarly for output, our program receives an explicit return value, and there is an invisible return
value as well: the world after our function exits.
Therefore, consider every function application in code as a partial application, with the world as the single
unsupplied parameter. Looked at this way, every function + inputs is a unary operator on the single
parameter World, and it returns another World. Calling our function again with the same inputs, does it change
that World again?

f(x) = f(f(x))

f(a,b)(world).newWorld = f(a,b)(f(a,b)(world)).newWorld

where the method newWorld extracts the world-after-execution from the function's multiple return values (the explicit one and the implied world).

[4] World is one of the most awkward words in the English language.


  1. Arguably, "Pushing "Place order" moves order 4567 from 'cart' to 'finalized' status" is not idempotent.

    It might be a clearer distinction through the lens of distributed services than monolithic architecture where these finer points tend not to surface.

  2. Your Hibernate example makes no sense at all.

    1. Consider an object with a very large hierarchical structure, where Hibernate is instantiating a whole lot for one retrieval.
      The second call to retrieve the same object returns the one that's already created. The second call is fast and doesn't add to the memory usage of the application.

  3. One of my hobby horses is that idempotence is frequently mistaken for the more interesting property of a mathematical fixed point. What you are usually after is a predictable outcome (desired state) that will not be spoiled by later actions. Idempotence is not the right criterion here (it is insufficient), because doing something once only does not guarantee that you will end up doing it right, or in the right place.

    e.g. mkdir desired
    and mkdir /specific/desired

    are not the same. Both are idempotent, but only the latter is a relativity-free fixed point. I called the latter "convergence" because what you want is for any process, in whatever initial state, however many steps (a functional map) to have a desired outcome that is definable and predictable.

    Fixed points allow self-repairing equilibria, and get as close to a dynamic definition of determinism as we can get in information systems.

  4. Math-idempotence only applies to functions of one parameter where the parameter type and return type are the same.

    Matrices can be idempotent and they take multiple parameters.

    Math-idempotence is, "If you take the absolute value of a number, and then you take the absolute value of that, the result doesn't change on the second (or subsequent) operations."

    See idempotence needn't be a property of the operation; it can be of the binary relation between two elements. (The binary relation is up for grabs.)

  5. Effectively teaching elementary math to children aged 5 to 10 (Grade 1,Grade 2,Grade 3,Grade 4,Grade 5).Great for Homeschool kids! All math results are logged and graded and we show how they are improving through real-time hereGrade 1 elementary school math lessons and timed tests