Thursday, April 25, 2013

Property-based testing: what is it?

What is this property-based thing?

Property-based tests make statements about the output of your code based on the input, and these statements are verified for many different possible inputs.

A property-based testing framework runs the same test over and over with generated input. The canonical framework is QuickCheck in Haskell. My experience is with ScalaCheck.

This contrasts with example-based testing, which is most of the tests we write these days. Each unit test sets up one input scenario, runs the code under test, and then checks the output (and any other effects the code is supposed to have).

Instead, one property-based test runs hundreds of times with different inputs. The testing framework will try to get the test to fail by passing empty lists, negative values, all the possible edges cases. It'll pass in long lists and high numbers and strings with special characters.

The property-based tester has to think very carefully about the specification. What kind of input is supported? Encode that in the preconditions of the test. How can the input be generated? For custom types, this takes at least a bit of code. What statements can we make about the output? Encode all these in one test, or one test per statement. This is hard!

For example, say I'm writing a function that takes in a bunch of sets
[B,F] [A,B,C] [A,D,E,F] [C] [B,F] [R]
and chooses a number of these sets to return, with the goal of returning the minumum number of sets that still include all the elements in all of the input sets. (Really, I did this at work a while back.) So the optimal output here is:
[A,B,C] [A,D,E,F] [R]
For my property-based test, the input is any set of sets. I can use sets of integers, since anything with an equals method will do.
I must make the following statement about the output:
  1. Every element in the input is also in the output
I could state the following, to sanity-check my function:
  1. Every output set was in the input
  2. The quantity of output sets is less than or equal to the input
  3. The same set never appears more than once in the output
  4. No output set is a subset of any other output set
And then the statement I'd really like to make:
  1. For every other possible combination of input elements such that (1) is true, the number of sets included is never fewer than the number of sets output by my function.
I could implement something like this in ScalaCheck. It isn't trivial. The first problem is: letting it generate any old set of sets of integers runs the JVM out of memory super fast. I have to code in length-limits for the sets. The second problem is: statement (6) takes forever to execute for more than a few sets, because they're O(n!). Fourteen input sets, ten billion combinations. Oops. Maybe it's worth making this check if I limit the input size to five.

See how much thought goes into a property-based test? and this is a simple specification! I don't recommend writing these for all your code - only the really important stuff.

The value of this type of testing is that it forces you to think about the code. If any possible input is not supported, that has to be considered and then codified into the test. empty sets? disjoint sets? identical sets?

All kinds of stuff will be tested in one test case. The framework will make a point of testing edge cases, and then it'll randomly generate a hundred possibilities. This saves you from writing all those edge-case tests, which saves repetition in your test code. Extreme thoroughness without repetition.

Property-based tests are best combined with example-based tests. Examples help you start organizing your thoughts, and they're easier for future-you to read and understand when you come back to this code later. Humans think in examples. Programs don't extrapolate. Property-based thinking and property-based testing can bridge between us and the computer. Math, it's a tool.

Also, it's fun to write one test and see "100 assertions passed" in the output.

22 comments:

  1. Nat Pryce has been trying to bring property based testing into the TDD cycle for a while now. He's doing another session on it at XP2013 in Vienna, Austria in June 2013 - http://xp2013.org/program/workshops-and-tutorials/property-based-test-driven-development/

    ReplyDelete
  2. We do quite a bit of this, and find it useful. There are two main difficulties that we come across: 1. the tests can involve algorithms sufficiently complicated that they, in turn, need tests, and 2. they don't give you any insight into what shape the solution should take.

    It's really really hard to write a generator that generates all the edge cases for you. We find that being able to pass hand-crafted examples into the same property-based assertion is very valuable too.

    ReplyDelete
  3. Excellent timing! I've just been writing about testing and proofs (via dependent types) for the May issue of http://pragprog.com/magazines, and I'm mentioning your article as a good first reference and comparison.

    My article (or articles - it's a series) will cover the next step up, of starting to prove certain properties for code, and looks at how and whether techniques based on dependent types give us leverage in this area.

    Are you writing anything else? your tweet hinted at a reason for writing this...

    ReplyDelete
    Replies
    1. This post seems to have hit a gap and drawn some interest, so I might write more.

      Based on your last blog post http://www.free-variable.org/2013/04/mind-the-gap-please/ : it looks like with types, especially dependent types, and tests, especially property-based tests, we are all trying to get our full knowledge of the problem domain into the code. It's easy to capture the procedures of the business. Capturing the "why" and "when" behind the procedures is much harder, and is not expressed in simple instructions for the computer.

      Example tests are a minimal, and more useful to communicating to humans and driving a good interface.
      Property tests are a step toward specificity.
      Proofs are another step.

      The more important the code, the deeper toward proofs we should go... but certain coding styles make it almost free to go farther. Hmm.

      Thanks for your comment and for referencing my post!

      Delete
  4. This sounds like exactly the kind of testing I'd like to see side-by-side with the TDD cycle. It sounds like an ideal way for programmer and tester to work together in real time on the same task. It also sounds like a great technique for programmers to practise in order to ingrain a few extra useful testing heuristics, which might reduce their blindspots and improve their effectiveness when test-driving anything.

    ReplyDelete
  5. The "program by design" curriculum (mostly created at Northeastern with input from Brown, Northwestern, Utah, WPI and Tübingen) uses precisely this sequence for its introductory freshmen curriculum:

    1. first semester freshmen: use 'example tests' (among several ideas on design); we use Racket's teaching languages for this purpose
    2. second semester freshmen: use property tests to articulate properties of functions (we do it in ACL2 but could do it in Racket and Scala)
    3. second semester freshmen: end up using ACL2 to prove properties for which they can't find counter-examples

    Sadly coop positions tend to "pollute" their minds. When they come back, they have to be reminded of these ideas.

    ReplyDelete
  6. You can test property (6) in ScalaCheck/QuickCheck, if you rethink it a bit.

    What's you need is obviously correct but not always optimal solver. For this problem greedy algorithm is good: http://en.wikipedia.org/wiki/Set_cover_problem . Idea is that, for some permutation of input, greedy algorithm will return optimal solution.

    Then the ScalaCheck property would be:

    forAll { (s : Set[Set[Int]], seed : Integer) => yourfn(s).size <= greedy(permuteWithSeed(s, seed)).size }

    P.S. set cover problem is NP-Complete, how did you solve it?

    ReplyDelete
  7. If I were writing this for production I'd totally keep a stateful count and return itself after each message. But that's a mutable-state optimization, and this is an exercise in functional style. the venus factor does it work

    ReplyDelete
  8. The ego cannot think about being deceased. It has no way of it. But there is another aspect of interest that not only can understand it, but already knows about it. cheap joomla hosting

    ReplyDelete
  9. Programs don't extrapolate. Property-based thinking and property-based testing can bridge between us and the computer. Math, it's a tool. lehenga choli

    ReplyDelete
  10. As almost everyone’s expenses knowledgeable, many organizations had to cut back on their display participation and the wide variety of individuals they sent to be existing at activities was considerably reduced. Consequently, some reveals stopped to are available or became much smaller versions of their halcyon periods. truth about cellulite com

    ReplyDelete
  11. Robichaud also hired Bella Energy of Louisville, Colo., to install solar panels on Precision headquarters to help recharge the batteries for eight hours at night after technicians drive them 120 miles during the day. reviews of old school new body

    ReplyDelete
  12. More than I was prepared to handle — literally and emotionally. Individually, you guys are great. Collectively, I wanted to crawl under my desk and curl up in the fetal position. bitcoin hosting

    ReplyDelete
  13. Because I'm a beginner I first thought "I'll just buy the $[...] one with 10 stitch options because I don't know what all those stitches do anyway, and then if I really get into sewing that's cheap enough that I'll upgrade in a year or two" I'm so glad I didn't do that, I'm still just beginning but I can see how valuable having more stitch options is going to be and I've already started using some of them. There wasnt a major growth but my boobs were a little more plump and full. Disappointed that it did not include a miter guide. f4x method exercises

    ReplyDelete
  14. Clinton's management was followed by a unsuccessful Republican management of Henry Shrub Jr., with a Republican The legislature. important source

    ReplyDelete
  15. Your power expenses come down significantly as excellent windows keep the room’s heat range constant more time once it has been warmed or chilled. So it uses less gas to warm or cool the house. www.browneshealth.co.uk

    ReplyDelete
  16. It is our intention to drive the first Electric Van for approximately 90 days and discover all of the nuances that might come with an EV or if there are any design changes we would like to have included with future vans,” said Robichaud the venus factor review

    ReplyDelete
  17. I agree, all (or at least most) systems should be build without primitive types. But you don't need to switch to Scala for this, it can be done perfectly well using Java. Saran Wrap Weight Loss

    ReplyDelete
  18. However, breaking even is better than having to pay out of your wallet. Determine what the complete expenses will be for you as a seller and any fees you may owe with the selling and price your home to crack even. We Buy Any House

    ReplyDelete
  19. The big range of gambling marketplaces provided on soccer by yourself is a lot of to protect bookies incredibly stressful especially in the course of the World Cup selection. Online Terpercaya

    ReplyDelete
  20. Usually, such individuals achieve size and body weight beyond the regular range. After supplements of the HGH just isn't done properly, signs just like Acromegaly are triggered. Going Here

    ReplyDelete
  21. Lysol, or other anti-bacterial fumigations can be used for daily servicing. he has a good point

    ReplyDelete