Sunday, November 18, 2012

Causality: tougher than it looks, but we can take it on

We like to take a hunk of data, graph one factor against another, demonstrate correlation, and infer causality. This naive form of analysis is appealing in its simplicity, but it doesn't cut it in the real world. With Big Data, we can identify correlation out the wazoo, but it's time to get way more sophisticated in our causality analysis.

With data as big as we can get it today, the scientific method doesn't work anymore. (Don't take my word for it. Listen to Sandy Pentland.)

A correlation between two factors is judged statistically significant if there is less than a 5%, or 1%, or 0.5% chance that the results would come out this way by chance. At the strictest level, this means 1:200 false hypotheses will show up as true out of randomness. With tremendous data, we can test effectively infinite hypotheses. Plenty of them will look significant when they are not. As Sandy puts it, you can learn that people who drive Fords on Thursdays are more likely to get the flu. The correlation exists, but it's bullshit.
With big data, it's time to bring the word "significant" back to its regular-people meaning. We have to look for causality. We have to look for the micropatterns that lead to better health, smoother traffic, lower energy use. No more "this happened and this happened to the same people, so they must be related!" Causality delineates the difference between truth and publishability of an academic paper.

How can we find that causality? It is complex: many influences together trigger each event, and each of these factors are triggered by many influences including each other. How are we to analyze this?
A painfully simplified example: Jay's new web site

Manufacturing has a tool that could be useful. Quality Function Deployment, and in particular the House of Quality tool, addresses the chains and webs of causality. As Chad Fowler explained yesterday at 1DevDayDetroit, the House of Quality starts with desired product characteristics. It identifies the relative importance of each characteristic; a list of measurable factors that influence the characteristics; and which factors influence which characteristics, how much, and in what direction. Magic multiplication formulas then calculate which factors are the most important to the final product.

But don't stop there. Take the factors and turn them into the target characteristics in the next House of Quality. Find factors that influence this new, more detailed set of characteristics. Repeat the determination of what factors influence what characteristics and how much.
The factors from Iteration 1 become the goals in Iteration 2.

Iterate until you get down to factors specific enough that they can be controlled in a production facility. Actionable, measurable steps are then apparent, along with a priority for each based on how much they influence the highest-level product characteristics. Meanwhile, you have created a little network of causalities.

This kind of causality analysis is a lot of work. Creating this sad little example made my brain hurt. This analysis is no simple graph of heart attacks vs strawberry consumption across populations. On the upside, Big Data drastically expands our selection of measurable factors. If we can identify causality at a level this detailed, we can get a deeper level of information. We can get closer to truth.


  1. It may seem that its polar environment hockey can be too challenging to find out as a adult but starting just as one adult seemingly doable along with starting just as one adult perhaps has their advantages. Ice baseball is very complicated to find out than alot of team athletics because you will need to learn tips on how to skate simply uses learn tips on how to play its polar environment hockey.

  2. In fact, they Purchase the Aboriginal Backpack to Manufacture the Exact Replica Handbags with Absolute Arrangement and Linings and this way they accumulate Louis Vuitton Imitation Handbags. Their handbags are identical in every way with originals. Those are the breitling replica finest replica superior accessible on bazaar , they fabricated them to perfection.Those replica handbags accept the aforementioned abstracts as the artist handbags. Their Louis Vuitton Replicas handbags accept the color, the attending and are absolutely like the replica watches originals. Their abundance is abounding of the best replica Louis Vuitton purse and handbags selections as able-bodied as a amount of added artist purses from alone the best and a lot of accepted designers.Louis Vuitton is the allegorical abode of appearance accoutrements and added accessories, back the year 1854. It is every individual girl's dream to be a appreciative buyer of the Louis Vuitton backpack and purses. Fabricated up of the rolex replica signature cipher canvas, these accoutrements are anxiously and acutely handmade till date so that it surpasses all the marks of quality. Talking of the baggage bags- craftsmen band up leathers and canvases calm and tap tiny nails one by one accepting the five- letter solid aces affidavit assumption locks with handmade keys. The hublot replica frames are aswell fabricated of 30-year old poplar that is broiled for at atomic four years. Talking about the superior of the Louis Vuitton will charge a complete book. But, as said by abundant people- aggregate comes for a price. So, assuredly the Louis Vuitton purses and handbags are no exceptions. To advance its name and its position of the "legendary" tag the bulk bracket is kept so top that, alone accountable few, from all locations of the apple can allow these bags. So, cerebration about the desires of millions others, abounding replica bag manufacturers accept sprung up in contempo times- who not alone carbon the designs and colours but even advance its qualities the best.Verify that your Breitling alarm has an arresting logo on its dial. If the logo is formed or corrective on the watch, again it is a affected as Breitling does not cast its watches in this manner. On the arresting logo, analysis to accomplish abiding that the breadth amid the ballast and surrounding wings is abounding in as Breitling alone produces its logo in this fashion. Examine the punch of the carefully and accomplish agenda of the sub-dials' functions. Chronographs, by definition, accept abate dials set in the beyond punch itself. On a Breitling, these sub-dials are acclimated alone for alarm functions. If the watch uses these sub-dials for agenda functions, it is not an accurate Breitling.