This is a bit more geeky than we like to get here at Smallbizlabs, but the rise of big data is resulting in growing numbers of people being exposed to statistical data and jargon.
A spurious correlation is a statistical term that describes a relationship between two variables that seem to be related (correlated), but happens just by chance or due to an unseen third variable.
A good example of an unrelated spurious correlation is skirt length theory. This is the belief that stock market trends follow the length of women's skirts. Followers believe when skirts get shorter, the stock market goes up. When they get longer, it goes down.
Skirt length theory has been proven to be wrong, but since it works some years (it has a 25% chance of being right in any given year) it continues to have followers.
An example of a spurious correlation caused by a third variable is the fact that ice cream sales and accidental drownings are highly correlated - they tend to both move up and down in a consistent pattern.
In this case a 3rd variable - temperature - is the driver. Hotter days result in higher ice cream sales and more people swimming - and, unfortunately - more people drowning.
Why does this matter?
If you make decisions based on spurious correlations or assume statistical significance where there is none, you will quickly get in trouble. Since there is no causal linkage, the relationship can break down at any point. Going back to skirt length theory, if you bet on the stock market using it you're going to get wiped out.
It's beyond the scope of this blog to fully cover this topic, but there are two new books well worth reading that do cover it:
Naked Statistics: Stripping the Dread from the Data does an excellent job of describing the basics around statistical causation and the limitations of statistical analysis in layman's terms.
The Signal and the Noise: Why So Many Predictions Fail - but Some Don't is from NY Times rock star statistician Nate Silver. He successfully called the last presidential election and his book covers nicely covers the strengths and weaknesses of statistical analysis.
In addition to these books, Hubspot's article A Marketer's Guide to Understanding Statistical Significance covers A/B testing, which is widely used (and mis-used) in online marketing.
Basic statistical skills and understanding is becoming a business requirement. This does not mean you have to be an expert in the field, but if you don't know a mean from a median it's time to learn.
Comments