As computer scientists, one of our first lessons is about “big O” complexity of software. It’s used to understand the expected time for a program to run. “Big O” theory tells us that it’s the order of magnitude of a system—like O(n^2) or O(n)— that matters, much more than smaller factors. To understand the parts of a computer program that dominate the time required to run it, we know not to focus on the tiny parts of the system that irrelevant to the overall behavior. Measuring the performance of a line of code to the fourth decimal place, when it only runs a ten-thousandth of the time that other code does, is wasted time.
As introduced in Beyond Data Part 1, we need to apply the same kind of insight to analyzing data, whether it be for business intelligence, decision support, dashboards, or other systems.
Think about your company as a snowball rolling downhill. If you start five balls down a slope, they’ll all roll a little while, but some will happen to hit stickier snow, and get a little bit bigger than the others. Those heavier snowballs will have just a bit more momentum, so they’ll roll a bit further, and use up all the snow, gaining a bit more size in the process. Which makes them heavier, which makes them roll faster. You get the idea: in general the fortunate few snowballs will end up much bigger, leaving the smaller ones stuck at the top of the hill and others much bigger and much faster. Continue reading