Mine Data, Find Pattern, Get Rich?
Among financial academics, chartists tend to be regarded as quacks. But a lot of the Big Data people are exactly like them. They say, “We are just going to stare at the data and look for patterns, and then act on them when we find them.” In short, there is very little real science in what we call “data science,” and that’s a big problem.
While I believe that Big Data is a serious trend and will, over time, have profound impact, I do think Prof. Fader has a point.
In conversations at conferences, I sometimes hear this viewpoint: get a bunch of data, “run data mining” on it and serve up the patterns that are found. There’s almost a mystical belief that if you can only unleash powerful-enough algorithms on the data, great business truths will be revealed.
Experienced practitioners know that this is a far cry from what actually happens. Data prep takes an unreasonably long time, the first several rounds of output from algorithms make very little sense, and very often the findings are either flukes or unactionable.
To get to something that’s actually useful, a considerable amount of business judgment, careful thinking about the problem/situation context, and (of course) data science virtuosity are needed.
Does this mean that there’s no value to trawling for patterns in Big Data? Not quite.
If we consider the discovery of patterns as the starting point for analysis, and not the final stage, there’s some benefit. Once in a while, you may stumble on a pattern that’s real, actionable and previously unknown.
The catch is that you need to test the pattern in the real world (i.e., outside your analysis sandbox) to see if it is real. This may involve running an experiment to make sure you aren’t confusing correlation with causation, or that you aren’t fooled by randomness.
The other catch is that machine-learning algorithms can spew out a gazillion patterns before you can say “stochastic gradient“! Determining which patterns are worth testing in the real-world requires judgment as well.
In short, patterns from Big Data have value but there’s a lot of hard work, business judgment, and problem knowledge needed to translate those patterns into profit.