Beware The Hype Over Big Data Analytics

Seeking Alpha recently ran a piece titled "Beware The Hype Over Big Data Analytics" which attempted to present a counter-argument to the hype surrounding Big Data and analytics. It concludes with "I would advise investors to be very, very wary of buying companies that are big on the data analytics hype."

I'd like to take a look at their arguments and see if there is any merit to them:

1. Collecting and analyzing large quantities of data is not really a new phenomenon

This is true. However, the ability to store and analyze huge amounts of data at an economical price IS very new. You used to require server farms that cost millions of dollars in upfront capital outlays for hardware and software to analyze huge amounts of data. Open source software (like Hadoop and Cassandra) and on-demand computing power has removed this barrier.

In addition, not that many organizations actually HAD large quantities of data before. The amount of data being generated and stored by organizations has grown exponentially over the last several years, and will continue to do so. This translates to the number of organizations that NEED this capability growing like wildfire.

When you combine plummeting costs with rocketing demand, the fact that there have been a handful of large companies doing this in the past becomes moot.

2. Organizations have been promised 'Moneyball' type results from data analytics, but have not gotten it.

The argument here goes that organizations that have been storing data for a long time have had a tough row to hoe extracting value from it. This is true.

However, this is exactly where research is being done, making the interfaces easier to use, the data more intuitive, and collecting context around the data to make it richer. This is clearly a work in progress, but progress is being made every day, startups are attacking it head-on, and there is definitely a light at the end of the tunnel. Recent advances in user interfaces and natural language processing, for example, are making it easier than ever to explore data and unlock the value it holds.

When you look at data as a natural resource, something that must be mined, processed, and refined before it can be made useful, this makes much more sense. You can't wear wool that you've just shaved from a sheep, and you can't get value by looking at raw database tables.

3. Companies can do it in-house

While I fail to see how this is a drawback, the author claims that analytics firms over-charge and under-deliver for skills that aren't especially unique and can easily be done in-house. As someone who has worked in this field for a long time, I agree and disagree.

There are many analytics problems which can be solved perfectly well by someone trained in the finer points of Microsoft Excel. It's a wonderful tool.

However, in an article about "Big Data" analytics, I find this argument disingenuous. The hard part is dealing with LOTS of data. The amounts of data that will make Excel crash and burn when it even sniffs it. Dealing with that much data IS very hard and requires a diverse skill set that an Excel jockey does not possess. Things like database indexing, storage and network optimization, multi-layer caching, the list goes on. It is hard, otherwise the whole Big Data "fad" if you want to call it that would have come and gone long ago.

4. Effective analytics requires in-depth domain knowledge

Again, I fail to see how this has any bearing on whether Big Data Analytics itself is good or bad. 

That said, the author is correct about this. No company or product will ever be able to take your data and deliver the exact insights you're looking for. This is simply because much of the work in analytics is exploring for insights that help answer larger questions, or bringing interesting things to your attention that you didn't even know were interesting.

However, good tools will make this exploration and discovery process infinitely easier and more enjoyable. They will also democratize the process so that the business users can actively participate.

Farming out your analytics to an outside party is certainly a bad idea. Analytics in general, however, can revolutionize a company or even entire industries.

5. Insights derived from larger data sets are no more valuable than those derived from smaller data sets.

This is patently false, and this is where the author displays his gross misunderstanding of big data analytics.

The fact is, the more data you have to work with the more accurate predictions become. This is a core tenant of data mining and predictive analytics.

Not just that, but the more different TYPES of data you can add to the mix adds just TONS of value. And that is because more data adds CONTEXT to your existing data. Data that exists in a vacuum can be not just uninteresting and uninformative but actually misleading.

If the data tells you that sales for a particular product are down, that's interesting. If you add more data and discover that the reason sales for that product are down is because it's constantly out of stock due to a bad supplier, that can dramatically affect your company's bottom line.

6. Cross-functional data is not useful when combined

Again, this seems like the author laboring to make a point that's not there.

There are some data sets that are just so universally useful that they can add value to just about any other data. For example, the US Census data set provides information about demographics, employment, income, education--the list goes on. This can be layered on top of just about any database that concerns people (which is a LOT of them) and add value instantly. When you intersect that with data sets that provide even more data on one of those topics--foreclosure data, for example--the information you have about the people in your database gets even richer. This is the entire premise of the Fusion Project.

And again, more information about something equates to more context around the data you already have. This context not only makes predictions more accurate but sends them off in new directions that you may never have even thought of on your own.

Summary

It's only natural to want to kick back against something that gets as much hype as big data analytics. However, sometimes where there's smoke there's fire, and there is definitely a reason why this area is so hot right now. Levelheaded analysis is always appreciated, but we need to be careful not to be so reactionary in it that we completely miss something very big.

Jason