Big Data: Why most businesses are sat on the sidelinesby
Big Data gets a lot of hype and attention but at heart a lot of the issues around it are the latest iteration of large scale business intelligence.
That's the argument put forward by Anurag Tandon, responsible for marketing around Big Data and analytics capabilities at BI champion MicroStrategy.
"Too often, we hear about Big Data in the context of large volumes of data," says Tandon. "Retailers sifting through millions and billions of point-of-sale records or banks looking at transactions in the millions and billions and whatnot. A lot of the time, those applications that are being discussed could be construed as just large-scale BI."
"You have data in your company," he adds. "There might be more sources of information available from the government as the governments open up more. There's data coming in from the financial sector. There's business and consumer studies and more and more of them, surveys and polls and things like that.
"All of that data is useful for business operations top and bottom line performance - just the traditional things that BI has been used for."
There are also newer sources, the so-called 'digital exhaust' from people, data that people involuntarily leave behind from things like the online click stream.
"So not only transaction information for ecommerce, but also what might be in these shoppers' shopping cart that's discarded or the items that people looked at in succession," explains Tandon. "All of these trends that are showing value, showing how the consumer thinks, are becoming more and more useful."
Web 2.0 phenomena play their part, such as social media and the content that is being generated in social media from posts, tweets, blogs, pictures, videos and so on.
Tandon describes this as:"content not involuntarily generated but voluntarily generated, voluntarily shared, that provides information on how the customer or consumer expresses themselves, how do they state their preferences, what words do they use, what sentiment do they express?
"Things like that are valuable for customer engagement, customer service, brand management and so on. When one shows behavioural data, the other one shows stated expression of preference."
The five Es
Despite the high profile of and hype around Big Data, most organisations have yet to get their Big Data strategies in place. "The reality becomes clear if you look at where the companies are in this journey of Big Data," says Tandon.
"If you look at five Es - the Evading, Envisioning, Evaluating, Executing, Expanding phases of a company - approximately 40% or so are still sitting on the sidelines. They're evaluating what's going on. What are the different use cases? Where should I invest? Where should I get skill in order to take on these Big Data challenges because a lot of the companies may not necessarily have the skill to take on that challenge."
A solution for many companies is to dump any data that is being collected in a company in some form of data sink, such as a Hadoop system, then figure out later what questions are going to be asked. That way there is no need to structure data upfront and put investment into ETL, says Tandon.
He cites the example of Netflix. "They do this for their streaming service where they load all of the data into a Hadoop system and then enable their analysts, their data scientists, to look at that data at will and at all fashions," he explains. "They use MicroStrategy on top, extract some of the data, do the analysis that might be short-lived analysis on how consumers are using their service, and are then be able to tweak their service and then collect more data and then analyse that later.
"In this context, the reports and the analyses that they're conducting are very short-lived and changed from time-to-time. They don't necessarily want the expense of creating an ETL cycle from their Hadoop system to their warehouse and then go through that expensive process only to figure out that the analysis has changed in two weeks' time and then having to do that all over again. So they're finding a lot of value in going directly into their data sink, their Hadoop system, and then asking questions."