The five biggest Big Data myths debunkedby
What do we really understand about Big Data? The general consensus is that quality and value take precedence over quantity, but what position are the majority of marketers really in, in terms of successfully analysing large datasets? Is the technology available really advanced enough?
Myth 1: Everyone else is ahead of the curve in adopting Big Data tech
“Interest in Big Data technologies and services is at a record high, with 73% of the organisations Gartner surveyed in 2014 investing or planning to invest in them. But most organisations are still in the very early stages of adoption — only 13% of those we surveyed had actually deployed these solutions.”
Gartner survey - Big data adoption 2013/2014
“The biggest challenges that organisations face are to determine how to obtain value from Big Data, and how to decide where to start. Many organisations get stuck at the pilot stage because they don't tie the technology to business processes or concrete use cases.”
Myth 2: With so much data, you can ignore little data flaws
IT leaders believe that the huge volume of data that organisations now manage makes individual data quality flaws insignificant due to the "law of large numbers."
"In reality, although each individual flaw has a much smaller impact on the whole dataset than it did when there was less data, there are more flaws than before because there is more data," said Ted Friedman, vice president and distinguished analyst at Gartner.
"Therefore, the overall impact of poor-quality data on the whole dataset remains the same. In addition, much of the data that organisations use in a Big Data context comes from outside, or is of unknown structure and origin. This means that the likelihood of data quality issues is even higher than before. So data quality is actually more important in the world of Big Data."
Myth 3: Big Data tech eliminates the need for data integration
“The general view is that Big Data technology — specifically the potential to process information via a "schema on read" approach — will enable organisations to read the same sources using multiple data models. Many people believe this flexibility will enable end users to determine how to interpret any data asset on demand. It will also, they believe, provide data access tailored to individual users.
“In reality, most information users rely significantly on "schema on write" scenarios in which data is described, content is prescribed, and there is agreement about the integrity of data and how it relates to the scenarios.”
Myth 4: Don’t bother with data warehouses for advanced analytics
“Many information management (IM) leaders consider building a data warehouse to be a time-consuming and pointless exercise when advanced analytics use new types of data beyond the data warehouse.
“The reality is that many advanced analytics projects use a data warehouse during the analysis. In other cases, IM leaders must refine new data types that are part of Big Data to make them suitable for analysis. They have to decide which data is relevant, how to aggregate it, and the level of data quality necessary — and this data refinement can happen in places other than the data warehouse.”
Myth 5: Data lakes to replace data warehouses
“Vendors market data lakes as enterprise-wide data management platforms for analysing disparate sources of data in their native formats. In reality, it's misleading for vendors to position data lakes as replacements for data warehouses or as critical elements of customers' analytical infrastructure. A data lake's foundational technologies lack the maturity and breadth of the features found in established data warehouse technologies.”
Chris was an Editor at MyCustomer from 2014 to 2022. He is a practiced editor, having worked as a copywriter for creative agency, Stranger Collective from 2009 to 2011 and subsequently as a journalist covering technology, marketing and customer service from 2011-2014 as editor of Business Cloud News.