The Buzz on Big Data

Recently, the World Economic Forum declared Big Data an economic asset, in the manner of currency or gold. Less than a year ago, the McKinsey Global Institute published a report over one hundred pages long on Big Data. And in July, the U.S. National Bureau of Economic Research will hold a workshop on the topic. Suffice it to say, Big Data is here to stay.

Whether you’re a large enterprise in the manufacturing industry or a non-profit organization in the public health sector, one thing is clear: Big Data is going to transform the way you work, if it hasn’t already. But why is that? And why now? In this article, we examine the factors contributing to the creation of massive amounts of data, the possibilities for productive use, and the challenges involved.

It’s all relative

There are as many definitions of Big Data as there are articles on the topic. This is because the parameters of Big Data are constantly changing. What we think of as a huge amount of data today, might seem entirely manageable in just a few years’ time as our technological ability to store data increases.

The McKinsey Global Institute report on Big Data, therefore, intentionally puts the definition in relative terms: Big Data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. This definition is intentionally subjective and does not give a minimum number of terabytes because, as technology advances over time, the size of datasets that qualify as Big Data will also increase. McKinsey also notes that this definition can vary by sector, depending on what kind of software tools and datasets are common in a particular industry.

So how does this rather abstract definition apply to the current state of data in the world? According to the research firm IDC, in 2007, the amount of digital data created in a year exceeded the world’s data storage capacity for the first time. The birth of Big Data on a global scale, if you will. In this study, IDC also found that data is growing at a much faster rate – 50 percent per year, or more than doubling every two years – than data storage capacity is expanding. In short, Big Data is only getting bigger. Let’s take a quick look at where this data is coming from.

What’s in a name?

Business, governments, individuals, and even machines are all contributing to the Big Data phenomenon. Companies maintain vast amounts of transactional data, gathering information about their customers, suppliers, and operations. And the same is true for the public sector. Most countries in the world manage enormous datasets containing census data, health indicators, and tax and expenditure information. Furthermore, online and mobile financial transactions, social media traffic, and GPS coordinates – the kinds of activities that millions of people do on their smartphones every day – generate over 2.5 quintillion bytes of Big Data daily.

But it isn’t always human hands behind Big Data. More and more, machine-to-machine communication – also called the Internet of Things – is responsible for the creation of data. Digital sensors are installed in shipping crates to track movement along a route and send the information to transportation companies. And sensors in electrical meters measure energy consumption at regular intervals and report the information to utilities companies. Such scenarios are becoming increasingly common. There are more than 30 million networked sensor nodes present in the transportation, automotive, industrial, utilities, and retail sectors today, and the number of these sensors is increasing at a rate of more than 30% per year.

The prevalence of Big Data, however, is irrelevant, unless it can be used to gain new insights in business, increase efficiency in government, and improve the world we live in. In order to achieve these results, data must first be collected and analyzed. Innovations in cloud computing are making it possible to store ever-increasing amounts of data at lower costs. Advances in natural language processing, sentiment analysis, pattern recognition, and machine learning (as in Siri) have opened up new possibilities in the analysis of unstructured data. All of these factors – the generation of ever-increasing amounts of digital data, innovations in data storage, and more nuanced analytics tools – are converging today to make Big Data a valuable, lasting part of our economic landscape.

Big Data in use

While Big Data may just now be entering the mainstream consciousness, some organizations have been benefiting from the analysis of Big Data all along. Amazon is the classic example. For years, the online retailer has parsed customer information, purchase history, and other data to power its scarily-accurate recommendation system. Online dating services analyze Big Data to determine potential matches among their users, and sports teams put digital information to use in their recruitment strategies (see our article, “On Mitnick and Moneyball”). On the whole, studies show that companies employing data-driven decision-making achieve productivity gains that are 5-6% higher than other factors can explain.

And in the public sector, Big Data creates an opportunity to better identify needs, provide services, and predict and prevent crises among low-income populations. A new initiative by the United Nations, for example, is using natural-language processing software to predict job losses and spending reductions in at-risk regions. They are then able to help assistance programs deal with these issues in advance. During the cholera outbreak in Haiti in 2010 and 2011, researchers tracked the movement of people from the affected zones to new areas via data generated by SIM cards. This information helped aid organizations prepare for new outbreaks.

In its report, the McKinsey Global Institute focuses on five broad areas where Big Data can add value to organizations across diverse industries and markets. These are summarized below:

  • Creating transparency: By making Big Data more easily accessible to all relevant parties, it is possible to support more efficient processes. In manufacturing, for example, integrating data from R&D, engineering, and manufacturing units can significantly cut time to market and improve quality.
  • Conducting what-if scenarios and further experimentation: Companies already create and store data on nearly every transaction. If they use this data to analyze variability in performance, such as product inventories or employee sick days, they can gain insights into the root causes of operational success.
  • Segmenting customer populations: Many companies in the marketing and retail sectors already tailor their offers and promotions to specific customer segments. Big Data can make that process even more nuanced.
  • Making data-based decisions: By basing decisions on huge amounts data – far greater than individuals can analyze themselves – companies are able to minimize risks and unearth new insights. One example of data-based decision-making can be seen in Oversight System’s continuous analytics software, which identifies fraudulent or unauthorized behavior by monitoring and evaluating each and every transaction.
  • Innovating new business models: The advent of Big Data and real-time analytics together are enabling companies to develop entirely new business models. Insurance companies, for example, can use real-time location data to price property and casualty insurance policies based on where and how people drive their cars.

More analysts needed

Before Big Data can really transform the way we work and live, there are a few issues that must be dealt with. One of the most pressing problems is that there is a dearth of analysts with the necessary experience and training. The McKinsey Global Institute estimates that the U.S. alone needs 140,000 to 190,000 more workers with deep analytical expertise and 1.5 million more data-literate managers.

There is also a need for governments to create and enforce official policies on Big Data. At the top of the list are privacy and security. Personal data in health records, for example, has enormous potential for human benefit. Analysis of this data could lead to more user-centric insurance policies or new medical treatments. However, many individuals feel that this information is particularly sensitive. And what is to stop some companies from abusing the information? There is a clear need for official policies that take into account both personal privacy and the potential benefit. In addition, measures must be taken to protect sensitive customer data from security breaches.

Furthermore, Big Data is particularly problematic when it comes to intellectual property and liability. Since data can be easily copied and used simultaneously, how do you decide who owns a piece of data and what rights come with that? If a piece of data is used irresponsibly or inaccurately, who is liable?

At its best, Big Data is a powerful tool to combat poverty, crime, and pollution. It can lead to incredible business insights, cost-savings, and new customer-focused products. At its worst, Big Data can be manipulated to support false discoveries that are unfair or discriminatory, or it can be used to abuse private information. And as Big Data becomes increasingly valuable for businesses, governments, and individuals, the need for official regulations and policies will only increase.

To read more about Big Data, see these reports: