May 24, 2012, 9 a.m.: The conference room at the Hilton City Hotel in Munich was brimming with representatives from a slew of well-known international companies – Google, SAP, Intel, Sky Communications, The Boston Consulting Group, Bosch, Springer, and more. Around 150 attendees in total, they were present to learn about the latest trend topic to occupy the IT industry: big data. Even more enticing, the symposium, put on by the supranational non-profit organization for communications research, Münchner Kreis, promised to divulge “how to turn big data into new knowledge.”
Speakers at the conference included business professionals from the likes of IBM, Siemens, and BMW, as well as academic experts from some of Germany’s premier universities. And coming from such diverse backgrounds, it’s no wonder that nearly every speaker first began by defining big data.
The Four Vs
As McKinsey recognized in its exhaustive report published on the subject last year, the parameters of what constitutes big data are necessarily relative to the situation: What is a typical server’s storage capacity at this moment in time? How big is a typical data set in that industry? They define big data simply as a dataset that is too large for typical database software tools to store and analyze. As technology advances, of course, the size of datasets that qualify as big data will also increase.
Event speaker Prof. Dr. Volker Markl, from the Technische Universität Berlin, agreed. He noted that what we call big data today will soon fit into a typical server’s main memory, thanks to innovations like SAP HANA. He and his colleagues are now talking about “medium data” – a large, but not unmanageable dataset. Transactional data is an example of what could be termed “medium data”.
Markl also pointed out that it’s not just the volume of data which makes it challenging to analyze. It’s also the velocity at which you need the analysis; the variability of the format, structured versus unstructured data; and the veracity of the data – is it accurate, can you trust the data? Volume, velocity, variability, veracity: the four Vs.
Given these challenges, why are so many companies and organizations so optimistic about big data? Continue reading to find out.
The Competitive Advantage
Studies reveal that companies making data-driven decisions show productivity gains five to six percent higher than other factors can explain. As Christian Klezl, vice president and cloud leader at IBM, said, that is the difference between staying in the market and closing shop. Klezl also shared that one in three organization leaders say they make decisions based on information they don’t have or don’t trust, and 60% of CEOs agree they have more data than they can effectively use.
Big data has the potential to give companies a decisive competitive advantage over their peers, but most people are still struggling to sort through the excess of information and capture that potential. The burden on big data analysis is to deliver the relevant information at the right time. Next-generation BI tools will have to follow through on that mandate. According to Markl, the future of BI is all about correlating data from different sources – not just data warehouses, but also web services, the corporate Intranet, and the Internet. Let’s say you want to find out the average age of every CEO according to country and industry. One source lists Bill Gates as the CEO of Microsoft, another source mentions that Bill Gates is 56 years old. The next-generation BI tools will be able aggregate and correlate such information to give you the right answer.
Now apply this example to questions like, what is your customer satisfaction according to region, and how has your product sentiment changed over the past year. This information could play an important part in achieving that five to six percent gain in productivity over the competition. And it’s not just large enterprises that stand to benefit. By storing data and carrying out these types of analyses in the cloud, SMEs can also tap into the big data phenomenon.
“It’s not a tech topic”
That big data has generated such widespread enthusiasm is probably due to the fact that it impacts essentially every industry – from financial services to healthcare to alternative energy – as well as individuals and organizations outside the corporate world.
IBM, for example, recently collaborated on a project with the city of Dublin to optimize bus routes and schedules. As Klezl explained, the company gathered geolocation data from Dublin’s fleet of buses and studied commuter patterns to create more efficient schedules. They also installed screens at bus stops that show real-time updates of bus locations and arrivals. So riders are now able to make an informed decision on whether they want to wait fifteen minutes for the bus, or walk a mile to their destination.
Dr. Volker Rieger, of Detecon International GmbH in Bonn, Germany, gave other examples of innovative big data applications developed by SMEs and startups. Bid my Bill, a U.S.-based startup, has introduced a system where telecommunications providers bid for a user’s phone contract based on an analysis of the individual’s phone usage. Another U.S. company, glympse, provides an app that lets users share their location with friends and family in real time. You can literally “follow” your friends as they run errands, and easily organize a rendezvous.
The ability to improve existing processes and offer new services in any number of industries is vast, thanks to big data. Here are just a few examples:
- Utilities: analyze data collected from smart meters to improve energy supply and demand forecasting
- Healthcare: optimize health insurance policies and improve authorization decisions
- Alternative energy: analyze weather data to optimize the placement of wind turbines
- Climate: predict future water demand in a region and correlate with expected rainfall to optimize water usage
- Crime: analyze a variety of data sources to identify problem areas in a city to better organize police presence
- Finance: use event processing to manage risk in trading
- Manufacturing: use sensors on vehicles to plan preemptive maintenance
- Telecommunications: analyze the location and frequency of dropped calls in real time to reach out to and retain those customers
The law has to catch up
While the speakers were notably excited about the future of big data, one aspect gave them pause: data privacy and protection. Technology today is moving at a rapid rate, and legislation is struggling to keep up. In terms of data, there are many open questions. Who exactly owns the data in a database? The company that collected the data or the cloud service provider that is hosting it? Customers would probably like to say that they own the data. But unless the information can be tied back to an individual identity, it’s up for grabs. And what about varying country policies?
This part of the conference presented many questions and gave few answers. One thing is clear: As data becomes more and more valuable to companies, individuals, and even countries, the question of ownership is crucial. The next developments in the big data sphere will likely be around creating and enforcing legislation.