While Moneyball, and Brad Pitt’s good looks, became the face that launched a thousand Big Data blog posts, I’ve often thought about other examples which might appeal to those who prefer to pour a glass a wine versus pore over box scores and Hadoop clusters.
This begs the question: Can Big Data help me find a good bottle of wine?
According to the book “Supercrunchers,” the answer is yes. Imagine trying to determine if 2013 will be a good year for cabernet because you want to invest in wine futures or want to place an early order for a few cases of the good stuff from your wine merchant. The usual approach is to ask a wine connoisseur who has decades of experience and uses the “swish and spit” technique to expose complex wine flavors.
Orley Ashenfelter, an economist by day, found a superior approach would be to “Run the Numbers” instead, and found that decades of experience can be beat by a simple linear equation:
Wine quality = 12.145 / 0.00117 * Winter Rainfall + 0.0614 average growing season temp – 0.00386 harvest rainfall.
This mathematical approach correctly predicted the “Wines of the Century” in 1989 and 1990. The reaction of the traditional experts was the same as the old scouts around the table in Moneyball. The highly influential wine guru Robert Parker laughed off the approach: “I’d hate to be invited to his house to drink wine,” he said.
But Ashenfelter had the last laugh because he made a lot of money for his advocates in wine futures by betting “against the house.”
Uncork your “Dark Data”
Ashenfelter’s wine formula is one example that reveals most data you need to run your business is actually external to your company. It’s akin to a time in the 1980s when physicists re-worked the numbers and started to come up with the conclusion that most matter was not the familiar atoms, but some other, ethereal substance which they called dark matter. More equations later, and it seems that 95% of the total matter of the universe is dark. And we know little about it. Oops.
I see similarities in big data. Up until the dawn of business networks, companies assumed all of its data could be rationalized into a data warehouse. There were still things to do, for sure: reconcile master data, design reports, tidy up some schemas, and build reporting cubes. But once enterprises began to link up with business networks, suddenly we discovered that more of the important data needed was outside the enterprise.
Dark matter is now an exciting field for physicists and dark data is also becoming a crucial area of activity for data scientists. We need to figure how to get it, how to rationalize it, and how to use it. Balancing all of this is hard to do if you’re not tapping into business networks to access intelligence on market trends, supplier history, qualifications and risk factors to gain insights that help design and execute smarter strategies.
Just like social networks make it easy for consumers to manage their personal relationships and activities, business networks allow companies to connect and collaborate with their trading partners around the world anytime, anywhere, from any device (more than 1 million companies in 190 countries, for instance, use the Ariba Network to transact over $465 billion in commerce on an annual basis).
Big Data can be used to answer many questions that were previously the domain of the expert. It’s a new way of operating but organizations that embrace it can ultimately transform their businesses.
Join me on Twitter @JamesMarland.
James Marland is vice president of Network Growth at Ariba, an SAP Company