SAP Data Hub: Framework for Big Data Scenarios

Many companies are currently collecting their data streams from diverse sources in so-called Big Data lakes. Yet only through interaction with the application landscape and its data can this be valuable for companies.

SAP Data Hub is the central orchestration tool, and minimizes the effort in implementing Big Data scenarios and creating new business models.

Virtually Any Company Can Maximize Existing Data

The situation is similar in many companies: In addition to the data from business processes such as CRM and ERP systems, CIOs are now being presented with new data deriving from sensors, social media platforms, and various cloud applications.

“It’s often difficult for companies to know precisely where to start,” explains Marc Hartz, project manager for SAP Data Hub. “Together with the raw data collected in the data lakes, datasets of numerous petabytes can be brought together year upon year.”

According to a current study on Big Data, 86% of participants believe they can extract more from their data, 74% believe their data landscape in not sufficiently transparent, and are concerned about the negative consequences on corporate flexibility.

Overcoming Intermediary Steps for Big Data Scenarios

The reason for this is that companies require specific skills, and it takes a lot of effort to ultimately derive added value from data. Data scientists are needed to be able to correctly implement the large number of required (open source) tools. Many intermediary steps are needed for productization, for example, the practical deployment of a Big Data scenario within the company.

Integrating data, checking the quality, processing it, setting filters, creating aggregates, anonymizing social media data: “This involves a lot of effort,” says Hartz.

Having several scenarios in one joint landscape is hardly achievable. It ultimately comes down to being able to acquire new insights and automatically integrate into a new business model.

“Those who manage to combine Big Data scenarios with the existing application landscape can generate extra added value,” says Hartz. Automation processes for the interaction between Big Data and the existing data warehouses and ERP applications can now be realized, and existing processes will be improved and simplified.

How Electronic Home Appliance Companies Profit From Big Data Scenarios

Data experts from an electronic home appliance company had up to seven tools in operation to examine how regularly individual appliance functions are used on a daily basis. The company equipped every push button with a sensor that immediately produced a dataset when pressed. The aim was to better tailor the functions to suit customer needs.

The sensory data analyzed was able to help provide a forecast for a device’s remaining running time and identify functions that had hardly been used in the past. “To find this out, the data lake needs a corresponding logic,” says Hartz. “We take the sales and production data and analyze the number of models sold, where they were sold, and what the top functionalities are.”

By implementing the first Big Data scenario, the electronic home appliance company can now offer individual and regionally differentiated products.

SAP Data Hub: Quicker Implementation of Big Data Scenarios, Less Tools, Deeper Integration

Thanks to the SAP Data Hub, additional scenarios will become a lot easier. What previously had to be fragmented or manually coded to be able to process Big Data will be automated by SAP Data Hub. In the future, if customers are planning similar Big Data scenarios for additional products, they won’t need new codes for every new scenario.

“Thirty years ago, we didn’t have ERP, no standards, and hardly any tools or applications that could process corporate data,” Hartz explains. “In the Big Data world we are currently in a similar position.”

SAP Data Hub is a key system component for digital innovation for SAP Leonardo and enables users to implement Big Data scenarios with greater ease. It is a control level for the entire data landscape, whose data is orchestrated and distributed, regardless if the data comes from Hadoop or SAP BW systems, from on-premise, cloud or hybrid system frameworks, or from data lakes. Although in the past the relevant data was extracted from various sources using ETL tools and centralized in a target database, SAP is no longer doing this.

Thanks to the distributed landscape of SAP HANA to Hadoop and Amazon or Google cloud systems, it is now easier to gain new insights without having to move data backwards and forwards, explains Hartz: “SAP Data Hub enables customers to implement cross-landscape scenarios quickly, with less tools, and also offers deeper integration into the business processes, particularly for SAP S/4HANA.”

Coming to Grips with Unstructured Data

SAP Data Hub is targeting companies that are currently experiencing the biggest difficulties with unstructured data masses that do not provide any tangible added value. In defining SAP’s new fundamental approach, the biggest challenge was to implement three steps, which sound straightforward, but are in fact extremely difficult, explains Hartz. First, transferring sensor data into a data lake and being able to identify it; second, processing it; and third, integrating the data into an application, such as a data warehouse.

It isn’t only the home appliance manufacturer that has successfully implemented these three steps. In a use case already presented by SAP, the sensor data collected from fitness trackers is globally analyzed and integrated into business processes, and social media efficiency analyses are generated for marketing campaigns. And these examples are only the start of Big Data of the future, whose potential is not yet even foreseen.