Integrating SAP and Apache Hadoop

 .shock -
Photo: .shock –

The new CIO Guides series aims to provide a comprehensive, end-to-end overview of major technology topics and to highlight interconnections between key use scenarios, reference architectures, and SAP products. Each guide provides a holistic overview of a selected topic and examines it from various different perspectives.

First in the series: Integrating SAP and Apache Hadoop

The first guide in the series looks at the Apache Hadoop open-source platform, which is particularly suited to running data-intensive distributed applications on commercial server clusters. It explains how to integrate the framework into existing business analytics and data warehouse installations and outlines the added value that enterprises can extract from combining SAP applications with Hadoop. The first CIO Guide therefore offers valuable background information and specific sample scenarios as a basis for investment and architecture decisions.

One of the most significant new technology trends is big data. Through the SAP Real-Time Data Platform, which combines SAP HANA with SAP Sybase IQ and other SAP and non-SAP technologies – notably Hadoop – SAP is addressing a very significant aspect of big data: fast access and real-time analytics of very large datasets. The Apache Hadoop open-source software framework runs intensive computing processes on large clusters of low-cost commodity hardware, which makes it ideal for distributed applications with high data volumes. The CIO Guide explains how to use Hadoop in conjunction with SAP solutions to extract maximum benefit.

According to Rohit Tripathi, vice president Product Architecture & Technology Strategy Group at SAP and co-author of the CIO Guide, Hadoop does not compete with SAP HANA – it complements it.

Next page: How customers will benefit from the guides

Understanding SAP and Hadoop: Benefits of the CIO Guide

The target audience for the first CIO Guide is SAP customers who run SAP business solutions, have (or are thinking of getting) SAP HANA, and are considering deploying Hadoop. It:

  • explains Hadoop and describes how it complements conventional databases and in-memory solutions such as SAP HANA;
  • sets out use cases and scenarios where leveraging Hadoop could benefit businesses;
  • provides a reference architecture showing how SAP HANA, Hadoop, and other SAP solutions – such as SAP Business Suite – can be interlinked;
  • demonstrates how Hadoop can fit into an SAP landscape and describes the benefits of SAP solutions for Hadoop implementations;
  • outlines the basics and best practices for implementing Hadoop and for project organization.

In particular, the CIO Guide aims to answer fundamental questions that customers ask:

  • When is Hadoop really the “best” solution to a business problem?
  • How should we use Hadoop alongside SAP solutions and technologies, including SAP HANA?
  • How will Hadoop evolve, and what plans does SAP have?

Next page: More CIO Guides on the way

Next topics: Security, performance and orchestration

“The Guide isn’t a detailed technical “how to” for integrating Hadoop and SAP HANA, but it will help you work out what your strategy and plans for SAP HANA/Hadoop solutions should be to deliver business value”, summarizes David Burdett, director Product Architecture & Technology Strategy Group at SAP and co-author of the guide.

More CIO Guides will be published over the year with focus topics such as security, performance, and orchestration.

Read the author’s blog post about delivering business value using SAP HANA and Hadoop.