In 2010, SAP raised the bar on enterprise-class data management and analytics when we introduced SAP HANA, the first data platform to deliver real-time analytics on live transactions.
As an in-memory database, the data in SAP HANA could be stored in RAM instead of hard disk for faster access to data in multicore CPUs. Memory reads were a million times faster than conventional databases. Transactional systems no longer had to be separated from analytics systems for operational reporting, so data was always real time and up to date.
Nine years later, SAP HANA is in use by major organizations around the world. Demand is unprecedented for SAP solutions overall. And now our commitment to market-leading innovation continues with the latest release of SAP HANA.
Broad Range of Innovative Features and Benefits
SAP HANA is the latest evolutionary step and adds a variety of innovative features. It simplifies and democratizes the in-memory computing environment so many more people in an organization can use it to get answers and insights. Numerous other features contribute to a lower total cost of ownership (TCO), improved scale and data reliability, an enhanced user experience, and much more.
- Data tiering enhancements for improved scale
- Persistent memory hardware availability
- Scale-out enhancements
- Data protection and privacy
- Smart data access and Hadoop integration
- Hexagon grid clustering for spatial data
- Data scientist and machine learning improvements
- Spatial performance enhancements and algorithms
- User experience optimizations throughout
Data Tiering Enhancements for Improved Scale
The new tiered storage feature lets you deploy a multi-temperature storage strategy with optimized data tiering capabilities so you can increase data capacity of SAP HANA at a low TCO. This new data tiering option, SAP HANA native storage extension, provides simple, scalable data tiering for all data types and workloads. Administrators can optimize data placement to achieve an optimal cost-versus-performance ratio to support their business priorities with more cost effectiveness. A data tiering advisor maintains database access pattern statistics and provides the user with recommendations on which database objects — such as tables, columns, partitions — to tier to warm storage.
The 2.3 release of SAP HANA in early 2018 introduced the first major database platform optimized for the full potential of Intel® Optane™ DC persistent memory. On April 2, 2019, Intel announced general availability of this technology, which is a major advancement in data storage hierarchy and will have major benefits for SAP HANA customers. We have continued our industry leadership with further improvements in our latest release. To adopt this technology, you’ll need the latest hardware and at least SAP HANA 2.3, but no application changes are required so there is no disruption.
Intel® Optane™ DC persistent memory provides large affordable capacity, with data persistency in a dual inline-memory module (DIMM) form factor. Because of our optimized use of persistent memory, it delivers much higher memory capacity and persistent storage of the data with near DRAM-like performance. Data stays in memory when the system is powered off, which enables dramatically faster data loads at startup. With SAP HANA and persistent memory, we deliver a game-changing way to manage more data in real time, at a lower cost, and with improved business continuity.
With persistent memory, SAP HANA can manage larger data sizes at in-memory speed. In an internal benchmark that SAP performed with Intel, we saw a 12.5x improvement in SAP HANA startup time when using Intel persistent memory compared to a traditional DRAM and an SSD configuration.
One of the world’s leading specialty chemical companies and an early adopter of SAP HANA, Evonik Industries experienced during its proof of concept test run a 17x improvement in data loads at startup. SAP HANA with persistent memory was up and running at full capacity on 1.3 TB of data within a minute and 35 seconds.
Scale Out Enhancements
A new table distribution capability provides administrators with the tooling to identify performance-optimized distribution of tables and partitions in an SAP HANA scale-out landscape. The administrator can define rules based on application semantics to control the data location in SAP HANA.
Combined with the enhanced data tiering functionality, these capabilities help ensure the best performance in SAP HANA scale-out clusters while balancing memory and CPU consumption reducing network traffic between nodes.
Data Protection and Privacy
To help enforce data privacy, enhance governance, and support compliance with regulations, SAP HANA lets you analyze all data while keeping sensitive information private. New audit retention policies address compliance requirements and increase transparency for decision makers when they are choosing anonymization parameters. Delivered key performance indicators (KPIs) help you measure critical variables such as risk.
Smart Data Access and Hadoop Ontegration
Now you can access and democratize remote data with or without data movement for a full view of the enterprise. Integration via a Spark controller with Hadoop clusters supports high-availability mode and a DLM utility to expose SAP HANA cold data to a Hive table. A generic adapter framework enables enterprises to connect to any ODBC or JDBC data source, which ensures that you can now connect to any data source in real time. DDL support for remote source enhances change management by centralizing change procedures in SAP HANA to remotely create objects in the target system.
Hexagon Grid Clustering for Spatial Data
SAP HANA supports spatial clustering using rectangular grids, clustering, and density-based spatial clustering of applications with noise (DBSCAN). And now it is the only database to support hexagon clustering. This method of clustering spatial data provides a better way to get insights from relational spatial data. It allows for curvature representation and is ideal for showing connectivity or movement paths. Hexagonal clustering is extremely useful for all spatial data processing in all industries that utilize spatial capabilities.
Hexagon clustering is a technique to process geospatial data for identifying and visualizing things such as escape routes that avoid hazards in emergency situations or the optimal paths for utility pipelines that avoid restricted areas. Developers can use it easily in new applications and combine it with other SQL predicates. Only SAP offers this capability, and it is laborious to implement in other relational databases using the basic geospatial functions. With SAP HANA, developers can use a single SQL query to process millions of geo-coordinates in seconds and cluster them effectively. To achieve this result, we use the geospatial and graph capabilities of SAP HANA. Using hexagonal clustering provides an effective way to aggregate information by reducing sampling bias and returns results faster than machine learning clustering algorithms, which are iterative.
Data Scientist and Machine Learning Improvements
SAP HANA now provides native Python and R machine learning APIs. In the past, data scientists could only access data via a standard SQL interface to SAP HANA. This limited how they could access the data and the powerful capabilities within SAP HANA. Now, with the native machine learning APIs for Python or R, data scientists can be much more productive with easy access to data in SAP HANA and the powerful native machine learning algorithms from within their favorite Python or R environments, like Jupyter™ Notebook.
We have also added support for leveraging R algorithms directly from within SAP HANA, streaming analytics. This enables the reuse of what you’ve already built in R and expands the available algorithms for streaming data use cases. These enhancements, along with the continued addition of new native machine learning algorithms, will enable you to continue to build more intelligence into your applications to gain more insight from all your data.
Spatial Performance Enhancements and Algorithms
New methods of linear referencing and geometry editing of spatial data are provided for the collection, simplification, scale, and editing of large data sets. To accelerate innovation in the cloud, we added the new modeler and labeling tool in our spatial services. The modeler allows customers to build custom models with statistical models based on Earth observation data. This allows for easy access to information and gives customers an easy way to derive new information by creating, for example, a statistical model on top to analyze moisture stress index for monitoring health level of crops. In the machine learning-powered labeling tool, customers can train classifiers by labeling different areas in satellite images, such as a city or agriculture lands. One SAP customer is utilizing this tool to automate imagery classification on a larger scale for land cover or use classification.
User Experience Optimizations Throughout
SAP HANA includes an array of features to enhance and simplify the user experience. For administrators providing built-in recommendations for further system optimization, there is workload management and data tiering. When it comes to modelling and application development, we have simplified many activities with new dialogs and wizards and added support for the cloud spplication programming to let developers focus on the business solution they are building instead of the technical aspects of the solution. These features help to increase the efficiency of database and data life-cycle management, modelling, and application development. They continue to help IT organizations reduce time to value and improve IT operational efficiency.
Again, these features are just some of what you can expect from the latest SAP HANA release.
Dirk Basenach is senior vice president of SAP HANA Development at SAP.