Erik Marcadé joined SAP through the acquisition of KXEN in 2013 and is now vice president of Advanced Analytics. His team focuses on several predictive analytics and machine learning solutions, such as SAP Predictive Analytics and SAP Predictive service — part of SAP Leonardo — as well as the internally used SAP Predictive Analytics Integrator, which is planned to be opened up to customers and partners at the end of 2017.
Q: What is machine learning, what is predictive analytics, and what’s the difference?
A: In simple terms, there is no difference between predictive analytics and machine learning. At the end of day, both terms cover mathematical techniques using data to build/train (predictive) models extracting links between input data (in the case of descriptive algorithms) or between input and target data (in the case of truly predictive algorithms). These two terms correspond to two “marketing” waves and they do not reflect any mathematical or technical difference.
Young generations may interpret the difference between these two terms as the “old way of doing things” and the “new way of doing things,” but neural networks were already associated with predictive technologies back in the 90s. That was at the time of my first startup in this domain.
Other young generations of data scientists may confuse the generic term of machine learning with the sole technology called “deep learning.” Both have seen significant advances because of increased availability of data and computing power, especially in the domain of unstructured data — namely text and image applications such as translation systems, natural language processing, image recognition, and the like.
Is it fair to say that predictive analytics help us to understand what is likely to happen in the future based on what has happened in the past and that through machine learning we are able to constantly make our predictions more accurate?
Well, my previous answer negates this question in a way. There had been numerous examples for incremental self learning or self-adaptation techniques before the term machine learning was widely spread. The feature of automatically retraining systems on either the freshest version of data as well as accumulated data already exists in many predictive automation tools, even in SAP Predictive Analytics.
For example, recommender systems are usually deployed in environments that are changing and they are continuously trained on data. You can even show that they need to forget the older data in order to be efficient on the new data. We can provide techniques that are using the previous version of predictive models to create the next one in order to track the changes through time. This allows you to build analytics on analytics in a way. And again, all these processes are already available in SAP Predictive Analytics.
Last year at SAP TechEd, Erik talked about massive predictive analytics. “I could have called it massive machine learning,” he says. He reminded the audience that predictive analytics and machine learning are already here: “A large number of organizations are currently investing in predictive analytics and machine learning. Both should not be considered as a lab or research center activity but really as a way for organizations to improve their margins and approach new business models.”
What is “massive predictive analytics”?
Massive predictive analytics is the message we have been sending to our customers to cover two aspects. One is the automation of the predictive and thus machine learning techniques in an environment called predictive factory. This allows a team of a few people to centrally manage thousands of predictive models used in operations.
The second aspect follows the line of democratization, not to say predictive for the masses. We are not there yet to have true self-service systems because the automated data collection and data preparation is still a semi-open problem, while automated algorithms are a solved issue. However, we can embed predictive scenarios within the tools and processes that most people use on a day-to-day basis. People using SAP S/4HANA Finance will leverage predictive and machine learning techniques every single day. The first processes such as contract consumption forecasting are already live. Sales team members can leverage predictive techniques for lead scoring if they use SAP tools today, fraud managers are leveraging our techniques to build models to automatically detect probable fraudulent cases. All these systems are already live today.
Erik holds an engineering degree from the French École nationale supérieure de l’aéronautique et de l’espace, where he specialized in process control, signal processing, computer science, and artificial intelligence. During his career, he headed projects on neural networks development and collaborated with Stanford University on the automatic landing and flare system for Boeing. He also worked on operational applications of neural networks such as forecasting inbound calls for global resource optimization for large call centers.
In September 1998, he co-founded KXEN as chief technical officer in charge of research and development and technical product management. “InfiniteInsight” was a predictive modeling suite developed by KXEN that assisted analytic professionals and business executives to extract information from data. In October 2013, SAP acquired KXEN.
How does SAP’s strategy in predictive and machine learning differ from our competitors?
I think it is fair to say that SAP has a clear focus on providing all the necessary tools, techniques, processes that allow our customers to leverage predictive and machine learning techniques integrated within business processes: at the end of the day, all the teams working in SAP around these techniques are more focused into data science in operations as opposed to data science in the lab.
This goes through different functional and technical layers such as security, governance, even reusing some assets that have been previously generated such as connectivity with SAP BW and now SAP BW/4HANA; building bridges to directly use our tools on top of SAP HANA and SAP Vora, with no data transfer; and technical components such as SAP Predictive Analytics integrator in ABAP environments as well has SAP Cloud Platform for deep integration into the entire portfolio of SAP applications. We differ because we are deeply integrated within SAP.
In March of this year, SAP was identified as a leader in the Forrester Wave “Predictive Analytics and Machine Learning Solutions”. According to Forrester, “SAP offers comprehensive data science tools to build models, but it is also the biggest enterprise application company on the planet. This puts SAP in a unique position to create tools that allow business users with no data science knowledge to use data-scientist-created models in applications. SAP’s solution offers the data tools that enterprise data scientists expect, but it also offers distinguished automation tools to train models.
For Erik this rating (with a focus on SAP Predictive Analytics) is a major achievement as it shows “that the strategy of providing tools and techniques that allow to integrate predictive models directly into business processes and workflows resonates deeply.”
What is SAP’s unique selling proposition in that area?
SAP Predictive Analytics focuses on three key differentiators:
- Automation in order to empower citizen data scientists to build predictive models with built-in robustness and model quality checks to reduce user error. This applies not only to the algorithms but to the full predictive life cycle — dataset production, modelling, scoring, and model management.
- Scalability to prepare data sets, train and apply models in databases, without data transfer, or to mass generate features and run data lineage analysis automatically, to train against wide data sets (Big Data) and reuse datasets easily to quickly solve multiple, related business questions.
- Integration with deep SAP HANA and Hadoop Spark on a technical level, but also through SAP Predictive Analytics integrator to integrate in SAP applications both in ABAP or in a cloud environment.
At the end of the day, we are allowing our users to optimize many different business processes, and our core value proposition is that they can do that very easily, at a low cost, as opposed to focus a vast amount of energy on optimizing just one process up to the last percent point of optimization.
In one of his latest blogs, Erik writes “about some of the challenges which technology executives need to be aware of when making investment decisions.”
What about the opportunities? You’ve been in the analytics business for many, many years. Why is it so fascinating to work in the area of artificial intelligence at the moment?
Thanks for insisting on the “many”! Anyway, for the old chaps like me that have known the different waves of data mining, neural networks, predictive analytics, machine learning, and now cognitive, it is fascinating to see concepts such as convolutional neural networks that have been pushed through years to achieve some level of efficiency today can beat human performance. And consider this: GPUs are only one step, wait for the application-specific integrated circuits (ASICs) wave to come pushing the frontier of what can be achieved even further. Not to forget even quantum computing!
So one lesson is: Persistence to defend good concepts pays off! But this is only the technical side of the world.
The second important transition is the overall acceptance of these systems by the coming generations: I do not think that my son will have problems sitting in an autonomous car. I am always amazed to see conversations on what should an autonomous car do when facing the situation of hitting a truck and killing its passengers or hitting a young boy on the side of the street: fortunately, this question is not one that is asked during driving exams that humans are going through today.
I think, in the near future we will hit the tipping point where humans will recognize that, in most situations, autonomous systems are safer than humans.
Which will require a lot of trust?
Correct. That’s the third point which is linked to trust and transparency and may be requiring regulation and control: Some become vocal about bias that can be introduced either unconsciously through data bias, or consciously through malware, but since we will hide the complexity of systems underneath a cover simple to use, we will need to have ways, managed by independent third parties, to check the non-biased behavior of such systems.
This is the thinking behind the TransAlgo initiative in France for example. Even an algorithm used to propose an itinerary can be biased in order to let people go to some shops. This control does not have to come from public regulators but, as mentioned, certainly through independent providers to check, compare, and benchmark complex systems. We need to find ways to check the systems integrity while protecting the proprietary IP that is developed within these systems.
Why is it so rewarding to bring in your vast expertise into SAP?
I have been fighting almost all my career in advanced analytics to defend the position that machine learning by itself is not enough if not integrated into business processes or systems: machine learning is just a very powerful technique to fulfill higher level missions and processes, thus the constant focus on integration: being in a company with thousands of applications, I still feel like a young boy in a toy store, and to see live integrations now being used by SAP customers is just what I have been looking for!