The computer screen smiles amiably. This is the first time the customer has been to an online store and she is initially a little apprehensive. But then she sees a woman smiling back at her from the screen, which immediately puts her at ease. The woman asks her what she would like and helps her place her order using the unfamiliar application. The customer is delighted with the service, and she quickly loses her inhibitions about buying things on the Internet. The kindly face may be artificially generated, but it makes interaction with the computer much easier as it communicates with her directly.
“Social agents” is the term used for these artificially generated human faces that act as go-betweens between man and machine. They combine the user interface technology of animated talking heads with the principles of social intelligence. Their task is to arouse users’ curiosity, to motivate them, and to support them in the use of the application. Social agents are one of the topics being studied by the Human Computer Interaction (HCI) research program, which was set up by SAP Research to devise methods of seamless interaction between software and user. With this in mind, a project team developed a prototype that uses third-party technology to automatically convert data from SAP applications to messages, which are then spoken by an animated agent. The Social Agents Dynamic System uses standard speech synthesis and animation components and can be integrated into any SAP solution.
The sensitive computer
The great strengths of these stand-in humans lie in their ability to speak and to arouse and express emotions. SAP Research aims to expand these talents. In future, the agents should also be able to identify the emotions in the face of their opposite number, react to them, and communicate naturally, that is, they should be able to “improvise” conversations. To achieve this, SAP Research is working with a number of partners, including the Fraunhofer Zentrum für Graphische Datenverarbeitung (ZGDV) which has developed a system for interactive story telling and conversational improvisation. The SAP researchers are also working with imedia, an international academy for new digital media, technologies, and applications in Providence, USA, to investigate the possibilities of using video technology to identify different emotional reactions in faces.
As well as online stores, other potential applications for social agents include Internet-based services, such as e-health for the electronic care of elderly, sick, or disabled people. HCI is therefore planning a prototype of an e-health application that uses agent technology.
SAP Research wants to use the HCI research program to extend the usability of SAP applications and to satisfy a whole range of demands people have on IT systems. People with a physical disability, for instance, require additional support. By the same token, classical graphical user interfaces fall short of adequately serving the specific requirements of laboratory staff or mobile technicians. These users routinely deal with complex instruments and tools, and having to read information off a computer screen distracts them from their actual tasks. And the same applies to drivers. The benefits of voice-based interactive technology in such cases are self-evident.
SAP Research has developed a voice-driven interface specifically for tasks like these, and the interface can hook into existing GUI applications without the actual software having to be adapted. The Voice-Enabled Portal (VEP) lets you navigate and enter data and text by using spoken commands. The prototype is based on SAP Enterprise Portal 5.0 and supports the Microsoft Speech Application Programming Interface (Microsoft SAPI) and IBM ViaVoice speech recognition interfaces. SAP Research is hoping to partner with a local government agency to implement a pilot program. SAP believes this is attractive to a local government agency that wants to provide better access to disabled employees and citizens. The VEP approach not only improves the usability of SAP solutions for disabled users, for instance, it can also benefit the internationalization process, a central issue for SAP, as the interface is intended to support multiple languages.
The Structured Audio Information Retrieval System (STAIRS), which is being developed by the Technical University at Darmstadt with funding from SAP Research, dispenses with a screen completely, enabling data to be input and output exclusively via sound. The audio interface supports blind users, for instance, or can provide important information for drivers or laboratory or warehouse workers who need their eyes and hands for other tasks.
Improved security via combined forms of access
The combination of different forms of data input and navigation is being investigated by the Multimodal Interfaces for Mobile Applications (MIMA) project. The project’s main objective is to improve access to mobile input devices with small screens, such as PDAs or cell phones. The approach combines audio interfaces, data input via a graphical user interface via a pen or via RFID sensors and barcode scanners. Here SAP Research is cooperating with external partners, such as Motorola and IBM, and various internal product divisions, such as Supply Chain Management (SCM), Retail, and Healthcare. So the MIMA team and SAP Warehouse Management is effectively setting the scene for the development of multimodal SCM solutions with SAPConsole.
Further, multimodal access to applications is a starting point for improving the security of corporate software. Such software contains sensitive data that need to be protected against unauthorized access. Standard security techniques, such as passwords, chip-cards, and biometric identification via fingerprint or iris, all have their weaknesses – just think about lost or stolen passwords or the incorrect rejection of an authorized user by a biometric system. The HCI research program will look at ways in which a combination of different forms of access control can both improve data security and increase the speed of identification.
The development of multimodal applications requires standards, but these are still in their infancy. IBM and Opera have developed the XHTML+Voice (X+V) standard, which combines Extensible Hypertext Markup Language (XHTML) and Voice eXtensible Markup Language (VXML). The idea is that X+V will be the basis on which a special programming language for multimodal applications will be developed.
Intelligent interfaces that adapt proactively to the needs of the user are another area being researched by the HCI program. For instance, these interfaces suggest logical navigation steps via speech or text messages on the screen depending on the users’ requirements.
Proactive interfaces of this type include knowledge-based cognitive assistants. Most traditional user interfaces contain far more information than individual users actually need, and users often have a hard time navigating through all the detail. Cognitive assistants reduce overload of this type. They know what a user needs to do in the software and automatically provide only the information he needs for the current step in the business process. They can also make useful suggestions, remind the user about missing information, and explain which process steps are required.
Whether by means of cognitive assistants, social agents, or voice-based data input, innovative approaches such as these make corporate solutions easier and more accessible for all users. The software gains an extra likeability factor, and companies can realize the full benefit of their investments in SAP applications. Further, they provide qualified solutions to the increasing number of statutory requirements for “barrier-free” access to software (i.e. enabling access for the disabled, etc.).