Increasing Quality with Unicode

February 3, 2003 by admin

Unicode is state-of-the-art technology that can save and process the characters of all scripts worldwide. From Release 6.10 onwards, the SAP Web Application Server (SAP Web AS) supports Unicode-enabled applications. The first Unicode-enabled SAP applications are now available in the form of SAP R/3 Enterprise, mySAP Supply Chain Management (mySAP SCM), mySAP Business Information Warehouse (SAP BW) 3.1 and mySAP Customer Relation Management (mySAP CRM).
But what do customers have to do if they want to make their programs Unicode-enabled, even if they have decided on a non-Unicode system? Firstly, all parts of the program that are based on the explicit or implicit assumption that 1 character is equal to 1 byte will need to be adapted. This assumption is commonly found in string processing, the assignment of structures and in offset/length access to structures. It should also be noted that the destination code page is declared explicitly when writing and reading files. These adaptations can also be carried out in a non-Unicode system, since both non-Unicode and Unicode applications run on the same code basis and have essentially the same semantics.

Identifying Problem Areas

Unicode Checks active

Unicode Checks active

In order to provide optimal support for the developer in his search for program parts that do not support Unicode, SAP has introduced a new program attribute “Unicode checks active”. It activates the more restrictive Unicode tests, which make all statements that are problematic in Unicode visible as syntax or runtime errors.

UCCHECK

UCCHECK

While these Unicode tests can be carried out program by program, things are made easier by using the UCCHECK transaction. This shows which syntax errors would occur for a freely definable set of programs after the Unicode tests are activated.

Making Easy Work of Investigating the Problem

There are many new rules in the Unicode environment. These are very detailed and complex in order to ensure the greatest possible compatibility with existing programs. However, a developer who wants to check his ABAP program for syntax errors does not need to learn even a single one of these rules off by heart. He is able to locate the cause of the error quickly and at any time thanks to integration in the syntax check and runtime environment, the good level of documentation in the F1 help, and direct navigation from error location in the editor to keyword documentation.
In addition to the F1 help for the individual keywords, the system also offers an overview of Unicode. This can be found either by using the transaction “UCCHECK->”Doc for ABAP and Unicode” or directly in the ABAP keyword documentation (transaction ABAPHELP -> keyword “Unicode”).
Customers who already have an SAP system that uses SAP Web Application Server 6.10 or 6.20 can, for example, follow the changeover from ABAP programs to Unicode using demo programs TECHED_UNICODE_EXERCISE/SOLUTION*. The programs are available through the support package (see note 583546 for details).

Hidden Data Types as an Error Source

In addition to reminders relating to the technical distinction of character and byte processing, most of the errors identified by the new program tests stem from parts of programs where the true data types are hidden. In the old ABAP syntax, this was very easy to do and always applies if, for example, structured data is saved using an implicit cast in a large character-like field that functions as a “container”. There is a danger that the binary data hidden in the container will be destroyed during a string processing operation (e.g. TOUPPER CASE conversion) or during a code page conversion when communicating externally. Even with purely character-like structures, hiding the field borders causes the implicit layout to be lost during external communication if a data-dependent change in the length occurs during a code page conversion. This is true for communication between Unicode and non-Unicode systems using Asian characters. Hiding data types in this way can also create problems if the database is converted with a code page change, for example converting from EBCDIC to ASCII or from ASCII to Unicode.

New Unicode Tests Improve Software Quality

Faulty programming

Faulty programming

The advantage of the new tests is immediately apparent, when you look at the newly issued errors and the corresponding program parts, since a considerable number of them are correct based on the old syntax but are semantically incorrect. The tests thus identify many real errors and questionable constructions.
More errors occur in ABAP programs from old SAP releases than in newer ones. This is primarily because the programming language has developed over the course of time and – as with the type concept for example – has improved considerably. Also, many new language constructs have been added that make it possible to write more reliable and robust programs at a more abstract level than before.
Since software quality can be improved through the restrictive Unicode tests it also makes sense for customers not immediately interested in Unicode to perform these tests on their programs. Any user who logs on to his SAP Web AS-based system, runs UCCHECK on his own programs and takes a detailed look at the messages that appear can try this out for himself. He will almost certainly have a few valuable surprises.

Using the correct Data Type

The restrictive Unicode tests are also valid during runtime. Since it is possible in ABAP to work with untyped variables, no static syntax check is supported here. Parts of the program that cannot be checked statically can be displayed in UCCHECK, however. It is advisable, wherever possible, to do proper typing and, if necessary, to work with the new generic ABAP data types such as CSEQUENCE or CLIKE.
However, in practice it is often not possible to type all variables. Also there are constructs that can only be checked during runtime in a Unicode system. For example, in ABAP lists – which have already been replaced by the ABAP List Viewer in many instances – the assumption that one character is also one column wide is no longer correct. This is why it is also necessary to carry out runtime tests after the Unicode test. These tests should be extensive, because unlike the development of additional functionality, switching to the more restrictive syntax and runtime tests affects all programs. It is therefore not enough to concentrate solely on just a few application scenarios. Release 6.10 onwards supports the specially developed ABAP Coverage Analyzer (transaction SCOV), which measures the test coverage and therefore supports these runtime tests. It is recommended for use when switching over to Unicode.

Program Switchover is Part of a Complete Project

If the entire system is converted to Unicode, changing over the ABAP programs is just one part of the overall project. The database and any RFC programs the user has created himself need to be switched over too. These two tasks are approached in a second article.

Further information on switching over to Unicode from ABAP programs and on guidelines and overviews can be found on SAPNet under alias Unicode@SAP in the “Media Library”.

Christian Hansen

Christian Hansen

Tags: ,

Leave a Reply