In 2001, more than a million developers around the world programmed in ABAP or in ABAP Objects, as the language is known today according to its object-oriented enhancement under SAP Web Application Server (SAP Web AS) – a figure that is still growing today. With SAP Web AS 6.30, ABAP Objects became a component of the SAP NetWeaver platform, SAP’s application and integration platform. Next to Java, ABAP remains the most important programming language for SAP developers.
In their daily work, many SAP developers must repeatedly select data from database tables, enrich that data with additional data, perform calculations, format the data, and display the data on the user interface. To support these kinds of tasks, SAP developed the language construct of internal tables for ABAP. ABAP provides the following types of internal tables:
- Standard tables
- Sorted tables
- Hash tables
Internal tables are best compared with associative, dynamic arrays in other programming languages like PHP or Perl. Internal tables help store data that might have been selected from one or more database tables. In this manner, large quantities of data can be stored and processed by row in structured form and in working memory. The individual components of a row correspond to the columns or fields of the internal table. Data can be specifically selected with index or key operations and addressed with field names.
Internal tables appear in almost every ABAP program. They not only help process large volumes of data, but also help set up complex structures. They also relieve programmers from routine tasks like dynamic memory management. With some justification, therefore, internal tables can be regarded as one of the central language constructs of ABAP. As the programming language of SAP solutions, ABAP is now more than 30 years old. But like the language itself, discussions about the use of the “correct” type of table are still very current, as seen in remarks on the ABAPforum.
So, what table type is most appropriate for a given task? Does one type of table work in all cases? Of course, it’s impossible to offer a general answer to these questions. The answer involves too many factors, such as the quantity of data, the key fields, and the required operations on the table (reading and sorting). But even some hints that make no claim to completeness can often help.
Standard tables offer options that other ABAP table types do not possess. Access via a remote function call (RFC) is a good example. Internally, standard tables build a logical index so that the entries are stored internally in a logical sequence. Access to individual table rows occurs with the index or with the key. Without exception, the key of a standard table is not unambiguous. A great deal of flexibility characterizes standard tables, which is surely one reason why they are the most frequently used table type in ABAP programs.
For example, if a user wants to search with various keys, first with the document number and later in the program with the reference document number, standard tables are ideal. But standard tables offer only inefficient read access because they search through the entire table sequentially. Response time increases linearly with the number of table entries. Users who want more efficiency during read access or when searching with a key must sort the table according to the key and then also use the BINARY SEARCH (READ) function. The table is then searched for the corresponding entry according to the search algorithm of the binary search.
A sorted table can’t store any duplicates (entries with more or less the same key). Each entry in a sorted table must have a unique (or unambiguous) key that consists of one or more fields, a document number, a vendor number, or something similar. This table type offers efficient read access that depends only a little upon the size of the table because sorted tables support the execution of binary searches. Insert operations in sorted tables take somewhat longer because the sorting sequence must be maintained.
Sorted tables offer optimization for nested query loops:
LOOP AT TAB1
LOOP AT TAB2 WHERE KEY = TAB1-KEY
* do something
The example involves two sorted tables; the nested query does not go through all the entries in the second table. The query selects only the values that meet the condition of the key. For example, table 1 contains a list of vendors. Table 2 contains a list of documents (like invoices) that belong to these vendors. The key of the first table would be the vendor number; the key of table 2 would be the document number and vendor number. In this case, the vendor number for table 2 is a partial key. The loop through table 2 ends when the key no longer agrees – when documents of another vendor appear.
For comparison, note that using a standard table would mean always having to go through all the entries in table 2. This situation would require you to optimize the query loop with a break condition – otherwise the operation would have to read all the entries. If the situation involves two tables, one with 100 entries and one with 1,000 entries, a standard table would require 100,000 runs through the loop. Depending on the key condition, using sorted tables drastically reduces that number.
In general, hash tables offer better read performance than standard tables. They should be used to access large quantities of data. The number of entries plays no role because access can occur only with the unambiguous key, one that consists of a document and partner number, for example. Access occurs with a hash algorithm. These algorithms generally deliver an unambiguous output value of a fixed length from input values of any length, which guarantees a constant amount of effort for the access.
The limitation? Access is efficient only when you specify the complete table key. Otherwise performance is similar to that of standard tables: it degrades linearly with the size of the table. Entries in a hash table have a sequence and can be sorted. Hash tables are always a good choice when the data has an unambiguousness key and you can guarantee that the unambiguity remains in effect. This approach is a very good way to map equivalents to a database table in a program.
The bottom line is that developers who stubbornly use only standard table types will quickly find that the program quickly reaches the limits of performance in some circumstances. To avoid poor performance, check each individual case to see which table type makes the most sense for the task at hand.