Special note: the AST/400 software has been verified to be Y2K compliant.
AST400 is an automated software translation system designed to convert existing IBM AS/400 RPG, COBOL, CL, and DDS programs running under OS/400 to UNIX C and, in the case of COBOL programs, MicroFocus UNIX COBOL. It also provides a runtime support library for the translated programs that supports OS/400 features not directly mappable to UNIX.
No modifications or hand fixup are required in any of the generated code, and full support is provided for such features as subfiles, data areas, data and message queues, message files, batch jobs, overrides, library lists, etc.
A relational database such as Informix, Oracle, or Sybase is used on the UNIX system to replace the AS/400 native filesystem and database.
Any modern UNIX system can be used as the host, such as the HP 9000 family or the IBM RS/6000 family.
To date, several thousand programs and datafiles have been converted and are fully functional.
This document covers a variety of topics dealing with the tecnology, methodology, limitations, and features of the AST/400 translation system. These are:
DDS, or Data Description Specifications, are central to the AS/400's application environment. These scripts specify the format of files and the records within these files, independently of any program's use of the files. Information in the DDS is used by all other parts of the translation system.
There are several variants of DDS on the AS/400, the ones of primary importance being Physical, Logical, Display, and Print DDS. Although basically similar, differences in detail exist. For instance, although all types of DDS describe record layouts, the operations that can be performed on a datafile differ from those that can be performed on a printer.
A description of the common features of DDS translation will be given first, followed by descriptions for each of the major DDS types.
The DDS translator is responsible for two broad areas. First, AS/400 DDS must be parsed and information from it extracted and placed in a repository for later use by the DDS translator itself, and for use by other areas of the AST/400 translation system. This repository is, among other things, a collection of information about the files described in the DDS, such as the record formats used by a particular file, the fields within a record format, etc.
Second, the specific actions needed for the particular variety of DDS being translated must be performed. This might be the generation of relational database schemas for physical files, or the generation of display routines for display files. These areas will be described in the individual descriptions, below.
The DDS translators are actually implemented as two separate programs, although in use a shell script hides this fact. The first program, ddstrans, performs lexical and syntactic analysis of the input DDS and produces an intermediate form which is the same for all varieties of input DDS. Since DDS consists of mixed fixed-format and free format items, using standard lex is somewhat difficult. This is resolved by having a lexical preprocessing phase that converts the fixed-format information into a form more readily handled by lex. A yacc parser then performs the syntactic analysis and emits the standard intermediate representation. No semantic analysis is performed during this initial phase.
The second program, ddscomp, is responsible for reading the intermediate form and performing the semantic analysis and code generation required. This program is written in ptml, a language designed for implementing code generators.
Ptml is a C-like language that has extensions for rule-based pattern matching of lists and sublists, and is particularly suited for parse tree manipulation. Its use allows new features or changes to be rapidly tracked, without having to rewrite the basic translator. The ptml compiler itself generates standard C code, so there is no necessity for a ptml compiler on any system other than our own development system.
The repository that is maintained as a part of the DDS translation process is a generalized object-like storage system that is used to contain information not just about the DDS, but about any other aspect of the applications being translated that are of interest to either the rest of the translation system, or to programmers or documentation writers. For example, information can be extracted from the repository and used to load a CASE tool to simplify reengineering efforts.
AS/400 physical files are mapped to relational database tables. Processing of physical file DDS produces three results. First, the repository is updated with information from the DDS. Second, SQL scripts which define table schemas and indexes for relational database tables that implement the files are produced. Third, C routines are created to handle runtime access to the tables.
The table schemas produced correspond closely to the record layout specified in the DDS. Where possible, datatypes are mapped to exact representations under the relational database (character data to char(), 4 byte binary to integer, etc.). Other datatypes are mapped to close equivalents and conversions are handled at runtime (see below). Extra fields may be added to the schema as needed to allow non-relational modes of access (such as read-previous).
Even though the DDS specifications are static on the AS/400, the actual file (table) accessed can be changed at runtime using the AS/400 override feature. Because of this, the runtime access routines use dynamic SQL commands to access each table. The routines fill in the table name at runtime, so that the specified table is used. When a file is opened, override processing is performed to determine the actual table desired. At this point, information about the table, such as its key fields, is retrieved from the repository.
For each physical file, open(), close(), setll(), setgt(), chain(), read(), reade(), readp(), write(), update(), and delete() functions are created. Table access is generally handled using database cursors, as appropriate. As mentioned above, data is stored in the relational database using datatypes that match as closely as possible the AS/400 datatypes, so that runtime access requires minimal conversion overhead. For datatypes that are not stored in exact form, data conversions are performed when records (rows) are fetched or written back to the database. The data buffer returned to a translated RPG or COBOL program contains the exact data representations that the program sees on the AS/400.
AS/400 logical files are mapped to relational database views. As with physical files, processing of logical file DDS produces three results. First, the repository is updated with information from the DDS. Second, SQL scripts which define views and indexes for relational database tables that implement the files are produced. Third, C routines are created to handle access to the views.
The view definitions produced correspond closely to the record layout specified in the DDS. Single format logicals are mapped to simple views and join logicals are mapped to relational joins in a straightforward manner. Multiple record formats are mapped to separate views which are then tied together by an access path table through which the logical file is traversed as one entity.
As with physical files, the file (view) accessed can be changed at runtime using the AS/400 override feature. Because of this, the runtime access routines use dynamic SQL commands to access each view. The routines fill in the view name at runtime, so that the specified view or table is used. When a file is opened, override processing is performed to determine the actual view or table desired. At this point, information about the view, such as its key fields, is retrieved from the repository.
For each logical file and for each record within the file, open(), close(), setll(), setgt(), chain(), read(), reade(), readp(), write(), update(), and delete() functions are created. The controlling program calls the routine for the logical file and the file routine determines which record is being accessed and calls the routine for that record. Access is generally handled using database cursors, as appropriate. As with physical files, the data buffer returned to a translated RPG or COBOL program contains the exact data representations that the program sees on the AS/400.
Processing of display DDS produces two results. First, the repository is updated with information from the DDS. Second, C routines are created to actually implement the display operations called for.
Even though the DDS specifications are static on the AS/400, many of the characteristics, such as video attributes, can be controlled at runtime via indicators. Because of this, the translator produces C functions that are passed the current indicator settings at runtime, so that the proper operations can be performed.
For each display file, an open() and a close() function are created. Then, for each record within the file, read(), readc(), write(), and update() functions are created to support the operations that can be performed from RPG or COBOL programs. The generated functions vary in content, since the operations performed on subfiles differ from the operations performed on normal display records.
These generated routines act as an interface to a lower level display package. The lower level package actually handles the details of dealing with the display device. This lower level can be changed without having to retranslate the DDS, so the display device can be of any form suitable, such as a character terminal or a X display device.
One special feature of display files is subfiles. A subfile is effectively a hidden datafile that is associated with an area on the display. Data can be written to or read from the hidden datafile by the calling program, and can be modified by the user via the display. The number of records present can be much larger than the display area, and scrolling through the entire set can be done by the user, or under program control, or both. As a special case, a subfile can be associated with a message queue, and can show the messages that have been posted to that queue. The generated functions and the runtime library fully support all subfile features including page up, page down, fold, drop, assume, keep, shared, and message subfile operations.
Full support is included for WINDOW and its related commands, COLOR if the user's display device is color-capable, and any other video attributes such as underline and blink that the display device can support. Of course, if the selected display cannot support a particular attribute, then that attribute cannot be provided.
The characteristics of the display devices used are not fixed; each user can have a different type of display and new display devices can be added without any retranslation, via the UNIX terminfo database.
The minimal display device suggested is a VT100 compatible terminal, or a VT100 emulator running on a PC.
Many other facilities are supported, such as help documents, as well as additional functionality to integrate into the UNIX environment more smoothly.
Processing of print DDS produces two results, just as for display DDS. First, the repository is updated with information from the DDS. Second, C routines are again created to perform the actual print operations.
Once again, for each print file an open() and close() function are created. Then, for each record read(), readc(), write(), and update() functions are created. Only the write() function actually does anything; the others are provided so that the interface from RPG or COBOL is uniform. During execution, output from the print routines is stored in a temporary file which is then passed to the print queue manager.
The program .B rpgtrans translates RPG into C. Each RPG program is translated into a single C program. The repository is consulted during this process, especially for definitions of externally described files. (RPG/400 allows mention of filenames without the necessity to include record layouts directly in programs.) After translation and compilation, the C code is linked with the runtime support library, which provides such facilities as packed-decimal arithmetic and datatype conversions, and dynamically linked with routines generated by AST/400 for the particular data, display, and print files used the program being translated. This results in a UNIX executable.
Traditional RPG programs rely upon a standard pattern of execution, the so-called fixed logic. Although most modern RPG programs use \*(OQprocedural\*(CQ code, which is all executed in a single step of the fixed logic, it is still necessary to translate traditional logic. Consequently, the generated C programs include code to duplicate the fixed logic sequence in the original program.
RPG is a strictly fixed-format language (unlike DDS which has some free-format). This makes the normal UNIX tool difficult to use, so an Sii tool, .B fxgen, substitutes for lex. Fxgen is similar to lex, but is designed for handling fixed-format languages. An fxgen-generated scanner analyzes the input specifications and passes the recognized tokens to the parser. The parser, generated by yacc, is used to recognize multi-line constructs such as if-chains, if-then-else, or subroutine definitions; upon recognition, all the RPG code is transformed into an internal representation of the user program. The user program representation is now complete, except for certain cases where an RPG programmer can use something before it is defined.
These cases are handled by a fixup pass, in which the unresolved entries are satisfied. A number of generation steps are performed which involve scanning the data part of the user program presentation, e.g., to generate C language initializations for static data. The semantic analyzer, which is generated from a large number of operator\u*\d .FS \u*\dstatements in RPG each contain an operator .FE definitions, makes a single pass over the completed user program representation and generates the actual C code.
Several files of C code are generated for each RPG program, along with a .I Makefile tailored for the program being translated. This Makefile is used to compile and link the UNIX executable. There are utility shell scripts which can translate and build entire libraries of RPG programs and which manage error reports, while keeping complete logs of all translation activities.
Unlike RPG, AS/400 COBOL code is converted into MicroFocus COBOL using ANSI-85 syntax. The reasoning behind this is simple. There is a very good COBOL compiler widely available on UNIX systems; COBOL programmers can continue to use the language that they are familiar with; and support for some COBOL features is difficult in C.
The major complexities in the translation are the replacement of normal COBOL file access statements with calls on the routines used to access the relational database, and replacement of AS/400 specific statements with routines to properly map the operation into the UNIX environment.
For files that have been converted to SQL, which are any externally-described files, the FD information is removed and replaced with equivalent working storage structures. READ, WRITE, REWRITE, etc. statements are replaced will calls on the generated SQL access functions. The information to do this comes from the repository information gathered during the DDS translation phase. The DDS translators create additional COBOL-accessible code for any generated access functions used by COBOL programs.
Files directly defined within the COBOL program, that is, non-externally-defined files, are typically left as-is. However, the translator can be instructed to replace these files with SQL access routines if desired. In this case access to the file, now a database table, is through generated access functions, just as for externally-defined files.
The translator also recognizes the AS/400 COPYLIB convention for externally defined files, and generates the appropriate COBOL data definitions both in the file section and in working storage.
The AS/400 COBOL compiler is rather lax in many ways, some of which are definitely improper. For instance, the AS/400 allows a variable to be declared in the LINKAGE section without ever being specified in a USING clause, and then allows use of this variable. Another example is a PIC on a group-level data item. The translator will detect these problems and correct them, while at the same time giving a warning. Although the automatic corrections are sufficient, in the interest of good programming practice we recommend that the original code be updated also.
As for RPG, a Makefile is also generated to allow automatic compilation and linking of any or all of the COBOL programs.
The COBOL translator is implemented as a single program. It uses a standard lex/yacc lexical analyzer and parser. The results of the parsing are presented in parse tree form to the code generator function, which emits the original COBOL as modified for the new target. Original commentary is preserved whenever possible, and modified code is specially marked to allow easy identification of transformed statements. Warnings are provided for questionable syntax or semantics.
CL is the command language for the AS/400. CL programs are used to tie together portions of an application and to perform system operations that are not easily done from within RPG or COBOL programs.
The CL translator is a single program that utilizes lex and yacc to perform the lexical and syntactic analysis of the source. The results of this phase is passed to a code generator which emits a C program. Thus, converted programs are true executables; the CL is is not interpreted.
CL translation is straightforward. Most of the difficulty associated with CL is encountered at runtime, since CL scripts make heavy use of AS/400 system operations such as overrides and data areas. The translated programs use the same runtime C library used by other parts of the translated application, such as the RPG programs, to handle these operations.
Of course, no application migration can be considered complete unless the associated data is also moved. Part of the translation system is a utility program that fetches and converts AS/400 datafiles into a form suitable for loading the SQL database on the UNIX system.
Typically, a network connection between the machines is used. In this case, the data converter automatically establishes an FTP link to the AS/400 and transfers the data. It then performs format conversions based upon the DDS for the file. Character data is converted from EBCDIC to ASCII, binary data is reordered to be compatible with the target machine's representation, etc. The target SQL database tables are then loaded with the data.
Even if networking is not available, conversion can still be done. In this case, the data converter can read tapes produced on the AS/400 that contain the data to migrate.
The runtime environment provided by the AS/400 is a rich and complex one. Unfortunately, unlike UNIX, most system commands do not operate on a range of different types of things, but rather there is one command for each variation of item, such as different types of files. There are over 1400 user-level commands in OS/400. Fortunately, many of these have no use in applications and generally are not called from a program. Many others have no meaning in a UNIX environment. However, any command that can be called must be processed. This means that the runtime support needed by the translated applications must also be rich and rather complex. While some AS/400 operations have direct analogs under UNIX, others do not.
Runtime support under UNIX is provided as both C library routines that are directly linked by the converted programs and as new UNIX commands that can be run directly or via requests from within the application programs.
Some of the major runtime areas addressed are job control, data areas, data queues, and overrides.
The fundamental unit of context in OS/400 is the job. Unlike UNIX, where a process is the fundamental unit, a job comprises all of the programs that are associated with a particular user session. Much information is available about the job and the programs that are running. Our runtime environment mimics this architecture by keeping all of the context information for a particular job in a temporary job directory which is preserved as long as the job is executing. This temporary directory contains some special data areas and queues, the equivalent of the QTEMP library, program status, and context information. It is automatically deleted on job termination.
A special type of job is a batch job. These jobs are not associated directly with an interactive user or terminal, and are started by the SBMJOB command on the AS/400. These are fully supported, and are handled as a special case of the general job facility. The actual execution of these jobs is scheduled and managed by the batch queue manager, below.
A data area on the AS/400 can be considered as a file that can contain only a single record. It can be shared between programs in a job, or between jobs. These map cleanly into UNIX files, and are so implemented.
A data queue or message queue on the AS/400 is the primary method of communication between various programs and jobs comprising an application. Messages or data records can be entered into queues for particular recipients, or for general access. These items can then be read in various orders and with various priorities by other programs. For example, every program that executes has its own queue automatically created. Typically, error or status information will be passed between programs using this facility. This functionality is provided by both UNIX programs and library routines which duplicate the operations supported by the AS/400. A data or message queue is implemented as a standard UNIX file.
The AS/400 override facility is fundamental to the operation of many AS/400 applications. This facility allows the file that a program opens to actually be a different file from that specified when the program was compiled, or to have characteristics other than those specified at compile time. An override can even construct a virtual file containing information from many other files and present this to a program as if the file actually exists. Necessarily, the override runtime support is complex. Overrides have an execution scope within which they are valid; outside of that scope they have no effect. The runtime library keeps track of overrides by maintaining special override lists and specifiers in the temporary job directory. The runtime library supports all override operations such as OVRDBF, OPNQRYF, OVRDSPF, etc. Support is also present for name overrides and attribute overrides (such as SHARED). Overrides are stored as UNIX files in the temporary directory created for a job.
There are some limitations on overrides. OVRDSPF can only be used to modify the SHARE state of a display file, not to specify a different display file via TOFILE or a different device via DEV. The file limitation holds for OVRPRTF, but with a slight difference: a TOFILE override can specify a different print file, but this is used only to resolve such items as queue, lines-per-page, etc. The TOFILE override cannot be used to change the actual print record format.
As mentioned above, printer output is not directly passed to a particular output device by the running programs. Rather, a print queueing system is provided that allows multiple output queues and associates an arbitrary Unix command string with each queue. Files can be handled in any way desired, such as being copied to the user's home directory, or submitted to lp, or even emailed to another user. Alternatively, the native print spooler provided with the target Unix system can be used.
Similarly, a batch queue daemon is notified whenever a batch job is submitted. It will then schedule the job for execution according to the request information provided by the SBMJOB operation. The jobs in the execution queues can be deferred, rescheduled, canceled, etc., just as on the AS/400. If the target Unix system supports a generalized queue management system, it can be used in conjunction with the batch daemon to handle job scheduling. Otherwise, basic queue creation and deletion, queue priority, job priority, etc. is supported by the batch queue daemon. Subsystem management is also handled by this daemon, in conjunction with STRSBS and related commands.
Command parameter prompting is supported, and works much in the way the AS/400 does, although the prompt screen layout is somewhat different. The prompted form of an intrinsic command, '?RTVDTAARA ...', etc., can be used in any context that is also legal on the AS/400. The prompting information for each command is kept in a UNIX file as normal text and can be extended by the user.
The ALCOBJ, DLCOBJ, and related commands, as well as implicit allocation and deallocation is fully supported. The allocation table is implemented as a Unix shared memory segment utilizing semaphores for process control and table integrity management. This table also has additional entries that are unique to the Unix implementation to support aspects of job management.
The relational database is used to store all physical files as tables, all logical files as views, and to hold temporary views that are created as a result of OPNQRYF. User-defined message files, but not queues, are also stored in the database as a table containing the message file name, the message id, and the message text.
Of course, there are many other runtime details to consider, but the above are most of the major areas of concern to application developers.
As a rule, Sii has two goals during any translation effort. First, the application must be moved in its entirety and must operate correctly and efficiently on the target UNIX system.
Second, the end customer should be able to repeat the translation as often as desired without any further involvement by Sii.
Accordingly, one of the deliverables is a copy of all of the translation tools configured for immediate use for retranslation of the AS/400 application. A directory hierarchy is delivered containing the translators and runtime libraries and utilities, and shell scripts are provided to fetch current sources from an AS/400 system, and retranslate, compile, and install the converted application. Typically, the user needs to give only a single UNIX command to perform all of the above. It is entirely feasible to have this process automatically scheduled via the cron or at commands.
If an existing customer base is being upgraded to the UNIX system, or the application developers do not have UNIX experience, several utility programs are provided to allow various operations to be performed as they were on the AS/400.
The principal aid is shell400, a command line processor that allows AS/400 commands to be executed using the same syntax as is used on the AS/400. For example, a data area can be created by entering: CRTDTAARA FOO *DEC (7 2) 100 instead of having to use UNIX commands. This aid can also be used to start up the converted application even in a delivered UNIX environment.
Although Sii supports many AS/400 capabilities, there are some that are not supported as a part of the standard toolset, or that operate differently because of the nature of the UNIX environment. This is the case when a particular feature depends upon some piece of hardware that is AS/400 specific and for which there is no UNIX equivalent. For example, CRTDEVFNC is not supported.
Generally, these exceptions have to do with communications facilities and devices. Unfortunately, standard implementations of various communications and device protocols is not a part of UNIX; each vendor has his own particular version. If specific device support is needed, it can be provided as a custom feature.
Also unsupported is use of AS/400 system menus or system applications, such as Information Assistant. Support is not provided because these items are directly tied to the AS/400 environment, and do not make sense in the UNIX environment. This includes most commands that produce file output using any of the AS/400 Qxxx system formats, such as DSPFD or other DSPxxx commands that can write to a datafile.
The GO command is not supported as an AS/400 equivalent. However, a GO command is provided that allows a UNIX program to be invoked instead of the original menu. The STRSEU command uses a UNIX text editor of the user's choice to perform the edit operation.
Many command equivalents ignore various parameters that have no meaning in the UNIX environment. These parameters do not affect the performance of the desired operation. An example is the TSEPOOL parameter to the CHGJOB command. This parameter is meaningless in UNIX and ignoring it does not affect the desired behavior.
Limited support is provided for the WRKxxx commands. Those that are supported are intended to interact with the Unix implementation of the system and may or may not have the same screen appearance and command set as the AS400 version. They are provided merely as a convenience; programs that make use of these commands may not perform as expected.
No support is provided for any System 36 emulation commands or features. This includes any CL commands that have S36 as part of their names. There is no intention of providing any support for this in the future.
No support is currently provided for DDS ICF files, although work is underway to support ICF operations via networked connections between UNIX systems. Availability of this will depend upon customer demand.
Support for TRNTBL and ALTSEQ is limited to what is possible in the underlying relational database used. Most do not have the capability to support either of these features, and there is no workaround that performs in any reasonable time.
ASCII is the native UNIX character set; this means that program and database collating sequences and sort ordering will be that of ASCII. This can cause difficulties if a program moves all 9's to a key field and expects this to be the highest possible key. The program would have to be modified to move HIVAL to the field instead. The translator uses the correct ASCII equivalent for character HIVALs and LOWVALs.
The use of ASCII as the character set may also affect programs if they depend upon characters having a binary value which is that from the EBCDIC codeset. One area where this might be encountered is in the device-dependent I/O feedback area. For example, the attention indicator byte is set to one of several hexadecimal values. If the program checks for a literal character instead of the explicit hex value, e.g., a '1' instead of 0xF1, the proper result will not be obtained. The original sources would have to be modified to rectify this. Note that this would also produce a more correct and fully AS400 compatible original program.
Another example would be overlaying an alpha field over a zoned decimal field in COBOL programs and then examining the sign byte. The actual character used for the sign byte is that of the Unix COBOL, which will not have the same value as the AS/400. This is not a problem in RPG programs, since we handle this during RPG translation to C. Unfortunately, this cannot be changed in the COBOL compiler.
AST/400 is a translation system, not an emulator. It is not intended for use as part of a running application. This means that it does not support on-the-fly compilation of programs generated by another program at runtime, or the processing of program-generated DDS. An example would be a program that creates a DDS text file and then calls CRTDSPF or CRTPRTF. While this situation is rare, some AS/400 applications do make use of such operations. These applications would have to be modified, or special arrangements made with SII to support this.
However, use of CRTPF and CRTLF are allowed as long as the newly created file is identical in record layout to an already existing file. CRTPRTF may be used to create a new print file that differs only in printing characteristics, such as lines-per-page or output queue, but uses the same record format as an existing print file.
If an application needs access to unsupported features, Sii will work with the customer to provide a custom solution.
A successful translation requires valid sources. The translation software assume that the code that is being converted is operational code that has actually been run successfully on an AS/400. The translator is not strict about validating all operations for legality; it assumes that the code that it is operating on is correct. Although many errors will be detected, the checking is not rigorous.
DDS source for all record formats used must be available. The translation system needs to know the record formats for any files that are to be processed in order to properly construct the access routines. If there is no DDS for a particular file, the translation system can sometimes deduce the format from the usage in a program. However, this is not 100% reliable and should not be depended upon. The translator can be told that the formats for a particular file are the same as another, similar to the REF feature in DDS. This is sometimes sufficient to handle missing DDS.
If applications use formats from AS/400 system files, the DDS source must still be provided. We have formats for some of the system files, but not all. Use of system formats frequently indicates usage of AS/400 facilities that are not directly supported. This can be addressed on a custom basis.
Any user-defined data areas must have CRTDTAARA commands to initially create them on the UNIX system.
If message files are used, it is best if the message ids and text are available in a file. The message information can be extracted from the message files on the AS/400 if it is not otherwise available.
A relational database must be used on the UNIX system. This requirement exists because the native AS/400 file system is relational. All of the major SQL databases are supported, and additional products can be easily accomodated.
If COBOL applications are to be translated, a MicroFocus COBOL compiler is required on the UNIX system.
In all cases, a C compiler and the make program must be present.
The target UNIX system can be any POSIX-compliant implementation. Ports have been made to IBM AIX, HP HP-UX, ATT/NCR GIS Unix, DEC OSF/1 and Dec/UNIX, as well as several other less well-known systems.
Back to home page.