GOFF


The GOFF specification was developed for IBM's MVS operating system to supersede the IBM OS/360 Object File Format to compensate for weaknesses in the older format.

Background

The original IBM OS/360 Object File Format was developed in 1964 for the new IBM System/360 mainframe computer. The format was also used by makers of plug compatible and workalike mainframes, including the Univac 90/60, 90/70 and 90/80 and Fujitsu B2800. The format was expanded to add symbolic records and expanded information about modules, plus support for procedures and functions with names longer than 8 characters. While this helped, it did not provide for the enhanced information necessary for today's more complicated programming languages and more advanced features such as objects, properties and methods, Unicode support, and virtual methods.
The GOFF object file format was developed by IBM approximately in 1995 as a means to overcome these problems. The earliest mention of this format was in the introductory information about the new High Level Assembler. Note that the OS/360 Object File Format was simply superseded by the GOFF format, it was not deprecated, and is still in use by assemblers and language compilers where the language can withstand the limitations of the older format.

Conventions

This article will use the term "module" to refer to any name or equivalent symbol, which is used to provide an identifier for a piece of code or data external to the scope to which it is referenced. A module may refer to a subroutine, a function, Fortran Common or Block Data, an object or class, a method or property of an object or class, or any other named routine or identifier external to that particular scope referencing the external name.
The terms "assembler" for a program that converts assembly language to machine code, as well as as the process of using one, and as the process of using a "compiler," which does the same thing for high-level languages, should, for the purposes of this article. be considered interchangeable; thus where "compile" and "compiler" are used, substitute "assemble" and "assembler" as needed.
Numbers used in this article are expressed as follows: unless specified as hexadecimal, all numbers used are in decimal. When necessary to express a number in hexadecimal, the standard mainframe assembler format of using the capital letter X preceding the number, expressing any hexadecimal letters in the number in upper case, and enclosing the number in single quotes, e.g. the number 15deadbeef16 would be expressed as X'15DEADBEEF'.
A "byte" as used in this article, is 8-bits, and unless otherwise specified, a "byte" and a "character" are the same thing; characters in EBCDIC are also 8-bit. When multi-byte character sets are used in user programs, they will use two bytes.

Requirements and restrictions

The format is similar to the OS/360 Object File Format but adds additional information for use in building applications.
Similarly to the older OS/360 format, object file records are divided into 6 different record types, some added, some deleted, some altered:
GOFF records may be fixed or variable length; the minimum length when using variable-length records is 56 characters, although most records will be longer than this. Except for module and class names, all characters are in the EBCDIC character set. Unix-based systems must use fixed-length records. Records in fixed-length files that are shorter than the fixed length should be zero-filled. To distinguish GOFF records from the older OS/360 format or from commands that may be present in the file, the first byte of each GOFF record is always the binary value X'03', while commands must start with a character value of at least space. The next 2 bytes of a GOFF record indicate the record type, continuation and version of the file format. These first 3 bytes are known as the PTV field.

PTV

The PTV field represents the first 3 bytes of every GOFF record.
ByteBitsValuePurpose
0All03Indicates start of a GOFF record
10-30ESD record
10-31TXT record
10-32RLD record
10-33LEN record
10-34END record
10-3X'5'-X'E'Reserved
10-3X'F'HDR record
14-5Reserved
16-700Initial record that is not continued on the next record. This should be the only value used for variable-length GOFF records
16-701Initial record which is continued on next record
16-710Continuation record not continued on next record
16-711Continuation record which is continued on the next record
2All00Version Number of the object file format. All values except X'00' are reserved

HDR

The HDR record is required, and must be the first record.
ByteSizeFieldValuePurpose
0-23PTVX'03F000'Only allowed value; HDR record currently cannot be continued
3-47450Reserved
48-514Architecture LevelBinary 0 or 1GOFF Architecture level; all values except 0 and 1 are reserved
52-532Module Properties SizebinaryLength of Module Properties Field
54-5960Reserved
60-0+Module PropertiesModule Properties List

ESD

An ESD record gives the public name for a module, a main program, a subroutine, procedure, function, property or method in an object, Fortran Common or alternate entry point. An ESD record for a public name must be present in the file before any reference to that name is made by any other record.

Continuation

In the case of fixed-length records where the name requires continuation records, the following is used:

Behavior Attributes

ADATA records

ADATA records are used to provide additional symbol information about a module. They replaced the older SYM records in the 360 object file format. To create an ADATA record
ADATA records will be appended to the end of the class in the order they are declared.
Class names assigned to ADATA records are translated by IBM programs by converting the binary value to text and appending it to the name C_ADATA, So an item numbered X'0033' would become the text string C_ADATA0033.
TYpeDescription
Translator records.
Program Management records
Reserved
Reserved for compilers and assemblers not released by IBM.
Available for User Records. IBM will not use these values.

TXT

TXT records specify the machine code instructions and data to be placed at a specific address location in the module. Note that wherever a "length" must be specified for this record, the length value must include any continuations to this record.

Continuation

Compression Table

A compression table is used if bytes 20-21 of the TXT record is nonzero. The R value is used to determine the number of times to repeat the string; the L value indicates the length of the text to be repeated "R" times. This could be used for pre-initializing tables or arrays to blanks or zero or for any other purpose where it is useful to express repeated data as a repeat count and a value.

IDR Data Table

IDR Format 1

Note that unlike most number values stored in a GOFF file, the "version", "release" and "trans_date" values are numbers as text characters instead of binary
ByteSizeFieldValuePurpose
0-910TranslatorAny textThis value is what the assembler or compiler identifies itself as; IBM calls this the "PID value" or "Program ID value" from IBM's catalog numbers of various programs, e.g. the Cobol Compiler for OS/VS1 is called "IKFCBL00"
10-112Versiontwo digitsThis is the version number of the assembler or compiler, 0 to 99.
12-132Releasetwo digitsThis is the release number subpart of the version number above, also 0 to 99
14-185Trans_DateYYDDD5 text characters indicating the 2-digit year, and the 3-digit day of the year this module was compiled or assembled; years 01-65 are presumed to be in the 21st Century, while year 00 or years greater than 65 are presumed to be in the 20th Century, e.g. 2000 or 1966-1999. The three digit day starts at 001 for January 1; 032 for February 1; 060 is March 1 in standard years and February 29 in leap years; and continuing through 365 for December 31 in standard years and 366 for leap years.

IDR Format 2

Normally compilers and assemblers do not generate this format record, it is typically created by the binder.
ByteSizeFieldValuePurpose
0-34DatePacked decimal form YYYYDDDFDate module was assembled or compiled, with the year and day of the year
4-52Data_LengthBinary valueActual length of next field, an unsigned, nonzero value
6-8580IDR_DataFormat of this data has not been disclosed

IDR Format 3

All text in this item are character data; no binary information is used.
ByteSizeFieldValuePurpose
0-910TranslatorAny text value the compiler/assembler writer wishes to use to identify itself
10-112Version00 to 99Version number of the assembler or compiler
12-132Release00 to 99Release number of above version
14=207Compile_DateYYYYDDDYear and day of year the program was compiled or assembled.
21-299Compile_TimeHHMMSSTTTHour, minute, second and thousandth of second that the program was compiled or assembled

RLD

RLD records allow a module to show where it references an address that must be relocated, such as references to specific locations in itself, or to external modules.

Relocation Data

If R_Pointer is omitted this field starts 4 bytes lower, in bytes 8-11.

If R_Pointer or P_Pointer is omitted, this field starts 4 bytes lower. If both fields are omitted, this field starts 8 bytes lower.

If R_Pointer, P_Pointer, or Offset are omitted, this field starts 4 bytes lower. If any two of them are omitted, this field starts 8 bytes lower. If all of them are omitted, this field starts 12 bytes lower.
To clarify, if a module in a C program named "Basura" was to issue a call to the "exit" function to terminate itself, the R_Pointer address would be the ESDID of the routine "exit" while the P_Pointer would be the ESDID of "Basura". If the address was in the same module R_Pointer and P_Pointer would be the same.

Flags

LEN

LEN records are used to declare the length of a module where it was not known at the time the ESD record was created, e.g. for one-pass compilers.
FieldOffsetSizeDescription
PTV0-23Record Type X'033000'
3-53Reserved
Length6-72Length of items following this field; value must be non-zero
Elements8-Element length data; see Elements table below
REMTrailing data to end of record for fixed-length records, must contain binary zeroes; not present for variable-length records.

Elements

A deferred-length element entry cannot be continued or split
FieldOffsetSizeDescription
ESDID0-34ESDID of element this value applies to
4-74Reserved
Length8-114Length of the item referenced

END

END must be the last record for a module. An 'Entry Point' is used when an address other than the beginning of the module is to be used as the start point for its execution. This is used either because the program has non-executable data appearing before the start of the module, or because the module calls an external module first, such as a run-time library to initialize itself.
FieldOffsetSizeBitsDescription
PTV0-23X'034000' - Not-continued
PTV0-23X'034100' - Continued on next record
30-56Reserved
Flags36-72Declarations regarding the presence or absence of an entry point
Flags36-7200 - No entry point given; all other values in this record are invalid
Flags36-7201 - Entry point specified by ESDID
Flags36-7210 - Entry point specified by name
Flags36-7211 - Reserved
AMODE41Addressing Mode value of entry point; the values are as specified in field 0 of the Behavior Attributes table in the ESD record.
5-73Reserved
Record Count8-114Number of GOFF records in this module
ESDID12-154Value of ESDID if entry point is referenced by ESDID; binary zero if referenced by name
16-194Reserved
Offset20-234Address offset of module entry point; this cannot be specified for an external entry point
Name Length24-252Length of name, this must be zero if entry point was specified by ESDID.
Name26-The name of the external symbol used as the entry point for this module; is binary zeros if entry point was specified by ESDID; if this record is continued this is the initial 54 characters of the name. This is the only non-binary value in the record; it would be a text field representing the public name for the entry point
REMTrailer extending to the end of the record; should be binary zeros to end of record for fixed-length records; omitted for variable-length

Continuation

If an entry-point name specified on a fixed-length END record is longer than 54 bytes or, the following continuation record is used.