Linker (computing)
In computing, a linker or link editor is a computer system program that takes one or more object files and combines them into a single executable file, library file, or another 'object' file.
A simpler version that writes its output directly to memory is called the loader, though loading is typically considered a separate process.
Overview
Computer programs typically are composed of several parts or modules; these parts/modules need not all be contained within a single object file, and in such cases refer to each other by means of symbols as addresses into other modules, which are mapped into memory addresses when linked for execution. Typically, an object file can contain three kinds of symbols:- defined "external" symbols, sometimes called "public" or "entry" symbols, which allow it to be called by other modules,
- undefined "external" symbols, which reference other modules where these symbols are defined, and
- local symbols, used internally within the object file to facilitate relocation.
Linkers can take objects from a collection called a library or runtime library. Most linkers do not include the whole library in the output; they include only the files that are referenced by other object files or libraries. Library linking may thus be an iterative process, with some referenced modules requiring additional modules to be linked, and so on. Libraries exist for diverse purposes, and one or more system libraries are usually linked in by default.
The linker also takes care of arranging the objects in a program's address space. This may involve relocating code that assumes a specific base address into another base. Since a compiler seldomly knows where an object will reside, it often assumes a fixed base location. Relocating machine code may involve re-targeting of absolute jumps, loads and stores.
The executable output by the linker may need another relocation pass when it is finally loaded into memory. This pass is usually omitted on hardware offering virtual memory: every program is put into its own address space, so there is no conflict even if all programs load at the same base address. This pass may also be omitted if the executable is a position independent executable.
On some Unix variants, such as SINTRAN III, the process performed by a linker was called loading. Additionally, in some operating systems, the same program handles both the jobs of linking and loading a program.
Dynamic linking
Many operating system environments allow dynamic linking, deferring the resolution of some undefined symbols until a program is run. That means that the executable code still contains undefined symbols, plus a list of objects or libraries that will provide definitions for these. Loading the program will load these objects/libraries as well, and perform a final linking.This approach offers two advantages:
- Often-used libraries need to be stored in only one location, not duplicated in every single executable file, thus saving limited memory and disk space.
- If a bug in a library function is corrected by replacing the library, all programs using it dynamically will benefit from the correction after restarting them. Programs that included this function by static linking would have to be re-linked first.
- Known on the Windows platform as "DLL hell", an incompatible updated library will break executables that depended on the behavior of the previous version of the library if the newer version is incorrectly not backward compatible.
- A program, together with the libraries it uses, might be certified as a package, but not if components can be replaced.
Static linking
Relocation
As the compiler has no information on the layout of objects in the final output, it cannot take advantage of shorter or more efficient instructions that place a requirement on the address of another object. For example, a jump instruction can reference an absolute address or an offset from the current location, and the offset could be expressed with different lengths depending on the distance to the target. By first generating the most conservative instruction and adding relaxation hints, it is possible to substitute shorter or more efficient instructions during the final link. In regard to jump optimizations this is also called automatic jump-sizing. This step can be performed only after all input objects have been read and assigned temporary addresses; the linker relaxation pass subsequently reassigns addresses, which may in turn allow more potential relaxations to occur. In general, the substituted sequences are shorter, which allows this process to always converge on the best solution given a fixed order of objects; if this is not the case, relaxations can conflict, and the linker needs to weigh the advantages of either option.While instruction relaxation typically occurs at link-time, inner-module relaxation can already take place as part of the optimizing process at compile-time. In some cases, relaxation can also occur at load-time as part of the relocation process or combined with dynamic dead-code elimination techniques.
Linkage editor
In IBM System/360 mainframe environments such as OS/360, including z/OS for the z/Architecture mainframes, this type of program is known as a linkage editor. As the name implies a linkage editor has the additional capability of allowing the addition, replacement, and/or deletion of individual program sections. Operating systems such as OS/360 have format for executable load-modules containing supplementary data about the component sections of a program, so that an individual program section can be replaced, and other parts of the program updated so that relocatable addresses and other references can be corrected by the linkage editor, as part of the process.One advantage of this is that it allows a program to be maintained without having to keep all of the intermediate object files, or without having to re-compile program sections that haven't changed. It also permits program updates to be distributed in the form of small files, containing only the object module to be replaced. In such systems, object code is in the form and format of 80-byte punched-card images, so that updates can be introduced into a system using that medium. In later releases of OS/360 and in subsequent systems, load-modules contain additional data about versions of components modules, to create a traceable record of updates. It also allows one to add, change, or remove an overlay structure from an already linked load module.
The term "linkage editor" should not be construed as implying that the program operates in a user-interactive mode like a text editor. It is intended for batch-mode execution, with the editing commands being supplied by the user in sequentially organized files, such as punched cards, DASD, or magnetic tape, and tapes were often used during the initial installation of the OS.
Linkage editing or consolidation or collection refers to the linkage editor's or consolidator's act of combining the various pieces into a relocatable binary, whereas the loading and relocation into an absolute binary at the target address is normally considered a separate step.
GNU linker
The GNU linker is the GNU Project's implementation of the Unix command ld. GNU ld runs the linker, which creates an executable file from object files created during compilation of a software project. A linker script may be passed to GNU ld to exercise greater control over the linking process. The GNU linker is part of the GNU Binary Utilities. Two versions of ld are provided in binutils: the traditional GNU ld based on bfd, and an ELF-only version called gold.Possible origins of the name "ld" are "LoaD" and "Link eDitor".
GNU linker is free software, distributed under the terms of the GNU General Public License.