Inline assembler


In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

Motivation and alternatives

The embedding of assembly language code is usually done for one of three reasons:
On the other hand, inline assembler poses a direct problem for the compiler itself as it complicates the analysis of what is done to each variable, a key part of register allocation. This means the performance might actually decrease. Inline assembler also complicate future porting and maintenance of a program.
Alternative facilities are often provided as a way to simplify the work for both the compiler and the programmer. Intrinsic functions for special instructions are provided by most compilers and C-function wrappers for arbitrary system calls are available on every Unix platform.

Syntax

In language standards

The ISO C++ standard and ISO C standards specify a conditionally supported syntax for inline assembler:

An asm declaration has the form
asm-definition:
asm ;
The asm declaration is conditionally-supported; its meaning is implementation-defined.

This definition, however, is rarely used in actual C, as it is simultaneously too liberal and too restricted.

In actual compilers

In practical use, inline assembly operating on values is rarely standalone as free-floating code. Since the programmer cannot predict what register a variable is assigned to, compilers typically provide a way to substitute them in as an extension.
There are, in general, two types of inline assembly supported by C/C++ compilers:
The two families of extensions represent different understandings of division of labor in processing inline assembly. The GCC form preserves the overall syntax of the language and compartmentizes what the compiler needs to know: what is needed and what is changed. It does not explicitly require the compiler to understand instruction names, as the compiler is only needs to substitute in its register assignments plus a few operations to handle the input requirements. The MSVC form of an embedded domain-specific language provides some ease of writing, but it requires the compiler itself to know about opcode names and their clobbering properties, demanding extra attention in maintenance and porting. The Rust language has a proposal to abstract away inline assembly options further than the LLVM version. It provides enough information to allow transforming the block into an externally-assembled function if the backend could not handle embedded assembly.
GNAT, LLVM, and the Rust programming language uses a syntax similar to the GCC syntax. The D programming language uses a DSL similar to the MSVC extension officially for x86_64, but the LLVM-based LDC also provides the GCC-style syntax on every architecture.

Examples

A system call in GCC

Calling an operating system directly is generally not possible under a system using protected memory. The OS runs at a more privileged level than the user ; a interrupt is used to make requests to the operating system. This is rarely a feature in a higher-level language, and so wrapper functions for system calls are written using inline assembler.
The following C code example shows an x86 system call wrapper in AT&T assembler syntax, using the GNU Assembler. Such calls are normally written with the aid of macros; the full code is included for clarity. In this particular case, the wrapper performs a system call of a number given by the caller with three operands, returning the result.
To recap, GCC supports both basic and extended assembly. The former simply passes text verbatim to the assembler, while the latter performs some substitutions for register locations.

extern int errno;
int syscall3

Processor-specific instruction in D

This example of inline assembly from the D programming language shows code that computes the tangent of x using the x86's FPU instructions.

// Compute the tangent of x
real tan

For readers unfamiliar with x87 programming, the followed by conditional jump idiom is used to access the x87 FPU status word bits C0 and C2. stores the status in a general-purpose register; sahf sets the FLAGS register to the higher 8 bits of the register; and the jump is used to judge on whatever flag bit that happens to correspond to the FPU status bit.