AES3


AES3 is a standard for the exchange of digital audio signals between professional audio devices. An AES3 signal can carry two channels of PCM audio over several transmission media including balanced lines, unbalanced lines, and optical fiber.
AES3 was jointly developed by the Audio Engineering Society and the European Broadcasting Union. The standard was first published in 1985 and was revised in 1992 and 2003. AES3 has been incorporated into the International Electrotechnical Commission's standard IEC 60958, and is available in a consumer-grade variant known as S/PDIF.

History and development

The development of standards for digital audio interconnect for both professional and domestic audio equipment, began in the late 1970s in a joint effort between the Audio Engineering Society and the European Broadcasting Union, and culminated in the publishing of AES3 in 1985. The AES3 standard has been revised in 1992 and 2003 and is published in AES and EBU versions. Early on, the standard was frequently known as AES/EBU.
Variants using different physical connections are specified in IEC 60958. These are essentially consumer versions of AES3 for use within the domestic high fidelity environment using connectors more commonly found in the consumer market. These variants are commonly known as S/PDIF.

Hardware connections

The AES3 standard parallels part 4 of the international standard IEC 60958. Of the physical interconnection types defined by IEC 60958, two are in common use.

IEC 60958 type I

Type I connections use balanced, 3-conductor, 110-ohm twisted pair cabling with XLR connectors. Type I connections are most often used in professional installations and are considered the standard connector for AES3. The hardware interface is usually implemented using RS-422 line drivers and receivers.
Cable endDevice end
InputXLR male plugXLR female jack
OutputXLR female plugXLR male jack

IEC 60958 type II

IEC 60958 Type II defines an unbalanced electrical or optical interface for consumer electronics applications. The precursor of the IEC 60958 Type II specification was the Sony/Philips Digital Interface, or S/PDIF. Both were based on the original AES/EBU work. S/PDIF and AES3 are interchangeable at the protocol level, but at the physical level, they specify different electrical signalling levels and impedances, which may be significant in some applications.

BNC Connector

AES/EBU can also be run using unbalanced BNC connectors a with a 75-ohm coaxial cable. The unbalanced version has a maximum transmission distance of 100 meters as opposed to the 1000 meters maximum for the balanced version. The AES-3id standard defines a 75-ohm BNC electrical variant of AES3. This uses the same cabling, patching and infrastructure as analogue or digital video, and is thus common in the broadcast industry.

Other formats

AES3 digital audio format can also be carried over an Asynchronous Transfer Mode network. The standard for packing AES3 frames into ATM cells is AES47. AES67 described digital audio transport over an IP network but does not include any of the ancillary data associated with AES3. A mechanism for carrying this additional data is described in SMPTE 2110-31.
For information on the synchronization of digital audio structures, see the AES11 standard. The ability to insert unique identifiers into an AES3 bit stream is covered by the AES52 standard.

Protocol

AES3 was designed primarily to support stereo PCM encoded audio in either DAT format at 48 kHz or CD format at 44.1 kHz. No attempt was made to use a carrier able to support both rates; instead, AES3 allows the data to be run at any rate, and encoding the clock and the data together using biphase mark code.
Each bit occupies one time slot. Each audio sample is combined with four flag bits and a synchronisation preamble which is four time slots long to make a subframe of 32 time slots. The 32 time slots of each subframe are assigned as follows:
Time slotNameDescription
0–3PreambleA synchronisation preamble for audio blocks, frames, and subframes.
4–7Auxiliary sample A low-quality auxiliary channel used as specified in the channel status word, notably for producer talkback or recording studio-to-studio communication.
8–27, or 4–27Audio sampleOne sample stored with most significant bit last. If the auxiliary sample is used, bits 4–7 are not included. Data with smaller sample bit depths always have MSB at bit 27 and are zero-extended towards the least significant bit.
28Validity Unset if the audio data are correct and suitable for D/A conversion. During the presence of defective samples, the receiving equipment may be instructed to mute its output. It is used by most CD players to indicate that concealment rather than error correction is taking place.
29User data Forms a serial data stream for each channel, with a format specified in the channel status word.
30Channel status Bits from each frame of an audio block are collated giving a 192-bit channel status word. Its structure depends on whether AES3 or S/PDIF is used.
31Parity Even parity bit for detection of errors in data transmission. Excludes preamble; Bits 4–31 have an even number of ones.

Two subframes make a frame. Frames contain 64 bit periods and are produced once per audio sample period. At the highest level, each 192 consecutive frames are grouped into an audio block. While samples repeat each frame time, metadata is only transmitted once per audio block. At 48 kHz sample rate, there are 250 audio blocks per second, and 3,072,000 time slots per second supported by a 6.144 MHz biphase clock.

Synchronisation preamble

The synchronisation preamble is a specially coded preamble that identifies the subframe and its position within the audio block. Preambles are not normal BMC-encoded data bits, although they do still have zero DC bias.
Three preambles are possible:
The three preambles are called X, Y, Z in the AES3 standard; and M, W, B in IEC 958.
The 8-bit preambles are transmitted in the time allocated to the first four time slots of each subframe. Any of the three marks the beginning of a subframe. X or Z marks the beginning of a frame, and Z marks the beginning of an audio block.

| 0 | 1 | 2 | 3 | | 0 | 1 | 2 | 3 | Time slots
_____ _ _____ _
/ \_____/ \_/ \_____/ \_/ \ Preamble X
_____ _ ___ ___
/ \___/ \___/ \_____/ \_/ \ Preamble Y
_____ _ _ _____
/ \_/ \_____/ \_____/ \_/ \ Preamble Z
___ ___ ___ ___
/ \___/ \___/ \___/ \___/ \ All 0 bits BMC encoded
_ _ _ _ _ _ _ _
/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ All 1 bits BMC encoded
| 0 | 1 | 2 | 3 | | 0 | 1 | 2 | 3 | Time slots

In two-channel AES3, the preambles form a pattern of ZYXYXYXY…, but it is straightforward to extend this structure to additional channels, each with a Y preamble, as is done in the MADI protocol.

Channel status word

There is one channel status bit in each subframe, a total of 192 bits or 24 bytes for each channel in each block. Between the AES3 and S/PDIF standards, the contents of the 192-bit channel status word differ significantly, although they agree that the first channel status bit distinguishes between the two. In the case of AES3, the standard describes, in detail, the function of each bit.
data can be embedded within AES3 digital audio signals. It can be used for synchronization and for logging and identifying audio content. According to John Ratcliff's Timecode: A user's guide, it is embedded as a 32-bit binary word in bytes 18 to 21 of the channel status data.