IBM z196


The z196 microprocessor is a chip made by IBM for their zEnterprise 196 and zEnterprise 114 mainframe computers, announced on July 22, 2010. The processor was developed over a three-year time span by IBM engineers from Poughkeepsie, New York; Austin, Texas; and Böblingen, Germany at a cost of US$1.5 billion. Manufactured at IBM's Fishkill, New York fabrication plant, the processor began shipping on September 10, 2010. IBM stated that it was the world's fastest microprocessor at the time.

Description

The chip measures 512.3 mm2 and consists of 1.4 billion transistors fabricated in IBM's 45 nm CMOS silicon on insulator fabrication process, supporting speeds of 5.2 GHz: at the time, the highest clock speed CPU ever produced for commercial sale.
The processor implements the CISC z/Architecture with a new superscalar, out-of-order pipeline and 100 new instructions. The instruction pipeline has 15 to 17 stages; the instruction queue can hold 40 instructions; and up to 72 instructions can be "in flight". It has four cores, each with a private 64 KB L1 instruction cache, a private 128 KB L1 data cache and a private 1.5 MB L2 cache. In addition, there is a 24 MB shared L3 cache implemented in eDRAM and controlled by two on-chip L3 cache controllers. There's also an additional shared L1 cache used for compression and cryptography operations.
Each core has six RISC-like execution units, including two integer units, two load-store units, one binary floating point unit and one decimal floating point unit. The z196 chip can decode three instructions and execute five operations in a single clock cycle.
The z196 chip has on board DDR3 RAM memory controller supporting a RAID like configuration to recover from memory faults. The z196 also includes a GX bus controller for accessing host channel adapters and peripherals. Additionally, each chip includes co-processors for cryptographic and compression functionality.

Shared Cache

Even though the z196 processor has on-die facilities for symmetric multiprocessing, there are 2 dedicated companion chips called the Shared Cache that each adds 96 MB off-die L4 cache for a total of 192 MB L4 cache. L4 cache is shared by all processors in the book. The SC chip consists of 1.5 billion transistors and measures 478.8 mm2, manufactured with the same 45 nm process as the z196 chip.
Each chip also has 24 MB L3 cache shared by the 4 cores on the chip.

Multi-chip module

The zEnterprise System z196 uses multi-chip modules which allows for six z196 chips to be on a single module. Each MCM has two shared cache chips allowing processors on the MCM to be connected with 40 GB/s links.
The different models of the zEnterprise System have a different number of active cores. To accomplish this, some processors in each MCM may have its fourth core disabled.

z114

The zEnterprise System z114 does use z196 processors but does not use MCMs so the processors are packaged on single chip modules instead. Two SCMs and one Shared Cache chip is mounted together in a processor drawer. These processors also run at reduced speed, at 3.8 GHz.