SSSE3


Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.

History

SSSE3 was first introduced with Intel processors based on the Core microarchitecture on June 26, 2006 with the "Woodcrest" Xeons.
SSSE3 has been referred to by the codenames Tejas New Instructions or Merom New Instructions for the first processor designs intended to support it.

Functionality

SSSE3 contains 16 new discrete instructions. Each instruction can act on 64-bit MMX or 128-bit XMM registers. Therefore, Intel's materials refer to 32 new instructions. They include:
In the table below, satsw takes a signed integer X, and converts it to −32768 if it is less than −32768, to +32767 if it is greater than 32767, and leaves it unchanged otherwise. As normal for the Intel architecture, bytes are 8 bits, words 16 bits, and dwords 32 bits; 'register' refers to an MMX or XMM vector register.
PSIGNB, PSIGNW, PSIGNDPacked SignNegate the elements of a register of bytes, words or dwords if the sign of the corresponding elements of another register is negative.
PABSB, PABSW, PABSDPacked Absolute ValueFill the elements of a register of bytes, words or dwords with the absolute values of the elements of another register
PALIGNRPacked Align Righttake two registers, concatenate their values, and pull out a register-length section from an offset given by an immediate value encoded in the instruction.
PSHUFBPacked Shuffle Bytestakes registers of bytes A = and B = and replaces A with ; except that it replaces the ith entry with 0 if the top bit of bi is set.
PMULHRSWPacked Multiply High with Round and Scaletreat the 16-bit words in registers A and B as signed 16-bit fixed-point numbers between −1.00000000 and +0.99996948..., and multiply them together with correct rounding.
PMADDUBSWMultiply and Add Packed Signed and Unsigned BytesTake the bytes in registers A and B, multiply them together, add pairs, signed-saturate and store. I.e. pmaddubsw =
PHSUBW, PHSUBDPacked Horizontal Subtract takes registers A = and B = and outputs
PHSUBSWPacked Horizontal Subtract and Saturate Wordslike PHSUBW, but outputs
PHADDW, PHADDDPacked Horizontal Add takes registers A = and B = and outputs
PHADDSWPacked Horizontal Add and Saturate Wordslike PHADDW, but outputs