XCORE-200


The xCORE200 is a 32-bit processor designed by XMOS, featuring support for two tiles with up to 8 concurrent threads each. It was launched in March 2015, and available as of June 2015 running at 500 MHz. Each thread can run at up to 100 MHz, and threads may be able to execute 2 instructions in a clock cycle. Five threads follow each other through the pipeline, resulting in a top speed of 2000 MIPS, and a speed of at least 1000 MIPS. The issue slots for each tile are equally distributed over all active threads on that tile. This allows the use of extra threads in order to hide latency. The XCORE-200 processor is used for, amongst others, voice interfaces and audio connectivity.

Description

An xCORE200 node comprises two physical cores and a switch. The execution core has a data path, a memory, and register banks for eight threads. The switches of two or more xCORE200 nodes can be connected using links, whereupon threads on all of the cores can communicate with each other by exchanging messages through the switches. The switching mechanism is abstracted by means of a channel, a virtual connection between two threads.
The switch has eight external links, permitting a maximum throughput of 3.2 GBits/s to other cores.
An XCORE200 node also has a USB PHY and an RGMII interface. The former enables a direct connection with a USB connector, the latter enables a connection to a gigabit Ethernet PHY.
xCORE200 devices with up to 16 threads comprise a single xCORE200 node; xCORE200 devices with 24 or 32 threads comprises two xCORE200 nodes connected by means of four links. The multithreaded nature of the XCORE200 enables programs to run deterministically, enabling it to emulate functions that would otherwise be implemented in hardware.

Instruction set architecture

xCORE200 processors implement the XS2 architecture. Each thread has access to 12 general purpose registers, and a standard 3-operand instruction set is used for programming the thread. The instruction set is encoded densely, encoding most instructions in 16 bits, where 11 bits are used for specifying 3 operands, and 5 bits are used to encode the opcode. Less frequently used instructions are encoded in 32 bits.
The instruction set is a load-store instruction set.
All instructions execute in a single cycle. If an instruction does not need data from memory, the instruction will prefetch a word of instructions. This acts like a very small instruction cache, but its behaviour can be predicted at compile time, making timing behaviour as predictable as functional behaviour.
The XS2 is an event driven processor which enables the processor to stop a thread and restart it when an event is ready. In addition, a thread may be interrupted in order to deal with some external events.