Register-transfer level


In digital circuit design, register-transfer level is a design abstraction which models a synchronous digital circuit in terms of the flow of digital signals between hardware registers, and the logical operations performed on those signals.
Register-transfer-level abstraction is used in hardware description languages like Verilog and VHDL to create high-level representations of a circuit, from which lower-level representations and ultimately actual wiring can be derived. Design at the RTL level is typical practice in modern digital design.
Unlike in software compiler design when register-transfer level intermediate representation is the lowest level, RTL level is the usual input that circuit designers operate on and there are many more levels than it. In fact, in circuit synthesis, an intermediate language between the input register transfer level representation and the target netlist is sometimes used. Unlike in netlist, constructs such as cells, functions, and multi-bit registers are available. Examples include FIRRTL and RTLIL.

RTL description

A synchronous circuit consists of two kinds of elements: registers and combinational logic. Registers synchronize the circuit's operation to the edges of the clock signal, and are the only elements in the circuit that have memory properties. Combinational logic performs all the logical functions in the circuit and it typically consists of logic gates.
For example, a very simple synchronous circuit is shown in the figure. The inverter is connected from the output, Q, of a register to the register's input, D, to create a circuit that changes its state on each rising edge of the clock, clk. In this circuit, the combinational logic consists of the inverter.
When designing digital integrated circuits with a hardware description language, the designs are usually engineered at a higher level of abstraction than transistor level or logic gate level. In HDLs the designer declares the registers, and describes the combinational logic by using constructs that are familiar from programming languages such as if-then-else and arithmetic operations. This level is called register-transfer level. The term refers to the fact that RTL focuses on describing the flow of signals between registers.
As an example, the circuit mentioned above can be described in VHDL as follows:

D <= not Q;
process
begin
if rising_edge then
Q <= D;
end if;
end process;

Using an EDA tool for synthesis, this description can usually be directly translated to an equivalent hardware implementation file for an ASIC or an FPGA. The synthesis tool also performs logic optimization.
At the register-transfer level, some types of circuits can be recognized. If there is a cyclic path of logic from a register's output to its input, the circuit is called a state machine or can be said to be sequential logic. If there are logic paths from a register to another without a cycle, it is called a pipeline.

RTL in the circuit design cycle

RTL is used in the logic design phase of the integrated circuit design cycle.
An RTL description is usually converted to a gate-level description of the circuit by a logic synthesis tool. The synthesis results are then used by placement and routing tools to create a physical layout.
Logic simulation tools may use a design's RTL description to verify its correctness.

Power estimation techniques for RTL

The most accurate power analysis tools are available for the circuit level but unfortunately, even with switch- rather than device-level modelling, tools at the circuit level have disadvantages like they are either too slow or require too much memory thus inhibiting large chip handling. The majority of these are simulators like SPICE and have been used by the designers for many years as performance analysis tools. Due to these disadvantages, gate-level power estimation tools have begun to gain some acceptance where faster, probabilistic techniques have begun to gain a foothold. But it also has its trade off as speedup is achieved on the cost of accuracy, especially in the presence of correlated signals. Over the years it has been realized that biggest wins in low power design cannot come from circuit- and gate-level optimizations whereas architecture, system, and algorithm optimizations tend to have the largest impact on power consumption. Therefore, there has been a shift in the incline of the tool developers towards high-level analysis and optimization tools for power.

Motivation

It is well known that more significant power reductions are possible if optimizations are made on levels of abstraction, like the architectural and algorithmic level, which are higher than the circuit or gate level This provides the required motivation for the developers to focus on the development of new architectural level power analysis tools. This in no way implies that lower level tools are unimportant. Instead, each layer of tools provides a foundation upon which the next level can be built. The abstractions of the estimation techniques at a lower level can be used on a higher level with slight modifications.

Advantages of doing power estimation at RTL or architectural level

It is a technique based on the concept of gate equivalents. The complexity of a chip architecture can be described approximately in terms of gate equivalents where gate equivalent count specifies the average number of reference gates that are required to implement the particular function. The total power required for the particular function is estimated by multiplying the approximated number of gate equivalents with the average power consumed per gate. The reference gate can be any gate e.g. 2-input NAND gate.

Examples of Gate Equivalent technique




Precharacterized Cell Libraries

This technique further customizes the power estimation of various functional blocks by having separate power model for logic, memory, and interconnect suggesting a Power Factor Approximation method for individually characterizing an entire library of functional blocks such as multipliers, adders, etc. instead of a single gate-equivalent model for “logic” blocks.

The power over the entire chip is approximated by the expression:


Where Ki is PFA proportionality constant that characterizes the ith functional element,Gi is the measure of hardware complexity, and fi denotes the activation frequency.

Example

Gi denoting the hardware complexity of the multiplier is related to the square of the input word length i.e. N2 where N is the word length. The activation frequency is the rate at which multiplies are performed by the algorithm denoted by fmult and the PFA constant, Kmult, is extracted empirically from past multiplier designs and shown to be about 15 fW/bit2-Hz for a 1.2 µm technology at 5V.
The resulting power model for the multiplier on the basis of the above assumptions is:



Advantages:
Weakness:
The estimation error for a 16x16 multiplier is experimented and it is observed that when the dynamic range of the inputs doesn’t fully occupy the word length of the multiplier, the UWN model becomes extremely inaccurate. Granted, good designers attempt to maximize word length utilization. Still, errors in the range of 50-100% are not uncommon. The figure clearly suggests a flaw in the UWN model.

Power estimation