Highway network


In machine learning, a highway network is an approach to optimizing networks and increasing their depth. Highway networks use learned gating mechanisms to regulate information flow, inspired by Long Short-Term Memory recurrent neural networks. The gating mechanisms allow neural networks to have paths for information to follow across different layers.
Highway networks have been used as part of text sequence labeling and speech recognition tasks.

Model

The model has two gates in addition to the H gate: the transform gate T and the carry gate C. Those two last gates are non-linear transfer functions. The H function can be any desired transfer function.
The carry gate is defined as
C = 1 - T'''. While the transform gate is just a gate with a sigmoid transfer function.

Structure

The structure of a hidden layer follows the equation:
The advantage of a Highway Network over the common deep neural networks is that solves or prevents partially the Vanishing gradient problem, thus leading to easier to optimize neural networks.