Arrow (computer science)


In computer science, arrows or bolts are a type class used in programming to describe computations in a pure and declarative fashion. First proposed by computer scientist John Hughes as a generalization of monads, arrows provide a referentially transparent way of expressing relationships between logical steps in a computation. Unlike monads, arrows don't limit steps to having one and only one input. As a result, they have found use in functional reactive programming, point-free programming, and parsers among other applications.

Motivation and history

While arrows were in use before being recognized as a distinct class, it wasn't until 2000 that John Hughes first published research focusing on them. Until then, monads had proven sufficient for most problems requiring the combination of program logic in pure code. However, some useful libraries, such as the Fudgets library for graphical user interfaces and certain efficient parsers, defied rewriting in a monadic form.
The formal concept of arrows was developed to explain these exceptions to monadic code, and in the process, monads themselves turned out to be a subset of arrows. Since then, arrows have been an active area of research. Their underlying laws and operations have been refined several times, with recent formulations such as arrow calculus requiring only five laws.

Relation to category theory

In category theory, the Kleisli categories of all monads form a proper subset of Hughes arrows. While Freyd categories were believed to be equivalent to arrows for a time, it has since been proven that arrows are even more general. In fact, arrows are not merely equivalent, but directly equal to enriched Freyd categories.

Definition

Like all type classes, arrows can be thought of as a set of qualities that can be applied to any data type. In the Haskell programming language, arrows allow functions to combine in a reified form. However, the actual term "arrow" may also come from the fact that some arrows correspond to the morphisms of different Kleisli categories. As a relatively new concept, there is not a single, standard definition, but all formulations are logically equivalent, feature some required methods, and strictly obey certain mathematical laws.

Functions

The description currently used by the Haskell standard libraries requires only two basic operations:

arr -> A s t


first -> A

Although only these two procedures are strictly necessary to define an arrow, other methods can be derived to make arrows easier to work with in practice and theory. As all arrows are categories, they can inherit a third operation from the class of categories:
A s t >>> A t u -> A s u

One more helpful method can be derived from a combination of the previous three:

A s t *** A u v -> A

Arrow laws

In addition to having some well-defined procedures, arrows must obey certain rules for any types they may be applied to:

arr id id


arr arr f >>> arr g
first first f >>> first g


arr first

The remaining laws restrict how the piping method behaves when the order of a composition is reversed, also allowing for simplifying expressions:

arr >>> first f first f >>> arr


first f >>> arr arr >>> f


first >>> arr -> )
arr -> ) >>> first f

Applications

Arrows may be extended to fit specific situations by defining additional operations and restrictions. Commonly used versions include arrows with choice, which allow a computation to make conditional decisions, and arrows with feedback, which allow a step to take its own outputs as inputs. Another set of arrows, known as arrows with application, are rarely used in practice because they are actually equivalent to monads.

Utility

Arrows have several benefits, mostly stemming from their ability to make program logic explicit yet concise. Besides avoiding side effects, purely functional programming creates more opportunities for static code analysis. This in turn can theoretically lead to better compiler optimizations, easier debugging, and features like syntax sugar.
Although no program strictly requires arrows, they generalize away much of the dense function passing that pure, declarative code would otherwise require. They can also encourage code reuse by giving common linkages between program steps their own class definitions. The ability to apply to types generically also contributes to reusability and keeps interfaces simple.
Arrows do have some disadvantages, including the initial effort of defining an arrow that satisfies the arrow laws. Because monads are usually easier to implement, and the extra features of arrows may be unnecessary, it is often preferable to use a monad. Another issue, which applies to many functional programming constructs, is efficiently compiling code with arrows into the imperative style used by computer instruction sets.

Limitations

Due to the requirement of having to define an arr function to lift pure functions, the applicability of arrows is limited. For example, bidirectional transformations cannot be arrows, because one would need to provide not only a pure function, but also its inverse, when using arr. This also limits the use of arrows to describe push-based reactive frameworks that stop unnecessary propagation. Similarly, the use of pairs to tuple values together results in a difficult coding style that requires additional combinators to re-group values, and raises fundamental questions about the equivalence of arrows grouped in different ways. These limitations remain an open problem, and extensions such as Generalized Arrows and N-ary FRP explore these problems.