Regular tree grammar


In theoretical computer science and formal language theory, a regular tree grammar is a formal grammar that describes a set of directed trees, or terms. A regular word grammar can be seen as a special kind of regular tree grammar, describing a set of single-path trees.

Definition

A regular tree grammar G is defined by the tuple
G =,
where
The grammar G implicitly defines a set of trees: any tree that can be derived from Z using the rule set P is said to be described by G.
This set of trees is known as the language of G.
More formally, the relation ⇒G on the set TΣ is defined as follows:
A tree t1TΣ can be derived in a single step into a tree t2TΣ
, if there is a context S and a production ∈ P such that:
Here, a context means a tree with exactly one hole in it; if S is such a context, S denotes the result of filling the tree t into the hole of S.
The tree language generated by G is the language L =.
Here, TΣ denotes the set of all trees composed from symbols of Σ, while ⇒G* denotes successive applications of ⇒G.
A language generated by some regular tree grammar is called a regular tree language.

Examples

Let G1 =, where
An example derivation from the grammar G1 is
BList
cons
cons
cons.
The image shows the corresponding derivation tree; it is a tree of trees, whereas a derivation tree in word grammars is a tree of strings.
The tree language generated by G1 is the set of all finite lists of boolean values, that is, L happens to equal TΣ1.
The grammar G1 corresponds to the algebraic data type declarations :

datatype Bool
= false
| true
datatype BList
= nil
| cons of Bool * BList

Every member of L corresponds to a Standard-ML value of type BList.
For another example, let G2 =, using the nonterminal set and the alphabet from above, but extending the production set by P2, consisting of the following productions:
The language L is the set of all finite lists of boolean values that contain true at least once. The set L has no datatype counterpart in Standard ML, nor in any other functional language.
It is a proper subset of L.
The above example term happens to be in L, too, as the following derivation shows:
BList1
cons
cons
cons.

Language properties

If L1, L2 both are regular tree languages, then the tree sets L1L2, L1L2, and L1 \ L2 are also regular tree languages, and it is decidable whether L1L2, and whether L1 = L2.

Alternative characterizations and relation to other formal languages

Applications of regular tree grammars include: