Tree traversal


In computer science, tree traversal is a form of graph traversal and refers to the process of visiting each node in a tree data structure, exactly once. Such traversals are classified by the order in which the nodes are visited. The following algorithms are described for a binary tree, but they may be generalized to other trees as well.

Types

Unlike linked lists, one-dimensional arrays and other linear data structures, which are canonically traversed in linear order, trees may be traversed in multiple ways. They may be traversed in depth-first or breadth-first order. There are three common ways to traverse them in depth-first order: in-order, pre-order and post-order. Beyond these basic traversals, various more complex or hybrid schemes are possible, such as depth-limited searches like iterative deepening depth-first search. The latter, as well as breadth-first search, can also be used to traverse infinite trees, see [|below].

Data structures for tree traversal

Traversing a tree involves iterating over all nodes in some manner. Because from a given node there is more than one possible next node, then, assuming sequential computation, some nodes must be deferred—stored in some way for later visiting. This is often done via a stack or queue. As a tree is a self-referential data structure, traversal can be defined by recursion or, more subtly, corecursion, in a very natural and clear fashion; in these cases the deferred nodes are stored implicitly in the call stack.
Depth-first search is easily implemented via a stack, including recursively, while breadth-first search is easily implemented via a queue, including corecursively.

Depth-first search of binary tree

These searches are referred to as depth-first search, since the search tree is deepened as much as possible on each child before going to the next sibling. For a binary tree, they are defined as access operations at each node, starting with the current node, whose algorithm is as follows:
The general recursive pattern for traversing a binary tree is this:
In the examples is mostly performed before. But before is also possible, see.

Pre-order (NLR)

  1. Access the data part of the current node.
  2. Traverse the left subtree by recursively calling the pre-order function.
  3. Traverse the right subtree by recursively calling the pre-order function.

    In-order (LNR)

  4. Traverse the left subtree by recursively calling the in-order function.
  5. Access the data part of the current node.
  6. Traverse the right subtree by recursively calling the in-order function.

    Reverse in-order (RNL)

  7. Traverse the right subtree by recursively calling the reverse in-order function.
  8. Access the data part of the current node.
  9. Traverse the left subtree by recursively calling the reverse in-order function.

    Post-order (LRN)

  10. Traverse the left subtree by recursively calling the post-order function.
  11. Traverse the right subtree by recursively calling the post-order function.
  12. Access the data part of the current node.
The trace of a traversal is called a sequentialisation of the tree. The traversal trace is a list of each visited root. No one sequentialisation according to pre-, in- or post-order describes the underlying tree uniquely. Given a tree with distinct elements, either pre-order or post-order paired with in-order is sufficient to describe the tree uniquely. However, pre-order with post-order leaves some ambiguity in the tree structure.

Generic tree

To traverse any tree with depth-first search, perform the following operations recursively at each node:
  1. Perform pre-order operation.
  2. For each i from 1 to the number of children do:
  3. # Visit i-th, if present.
  4. # Perform in-order operation.
  5. Perform post-order operation.
Depending on the problem at hand, the pre-order, in-order or post-order operations may be void, or you may only want to visit a specific child, so these operations are optional. Also, in practice more than one of pre-order, in-order and post-order operations may be required. For example, when inserting into a ternary tree, a pre-order operation is performed by comparing items. A post-order operation may be needed afterwards to re-balance the tree.

Breadth-first search / level order

Trees can also be traversed in level-order, where we visit every node on a level before going to a lower level. This search is referred to as breadth-first search, as the search tree is broadened as much as possible on each depth before going to the next depth.

Other types

There are also tree traversal algorithms that classify as neither depth-first search nor breadth-first search. One such algorithm is Monte Carlo tree search, which concentrates on analyzing the most promising moves, basing the expansion of the search tree on random sampling of the search space.

Applications

Pre-order traversal can be used to make a prefix expression from expression trees: traverse the expression tree pre-orderly. For example, traversing the depicted arithmetic expression in pre-order yields "+ * A - B C + D E".
Post-order traversal can generate a postfix representation of a binary tree. Traversing the depicted arithmetic expression in post-order yields "A B C - * D E + +"; the latter can easily be transformed into machine code to evaluate the expression by a stack machine.
In-order traversal is very commonly used on binary search trees because it returns values from the underlying set in order, according to the comparator that set up the binary search tree.
Post-order traversal while deleting or freeing nodes and values can delete or free an entire binary tree. Thereby the node is freed after freeing its children.
Also the duplication of a binary tree yields a post-order sequence of actions, because the pointer to the copy of a node is assigned to the corresponding child field within the copy of the parent immediately after in the recursive procedure. This means that the parent cannot be finished before all children are finished.

Implementations

Depth-first search

Pre-order

In-order

Post-order

All the above implementations require stack space proportional to the height of the tree which is a call stack for the recursive and a parent stack for the iterative ones. In a poorly balanced tree, this can be considerable. With the iterative implementations we can remove the stack requirement by maintaining parent pointers in each node, or by [|threading the tree].

Morris in-order traversal using threading

A binary tree is threaded by making every left child pointer point to the in-order predecessor of the node and every right child pointer point to the in-order successor of the node.
Advantages:
  1. Avoids recursion, which uses a call stack and consumes memory and time.
  2. The node keeps a record of its parent.
Disadvantages:
  1. The tree is more complex.
  2. We can make only one traversal at a time.
  3. It is more prone to errors when both the children are not present and both values of nodes point to their ancestors.
Morris traversal is an implementation of in-order traversal that uses threading:
  1. Create links to the in-order successor.
  2. Print the data using these links.
  3. Revert the changes to restore original tree.

    Breadth-first search

Also, listed below is pseudocode for a simple queue based level-order traversal, and will require space proportional to the maximum number of nodes at a given depth. This can be as much as the total number of nodes / 2. A more space-efficient approach for this type of traversal can be implemented using an iterative deepening depth-first search.
levelorder
q ← empty queue
q.enqueue
while not q.isEmpty do
node ← q.dequeue
visit
if node.left ≠ null then
q.enqueue
if node.right ≠ null then
q.enqueue

Infinite trees

While traversal is usually done for trees with a finite number of nodes it can also be done for infinite trees. This is of particular interest in functional programming, as infinite data structures can often be easily defined and worked with, though they are not evaluated, as this would take infinite time. Some finite trees are too large to represent explicitly, such as the game tree for chess or go, and so it is useful to analyze them as if they were infinite.
A basic requirement for traversal is to visit every node eventually. For infinite trees, simple algorithms often fail this. For example, given a binary tree of infinite depth, a depth-first search will go down one side of the tree, never visiting the rest, and indeed an in-order or post-order traversal will never visit any nodes, as it has not reached a leaf. By contrast, a breadth-first traversal will traverse a binary tree of infinite depth without problem, and indeed will traverse any tree with bounded branching factor.
On the other hand, given a tree of depth 2, where the root has infinitely many children, and each of these children has two children, a depth-first search will visit all nodes, as once it exhausts the grandchildren, it will move on to the next. By contrast, a breadth-first search will never reach the grandchildren, as it seeks to exhaust the children first.
A more sophisticated analysis of running time can be given via infinite ordinal numbers; for example, the breadth-first search of the depth 2 tree above will take ω·2 steps: ω for the first level, and then another ω for the second level.
Thus, simple depth-first or breadth-first searches do not traverse every infinite tree, and are not efficient on very large trees. However, hybrid methods can traverse any infinite tree, essentially via a diagonal argument.
Concretely, given the infinitely branching tree of infinite depth, label the root, the children of the root,, …, the grandchildren,, …,,, …, and so on. The nodes are thus in a one-to-one correspondence with finite sequences of positive numbers, which are countable and can be placed in order first by sum of entries, and then by lexicographic order within a given sum, which gives a traversal. Explicitly:
0:
1:
2:
3:
4:
etc.
This can be interpreted as mapping the infinite depth binary tree onto this tree and then applying breadth-first search: replace the "down" edges connecting a parent node to its second and later children with "right" edges from the first child to the second child, from the second child to the third child, etc. Thus at each step one can either go down or go right , which shows the correspondence between the infinite binary tree and the above numbering; the sum of the entries corresponds to the distance from the root, which agrees with the 2n−1 nodes at depth in the infinite binary tree.