Fagin's theorem


Fagin's theorem is the oldest result of descriptive complexity theory, a branch of computational complexity theory that characterizes complexity classes in terms of logic-based descriptions of their problems rather than by the behavior of algorithms for solving those problems.
The theorem states that the set of all properties expressible in existential second-order logic is precisely the complexity class NP.
It was proven by Ronald Fagin in 1973 in his doctoral thesis, and appears in his 1974 paper. The arity required by the second-order formula was improved in Lynch's theorem, and several results of Grandjean have provided tighter bounds on nondeterministic random-access machines.

Proof

In addition to Fagin's 1974 paper, Immerman 1999 provides a detailed proof of the theorem. It is straightforward to show that every existential second-order formula can be recognized in NP, by nondeterministically choosing the value of all existentially-qualified variables, so the main part of the proof is to show that every language in NP can be described by an existential second-order formula. To do so, one can use second-order existential quantifiers to arbitrarily choose a computation tableau. In more detail, for every timestep of an execution trace of a non-deterministic Turing machine, this tableau encodes the state of the Turing machine, its position in the tape, the contents of every tape cell, and which nondeterministic choice the machine makes at that step. Constraining this encoded information so that it describes a valid execution trace in which the tape contents and Turing machine state and position at each timestep follow from the previous timestep can then be done with a first-order formula.
A key lemma used in the proof is that it is possible to encode a linear order of length nk as a 2k-ary relation R on a universe A of size n. One way to achieve this is to choose a linear ordering L of A and then define R to be the lexicographical ordering of k-tuples from A with respect to L.