Ambiguous grammar

In computer science, an ambiguous grammar is a context-free grammar for which there exists a string that can have more than one leftmost derivation or parse tree, while an unambiguous grammar is a context-free grammar for which every valid string has a unique leftmost derivation or parse tree. Many languages admit both ambiguous and unambiguous grammars, while some languages admit only ambiguous grammars. Any non-empty language admits an ambiguous grammar by taking an unambiguous grammar and introducing a duplicate rule or synonym. A language that only admits ambiguous grammars is called an inherently ambiguous language, and there are inherently ambiguous context-free languages. Deterministic context-free grammars are always unambiguous, and are an important subclass of unambiguous grammars; there are non-deterministic unambiguous grammars, however.
For computer programming languages, the reference grammar is often ambiguous, due to issues such as the dangling else problem. If present, these ambiguities are generally resolved by adding precedence rules or other context-sensitive parsing rules, so the overall phrase grammar is unambiguous. Some parsing algorithms can generate sets of parse trees from strings that are syntactically ambiguous.

Examples

Trivial language

The simplest example is the following ambiguous grammar for the trivial language, which consists of only the empty string:
…meaning that a production can either be itself again, or the empty string. Thus the empty string has leftmost derivations of length 1, 2, 3, and indeed of any length, depending on how many times the rule A → A is used.
This language also has the unambiguous grammar, consisting of a single production rule:
…meaning that the unique production can only produce the empty string, which is the unique string in the language.
In the same way, any grammar for a non-empty language can be made ambiguous by adding duplicates.

Unary string

The regular language of unary strings of a given character, say 'a', has the unambiguous grammar:
…but also has the ambiguous grammar:
These correspond to producing a right-associative tree or allowing both left- and right- association. This is elaborated below.

Addition and subtraction

The context free grammar
is ambiguous since there are two leftmost derivations for the string a + a + a:
As another example, the grammar is ambiguous since there are two parse trees for the string a + a − a:
The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous grammar generating the same language:

Dangling else

A common example of ambiguity in computer programming languages is the [|dangling else] problem. In many languages, the else in an If–then statement is optional, which results in nested conditionals having multiple ways of being recognized in terms of the context-free grammar.
Concretely, in many languages one may write conditionals in two valid forms: the if-then form, and the if-then-else form – in effect, making the else clause optional:
In a grammar containing the rules
Statement → if Condition then Statement |
if Condition then Statement else Statement |
...
Condition →...
some ambiguous phrase structures can appear. The expression
if a then if b then s else s2
can be parsed as either
if a then begin if b then s end else s2
or as
if a then begin if b then s else s2 end
depending on whether the else is associated with the first if or second if.
This is resolved in various ways in different languages. Sometimes the grammar is modified so that it is unambiguous, such as by requiring an endif statement or making else mandatory. In other cases the grammar is left ambiguous, but the ambiguity is resolved by making the overall phrase grammar context-sensitive, such as by associating an else with the nearest if. In this latter case the grammar is unambiguous, but the context-free grammar is ambiguous.

An unambiguous grammar with multiple derivations

The existence of multiple derivations of the same string does not suffice to indicate that the grammar is ambiguous; only multiple leftmost derivations indicate ambiguity.
For example, the simple grammar
S → A + A
A → 0 | 1
is an unambiguous grammar for the language. While each of these four strings has only one leftmost derivation, it has two different derivations, for example
S ⇒ A + A ⇒ 0 + A ⇒ 0 + 0
and
S ⇒ A + A ⇒ A + 0 ⇒ 0 + 0
Only the former derivation is a leftmost one.

Recognizing ambiguous grammars

The decision problem of whether an arbitrary grammar is ambiguous is undecidable because it can be shown that it is equivalent to the Post correspondence problem. At least, there are tools implementing some semi-decision procedure for detecting ambiguity of context-free grammars.
The efficiency of context-free grammar parsing is determined by the automaton that accepts it. Deterministic context-free grammars are accepted by deterministic pushdown automata and can be parsed in linear time, for example by the LR parser. This is a subset of the context-free grammars which are accepted by the pushdown automaton and can be parsed in polynomial time, for example by the CYK algorithm. Unambiguous context-free grammars can be nondeterministic.
For example, the language of even-length palindromes on the alphabet of 0 and 1 has the unambiguous context-free grammar S → 0S0 | 1S1 | ε. An arbitrary string of this language cannot be parsed without reading all its letters first which means that a pushdown automaton has to try alternative state transitions to accommodate for the different possible lengths of a semi-parsed string. Nevertheless, removing grammar ambiguity may produce a deterministic context-free grammar and thus allow for more efficient parsing. Compiler generators such as YACC include features for resolving some kinds of ambiguity, such as by using the precedence and associativity constraints.

Inherently ambiguous languages

The existence of inherently ambiguous languages was proven with Parikh's theorem in 1961 by Rohit Parikh in an MIT research report.
While some context-free languages have both ambiguous and unambiguous grammars, there exist context-free languages for which no unambiguous context-free grammar can exist. An example of an inherently ambiguous language is the union of with. This set is context-free, since the union of two context-free languages is always context-free. But give a proof that there is no way to unambiguously parse strings in the common subset.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...