Dangling else
The dangling else is a problem in computer programming in which an optional else clause in an if–then statement results in nested conditionals being ambiguous. Formally, the reference context-free grammar of the language is ambiguous, meaning there is more than one correct parse tree.
In many programming languages one may write conditionally executed code in two forms: the if-then form, and the if-then-else form – the else clause is optional:
if a then s
if b then s1 else s2
This gives rise to an ambiguity in interpretation when there are nested statements, specifically whenever an if-then form appears as
s1
in an if-then-else form:if a then if b then s else s2
In this example,
s
is unambiguously executed when a
is true and b
is true, but one may interpret s2
as being executed when a
is false or when a
is true and b
is false. In other words, one may see the previous statement as either of the following expressions:if a then else s2
if a then
The dangling else problem dates to ALGOL 60, and has been resolved in various ways in subsequent languages. In LR parsers, the dangling else is the archetypal example of a shift-reduce conflict.
Avoiding ambiguity while keeping the syntax
This is a problem that often comes up in compiler construction, especially scannerless parsing. The convention when dealing with the dangling else is to attach the else to the nearby if statement, allowing for unambiguous context-free grammars, in particular. Programming languages like Pascal, C and Java follow this convention, so there is no ambiguity in the semantics of the language, though the use of a parser generator may lead to ambiguous grammars. In these cases alternative grouping is accomplished by explicit blocks, such asbegin...end
in Pascal and
in C.Depending on the compiler construction approach, one may take different corrective actions to avoid ambiguity:
- If the parser is produced by an SLR, LR or LALR LR parser generator, the programmer will often rely on the generated parser feature of preferring shift over reduce whenever there is a conflict. Alternatively, the grammar can be rewritten to remove the conflict, at the expense of an increase in grammar size.
- If the parser is hand written, the programmer may use a non-ambiguous context-free grammar. Alternatively, one may rely on a non-context-free grammar or a parsing expression grammar.
Avoiding ambiguity by changing the syntax
Possible solutions are:
- Having an "end if" statement. Examples of such languages are ALGOL 68, Ada, Eiffel, PL/SQL and Visual Basic
- Disallowing the statement following a "then" to be an "if" itself. This approach is followed by ALGOL 60.
- Requiring braces when an "else" follows an "if". This is effectively true in Python as its indentation rules delimit every block, not just those in "if" statements.
- Requiring every "if" to be paired with an "else". To avoid a similar problem concerning semantics rather than syntax, Racket deviates from Scheme by considering an
if
without a fallback clause to be an error, effectively distinguishing conditional expressions from conditional statements. - Using different keywords for the one-alternative and two-alternative "if" statements. S-algol uses
if e do s
for the one-alternative case andif e1 then e2 else e3
for the general case. - Require braces unconditionally, like Swift and Modula-2. To reduce the resulting clutter, Modula-2 does away with the block opener on non function levels.
Examples
C
In C, the grammar reads, in part:
statement =...
| selection-statement
selection-statement =...
| IF statement
| IF statement ELSE statement
Thus, without further rules, the statement
if if s; else s2;
could ambiguously be parsed as if it were either:
if
or:
if
else
s2;
In practice in C the first tree is chosen, by associating the
else
with the nearest if
.Avoiding the conflict in LR parsers
The above example could be rewritten in the following way to remove the ambiguity :
statement: open_statement
| closed_statement
;
open_statement: IF statement
| IF closed_statement ELSE open_statement
;
closed_statement: non_if_statement
| IF closed_statement ELSE closed_statement
;
non_if_statement:...
;
Any other statement-related grammar rules may also have to be duplicated in this way if they may directly or indirectly end with a
statement
or selection-statement
non-terminal.However, we give grammar that includes both of if and while statements.
statement: open_statement
| closed_statement
;
open_statement: IF statement
| IF closed_statement ELSE open_statement
| WHILE open_statement
;
closed_statement: simple_statement
| IF closed_statement ELSE closed_statement
| WHILE closed_statement
;
simple_statement:...
;
Finally, we give the grammar that forbids ambiguous IF statements.
statement: open_statement
| closed_statement
;
open_statement: IF simple_statement
| IF open_statement
| IF closed_statement ELSE open_statement
| WHILE open_statement
;
closed_statement: simple_statement
| IF closed_statement ELSE closed_statement
| WHILE closed_statement
;
simple_statement:...
;
With this grammar parsing of "if if c else d" fails:
statement
open_statement
IF closed_statement ELSE open_statement
'if' closed_statement 'else' 'd'
and then the parsing fails trying to match
closed_statement
to "if c". An attempt with closed_statement
fails in the same way.