Pumping lemma for regular languages

In the theory of formal languages, the pumping lemma for regular languages is a lemma that describes an essential property of all regular languages. Informally, it says that all sufficiently long words in a regular language may be pumped—that is, have a middle section of the word repeated an arbitrary number of times—to produce a new word that also lies within the same language.
Specifically, the pumping lemma says that for any regular language there exists a constant such that any word in with length at least can be split into three substrings,, where the middle portion must not be empty, such that the words constructed by repeating zero or more times are still in. This process of repetition is known as "pumping". Moreover, the pumping lemma guarantees that the length of will be at most, imposing a limit on the ways in which may be split. Finite languages vacuously satisfy the pumping lemma by having equal to the maximum string length in plus one.
The pumping lemma is useful for disproving the regularity of a specific language in question. It was first proven by Michael Rabin and Dana Scott in 1959, and rediscovered shortly after by Yehoshua Bar-Hillel, Micha A. Perles, and Eli Shamir in 1961, as a simplification of their pumping lemma for context-free languages.

Formal statement

Let be a regular language. Then there exists an integer depending only on such that every string in of length at least can be written as , satisfying the following conditions:

is the substring that can be pumped. means the loop to be pumped must be of length at least one; means the loop must occur within the first characters. must be smaller than and ), but apart from that, there is no restriction on and.
In simple words, for any regular language, any sufficiently long word can be split into 3 parts.
i.e. , such that all the strings for are also in.
Below is a formal expression of the Pumping Lemma.

Use of the lemma

The pumping lemma is often used to prove that a particular language is non-regular: a proof by contradiction may consist of exhibiting a word in the language that lacks the property outlined in the pumping lemma.
For example, the language over the alphabet can be shown to be non-regular as follows:
Let, and be as used in the formal statement for the pumping lemma [|above]. We assume that there exists some constant. Let in be given by, which is a string longer than. By the pumping lemma, there must be some decomposition with and such that
in for every. Using, we know only consists of instances of. Moreover, because, it contains at least one instance of the letter. We now pump up: has more instances of the letter than the letter, since we have added some instances of without adding instances of. Therefore, is not in. We have reached a contradiction. Therefore, the assumption that is regular must be incorrect. Hence is not regular.
The proof that the language of balanced parentheses is not regular follows the same idea. Given, there is a string of balanced parentheses that begins with more than left parentheses, so that will consist entirely of left parentheses. By repeating, we can produce a string that does not contain the same number of left and right parentheses, and so they cannot be balanced.

Proof of the pumping lemma

For every regular language there is a finite state automaton that accepts the language. The number of states in such an FSA are counted and that count is used as the pumping length. For a string of length at least, let be the start state and let be the sequence of the next states visited as the string is emitted. Because the FSA has only states, within this sequence of visited states there must be at least one state that is repeated. Write for such a state. The transitions that take the machine from the first encounter of state to the second encounter of state match some string. This string is called in the lemma, and since the machine will match a string without the portion, or with the string repeated any number of times, the conditions of the lemma are satisfied.
For example, the following image shows an FSA.
The FSA accepts the string: abcd. Since this string has a length at least as large as the number of states, which is four, the pigeonhole principle indicates that there must be at least one repeated state among the start state and the next four visited states. In this example, only is a repeated state. Since the substring bc takes the machine through transitions that start at state and end at state, that portion could be repeated and the FSA would still accept, giving the string. Alternatively, the bc portion could be removed and the FSA would still accept giving the string ad. In terms of the pumping lemma, the string abcd is broken into an portion a, a portion bc and a portion d.

General version of pumping lemma for regular languages

If a language is regular, then there exists a number such that every string in with |w| ≥ p can be written in the form
with strings, and such that, and
From this, the above standard version follows a special case, with both and being the empty string.
Since the general version imposes stricter requirements on the language, it can be used to prove the non-regularity of many more languages, such as.

Converse of lemma not true

While the pumping lemma states that all regular languages satisfy the conditions described above, the converse of this statement is not true: a language that satisfies these conditions may still be non-regular. In other words, both the original and the general version of the pumping lemma give a necessary but not sufficient condition for a language to be regular.
For example, consider the following language:
In other words, contains all strings over the alphabet with a substring of length 3 including a duplicate character, as well as all strings over this alphabet where precisely 1/7 of the string's characters are 3's. This language is not regular but can still be "pumped" with. Suppose some string s has length at least 5. Then, since the alphabet has only four characters, at least two of the first five characters in the string must be duplicates. They are separated by at most three characters.

If the duplicate characters are separated by 0 characters, or 1, pump one of the other two characters in the string, which will not affect the substring containing the duplicates.
If the duplicate characters are separated by 2 or 3 characters, pump 2 of the characters separating them. Pumping either down or up results in the creation of a substring of size 3 that contains 2 duplicate characters.
The second condition of ensures that is not regular: Consider the string. This string is in exactly when and thus is not regular by the Myhill–Nerode theorem.

The Myhill–Nerode theorem provides a test that exactly characterizes regular languages. The typical method for proving that a language is regular is to construct either a finite state machine or a regular expression for the language.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...