Kolakoski sequence

In mathematics, the Kolakoski sequence, sometimes also known as the Oldenburger-Kolakoski sequence, is an infinite sequence of symbols that is the sequence of run lengths in its own run-length encoding, and the prototype for an infinite family of related sequences. It is named after the recreational mathematician William Kolakoski who described it in 1965, but subsequent research has shown that the sequence was previously discussed by Rufus Oldenburger in 1939.

Definition of the classic Kolakoski sequence

The initial terms of the Kolakoski sequence are:
Each symbol occurs in a "run" of either one or two consecutive terms, and writing down the lengths of these runs gives exactly the same sequence:
The description of the Kolakoski sequence is therefore reversible. If K stands for "the Kolakoski sequence", description #1 logically implies description #2 :
Accordingly, one can say that each term of the Kolakoski sequence generates a run of one or two future terms. The first 1 of the sequence generates a run of "1", i.e. itself; the first 2 generates a run of "22", which includes itself; the second 2 generates a run of "11"; and so on. Each number in the sequence is the length of the next run to be generated, and the element to be generated alternates between 1 and 2:
As can be seen, the length of the sequence at each stage is equal to the sum of terms in the previous stage. This animation illustrates the process:
These self-generating properties, which remain if the sequence is written without the initial 1, mean that the Kolakoski sequence can be described as a fractal, or mathematical object that encodes its own representation on other scales. Bertran Steinsky has created a recursive formula for the i-th term of the sequence but the sequence is believed to be aperiodic, that is, its terms do not have a general repeating pattern.

Other self-generating Kolakoski sequences

From finite integer alphabets

The Kolakoski sequence is the prototype for an infinite family of other sequences that are each their own run-length encodings. Each sequence is based on what is formally called an alphabet of integers. For example, the classic Kolakoski sequence described above has the alphabet. Some of the additional Kolakoski sequences listed at the OEIS are:
Like the Kolakoski -sequence, writing the run-lengths returns the same sequence. More generally, any alphabet of integers,, can generate a Kolakoski sequence if the same integer does not occur 1) twice or more in a row; 2) at the beginning and end of the alphabet. For example, the alphabet generates:
And the alphabet generates:
Again, writing the run-lengths returns the same sequence.

From infinite integer alphabets

Kolakoski sequences can also be created from infinite alphabets of integers, such as :
The infinite alphabet generates the Golomb sequence:
A Kolakoski sequence can also be created from integers chosen at random from a finite alphabet, with the restriction that the same number cannot be chosen twice in a row. If the finite alphabet is, one possible sequence is this:
In effect, the sequence is based on the infinite alphabet, which contains a random sequence of 1s, 2s and 3s from which repeats have been removed.

Chain sequences

Whereas the classic Kolakoski -sequence generates itself, these two sequences generate each other:
In other words, if you write the run-lengths of the first sequence, you generate the second; if you write the run-lengths of the second, you generate the first. In the following chain of three sequences, the run-lengths of each generate the next in the order 1 → 2 → 3 → 1:
The sequences use the integer alphabet, but each starts at a different point in the alphabet. The following five sequences form a similar chain using the alphabet :
However, to create a sequence-chain of length l, it is not necessary to have distinct integer alphabets of size l. For example, the alphabet-series,,, and is sufficient for a five-link chain:
Each sequence is unique and the run-lengths of each generate the terms of the next sequence in the chain. The integer alphabets used to generate a chain can also be of different sizes. A Kolakoski mirror can be created from the alphabets and :

Research into the classic sequence

Density of the sequence

It seems plausible that the density of 1s in the Kolakoski -sequence is 1/2, but this conjecture remains unproved. Václav Chvátal has proved that the upper density of 1s is less than 0.50084. Nilsson has used the same method with far greater computational power to obtain the bound 0.500080.
Although calculations of the first 3×10⁸ values of the sequence appeared to show its density converging to a value slightly different from 1/2, later calculations that extended the sequence to its first 10¹³ values show the deviation from a density of 1/2 growing smaller, as one would expect if the limiting density actually is 1/2.

Connection with tag systems

describes the Kolakoski sequence in connection with the history of cyclic tag systems.

Uniqueness of the sequence

Some discussions of the classic Kolakoski sequence make the claim that, written with or without the initial 1, it is the "only sequence" that is its own run-length encoding or the only such sequence that begins with 1. As can be seen above, this is untrue: an infinite number of additional sequences possess these properties. However, the Kolakoski - and -sequences are the only such sequences using solely the integers 1 and 2.

Anti-Kolakoski sequence

In the anti-Kolakoski sequence, the run-lengths of 1s and 2s never coincide with the terms of the original sequence:
As can be seen, the run-lengths of the anti-Kolakoski sequence return the Kolakoski -sequence, which means that the former can be created from the latter by simple subtraction. If k is the i-th term of the Kolakoski -sequence and ak is the i-th term of the anti-Kolakoski sequence, then ak = 3-k, just as k = 3-ak. Accordingly, like the Kolakoski sequence, the anti-Kolakoski sequence retains its defining property when written without its initial term, i.e. 2.

Kolakoski Constant

The so-called Kolakoski constant is created by subtracting 1 from each term of the Kolakoski -sequence and treating the result as a binary fraction.

Algorithms

Algorithm for the Kolakoski {1,2}-sequence

The Kolakoski -sequence may be generated by an algorithm that, in the i-th iteration, reads the value x_i that has already been output as the i-th value of the sequence. Then, if i is odd, it outputs x_i copies of the number 1, while if i is even, it outputs x_i copies of the number 2.
Thus, the first few steps of the algorithm are:

The first value has not yet been output, so set x₁ = 1, and output 1 copy of the number 1
The second value has not yet been output, so set x₂ = 2, and output 2 copies of the number 2
The third value x₃ was output as 2 in the second step, so output 2 copies of the number 1.
The fourth value x₄ was output as 1 in the third step, so output 1 copy of the number 2. Etc.

This algorithm takes linear time, but because it needs to refer back to earlier positions in the sequence it needs to store the whole sequence, taking linear space. An alternative algorithm that generates multiple copies of the sequence at different speeds, with each copy of the sequence using the output of the previous copy to determine what to do at each step, can be used to generate the sequence in linear time and only logarithmic space.

General algorithm for Kolakoski sequences

In general, a Kolakoski sequence for any integer alphabet may be generated by an algorithm that, in the i-th iteration, reads the value x_i that has already been output as the i-th value of the sequence. At each step, the output n_i is adjusted according to the size of the alphabet, reverting to n₁ when the final position in the alphabet is exceeded. The first few steps of the algorithm for the alphabet are:

The first value has not yet been output, so set x₁ = 1 = n₁, and output 1 copy of the number 1
The second value has not yet been output, so set x₂ = 2 = n₂, and output 2 copies of the number 2
The third value x₃ was output as 2 in the second step, so output 2 copies of 3 = n₃.
The fourth value x₄ was output as 3 in the third step, so output 3 copies of 4 = n₄.
The fifth value x₅ was output as 3 in the third step, so output 3 copies of the number 1 = n_1=adjusted.
The sixth value x₆ was output as 4 in the fourth step, so output 4 copies of the number 2 = n_2=adjusted. Etc.

The resultant sequence is:

Algorithm for Kolakoski-chains

Kolakoski chains of any desired length can be generated with a simple algorithm. Suppose that one wishes to generate a chain with 3 sequences where the terms of seq are generated by the run-lengths of seq and the alphabet is. Begin by setting the first term of seq, the initial sequence in the chain, to the value of 2. The next sequence in the chain, seq, whose run-lengths generate the terms of seq, must therefore have the terms. Therefore seq, whose run-lengths generate seq =, must have the runs. Here is the first stage of the algorithm:
Now, note that the run-lengths of seq generate the terms of seq, which means that the terms of seq generate the runs of seq. Because seq = after stage 1 of the algorithm, seq must equal in the next stage. From this extended seq, one can generate further runs of seq, then further runs of seq:
Now use the terms of seq in stage 2 to generate further runs of seq in stage 3:
The sequences can now be re-arranged so that the run-lengths of seq generate the terms of seq = seq):
If the chain has 5 sequences, the algorithm yields these stages:
Finally, the sequences are re-arranged so that the run-lengths of seq generate the terms of seq:

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...