Collatz conjecture


The Collatz conjecture is a conjecture in mathematics that concerns a sequence defined as follows: start with any positive integer. Then each term is obtained from the previous term as follows: if the previous term is even, the next term is one half of the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that no matter what value of, the sequence will always reach 1.
The conjecture is named after Lothar Collatz, who introduced the idea in 1937, two years after receiving his doctorate. It is also known as the problem, the conjecture, the Ulam conjecture, Kakutani's problem, the Thwaites conjecture, Hasse's algorithm, or the Syracuse problem.
The sequence of numbers involved is sometimes referred to as the hailstone sequence or hailstone numbers, or as wondrous numbers.
Paul Erdős said about the Collatz conjecture: "Mathematics may not be ready for such problems." He also offered US$500 for its solution. Jeffrey Lagarias in 2010 claimed that based only on known information about this problem, "this is an extraordinarily difficult problem, completely out of reach of present day mathematics."

Statement of the problem

Consider the following operation on an arbitrary positive integer:
In modular arithmetic notation, define the function as follows:
Now form a sequence by performing this operation repeatedly, beginning with any positive integer, and taking the result at each step as the input at the next.
In notation:
.
The Collatz conjecture is: This process will eventually reach the number 1, regardless of which positive integer is chosen initially.
That smallest such that is called the total stopping time of. The conjecture asserts that every has a well-defined total stopping time. If, for some, such an doesn't exist, we say that has infinite total stopping time and the conjecture is false.
If the conjecture is false, it can only be because there is some starting number which gives rise to a sequence that does not contain 1. Such a sequence would either enter a repeating cycle that excludes 1, or increase without bound. No such sequence has been found.

Examples

For instance, starting with, one gets the sequence 12, 6, 3, 10, 5, 16, 8, 4, 2, 1.
, for example, takes longer to reach 1: 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.
The sequence for, listed and graphed below, takes 111 steps, climbing to a high of 9232 before descending to 1.
Numbers with a total stopping time longer than that of any smaller starting value form a sequence beginning with:
The starting values whose maximum trajectory point is greater than that of any smaller starting value are as follows:
Number of steps for to reach 1 are
The longest progression for any initial starting number
These numbers are the lowest ones with the indicated step count, but not necessarily the only ones below the given limit. As an example, has 1132 steps, as does.
The powers of two converge to one quickly because is halved times to reach 1, and is never increased.

Visualizations

Cycles

In this part, consider the "shortcut" form of the Collatz function
A cycle is a sequence of distinct positive integers where,,..., and.
The only known cycle is of length 2, called the trivial cycle.

Cycle length

The length of a non-trivial cycle is known to be at least. In fact, Eliahou proved that the period of any non-trivial cycle is of the form
where, and are non-negative integers, and. This result is based on the continued fraction expansion of.
A similar reasoning that accounts for the recent verification of the conjecture up to leads to the improved lower bound . This lower bound is consistent with the above result, since.

-cycles

A -cycle is a cycle that can be partitioned into contiguous subsequences: increasing sequences of odd numbers alternating with decreasing sequences of even numbers. For instance, if the cycle consists of a single increasing sequence of odd numbers followed by a decreasing sequence of even numbers, it is called a 1-cycle.
Steiner proved that there is no 1-cycle other than the trivial. Simons used Steiner's method to prove that there is no 2-cycle. Simons & de Weger extended this proof up to 68-cycles: there is no -cycle up to. Beyond 68, this method gives upper bounds for the elements in such a cycle: for example, if there is a 75-cycle, then at least one element of the cycle is less than. Therefore, as exhaustive computer searches continue, larger cycles may be ruled out. To state the argument more intuitively: we need not look for cycles that have at most 68 trajectories, where each trajectory consists of consecutive ups followed by consecutive downs.

Supporting arguments

Although the conjecture has not been proven, most mathematicians who have looked into the problem think the conjecture is true because experimental evidence and heuristic arguments support it.

Experimental evidence

, the conjecture has been checked by computer for all starting values up to 268. All initial values tested so far eventually end in the repeating cycle, which has only three terms. From this lower bound on the starting value, a lower bound can also be obtained for the number of terms a repeating cycle other than must have. When this relationship was established in 1981, the formula gave a lower bound of terms.
This computer evidence is not a proof that the conjecture is true. As shown in the cases of the Pólya conjecture, the Mertens conjecture, and Skewes' number, sometimes a conjecture's only counterexamples are found when using very large numbers.

A probabilistic heuristic

If one considers only the odd numbers in the sequence generated by the Collatz process, then each odd number is on average of the previous one. This yields a heuristic argument that every Hailstone sequence should decrease in the long run, although this is not evidence against other cycles, only against divergence. The argument is not a proof because it assumes that Hailstone sequences are assembled from uncorrelated probabilistic events.
And even if the probabilistic reasoning were rigorous, this would still imply only that the conjecture is almost surely true for any given integer, which does not necessarily imply that it is true for all integers.
Terence Tao proved using probability that almost all Collatz orbits are bounded by any function that diverges into infinity. Responding to this work, Quanta Magazine wrote that Tao "came away with one of the most significant results on the Collatz conjecture in decades."

Rigorous bounds

In a computer-aided proof, Krasikov and Lagarias showed that the number of integers in the interval that eventually reach 1 is at least equal to for all sufficiently large.

Other formulations of the conjecture

In reverse

There is another approach to prove the conjecture, which considers the bottom-up
method of growing the so-called Collatz graph. The Collatz graph is a graph defined by the inverse relation
So, instead of proving that all positive integers eventually lead to 1, we can try to prove that 1 leads backwards to all positive integers. For any integer, if and only if. Equivalently, if and only if. Conjecturally, this inverse relation forms a tree except for the 1–2–4 loop.
When the relation of the function is replaced by the common substitute "shortcut" relation, the Collatz graph is defined by the inverse relation,
For any integer, if and only if. Equivalently, if and only if. Conjecturally, this inverse relation forms a tree except for a 1–2 loop.
Alternatively, replace the with where and is the highest power of 2 that divides . The resulting function maps from odd numbers to odd numbers. Now suppose that for some odd number, applying this operation times yields the number 1. Then in binary, the number can be written as the concatenation of strings where each is a finite and contiguous extract from the representation of. The representation of therefore holds the repetends of, where each repetend is optionally rotated and then replicated up to a finite number of bits. It is only in binary that this occurs. Conjecturally, every binary string that ends with a '1' can be reached by a representation of this form.

As an abstract machine that computes in base two

Repeated applications of the Collatz function can be represented as an abstract machine that handles strings of bits. The machine will perform the following three steps on any odd number until only one "1" remains:
  1. Append 1 to the end of the number in binary ;
  2. Add this to the original number by binary addition ;
  3. Remove all trailing "0"s.

    Example

The starting number 7 is written in base two as 111. The resulting Collatz sequence is:

111
1111
10110
10111
100010
100011
110100
11011
101000
1011
10000

As a [|parity sequence]

For this section, consider the Collatz function in the slightly modified form
This can be done because when is odd, is always even.
If is the parity of a number, that is and, then we can define the Collatz parity sequence for a number as, where, and.
Which operation is performed, or, depends on the parity. The parity sequence is the same as the sequence of operations.
Using this form for, it can be shown that the parity sequences for two numbers and will agree in the first terms if and only if and are equivalent modulo. This implies that every number is uniquely identified by its parity sequence, and moreover that if there are multiple Hailstone cycles, then their corresponding parity cycles must be different.
Applying the function times to the number will give the result, where is the result of applying the function times to, and is how many increases were encountered during that sequence. When is then there will be rises and the result will be. The factor of 3 multiplying is independent of the value of ; it depends only on the behavior of. This allows one to predict that certain forms of numbers will always lead to a smaller number after a certain number of iterations, e.g. becomes after two applications of and becomes after 4 applications of. Whether those smaller numbers continue to 1, however, depends on the value of.

As a tag system

For the Collatz function in the form
Hailstone sequences can be computed by the extremely simple with production rules
In this system, the positive integer is represented by a string of copies of, and iteration of the tag operation halts on any word of length less than 2.
The Collatz conjecture equivalently states that this tag system, with an arbitrary finite string of as the initial word, eventually halts.

Extensions to larger domains

Iterating on all integers

An extension to the Collatz conjecture is to include all integers, not just positive integers. Leaving aside the cycle 0 → 0 which cannot be entered from outside, there are a total of 4 known cycles, which all nonzero integers seem to eventually fall into under iteration of. These cycles are listed here, starting with the well-known cycle for positive :
Odd values are listed in large bold. Each cycle is listed with its member of least absolute value first.
CycleOdd-value cycle lengthFull cycle length
1 → 4 → 2 → 1 ...13
−1 → −2 → −1 ...12
−5 → −14 → −7 → −20 → −10 → −5 ...25
−17 → −50 → −25 → −74 → −37 → −110 → −55 → −164 → −82 → −41 → −122 → −61 → −182 → −91 → −272 → −136 → −68 → −34 → −17 ...718

The generalized Collatz conjecture is the assertion that every integer, under iteration by, eventually falls into one of the four cycles above or the cycle 0 → 0. The 0 → 0 cycle is often regarded as "trivial" by the argument, as it is only included for the sake of completeness.

Iterating on rationals with odd denominators

The Collatz map can be extended to rational numbers which have odd denominators when written in lowest terms.
The number is taken to be 'odd' or 'even' according to whether its numerator is odd or even. Then the formula for the map is exactly the same as when the domain is the integers: an 'even' such rational is divided by 2; an 'odd' such rational is multiplied by 3 and then 1 is added. A closely related fact is that the Collatz map extends to the ring of 2-adic integers, which contains the ring of rationals with odd denominators as a subring.
When using the "shortcut" definition of the Collatz map, it is known that any periodic parity sequence is generated by exactly one rational. Conversely, it is conjectured that every rational with an odd denominator has an eventually cyclic parity sequence.
If a parity cycle has length and includes odd numbers exactly times at indices, then the unique rational which generates immediately and periodically this parity cycle is
For example, the parity cycle has length 7 and four odd terms at indices 0, 2, 3, and 6. It is repeatedly generated by the fraction
as the latter leads to the rational cycle
Any cyclic permutation of is associated to one of the above fractions. For instance, the cycle is produced by the fraction
For a one-to-one correspondence, a parity cycle should be irreducible, i.e., not partitionable into identical sub-cycles. As an illustration of this, the parity cycle and its sub-cycle are associated to the same fraction when reduced to lowest terms.
In this context, assuming the validity of the Collatz conjecture implies that and are the only parity cycles generated by positive whole numbers.
If the odd denominator of a rational is not a multiple of 3, then all the iterates have the same denominator and the sequence of numerators can be obtained by applying the "" generalization of the Collatz function

2-adic extension

The function
is well-defined on the ring of 2-adic integers, where it is continuous and measure-preserving with respect to the 2-adic measure. Moreover, its dynamics is known to be ergodic.
Define the parity vector function acting on as
The function is a 2-adic isometry. Consequently, every infinite parity sequence occurs for exactly one 2-adic integer, so that almost all trajectories are acyclic in.
An equivalent formulation of the Collatz conjecture is that

Iterating on real or complex numbers

The Collatz map can be viewed as the restriction to the integers of the smooth real and complex map
If the standard Collatz map defined above is optimized by replacing the relation with the common substitute "shortcut" relation, it can be viewed as the restriction to the integers of the smooth real and complex map
in a neighbourhood of the real line
The point of view of iteration on the real line was investigated by.
He showed that the conjecture does not hold for real numbers since there are infinitely many fixed points, as well as orbits escaping monotonically to infinity. He also showed that there is, at least, another attracting cycle: 1.1925 → 2.1386.
On the complex plane, it was investigated by
Most points of the plane diverge to infinity, as seen in blue on the illustration below. The boundary between diverging and non-diverging regions show a fractal pattern called the "Collatz fractal".

Optimizations

Time–space tradeoff

The section As a parity sequence above gives a way to speed up simulation of the sequence. To jump ahead steps on each iteration, break up the current number into two parts, , and . The result of jumping ahead steps can be found as:
The and arrays are precalculated for all possible -bit numbers, where is the result of applying the function times to, and is the number of odd numbers encountered on the way. For example, if, one can jump ahead 5 steps on each iteration by separating out the 5 least significant bits of a number and using:
This requires precomputation and storage to speed up the resulting calculation by a factor of, a space–time tradeoff.

Modular restrictions

For the special purpose of searching for a counterexample to the Collatz conjecture, this precomputation leads to an even more important acceleration, used by Tomás Oliveira e Silva in his computational confirmations of the Collatz conjecture up to large values of . If, for some given and, the inequality
holds for all, then the first counterexample, if it exists, cannot be modulo. For instance, the first counterexample must be odd because, smaller than ; and it must be 3 mod 4 because, smaller than. For each starting value which is not a counterexample to the Collatz conjecture, there is a for which such an inequality holds, so checking the Collatz conjecture for one starting value is as good as checking an entire congruence class. As increases, the search only needs to check those residues that are not eliminated by lower values of . Only an exponentially small fraction of the residues survive. For example, the only surviving residues mod 32 are 7, 15, 27, and 31.

Syracuse function

If is an odd integer, then is even, so with odd and. The Syracuse function is the function from the set of odd integers into itself, for which .
Some properties of the Syracuse function are:
The Collatz conjecture is equivalent to the statement that, for all in, there exists an integer such that.

Undecidable generalizations

In 1972, John Horton Conway proved that a natural generalization of the Collatz problem is algorithmically undecidable.
Specifically, he considered functions of the form
and are rational numbers which are so chosen that is always an integer.
The standard Collatz function is given by,,,,. Conway proved that the problem:
is undecidable, by representing the halting problem in this way.
Closer to the Collatz problem is the following universally quantified problem:
Modifying the condition in this way can make a problem either harder or easier to solve. Kurtz and Simon proved that the above problem is, in fact, undecidable and even higher in the arithmetical hierarchy, specifically -complete. This hardness result holds even if one restricts the class of functions by fixing the modulus to 6480.

In popular culture

In the movie Incendies, a graduate student in pure mathematics explains the Collatz conjecture to a group of undergraduates. She puts her studies on hold for a time to address some unresolved questions about her family's past. Late in the movie, the Collatz conjecture turns out to have foreshadowed a disturbing and difficult discovery that she makes about her family.

Papers

Preprints

Books