Normal number
In mathematics, a real number is said to be simply normal in an integer base b if its infinite sequence of digits is distributed uniformly in the sense that each of the b digit values has the same natural density 1/b. A number is said to be normal in base b if, for every positive integer n, all possible strings n digits long have density b−n.
Intuitively, a number being simply normal means that no digit occurs more frequently than any other. If a number is normal, no finite combination of digits of a given length occurs more frequently than any other combination of the same length. A normal number can be thought of as an infinite sequence of coin flips or rolls of a die. Even though there will be sequences such as 10, 100, or more consecutive tails or fives or even 10, 100, or more repetitions of a sequence such as tail-head or 6-1, there will also be equally many of any other sequence of equal length. No digit or sequence is "favored".
A number is said to be absolutely normal if it is normal in all integer bases greater than or equal to 2.
While a general proof can be given that almost all real numbers are normal, this proof is not constructive, and only a few specific numbers have been shown to be normal. For example, Chaitin's constant is normal. It is widely believed that the numbers square root of 2|, pi|, and e are normal, but a proof remains elusive.
Definitions
Let Σ be a finite alphabet of b-digits, and Σ∞ the set of all sequences that may be drawn from that alphabet. Let S ∈ Σ∞ be such a sequence. For each a in Σ let NS denote the number of times the digit a appears in the first n digits of the sequence S. We say that S is simply normal if the limitfor each a. Now let w be any finite string in Σ∗ and let NS be the number of times the string w appears as a substring in the first n digits of the sequence S. S is normal if, for all finite strings w ∈ Σ∗,
where | w | denotes the length of the string w.
In other words, S is normal if all strings of equal length occur with equal asymptotic frequency. For example, in a normal binary sequence, 0 and 1 each occur with frequency 1⁄2; 00, 01, 10, and 11 each occur with frequency 1⁄4; 000, 001, 010, 011, 100, 101, 110, and 111 each occur with frequency 1⁄8, etc. Roughly speaking, the probability of finding the string w in any given position in S is precisely that expected if the sequence had been produced at random.
Suppose now that b is an integer greater than 1 and x is a real number. Consider the infinite digit sequence expansion Sx, b of x in the base b positional number system. We say that x is simply normal in base b if the sequence Sx, b is simply normal and that x is normal in base b if the sequence Sx, b is normal. The number x is called a normal number if it is normal in base b for every integer b greater than 1.
A given infinite sequence is either normal or not normal, whereas a real number, having a different base-b expansion for each integer b ≥ 2, may be normal in one base but not in another. For bases r and s with log r / log s rational every number normal in base r is normal in base s. For bases r and s with log r / log s irrational, there are uncountably many numbers normal in each base but not the other.
A disjunctive sequence is a sequence in which every finite string appears. A normal sequence is disjunctive, but a disjunctive sequence need not be normal. A rich number in base b is one whose expansion in base b is disjunctive: one that is disjunctive to every base is called absolutely disjunctive or is said to be a lexicon. A number normal in base b is rich in base b, but not necessarily conversely. The real number x is rich in base b if and only if the set is dense in the unit interval.
We defined a number to be simply normal in base b if each individual digit appears with frequency 1/b. For a given base b, a number can be simply normal, b-dense, normal, or none of these. A number is absolutely non-normal or absolutely abnormal if it is not simply normal in any base.
Properties and examples
For any base, while rational numbers can be simply normal in a particular base, every normal number is irrational.The concept of a normal number was introduced by. Using the Borel–Cantelli lemma, he proved that almost all real numbers are normal, establishing the existence of normal numbers. showed that it is possible to specify a particular such number. proved that there is a computable absolutely normal number.
Although this construction does not directly give the digits of the numbers constructed, it shows that it is possible in principle to enumerate all the digits of a particular normal number.
The set of non-normal numbers, despite being "large" in the sense of being uncountable, is also a null set. Also, the non-normal numbers are dense in the reals: the set of non-normal numbers between two distinct real numbers is non-empty since it contains [|every rational number]. For instance, there are uncountably many numbers whose decimal expansions do not contain the digit 1, and none of these numbers is normal.
Champernowne's constant
obtained by concatenating the decimal representations of the natural numbers in order, is normal in base 10. Likewise, the different variants of Champernowne's constant are normal in their respective bases, but they have not been proven to be normal in other bases.
The Copeland–Erdős constant
obtained by concatenating the prime numbers in base 10, is normal in base 10, as proved by. More generally, the latter authors proved that the real number represented in base b by the concatenation
where f is the nth prime expressed in base b, is normal in base b. proved that the number represented by the same expression, with f = n2,
obtained by concatenating the square numbers in base 10, is normal in base 10. proved that the number represented by the same expression, with f being any non-constant polynomial whose values on the positive integers are positive integers, expressed in base 10, is normal in base 10.
proved that if f is any non-constant polynomial with real coefficients such that f > 0 for all x > 0, then the real number represented by the concatenation
where is the integer part of f expressed in base b, is normal in base b. The authors also show that the same result holds even more generally when f is any function of the form
where the αs and βs are real numbers with β > β1 > β2 >... > βd ≥ 0, and f > 0 for all x > 0.
show an explicit uncountably infinite class of b-normal numbers by perturbing Stoneham numbers.
It has been an elusive goal to prove the normality of numbers that are not artificially constructed. While square root of 2|, π, ln, and e are strongly conjectured to be normal, it is still not known whether they are normal or not. It has not even been proven that all digits actually occur infinitely many times in the decimal expansions of those constants. It has also been conjectured that every irrational algebraic number is absolutely normal, and no counterexamples are known in any base. However, no irrational algebraic number has been proven to be normal in any base.
Non-normal numbers
No rational number is normal in any base, since the digit sequences of rational numbers are eventually periodic. However, a rational number can be simply normal in a particular base. For example, is simply normal in base 10.has given a simple example of an irrational number that is absolutely abnormal. Let f = 4 and
Then α is a Liouville number and is absolutely abnormal.
Properties
Additional properties of normal numbers include:- Every non-zero real number can be written as the product of two normal numbers. For instance, if a is any non-zero real number, and x is a non-zero real number that is chosen uniformly at random from any finite interval, then almost surely x and a/x are both normal.
- If x is normal in base b and a ≠ 0 is a rational number, then is also normal in base b.
- If is dense and are the base-b expansions of the elements of A, then the number, formed by concatenating the elements of A, is normal in base b. From this it follows that Champernowne's number is normal in base 10 and that the Copeland–Erdős constant is normal in base 10.
- A sequence is normal if and only if every block of equal length appears with equal frequency. This was implicit in the work of and made explicit in the work of.
- A number is normal in base b if and only if it is simply normal in base bk for all. This follows from the previous block characterization of normality: Since the nth block of length k in its base b expansion corresponds to the nth digit in its base bk expansion, a number is simply normal in base bk if and only if blocks of length k appear in its base b expansion with equal frequency.
- A number is normal if and only if it is simply normal in every base. This follows from the previous characterization of base b normality.
- A number is b-normal if and only if there exists a set of positive integers where the number is simply normal in bases bm for all No finite set suffices to show that the number is b-normal.
- All normal sequences are closed under finite variations: adding, removing, or changing a finite number of digits in any normal sequence leaves it normal. Similarly, if a finite number of digits are added to, removed from, or changed in any simply normal sequence, the new sequence is still simply normal.
Connection to finite-state machines
A deeper connection exists with finite-state gamblers and information lossless finite-state compressors.
- A finite-state gambler is a finite-state machine over a finite alphabet, each of whose states is labelled with percentages of money to bet on each digit in. For instance, for an FSG over the binary alphabet, the current state q bets some percentage of the gambler's money on the bit 0, and the remaining fraction of the gambler's money on the bit 1. The money bet on the digit that comes next in the input is multiplied by, and the rest of the money is lost. After the bit is read, the FSG transitions to the next state according to the input it received. A FSG d succeeds on an infinite sequence S if, starting from $1, it makes unbounded money betting on the sequence; i.e., if
- A finite-state compressor is a finite-state machine with output strings labelling its state transitions, including possibly the empty string.. An information lossless finite-state compressor is a finite-state compressor whose input can be uniquely recovered from its output and final state. In other words, for a finite-state compressor C with state set Q, C is information lossless if the function, mapping the input string of C to the output string and final state of C, is 1–1. Compression techniques such as Huffman coding or Shannon–Fano coding can be implemented with ILFSCs. An ILFSC C compresses an infinite sequence S if
Ziv and Lempel showed:
. Since the LZ compression algorithm compresses asymptotically as well as any ILFSC, this means that the LZ compression algorithm can compress any non-normal sequence.
These characterizations of normal sequences can be interpreted to mean that "normal" = "finite-state random"; i.e., the normal sequences are precisely those that appear random to any finite-state machine. Compare this with the algorithmically random sequences, which are those infinite sequences that appear random to any algorithm.
Connection to equidistributed sequences
A number x is normal in base b if and only if the sequence is equidistributed modulo 1, or equivalently, using Weyl's criterion, if and only ifThis connection leads to the terminology that x is normal in base β for any real number β if the sequence is equidistributed modulo 1.