Dixon's factorization method


In number theory, Dixon's factorization method is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run-time bound comes with a rigorous proof that does not rely on conjectures about the smoothness properties of the values taken by polynomial.
The algorithm was designed by John D. Dixon, a mathematician at Carleton University, and was published in 1981.

Basic idea

Dixon's method is based on finding a congruence of squares modulo the integer N which is intended to factor. Fermat's factorization method finds such a congruence by selecting random or pseudo-random x values and hoping that the integer x2 mod N is a perfect square :
For example, if, the is 256, the square of 16. So. Computing the greatest common divisor of and N using Euclid's algorithm gives 163, which is a factor of N.
In practice, selecting random x values will take an impractically long time to find a congruence of squares, since there are only squares less than N.
Dixon's method replaces the condition "is the square of an integer" with the much weaker one "has only small prime factors"; for example, there are 292 squares smaller than 84923; 662 numbers smaller than 84923 whose prime factors are only 2,3,5 or 7; and 4767 whose prime factors are all less than 30.
If there are many numbers whose squares can be factorized as for a fixed set of small primes, linear algebra modulo 2 on the matrix will give a subset of the whose squares combine to a product of small primes to an even power — that is, a subset of the whose squares multiply to the square of a number mod N.

Method

Suppose the composite number N is being factored. Bound B is chosen, and the factor base is identified, the set of all primes less than or equal to B. Next, positive integers z are sought such that z2 mod N is B-smooth. Therefore it can be written, for suitable exponents ai,
When enough of these relations have been generated, the methods of linear algebra can be used to multiply together these various relations in such a way that the exponents of the primes on the right-hand side are all even:
This yields a congruence of squares of the form which can be turned into a factorization of N, This factorization might turn out to be trivial, which can only happen if in which case another try has to be made with a different combination of relations; but a nontrivial pair of factors of N can be reached, and the algorithm will terminate.

Pseudocode

input: positive integer
output: non-trivial factor of
Choose bound
Let be all primes
repeat
for to do
Choose such that is -smooth
Let such that
end for
Find non-empty such that
Let

while
return

Example

This example will try to factor N = 84923 using bound B = 7. The factor base is then P = . A search can be made for integers between and N whose squares mod N are B-smooth. Suppose that two of the numbers found are 513 and 537:
So
Then
That is,
The resulting factorization is 84923 = gcd × gcd = 163 × 521.

Optimizations

The quadratic sieve is an optimization of Dixon's method. It selects values of x close to the square root of such that x2 modulo N is small, thereby largely increasing the chance of obtaining a smooth number.
Other ways to optimize Dixon's method include using a better algorithm to solve the matrix equation, taking advantage of the sparsity of the matrix: a number z cannot have more than factors, so each row of the matrix is almost all zeros. In practice, the block Lanczos algorithm is often used. Also, the size of the factor base must be chosen carefully: if it is too small, it will be difficult to find numbers that factorize completely over it, and if it is too large, more relations will have to be collected.
A more sophisticated analysis, using the approximation that a number has all its prime factors less than with probability about , indicates that choosing too small a factor base is much worse than too large, and that the ideal factor base size is some power of.
The optimal complexity of Dixon's method is
in big-O notation, or
in L-notation.