In this post I’d like to discuss how finite fields (and more generally finite rings) can be used to study two fundamental computational problems in number theory: computing square roots modulo a prime and proving primality for Mersenne numbers. The corresponding algorithms (Cipolla’s algorithm and the Lucas-Lehmer test) are often omitted from undergraduate-level number theory courses because they appear, at first glance, to be relatively sophisticated. This is a shame because these algorithms are really marvelous — not to mention useful — and I think they’re more accessible than most people realize. I’ll attempt to describe these two algorithms assuming only a standard first-semester course in elementary number theory and familiarity with complex numbers, without assuming any background in abstract algebra. These algorithms could also be used in a first-semester abstract algebra course to help motivate the practical utility of concepts like rings and fields.
Anecdotal Prelude
In researching this blog post on Wikipedia, I learned a couple of interesting historical tidbits that I’d like to share before getting on with the math.
The French mathematician Edouard Lucas (of Lucas sequence fame) showed in 1876 that the 39-digit Mersenne number
is prime. This stood for 75 years as the largest known prime. Lucas also invented (or at least popularized) the Tower of Hanoi puzzle and published the first description of the Dots and Boxes game. He died in quite an unusual way: apparently a waiter at a banquet dropped what he was carrying and a piece of broken plate cut Lucas on the cheek. Lucas developed a severe skin inflammation and died a few days later at age 47.
Derrick Henry Lehmer was a long-time professor of mathematics at U.C. Berkeley. In 1950, Lehmer was one of 31 University of California faculty members fired for refusing to sign a loyalty oath during the McCarthy era. In 1952, the California Supreme Court declared the oath unconstitutional, and Lehmer returned to Berkeley shortly thereafter. Lehmer also built a number of fascinating mechanical sieve machines designed to factor large numbers.
Rings and fields
Let be an integer and let
denote the set
together with the operations of addition and multiplication modulo
. Such a structure — a set together with two binary operations
and
satisfying various axioms such as the associative, commutative, and distributive laws — is called a (commutative) ring. (Click here for a more detailed and formal discussion.) One of the axioms for a ring is that we can subtract any two elements, giving an inverse to the addition operation. If we can also divide by any nonzero element, giving an inverse to multiplication, then we call the ring a field. In this post we will be particularly interested in studying finite rings and fields.
The prototypical example of a finite field is for
a prime number. We can also construct fields with
elements by analogy with the complex numbers. Choose a non-square
and define
to be the set of all formal expressions of the form
with
, together with the addition and multiplication laws
and
. One can easily check that
is a field, with the reciprocal of
given by
, where
and
. We can identify
with the subfield
of
.
Here are two basic properties of fields, both of which are usually known as “Lagrange’s Theorem”:
Lagrange #1: A polynomial of degree with coefficients in a field
can have at most
roots in
. (This is completely false if we replace “field” by “ring”.) See here for a proof in the special case
which generalizes immediately to any field. We just need the following special case: the only roots of
in
are
. This follows from the fact that
in
implies that either
or
.
Lagrange #2: Let be a finite field with
elements and let
denote the nonzero elements of
. Then the order
of any
(defined as the smallest positive integer
such that
) divides
. (This can be viewed as an extension of Fermat’s Little Theorem, the proof is basically the same: multiplication by
permutes the elements of
, and thus
. Hence
. Writing
with
integers such that
, it follows that
and hence
.)
The following property of the fields will be crucial in what follows.
Key Lemma: For , we have
.
Proof: Since is a quadratic non-residue modulo
, Euler’s Criterion tells us that
in
. Thus
in
. From the binomial theorem, the fact that
in
, and Fermat’s Little Theorem, it follows that
as desired.
Computing modular square roots
We now turn to Cipolla’s algorithm for finding a solution to when
is a nonzero square (quadratic residue) modulo the odd prime
. If
, one checks easily that the two solutions are
, which can be efficiently computed via repeated squaring. However, the case
cannot be treated in a similarly elementary and explicit way.
The first step in Cipolla’s algorithm is to find an integer with
such that
is not a square modulo
. The only known method to do this efficiently is probabilistic: for a random value of
, the number
will be a non-square modulo
with probability
. Thus, if
are chosen randomly, the probability that none of the
works will be
. So in practice, we will always very quickly be able to find a suitable value of
, since for any particular
we can efficiently decide whether or not
is a square mod
(using, for example, the Law of Quadratic Reciprocity for the Jacobi symbol).
Theorem 1: The two solutions in to
(i.e., the two solutions to
) are
.
Proof: Let . Then
, which by the Key Lemma equals
Thus
are square roots of
in
. Moreover,
, since by assumption the polynomial
has two roots in
, and it follows from Lagrange #1 that these are the only roots in the larger field
.
We thus obtain:
Cipolla’s Algorithm: Let be a non-zero quadratic residue modulo
. Compute
using repeated squaring in the finite field
, where
is a quadratic non-residue mod
for some integer
. The result is an element of
whose square is equal to
.
For computer implementation of this algorithm, it is useful to identify an element with ordered pair
, just as a complex number is often identified with a point in the Euclidean plane. The addition law on ordered pairs is given by the rule
and the multiplication law is given by
.
We note, for use in the next section, that these addition and multiplication laws make sense on ordered pairs of integers modulo even if
is not prime, or if
is a square modulo
. Abusing terminology somewhat, we denote the corresponding ring by
.
Primality proving for Mersenne numbers
We now turn to the problem of determining whether a large positive integer is prime. Though there are general algorithms for doing this (e.g. the celebrated polynomial-time test due to Agrawal, Kayal, and Saxena), such algorithms are much slower than special-purpose algorithms designed for numbers of a specific form. To illustrate our point, all of the largest known prime numbers are Mersenne numbers, meaning that they have the form
for some odd positive integer
. More precisely, the largest number proven to be prime (as of November 2013) is the 17,425,170-digit Mersenne prime
. It was discovered by the GIMPS project in January 2013. The reason that Mersenne numbers can be tested for primality in a particularly efficient way is because of the Lucas-Lehmer test, which we now describe.
We begin by observing that if is a Mersenne prime, then
is a quadratic residue modulo
and
is a quadratic non-residue modulo
. This is a straightforward consequence of the Law of Quadratic Reciprocity, since one shows easily that
and
.
Theorem 2: Let be a Mersenne number. Then
is prime if and only if
in the ring
.
Proof: Suppose first that is prime. Let
be a square root of
in
and let
in the field
. Then
and
. Conversely, suppose
and let
be a prime divisor of
. Let
if
is a square mod
, and let
otherwise. Since
but
, it follows that the order of
in
is
. By Lagrange #2,
divides
, which is either
or
. In either case,
. Since this is true for all primes
dividing
, it follows that
is prime.
Since one can compute in the ring
via repeated squaring, the theorem gives a practical method for testing Mersenne numbers for primality. However, there is a reformulation of the theorem which is more elementary and better suited to computer implementation; we describe this now. (We first note that the theorem remains true, with the same proof, if we replace
by
.)
Let , and recursively define a sequence
of integers modulo
by the rule
and
for
. (In other words,
where
is the
iterate of
.)
Lucas-Lehmer Test: The Mersenne number with
odd is prime if and only if
.
Proof: We first observe that in
for all
: for
this is clear, and a simple calculation shows that
If is prime, then
Thus
in
, which implies that
since
is prime.
Conversely, suppose in
. Then
, so
. On the other hand,
implies that
, which means that
. It follows that
in
, so by our theorem
is prime.
Concluding observations:
1. The following is a variant of Cipolla’s algorithm based on Lucas sequences (see Exercise 2.31 in the book “Prime Numbers: A Computational Perspective” by Crandall and Pomerance). Choose so that
is a quadratic non-residue mod
. Define a Lucas sequence
by
, and
for
. Then
is a square root of
modulo
.
2. The proof of the Lucas-Lehmer test presented above is due to Michael Rosen, with simplifications by J.W. Bruce. Our exposition was influenced by this blog post written by Terry Tao, which the interested reader should definitely consult.
3. Our “Key Lemma” can also be proved using Galois theory: the Galois group of over
has order 2 and both
and
are non-trivial automorphisms, so they must coincide.
4. The complexity of Cipolla’s Algorithm is bit operations, which is asymptotically better than the worst case of Tonelli-Shanks Algorithm. There is no known deterministic polynomial-time algorithm for computing square roots modulo a prime. However, assuming the generalized Riemann hypothesis, it is known that there is a quadratic non-residue less than
, which makes it possible to determine suitable values of
and
in Cipolla’s algorithm in polynomial time.
5. Our statement and proof of the main theorem used in the Lucas-Lehmer test is reminiscent of another classical result attributed to Lucas: if and
are relatively prime positive integers with
and
for all prime divisors
of
, then
is prime (and
is a primitive root mod
). Lucas viewed this as the correct “converse” to Fermat’s Little Theorem.
6. The Chebyshev polynomial which one iterates in the Lucas-Lehmer test is very special from the point of view of algebraic dynamics (for example, one can explicitly write down its preperiodic points and its Julia set is the real segment
). I don’t know if this connection has any significance…
I updated this post based on helpful comments and corrections by Keith Conrad. Keith pointed out that in the proof of Theorem 2, a little more should be said about the compatibility of the calculation of (2+sqrt(3))^((n+1)/2) in (Z/n)[sqrt(3)], as the theorem requires, and its calculation in the field F, which is carried out only in the proof. I will clarify this in my next post (coming soon!). Keith also noted: “In the 4th observation, this use of GRH is for Dirichlet L-functions. If Dirichlet L-functions were proved to be zero free for Re(s) > 1 – epsilon then the least quadratic nonresidue mod p would be O((log p)^(1/epsilon)), which would make the algorithm polynomial-time as long as there were any vertical strip to the left of Re(s) = 1 that is zero-free for all Dirichlet L-functions.”
Pingback: Lucas sequences and Chebyshev polynomials | Matt Baker's Math Blog
We now link to you
dick