Everyone who studies elementary number theory learns two different versions of Fermat’s Little Theorem:
Fermat’s Little Theorem, Version 1: If is prime and is an integer not divisible by , then .
Fermat’s Little Theorem, Version 2: If is prime and is any integer, then .
as well as the following extension of Version 1 of Fermat’s Little Theorem to arbitrary positive integers :
Euler’s Theorem: If is a positive integer and , then , where is Euler’s totient function.
My first goal in this post is to explain a generalization of Version 2 of Fermat’s Little Theorem to arbitrary . I’ll then explain an extension of this result to integer matrices, along with a slick proof of all of these results (and more) via “combinatorial zeta functions”.
The Möbius function and the Necklace Theorem
First of all, recall the definition of the Möbius function from elementary number theory:
Definition: For a positive integer , define if is not square-free. Otherwise, write as a product of distinct primes and define .
The importance of the Möbius function is captured by the following result (analogous to the Fourier inversion formula):
Theorem (Möbius Inversion): If is multiplicative, meaning that whenever , and for all , then for all .
The following result, which generalizes Version 2 of Fermat’s Little Theorem, was proved by Gauss when is prime. A proof of the general case was published independently between 1880 and 1883 by Kantor, Weyr, Lucas, and Pellet. For lack of a better name (it’s usually called “Gauss’s congruence”, but let’s be honest, Gauss has enough named for him already) I’ll call this the
Necklace Theorem: If is a positive integer and is any integer, then .
We will give three different proofs of this result. (For an explanation of the name “Necklace Theorem”, see this Wikipedia page or keep reading.)
Algebraic Proof of the Necklace Theorem
When is a prime power, the Necklace Theorem says that . This can be proved by induction on , with the base case being Fermat’s Little Theorem. For the inductive step, if then for some integer . Raising both sides to the power and using the fact that for , the result follows.
The general case follows from the prime-power case using the Chinese Remainder Theorem. For example, when the Necklace Theorem says that By the case the left-hand side is , but the left-hand side is also . We leave a formal proof of the general case along these lines as an exercise to the reader.
For future reference, we make the following definition:
Definition: A sequence of integers for satisfies the necklace congruences if for all . Equivalently, by the Chinese Remainder Theorem, this means that whenever for some prime .
With this terminology, the Necklace Theorem is the assertion that for every integer , the sequence satisfies the necklace congruences.
Dynamical Proof of the Necklace Theorem
If is a set and is any function, let denote the iterate ( times) and let denote the number of fixed points of . We assume:
() has only finitely many fixed points for all positive integers .
Condition () is automatic, of course, if is a finite set.
An -cycle for is an orbit of size for (acting on by iteration).
Lemma: The number of -cycles for is equal to
Proof: Let denote the number of -cycles for . A point of is fixed by if and only if it belongs to a -cycle for some , and each such -cycle has elements. Thus The result now follows by Möbius inversion.
Corollary: The sequence satisfies the necklace congruences for every dynamical system satisfying ().
Proof: Since is equal to the number of -cycles for , it must in particular be an integer. Therefore for all .
Using the Corollary, we can give a “dynamical proof” of the Necklace Theorem in the special case where is a positive integer (which immediately implies the general case if we take the characterization of the necklace congruences). It suffices to construct a dynamical system for which the number of fixed points of is equal to for all .
Let and let be the set of all functions , where . Let be the shift map which takes to . (We can think of elements of as circular necklaces comprised of labelled beads, each of which is colored using possible colors, and of as a single-bead rotation.) For , a necklace is fixed by iff for all , which means that the beads must all have the same color for . Since there are such necklaces, the result follows.
Interlude: Varieties over Finite Fields
(If you don’t know any algebraic geometry, feel free to skip to the next section now!)
Any number theorist worth his/her salt who sees a sequence of the form will immediately think of varieties over finite fields. If is a smooth projective algebraic variety over a finite field (identified, for simplicity, with its points over an algebraic closure of ) and is the Frobenius map, we have and thus , which is the subject of the Weil Conjectures (now theorems!).
The Weil Conjectures concern the Hasse-Weil zeta function of , which is defined as a formal power series by the formula Among the deep facts comprising the Weil Conjectures is the assertion (first proved by Dwork) that is a rational function. If we write , then as observed by Weil the coefficients are integers. In fact, is the number of effective divisors of degree on which are defined over . As we will see in a moment, the fact that for all is intimately related to the fact (established above) that the sequence satisfies the necklace congruences.
Combinatorial zeta functions
Definition: Let be a sequence of integers. The combinatorial zeta function of is the formal power series defined by
The definition of the combinatorial zeta function is motivated by the Girard-Newton identities relating the elementary symmetric functions of to the powers sums . More precisely, if we write and let then each is a polynomial in the ‘s with rational coefficients (and vice-versa). The combinatorial zeta function gives a simple way to see this (and derive the explicit formulas).
Indeed, we can write with . The right-hand side can be rewritten as Expanding and equating coefficients of on both sides of gives a rational polynomial expression for the ‘s in terms of the ‘s.
This argument shows, in particular:
Lemma: The combinatorial zeta function of the sequence is in fact a polynomial (the reciprocal polynomial of ).
One of the most remarkable facts about the combinatorial zeta function is the following result:
Theorem (Carlitz): We have for all if and only if satisfies the necklace congruences.
Proof: For any formal power series , with rational coefficients, we can recursively define a unique sequence of rational numbers such that has the “Euler product” expansion Indeed, if we write then Therefore and by Möbius inversion
Applying this to , we have and therefore which is an integer for all iff satisfies the necklace congruences. But from we easily deduce that iff for all , and the result follows.
Combinatorial Proof of the Necklace Theorem
Using Carlitz’s result, the Necklace Theorem follows in a completely formal way. Indeed, if then is equal to , which has integer coefficients.
Exercise: Use Carlitz’s theorem to show that if is prime then then we have the polynomial congruence
Exercise (Harder): Show that the Legendre polynomials , whose coefficients are rational numbers with only powers of 2 in their denominators, satisfy the congruences for every odd prime .
A Second Interlude
Any number theorist worth his/her salt who sees a sequence of the form should also think of the Grothendieck-Lefschetz Trace Formula, which computes the number of fixed points of certain maps in terms of the traces of certain linear operators. From this point of view, it’s natural to ask (in light of the above discussion) whether the traces of satisfy the necklace congruences when is a square integer matrix. The answer, as we are about to see, is yes.
Fermat’s Little Theorem for Matrices
Theorem (Jänichen): If is an integer matrix, the sequence satisfies the necklace congruences. In particular, for all prime numbers .
This result (which reduces to Fermat’s Little Theorem when and is prime) was first proved by Jänichen in 1921, and the theorem was independently rediscovered by many different authors, including Schur in 1937. An algebraic proof similar to the one given above was published by Mazur and Petrenko in 2010, and a dynamical proof using generalized necklaces was given by C. J. Smyth in 1986. For a survey of the history of all of these different proofs, see this paper by Zarelua.
Here is a simple combinatorial proof using Carlitz’s theorem and the matrix identity :
Proof: Let Then is equal to Since the right-hand side is equal to , where are the algebraic integer eigenvalues of , the theory of symmetric functions shows that it’s a power series with integer coefficients.
Exercise: Use Jänichen’s theorem to show that the Lucas sequence defined by for satisfies the necklace congruences.
Exercise: [American Math Monthly Problem E2992 (1986), proposed by Michael Larsen as a graduate student at Harvard] If are complex numbers, show that is an integer for all if and only if are the roots of a monic integer polynomial of degree .
(1) As discussed in Section 4.7 of Wilf’s book “generatingfunctionology”, another application of combinatorial zeta functions is that if are indeterminates, the coefficient of in the zeta function , where , is the probability that a random permutation on letters has cycle type . Using this, one can prove wild-looking results such as:
Theorem (cf. Section 4.8 of Wilf): The probability that a random permutation in has a square root in is the coefficient of in
(2) When is the number of isolated fixed points of for some continuous endomorphism of a topological space , the corresponding combinatorial zeta function is known as the Artin-Mazur zeta function of . In their paper, Artin and Mazur proved that if is a compact differentiable manifold, then for every there is a dense subset of the space of -diffeomorphisms of such that if then the formal power series has a nonzero radius of convergence in . (This is equivalent to saying that the number of isolated periodic points of of period grows at most exponentially in .)
(3) Dwork’s proof of the rationality of the zeta function for varieties over finite fields is based on the necklace congruences, an infinite-dimensional analogue of Jänichen’s Theorem, and the rationality criterion for integer power series discussed in this blog post.
(4) Dick Lipton has some interesting speculations relating Jänichen’s Theorem to the computational hardness of integer factorization in this blog post.
(5) The polynomials in the variable which appear in the definition of necklace congruences are called necklace polynomials, and they appear in many different mathematical guises. For example, is the number of monic irreducible polynomials of degree over the finite field .
(6) Some references for the material in this post include this Math Overflow post, Chapter 7 of Alain Robert’s book “A Course in p-adic Analysis”, and Richard Stanley’s “Enumerative Combinatorics” books (especially Exercise 1.158 in Vol. 1 and Exercise 5.2 in Vol. 2). Carlitz’s theorem was originally published here.