Generalizations of Fermat’s Little Theorem and combinatorial zeta functions

Everyone who studies elementary number theory learns two different versions of Fermat’s Little Theorem:

Fermat’s Little Theorem, Version 1: If p is prime and a is an integer not divisible by p, then a^{p-1} \equiv 1 \pmod{p}.

Fermat’s Little Theorem, Version 2: If p is prime and a is any integer, then a^{p} \equiv a \pmod{p}.

as well as the following extension of Version 1 of Fermat’s Little Theorem to arbitrary positive integers n:

Euler’s Theorem: If n is a positive integer and (a,n)=1, then a^{\phi(n)} \equiv 1 \pmod{n}, where \phi is Euler’s totient function.

My first goal in this post is to explain a generalization of Version 2 of Fermat’s Little Theorem to arbitrary n. I’ll then explain an extension of this result to m \times m integer matrices, along with a slick proof of all of these results (and more) via “combinatorial zeta functions”.

The Möbius function and the Necklace Theorem

First of all, recall the definition of the Möbius function from elementary number theory:

Definition: For a positive integer n, define \mu(n) = 0 if n is not square-free. Otherwise, write n=p_1\cdots p_k as a product of distinct primes and define \mu(n) = (-1)^k.

The importance of the Möbius function is captured by the following result (analogous to the Fourier inversion formula):

Theorem (Möbius Inversion): If f : {\mathbb N} \to {\mathbb C} is multiplicative, meaning that f(ab)=f(a)f(b) whenever (a,b)=1, and g(n) = \sum_{d \mid n} f(d) for all n, then f(n) = \sum_{d \mid n} \mu(n/d) g(d) for all n.

The following result, which generalizes Version 2 of Fermat’s Little Theorem, was proved by Gauss when a is prime. A proof of the general case was published independently between 1880 and 1883 by Kantor, Weyr, Lucas, and Pellet. For lack of a better name (it’s usually called “Gauss’s congruence”, but let’s be honest, Gauss has enough named for him already) I’ll call this the

Necklace Theorem: If n is a positive integer and a is any integer, then \sum_{d \mid n} \mu(n/d) a^d \equiv 0 \pmod{n}.

We will give three different proofs of this result. (For an explanation of the name “Necklace Theorem”, see this Wikipedia page or keep reading.)

Algebraic Proof of the Necklace Theorem

When n=p^k is a prime power, the Necklace Theorem says that a^{p^k} \equiv a^{p^{k-1}} \pmod{p^k}. This can be proved by induction on k, with the base case k=1 being Fermat’s Little Theorem. For the inductive step, if a^{p^k} \equiv a^{p^{k-1}} \pmod{p^k} then a^{p^k} = a^{p^{k-1}} + p^k m for some integer m. Raising both sides to the p^{\rm th} power and using the fact that p \mid \binom{p}{i} for 0 < i < p, the result follows.

The general case follows from the prime-power case using the Chinese Remainder Theorem. For example, when n = 45 the Necklace Theorem says that a^{45} - a^{15} - a^9 + a^3 \equiv 0 \pmod{45}. By the n=9 case the left-hand side is ((a^5)^9 - (a^5)^3)+(a^9-a^3) \equiv 0 \pmod{9}, but the left-hand side is also ((a^9)^5 - a^9))+((a^3)^5-a^3) \equiv 0 \pmod{5}. We leave a formal proof of the general case along these lines as an exercise to the reader.

For future reference, we make the following definition:

Definition: A sequence of integers a_n for n=1,2,\ldots satisfies the necklace congruences if \sum_{d \mid n} \mu(n/d) a_d \equiv 0 \pmod{n} for all n. Equivalently, by the Chinese Remainder Theorem, this means that a_n \equiv a_{n/p} \pmod{n} whenever n=p^k for some prime p.

With this terminology, the Necklace Theorem is the assertion that for every integer a, the sequence a_n = a^n satisfies the necklace congruences.

Dynamical Proof of the Necklace Theorem

If X is a set and f : X \to X is any function, let f^d denote the d^{\rm th} iterate f \circ f \circ \cdots \circ f (d times) and let {\rm Fix}(f) denote the number of fixed points of f. We assume:

(\star) f^d has only finitely many fixed points for all positive integers d.

Condition (\star) is automatic, of course, if X is a finite set.

An n-cycle for f is an orbit of size n for f (acting on X by iteration).

Lemma: The number of n-cycles for f is equal to \frac{1}{n} \sum_{d\mid n} \mu(n/d) |{\rm Fix}(f^d)|.

Proof: Let C_d(f) denote the number of d-cycles for f. A point of X is fixed by f^n if and only if it belongs to a d-cycle for some d \mid n, and each such d-cycle has d elements. Thus |{\rm Fix}(f^n)| = \sum_{d\mid n} d C_d(f). The result now follows by Möbius inversion.

Corollary: The sequence a_n =|{\rm Fix}(f^n)| satisfies the necklace congruences for every dynamical system f : X \to X satisfying (\star).

Proof: Since \frac{1}{n} \sum_{d\mid n} \mu(n/d) |{\rm Fix}(f^d)| is equal to the number of n-cycles for f, it must in particular be an integer. Therefore \sum_{d\mid n} \mu(n/d) a_d \equiv 0 \pmod{n} for all n.

Using the Corollary, we can give a “dynamical proof” of the Necklace Theorem in the special case where a is a positive integer (which immediately implies the general case if we take the a_n \equiv a_{n/p} \pmod{n} characterization of the necklace congruences). It suffices to construct a dynamical system for which the number of fixed points of f^n is equal to a^n for all n\geq 1.

Let N = {\rm LCM}(1,\ldots,n) and let X be the set of all functions \nu : {\mathbb Z} / N{\mathbb Z} \to [a], where [a] = \{ 1,\ldots,a \}. Let f be the shift map which takes \nu(x) to \nu(x+1). (We can think of elements of X as circular necklaces comprised of N labelled beads, each of which is colored using a possible colors, and of f as a single-bead rotation.) For d \mid n, a necklace \nu is fixed by f^d iff \nu(i) = \nu(i+d) for all i \in {\mathbb Z} / N{\mathbb Z}, which means that the beads \{ i, i+d, i+2d,\ldots \} must all have the same color for i=0,1\ldots,d-1. Since there are a^d such necklaces, the result follows.

Image: https://mathlesstraveled.com/2017/12/12/fermats-little-theorem-proof-by-necklaces/

Interlude: Varieties over Finite Fields

(If you don’t know any algebraic geometry, feel free to skip to the next section now!)

Any number theorist worth his/her salt who sees a sequence of the form a_n =|{\rm Fix}(f^n)| will immediately think of varieties over finite fields. If X is a smooth projective algebraic variety over a finite field {\mathbb F}_q (identified, for simplicity, with its points over an algebraic closure of {\mathbb F}_q) and f : X \to X is the Frobenius map, we have {\rm Fix}(f^n) = X({\mathbb F}_{q^n}) and thus a_n = |X({\mathbb F}_{q^n})|, which is the subject of the Weil Conjectures (now theorems!).

The Weil Conjectures concern the Hasse-Weil zeta function of X, which is defined as a formal power series by the formula \zeta_X(t) = {\rm exp}(\sum_{n\geq 1} a_n t^n / n). Among the deep facts comprising the Weil Conjectures is the assertion (first proved by Dwork) that \zeta_X(t) is a rational function. If we write \zeta_X(t)=1 + \sum_{n \geq 1} c_n t^n, then as observed by Weil the coefficients c_n are integers. In fact, c_n = |{\rm Sym}^n(X)({\mathbb F}_q)| is the number of effective divisors of degree n on X which are defined over {\mathbb F}_q. As we will see in a moment, the fact that c_n \in {\mathbb Z} for all n is intimately related to the fact (established above) that the sequence a_n satisfies the necklace congruences.

Combinatorial zeta functions

Definition: Let a = \{ a_n \} be a sequence of integers. The combinatorial zeta function of a is the formal power series Z_a(t)=1 + \sum_{n \geq 1} c_n t^n defined by \zeta_a(t) = {\rm exp}(\sum_{n\geq 1} a_n t^n / n).

The definition of the combinatorial zeta function is motivated by the Girard-Newton identities relating the elementary symmetric functions of a_1,\ldots,a_d to the powers sums a_1^n + \cdots + a_d^n. More precisely, if we write h(x) = \prod_{i=1}^d (x - a_i) = x^d + \sum_{i=1}^d (-1)^i e_i x^{d-i} and let p_n = \sum_{i=1}^n a_i^n then each e_i is a polynomial in the p_n‘s with rational coefficients (and vice-versa). The combinatorial zeta function gives a simple way to see this (and derive the explicit formulas).

Indeed, we can write h(x) = x^d f(1/x) with f(x) = 1+\sum_{i=1}^d (-1)^i e_i x^i = \prod_{i=1}^n (1-a_i x). The right-hand side can be rewritten as {\rm exp}(\log \prod_{i=1}^n (1-a_i x))= \sum_{j=0}^\infty (-\sum_{i=1}^\infty \frac{p_i}{i} x^i)^j/j! Expanding and equating coefficients of x^n on both sides of 1+\sum_{i=1}^d (-1)^i e_i x^i = \sum_{j=0}^\infty (-\sum_{i=1}^\infty \frac{p_i}{i} x^i)^j/j! gives a rational polynomial expression for the e_i‘s in terms of the p_i‘s.

This argument shows, in particular:

Lemma: The combinatorial zeta function of the sequence -p_n is in fact a polynomial (the reciprocal polynomial of h(x)).

Carlitz’s Theorem

One of the most remarkable facts about the combinatorial zeta function is the following result:

Theorem (Carlitz): We have c_n \in {\mathbb Z} for all n if and only if a_n satisfies the necklace congruences.

Proof: For any formal power series f(t) = 1 + \sum_{n \geq 1} c_n t^n, with rational coefficients, we can recursively define a unique sequence of rational numbers b_n such that f(t) has the “Euler product” expansion f(t) = \prod_{n \geq 1} (1-t^n)^{-b_n}. Indeed, if we write g(t) =  \log f(t)= \sum_{n \geq 1} g_n t^n then g(t) = \sum_{i \geq 1} \sum_{j \geq 1} \frac{b_i t^{ij}}{j} = \sum_{n \geq 1} \frac{t^n}{n} \sum_{d \mid n} d b_d. Therefore n g_n = \sum_{d \mid n} d b_d and by Möbius inversion b_n = \frac{1}{n} \sum_{d \mid n} d g_d \mu(n/d).

Applying this to f(t) = \zeta_a(t), we have g_n = a_n / n and therefore b_n = \frac{1}{n} \sum_{d \mid n} a_d \mu(n/d), which is an integer for all n iff a_n satisfies the necklace congruences. But from \sum_{n \geq 1} c_n t^n = \prod_{n \geq 1} (1-t^n)^{-b_n} we easily deduce that c_n \in {\mathbb Z} iff b_n \in {\mathbb Z} for all n, and the result follows.

Combinatorial Proof of the Necklace Theorem

Using Carlitz’s result, the Necklace Theorem follows in a completely formal way. Indeed, if a_n = a^n then \zeta_a(t) = {\rm exp}(\sum_{n\geq 1} a^n t^n / n) is equal to {\rm exp}(-\log (1 - at)) = \frac{1}{1-at} = 1 + \sum_{n\geq 1} a^n t^n, which has integer coefficients.

Exercise: Use Carlitz’s theorem to show that if p \mid m is prime then then we have the polynomial congruence (1+x)^m \equiv (1+x^p)^{m/p} \pmod{p^{{\rm ord}_p(m)}}.

Exercise (Harder): Show that the Legendre polynomials P_n(t), whose coefficients are rational numbers with only powers of 2 in their denominators, satisfy the congruences P_m(t) \equiv P_{m/p}(t^p) \pmod{p^{{\rm ord}_p(m)}} for every odd prime p.

A Second Interlude

Any number theorist worth his/her salt who sees a sequence of the form a_n =|{\rm Fix}(f^n)| should also think of the Grothendieck-Lefschetz Trace Formula, which computes the number of fixed points of certain maps in terms of the traces of certain linear operators. From this point of view, it’s natural to ask (in light of the above discussion) whether the traces of A^n satisfy the necklace congruences when A is a square integer matrix. The answer, as we are about to see, is yes.

Fermat’s Little Theorem for Matrices

Theorem (Jänichen): If A is an m \times m integer matrix, the sequence a_n = {\rm trace}(A^n) satisfies the necklace congruences. In particular, {\rm trace}(A^p) \equiv {\rm trace}(A) \pmod{p} for all prime numbers p.

This result (which reduces to Fermat’s Little Theorem when m=1 and n = p is prime) was first proved by Jänichen in 1921, and the theorem was independently rediscovered by many different authors, including Schur in 1937. An algebraic proof similar to the one given above was published by Mazur and Petrenko in 2010, and a dynamical proof using generalized necklaces was given by C. J. Smyth in 1986. For a survey of the history of all of these different proofs, see this paper by Zarelua.

Here is a simple combinatorial proof using Carlitz’s theorem and the matrix identity {\rm exp}({\rm trace}(A)) = {\rm det}({\rm exp}(A)):

Proof: Let a_n = {\rm trace}(A^n). Then \zeta_a(t) = {\rm exp}(\sum_{n\geq 1} {\rm trace}(A^n)t^n / n) is equal to {\rm exp}(- {\rm trace}(\log (1 - At)) = 1/{\rm det}(1-At). Since the right-hand side is equal to \frac{1}{\prod (1-\lambda_i t)}, where \lambda_1,\ldots,\lambda_m are the algebraic integer eigenvalues of A, the theory of symmetric functions shows that it’s a power series with integer coefficients.

Exercise: Use Jänichen’s theorem to show that the Lucas sequence defined by L_0=2, L_1=1, L_n = L_{n-1}+L_{n-2} for n \geq 2 satisfies the necklace congruences.

Exercise: [American Math Monthly Problem E2992 (1986), proposed by Michael Larsen as a graduate student at Harvard] If \alpha_1,\ldots,\alpha_d are complex numbers, show that \alpha_1^n + \cdots + \alpha_d^n is an integer for all n \geq 1 if and only if \alpha_1,\ldots,\alpha_d are the roots of a monic integer polynomial of degree d.

Concluding remarks

(1) As discussed in Section 4.7 of Wilf’s book “generatingfunctionology”, another application of combinatorial zeta functions is that if x_1,x_2,\ldots are indeterminates, the coefficient of x_1^{k_1} \cdots x_m^{k_m} t^n in the zeta function {\rm exp}(\sum_{n\geq 1} x_n t^n / n), where k_1 + \cdots + k_m = n, is the probability that a random permutation on n letters has cycle type (k_1,\ldots,k_m). Using this, one can prove wild-looking results such as:

Theorem (cf. Section 4.8 of Wilf): The probability that a random permutation in S_n has a square root in S_n is the coefficient of t^n in \sqrt{\frac{1+t}{1-t}} \prod_{m \geq 1} {\rm cosh}(\frac{t^{2m}}{2m}).

(2) When a_n is the number of isolated fixed points of f : X \to X for some continuous endomorphism of a topological space X, the corresponding combinatorial zeta function is known as the Artin-Mazur zeta function \zeta_f(t) of f. In their paper, Artin and Mazur proved that if X is a compact differentiable manifold, then for every k there is a dense subset E_k of the space of C^k-diffeomorphisms of X such that if f \in E_k then the formal power series \zeta_f(t) has a nonzero radius of convergence in {\mathbb C}. (This is equivalent to saying that the number of isolated periodic points of f of period n grows at most exponentially in n.)

(3) Dwork’s proof of the rationality of the zeta function for varieties over finite fields is based on the necklace congruences, an infinite-dimensional analogue of Jänichen’s Theorem, and the rationality criterion for integer power series discussed in this blog post.

(4) Dick Lipton has some interesting speculations relating Jänichen’s Theorem to the computational hardness of integer factorization in this blog post.

(5) The polynomials M(\alpha,n)= \frac{1}{n} \sum_{d \mid n} \mu_{n/d} \alpha^d in the variable \alpha which appear in the definition of necklace congruences are called necklace polynomials, and they appear in many different mathematical guises. For example, M(q,n) is the number of monic irreducible polynomials of degree n over the finite field {\mathbb F}_q.

(6) Some references for the material in this post include this Math Overflow post, Chapter 7 of Alain Robert’s book “A Course in p-adic Analysis”, and Richard Stanley’s “Enumerative Combinatorics” books (especially Exercise 1.158 in Vol. 1 and Exercise 5.2 in Vol. 2). Carlitz’s theorem was originally published here.

10 thoughts on “Generalizations of Fermat’s Little Theorem and combinatorial zeta functions

      • I am glad you like it! There are many interesting facts I didn’t know about in this posting, so thanks so much for writing this up. This blog is so beautiful 🙂

        (Apologies if the following turns out to be nonsense or too trivial.) When you have a time to take a look at the poster or the preprint, I would like to ask you if you think that it is reasonable to look for some dynamics version of the Polya enumearation. That is, can we somehow relate the number of fixed points of an endomorphism f on X (say in the setting of Artin-Mazur) to the number of fixed points of the induced map on X^n/G for a subgroup G of S_n? This should be (in some sense) immediate if we have some version of Lefschetz trace formula for f, but it sounds like Artin-Mazur was dealing with much wilder f’s. For the special case G = S_n, this seems to be related to the Artin-Mazur zeta function, but I am not 100% sure.

  1. In the definition of necklace congruences, in the second sentence, I think you mean a_n \equiv a_{n/p} \pmod n for primes powers $n$? As it is written, it’s not true even for n = 6, a_n = a^n unless I made some calculation mistakes.

    Reply
  2. Sorry for the multiple comments…

    One way to view Mobius inversion is that if you have two Dirichlet series with coefficients a_n, b_n, then the product Dirichlet series has coefficients the convolution \sum_{d|n}a_db_{n/d}. Then, Mobius inversion just corresponds to dividing by the Riemann zeta function.

    This suggests that the Dirichlet series \sum a^nn^{-s} should be interesting, which is like a q-deformation of the Riemann zeta function (a = q). Unfortunately, this is always divergent for a > 1.

    Reply
  3. Pingback: Deformations of the Riemann zeta function – Math Solution

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s