Generalizations of Fermat’s Little Theorem and combinatorial zeta functions

Everyone who studies elementary number theory learns two different versions of Fermat’s Little Theorem:

Fermat’s Little Theorem, Version 1: If $p$ is prime and $a$ is an integer not divisible by $p$ , then $a^{p-1} \equiv 1 \pmod{p}$ .

Fermat’s Little Theorem, Version 2: If $p$ is prime and $a$ is any integer, then $a^{p} \equiv a \pmod{p}$ .

as well as the following extension of Version 1 of Fermat’s Little Theorem to arbitrary positive integers $n$ :

Euler’s Theorem: If $n$ is a positive integer and $(a,n)=1$ , then $a^{\phi(n)} \equiv 1 \pmod{n}$ , where $\phi$ is Euler’s totient function.

My first goal in this post is to explain a generalization of Version 2 of Fermat’s Little Theorem to arbitrary $n$ . I’ll then explain an extension of this result to $m \times m$ integer matrices, along with a slick proof of all of these results (and more) via “combinatorial zeta functions”.

The Möbius function and the Necklace Theorem

First of all, recall the definition of the Möbius function from elementary number theory:

Definition: For a positive integer $n$ , define $\mu(n) = 0$ if $n$ is not square-free. Otherwise, write $n=p_1\cdots p_k$ as a product of distinct primes and define $\mu(n) = (-1)^k$ .

The importance of the Möbius function is captured by the following result (analogous to the Fourier inversion formula):

Theorem (Möbius Inversion): If $f : {\mathbb N} \to {\mathbb C}$ is multiplicative, meaning that $f(ab)=f(a)f(b)$ whenever $(a,b)=1$ , and $g(n) = \sum_{d \mid n} f(d)$ for all $n$ , then $f(n) = \sum_{d \mid n} \mu(n/d) g(d)$ for all $n$ .

The following result, which generalizes Version 2 of Fermat’s Little Theorem, was proved by Gauss when $a$ is prime. A proof of the general case was published independently between 1880 and 1883 by Kantor, Weyr, Lucas, and Pellet. For lack of a better name (it’s usually called “Gauss’s congruence”, but let’s be honest, Gauss has enough named for him already) I’ll call this the

Necklace Theorem: If $n$ is a positive integer and $a$ is any integer, then $\sum_{d \mid n} \mu(n/d) a^d \equiv 0 \pmod{n}$ .

We will give three different proofs of this result. (For an explanation of the name “Necklace Theorem”, see this Wikipedia page or keep reading.)

Algebraic Proof of the Necklace Theorem

When $n=p^k$ is a prime power, the Necklace Theorem says that $a^{p^k} \equiv a^{p^{k-1}} \pmod{p^k}$ . This can be proved by induction on $k$ , with the base case $k=1$ being Fermat’s Little Theorem. For the inductive step, if $a^{p^k} \equiv a^{p^{k-1}} \pmod{p^k}$ then $a^{p^k} = a^{p^{k-1}} + p^k m$ for some integer $m$ . Raising both sides to the $p^{\rm th}$ power and using the fact that $p \mid \binom{p}{i}$ for $0 < i < p$ , the result follows.

The general case follows from the prime-power case using the Chinese Remainder Theorem. For example, when $n = 45$ the Necklace Theorem says that $a^{45} - a^{15} - a^9 + a^3 \equiv 0 \pmod{45}.$ By the $n=9$ case the left-hand side is $((a^5)^9 - (a^5)^3)+(a^9-a^3) \equiv 0 \pmod{9}$ , but the left-hand side is also $((a^9)^5 - a^9))+((a^3)^5-a^3) \equiv 0 \pmod{5}$ . We leave a formal proof of the general case along these lines as an exercise to the reader.

For future reference, we make the following definition:

Definition: A sequence of integers $a_n$ for $n=1,2,\ldots$ satisfies the necklace congruences if $\sum_{d \mid n} \mu(n/d) a_d \equiv 0 \pmod{n}$ for all $n$ . Equivalently, by the Chinese Remainder Theorem, this means that $a_n \equiv a_{n/p} \pmod{n}$ whenever $n=p^k$ for some prime $p$ .

With this terminology, the Necklace Theorem is the assertion that for every integer $a$ , the sequence $a_n = a^n$ satisfies the necklace congruences.

Dynamical Proof of the Necklace Theorem

If $X$ is a set and $f : X \to X$ is any function, let $f^d$ denote the $d^{\rm th}$ iterate $f \circ f \circ \cdots \circ f$ ( $d$ times) and let ${\rm Fix}(f)$ denote the number of fixed points of $f$ . We assume:

( $\star$ ) $f^d$ has only finitely many fixed points for all positive integers $d$ .

Condition ( $\star$ ) is automatic, of course, if $X$ is a finite set.

An $n$ -cycle for $f$ is an orbit of size $n$ for $f$ (acting on $X$ by iteration).

Lemma: The number of $n$ -cycles for $f$ is equal to $\frac{1}{n} \sum_{d\mid n} \mu(n/d) |{\rm Fix}(f^d)|.$

Proof: Let $C_d(f)$ denote the number of $d$ -cycles for $f$ . A point of $X$ is fixed by $f^n$ if and only if it belongs to a $d$ -cycle for some $d \mid n$ , and each such $d$ -cycle has $d$ elements. Thus $|{\rm Fix}(f^n)| = \sum_{d\mid n} d C_d(f).$ The result now follows by Möbius inversion.

Corollary: The sequence $a_n =|{\rm Fix}(f^n)|$ satisfies the necklace congruences for every dynamical system $f : X \to X$ satisfying ( $\star$ ).

Proof: Since $\frac{1}{n} \sum_{d\mid n} \mu(n/d) |{\rm Fix}(f^d)|$ is equal to the number of $n$ -cycles for $f$ , it must in particular be an integer. Therefore $\sum_{d\mid n} \mu(n/d) a_d \equiv 0 \pmod{n}$ for all $n$ .

Using the Corollary, we can give a “dynamical proof” of the Necklace Theorem in the special case where $a$ is a positive integer (which immediately implies the general case if we take the $a_n \equiv a_{n/p} \pmod{n}$ characterization of the necklace congruences). It suffices to construct a dynamical system for which the number of fixed points of $f^n$ is equal to $a^n$ for all $n\geq 1$ .

Let $N = {\rm LCM}(1,\ldots,n)$ and let $X$ be the set of all functions $\nu : {\mathbb Z} / N{\mathbb Z} \to [a]$ , where $[a] = \{ 1,\ldots,a \}$ . Let $f$ be the shift map which takes $\nu(x)$ to $\nu(x+1)$ . (We can think of elements of $X$ as circular necklaces comprised of $N$ labelled beads, each of which is colored using $a$ possible colors, and of $f$ as a single-bead rotation.) For $d \mid n$ , a necklace $\nu$ is fixed by $f^d$ iff $\nu(i) = \nu(i+d)$ for all $i \in {\mathbb Z} / N{\mathbb Z}$ , which means that the beads $\{ i, i+d, i+2d,\ldots \}$ must all have the same color for $i=0,1\ldots,d-1$ . Since there are $a^d$ such necklaces, the result follows.

Image: https://mathlesstraveled.com/2017/12/12/fermats-little-theorem-proof-by-necklaces/

Interlude: Varieties over Finite Fields

(If you don’t know any algebraic geometry, feel free to skip to the next section now!)

Any number theorist worth his/her salt who sees a sequence of the form $a_n =|{\rm Fix}(f^n)|$ will immediately think of varieties over finite fields. If $X$ is a smooth projective algebraic variety over a finite field ${\mathbb F}_q$ (identified, for simplicity, with its points over an algebraic closure of ${\mathbb F}_q$ ) and $f : X \to X$ is the Frobenius map, we have ${\rm Fix}(f^n) = X({\mathbb F}_{q^n})$ and thus $a_n = |X({\mathbb F}_{q^n})|$ , which is the subject of the Weil Conjectures (now theorems!).

The Weil Conjectures concern the Hasse-Weil zeta function of $X$ , which is defined as a formal power series by the formula $\zeta_X(t) = {\rm exp}(\sum_{n\geq 1} a_n t^n / n).$ Among the deep facts comprising the Weil Conjectures is the assertion (first proved by Dwork) that $\zeta_X(t)$ is a rational function. If we write $\zeta_X(t)=1 + \sum_{n \geq 1} c_n t^n$ , then as observed by Weil the coefficients $c_n$ are integers. In fact, $c_n = |{\rm Sym}^n(X)({\mathbb F}_q)|$ is the number of effective divisors of degree $n$ on $X$ which are defined over ${\mathbb F}_q$ . As we will see in a moment, the fact that $c_n \in {\mathbb Z}$ for all $n$ is intimately related to the fact (established above) that the sequence $a_n$ satisfies the necklace congruences.

Combinatorial zeta functions

Definition: Let $a = \{ a_n \}$ be a sequence of integers. The combinatorial zeta function of $a$ is the formal power series $Z_a(t)=1 + \sum_{n \geq 1} c_n t^n$ defined by $\zeta_a(t) = {\rm exp}(\sum_{n\geq 1} a_n t^n / n).$

The definition of the combinatorial zeta function is motivated by the Girard-Newton identities relating the elementary symmetric functions of $a_1,\ldots,a_d$ to the powers sums $a_1^n + \cdots + a_d^n$ . More precisely, if we write $h(x) = \prod_{i=1}^d (x - a_i) = x^d + \sum_{i=1}^d (-1)^i e_i x^{d-i}$ and let $p_n = \sum_{i=1}^d a_i^n$ then each $e_i$ is a polynomial in the $p_n$ ‘s with rational coefficients (and vice-versa). The combinatorial zeta function gives a simple way to see this (and derive the explicit formulas).

Indeed, we can write $h(x) = x^d f(1/x)$ with $f(x) = 1+\sum_{i=1}^d (-1)^i e_i x^i = \prod_{i=1}^d (1-a_i x)$ . The right-hand side can be rewritten as ${\rm exp}(\log \prod_{i=1}^d (1-a_i x))= \sum_{j=0}^\infty (-\sum_{i=1}^\infty \frac{p_i}{i} x^i)^j/j!$ Expanding and equating coefficients of $x^n$ on both sides of $1+\sum_{i=1}^d (-1)^i e_i x^i = \sum_{j=0}^\infty (-\sum_{i=1}^\infty \frac{p_i}{i} x^i)^j/j!$ gives a rational polynomial expression for the $e_i$ ‘s in terms of the $p_i$ ‘s.

This argument shows, in particular:

Lemma: The combinatorial zeta function of the sequence $-p_n$ is in fact a polynomial (the reciprocal polynomial of $h(x)$ ).

Carlitz’s Theorem

One of the most remarkable facts about the combinatorial zeta function is the following result:

Theorem (Carlitz): We have $c_n \in {\mathbb Z}$ for all $n$ if and only if $a_n$ satisfies the necklace congruences.

Proof: For any formal power series $f(t) = 1 + \sum_{n \geq 1} c_n t^n$ , with rational coefficients, we can recursively define a unique sequence of rational numbers $b_n$ such that $f(t)$ has the “Euler product” expansion $f(t) = \prod_{n \geq 1} (1-t^n)^{-b_n}.$ Indeed, if we write $g(t) = \log f(t)= \sum_{n \geq 1} g_n t^n$ then $g(t) = \sum_{i \geq 1} \sum_{j \geq 1} \frac{b_i t^{ij}}{j} = \sum_{n \geq 1} \frac{t^n}{n} \sum_{d \mid n} d b_d.$ Therefore $n g_n = \sum_{d \mid n} d b_d$ and by Möbius inversion $b_n = \frac{1}{n} \sum_{d \mid n} d g_d \mu(n/d).$

Applying this to $f(t) = \zeta_a(t)$ , we have $g_n = a_n / n$ and therefore $b_n = \frac{1}{n} \sum_{d \mid n} a_d \mu(n/d),$ which is an integer for all $n$ iff $a_n$ satisfies the necklace congruences. But from $\sum_{n \geq 1} c_n t^n = \prod_{n \geq 1} (1-t^n)^{-b_n}$ we easily deduce that $c_n \in {\mathbb Z}$ iff $b_n \in {\mathbb Z}$ for all $n$ , and the result follows.

Combinatorial Proof of the Necklace Theorem

Using Carlitz’s result, the Necklace Theorem follows in a completely formal way. Indeed, if $a_n = a^n$ then $\zeta_a(t) = {\rm exp}(\sum_{n\geq 1} a^n t^n / n)$ is equal to ${\rm exp}(-\log (1 - at)) = \frac{1}{1-at} = 1 + \sum_{n\geq 1} a^n t^n$ , which has integer coefficients.

Exercise: Use Carlitz’s theorem to show that if $p \mid m$ is prime then then we have the polynomial congruence $(1+x)^m \equiv (1+x^p)^{m/p} \pmod{p^{{\rm ord}_p(m)}}.$

Exercise (Harder): Show that the Legendre polynomials $P_n(t)$ , whose coefficients are rational numbers with only powers of 2 in their denominators, satisfy the congruences $P_m(t) \equiv P_{m/p}(t^p) \pmod{p^{{\rm ord}_p(m)}}$ for every odd prime $p$ .

A Second Interlude

Any number theorist worth his/her salt who sees a sequence of the form $a_n =|{\rm Fix}(f^n)|$ should also think of the Grothendieck-Lefschetz Trace Formula, which computes the number of fixed points of certain maps in terms of the traces of certain linear operators. From this point of view, it’s natural to ask (in light of the above discussion) whether the traces of $A^n$ satisfy the necklace congruences when $A$ is a square integer matrix. The answer, as we are about to see, is yes.

Fermat’s Little Theorem for Matrices

Theorem (Jänichen): If $A$ is an $m \times m$ integer matrix, the sequence $a_n = {\rm trace}(A^n)$ satisfies the necklace congruences. In particular, ${\rm trace}(A^p) \equiv {\rm trace}(A) \pmod{p}$ for all prime numbers $p$ .

This result (which reduces to Fermat’s Little Theorem when $m=1$ and $n = p$ is prime) was first proved by Jänichen in 1921, and the theorem was independently rediscovered by many different authors, including Schur in 1937. An algebraic proof similar to the one given above was published by Mazur and Petrenko in 2010, and a dynamical proof using generalized necklaces was given by C. J. Smyth in 1986. For a survey of the history of all of these different proofs, see this paper by Zarelua.

Here is a simple combinatorial proof using Carlitz’s theorem and the matrix identity ${\rm exp}({\rm trace}(A)) = {\rm det}({\rm exp}(A))$ :

Proof: Let $a_n = {\rm trace}(A^n).$ Then $\zeta_a(t) = {\rm exp}(\sum_{n\geq 1} {\rm trace}(A^n)t^n / n)$ is equal to ${\rm exp}(- {\rm trace}(\log (1 - At)) = 1/{\rm det}(1-At).$ Since the right-hand side is equal to $\frac{1}{\prod (1-\lambda_i t)}$ , where $\lambda_1,\ldots,\lambda_m$ are the algebraic integer eigenvalues of $A$ , the theory of symmetric functions shows that it’s a power series with integer coefficients.

Exercise: Use Jänichen’s theorem to show that the Lucas sequence defined by $L_0=2, L_1=1, L_n = L_{n-1}+L_{n-2}$ for $n \geq 2$ satisfies the necklace congruences.

Exercise: [American Math Monthly Problem E2992 (1986), proposed by Michael Larsen as a graduate student at Harvard] If $\alpha_1,\ldots,\alpha_d$ are complex numbers, show that $\alpha_1^n + \cdots + \alpha_d^n$ is an integer for all $n \geq 1$ if and only if $\alpha_1,\ldots,\alpha_d$ are the roots of a monic integer polynomial of degree $d$ .

Concluding remarks

(1) As discussed in Section 4.7 of Wilf’s book “generatingfunctionology”, another application of combinatorial zeta functions is that if $x_1,x_2,\ldots$ are indeterminates, the coefficient of $x_1^{k_1} \cdots x_m^{k_m} t^n$ in the zeta function ${\rm exp}(\sum_{n\geq 1} x_n t^n / n)$ , where $k_1 + \cdots + k_m = n$ , is the probability that a random permutation on $n$ letters has cycle type $(k_1,\ldots,k_m)$ . Using this, one can prove wild-looking results such as:

Theorem (cf. Section 4.8 of Wilf): The probability that a random permutation in $S_n$ has a square root in $S_n$ is the coefficient of $t^n$ in $\sqrt{\frac{1+t}{1-t}} \prod_{m \geq 1} {\rm cosh}(\frac{t^{2m}}{2m}).$

(2) When $a_n$ is the number of isolated fixed points of $f : X \to X$ for some continuous endomorphism of a topological space $X$ , the corresponding combinatorial zeta function is known as the Artin-Mazur zeta function $\zeta_f(t)$ of $f$ . In their paper, Artin and Mazur proved that if $X$ is a compact differentiable manifold, then for every $k$ there is a dense subset $E_k$ of the space of $C^k$ -diffeomorphisms of $X$ such that if $f \in E_k$ then the formal power series $\zeta_f(t)$ has a nonzero radius of convergence in ${\mathbb C}$ . (This is equivalent to saying that the number of isolated periodic points of $f$ of period $n$ grows at most exponentially in $n$ .)

(3) Dwork’s proof of the rationality of the zeta function for varieties over finite fields is based on the necklace congruences, an infinite-dimensional analogue of Jänichen’s Theorem, and the rationality criterion for integer power series discussed in this blog post.

(4) Dick Lipton has some interesting speculations relating Jänichen’s Theorem to the computational hardness of integer factorization in this blog post.

(5) The polynomials $M(\alpha,n)= \frac{1}{n} \sum_{d \mid n} \mu_{n/d} \alpha^d$ in the variable $\alpha$ which appear in the definition of necklace congruences are called necklace polynomials, and they appear in many different mathematical guises. For example, $M(q,n)$ is the number of monic irreducible polynomials of degree $n$ over the finite field ${\mathbb F}_q$ .

(6) Some references for the material in this post include this Math Overflow post, Chapter 7 of Alain Robert’s book “A Course in p-adic Analysis”, and Richard Stanley’s “Enumerative Combinatorics” books (especially Exercise 1.158 in Vol. 1 and Exercise 5.2 in Vol. 2). Carlitz’s theorem was originally published here.

10 thoughts on “Generalizations of Fermat’s Little Theorem and combinatorial zeta functions”

Gilyoung Cheong says:

May 26, 2020 at 6:48 pm

I have been obsessed with combinatorial ways to think about zeta series as well! The following is the link to a poster that explains some parts of my latest preprint regarding this aspect, building up on various people’s works (most importantly, Polya, Macdonald, and Vakil):

Click to access Cheong%20–%20Polya%20poster.pdf

- Matt Baker says:
  
  May 26, 2020 at 6:59 pm
  
  Thanks for the link! That looks really interesting.
  
  - Gilyoung Cheong says:
    
    May 26, 2020 at 8:18 pm
    
    I am glad you like it! There are many interesting facts I didn’t know about in this posting, so thanks so much for writing this up. This blog is so beautiful 🙂
    
    (Apologies if the following turns out to be nonsense or too trivial.) When you have a time to take a look at the poster or the preprint, I would like to ask you if you think that it is reasonable to look for some dynamics version of the Polya enumearation. That is, can we somehow relate the number of fixed points of an endomorphism f on X (say in the setting of Artin-Mazur) to the number of fixed points of the induced map on X^n/G for a subgroup G of S_n? This should be (in some sense) immediate if we have some version of Lefschetz trace formula for f, but it sounds like Artin-Mazur was dealing with much wilder f’s. For the special case G = S_n, this seems to be related to the Artin-Mazur zeta function, but I am not 100% sure.
Thomás says:

May 26, 2020 at 8:49 pm

There are also some q-versions of the necklace congruences: https://arxiv.org/pdf/1805.01254.pdf

- Matt Baker says:
  
  May 26, 2020 at 9:00 pm
  
  Aha, I didn’t know about that – thanks for the reference!
  
Asvin G. says:

May 27, 2020 at 7:36 am

In the definition of necklace congruences, in the second sentence, I think you mean $a_n \equiv a_{n/p} \pmod n$ for primes powers $n$? As it is written, it’s not true even for $n = 6, a_n = a^n$ unless I made some calculation mistakes.

- Matt Baker says:
  
  May 27, 2020 at 11:13 am
  
  Oops, you’re totally right. That’s what I meant to say… fixed now. Thanks!
  
Asvin G. says:

May 27, 2020 at 7:50 am

Here’s another dynamical proof: Let $f_a$ be multiplication by $a: \mathbb Q/\mathbb Z \to \mathbb Q/\mathbb Z$ . Then $f_a^d$ has exactly $a^d$ fixed points.

Asvin G. says:

May 27, 2020 at 8:45 am

Sorry for the multiple comments…

One way to view Mobius inversion is that if you have two Dirichlet series with coefficients $a_n, b_n$ , then the product Dirichlet series has coefficients the convolution $\sum_{d|n}a_db_{n/d}$ . Then, Mobius inversion just corresponds to dividing by the Riemann zeta function.

This suggests that the Dirichlet series $\sum a^nn^{-s}$ should be interesting, which is like a q-deformation of the Riemann zeta function (a = q). Unfortunately, this is always divergent for $a > 1$ .

Pingback: Deformations of the Riemann zeta function – Math Solution