# A p-adic proof that pi is transcendental

Ferdinand von Lindemann

In my last blog post, I discussed a simple proof of the fact that pi is irrational.  That pi is in fact transcendental was first proved in 1882 by Ferdinand von Lindemann, who showed that if $\alpha$ is a nonzero complex number and $e^\alpha$ is algebraic, then $\alpha$ must be transcendental.  Since $e^{i \pi} = -1$ is algebraic, this suffices to establish the transcendence of $\pi$ (and setting $\alpha = 1$ it shows that $e$ is transcendental as well).  Karl Weierstrass proved an important generalization of Lindemann’s theorem in 1885.

The proof by Lindemann that pi is transcendental is one of the crowning achievements of 19th century mathematics.  In this post, I would like to explain a remarkable 20th century proof of the Lindemann-Weierstrass theorem due to Bezivin and Robba [Annals of Mathematics Vol. 129, No. 1 (Jan. 1989), pp. 151-160], which uses p-adic analysis in a key way.  Their original argument was made substantially more elementary by Beukers in this paper; we refer the reader to [American Mathematical Monthly Vol. 97 Issue 3 (Mar. 1990), pp. 193-197] for a lovely exposition of the resulting proof, which rivals any of the usual approaches in its simplicity.  But I’d like to focus here on the original Bezivin-Robba proof, which deserves to be much better known than it is.  In the concluding remarks, we will briefly discuss a 21st century theorem of Bost and Chambert-Loir that situates the Bezivin-Robba approach within a much broader mathematical framework.

An equivalent assertion

Let $\overline{{\mathbb Q}}$ be the subfield of ${\mathbb C}$ consisting of all complex numbers which are algebraic (over ${\mathbb Q}$).  The Lindemann-Weierstrass theorem is the following statement:

(L-W) Let $\alpha_1,\ldots,\alpha_m \in \overline{{\mathbb Q}}$ be distinct algebraic numbers.  Then $e^{\alpha_1},\ldots,e^{\alpha_m}$ are linearly independent over $\overline{{\mathbb Q}}$.

A relatively simple argument shows that (L-W) is equivalent to a rather different-looking assertion about formal power series which are represented by rational functions.

It will be convenient to work with power series expansions around infinity rather than zero.  Recall that a function $f : {\mathbb C} \to {\mathbb C}$ is analytic at $\infty$ if the function $g(w) = f(1/w)$ is analytic at $w=0$.  If $g(w)=b_0 + b_1 w + b_2 w^2 + \cdots$ is the power series expansion for $g(w)$ around $0$, we call

$f(z) = b_0 + b_1 \frac{1}{z} + b_2 \frac{1}{z^2} + \cdots$

the power series expansion for $f(z)$ around $z=\infty$.  We will be particularly interested in functions $f(z)$ for which $b_0 = 0$ (i.e., which vanish at infinity).

We say that a formal power series $v(x) \in {\mathbb C}[[\frac{1}{x}]]$ is analytic at $\infty$ if the power series $u(w) := v(1/w)$ has a nonzero radius of convergence around $w=0$.  And by abuse of terminology, we say that $v(x)$ is a rational function if there are polynomials $P(x),Q(x)$ with $Q(x)$ not identically zero such that the power series expansion of $f(x)=\frac{P(x)}{Q(x)}$ around $x=\infty$ is equal to $v(x)$.  A rational function vanishes at infinity if and only if ${\rm deg}(P) < {\rm deg}(Q)$.

Let $K$ be a field, and let ${\mathcal F}_K := \frac{1}{x}K[[\frac{1}{x}]]$ be the ring of formal power series over $K$ in $\frac{1}{x}$ which vanish at infinity.  Let ${\mathcal D} : {\mathcal F}_{\mathbb C} \to {\mathcal F}_{\mathbb C}$ be the differential operator ${\mathcal D}(v) = v + v'$.  We will show that (L-W) is equivalent to the following statement:

(B-R) If $v \in {\mathcal F}_{\mathbb Q}$ is analytic at infinity and ${\mathcal D}(v)$ is a rational function, then $v$ is also a rational function.

Note that the conclusion of (B-R) can fail for functions with an essential singularity at infinity; for example, ${\mathcal D}(e^{-x}) = 0$ but $e^{-x}$ is not a rational function.

Proof of the equivalence

The proof that (L-W) and (B-R) are equivalent is based on properties of the Laplace transform.  Define the formal Laplace transform ${\mathcal L} : {\mathbb C}[[z]] \to {\mathcal F}_{\mathbb C}$ by

${\mathcal L}(\sum_{n=0}^\infty a_n z^n) = \sum_{n=0}^\infty \frac{n! a_n}{x^{n+1}}.$

(This is just the extension of the usual Laplace transform to the setting of formal power series.)  The map ${\mathcal L} : {\mathbb C}[[z]] \to {\mathcal F}_{\mathbb C}$ is clearly a bijection.

We will make use of the following standard facts from complex analysis:

(L1) $f(z) \in {\mathbb C}[[z]]$ defines an entire function of exponential growth (i.e. $|f(z)| \leq C_1 e^{C_2 |z|}$ for some $C_1, C_2$) if and only if ${\mathcal L}(f)$ is analytic at infinity.

(L2) $f(z) \in {\mathbb C}[[z]]$ is the power series expansion around $z=0$ of an exponential polynomial $p_1(z)e^{a_1 z} + \cdots + p_n(z)e^{a_n z}$ if and only if ${\mathcal L}(f)$ is a rational function.  This gives a bijection between exponential polynomials and rational functions vanishing at infinity.

The proof of (L2), which is based on the partial fractions decomposition of rational functions and the fact that ${\mathcal L}(e^{az}) = \frac{1}{1-ax}$, shows that $p_i(z) \in \overline{{\mathbb Q}}[z]$ and $a_i \in \overline{{\mathbb Q}}$ for all $i$ if and only if ${\mathcal L}(f) \in \overline{{\mathbb Q}}(x).$

We will also need the following lemma, whose proof we leave as an exercise:

Lemma: Define $\delta : {\mathbb C}[[z]] \to {\mathbb C}[[z]]$ by $\delta(f(z)) = (z-1)f(z)$, and let ${\mathcal D} : {\mathcal F}_{\mathbb C} \to {\mathcal F}_{\mathbb C}$ be as above.  Then $\delta$ and ${\mathcal D}$ are bijections, and ${\mathcal D}({\mathcal L}(f))= {\mathcal L}(\delta(f)).$

To see that (L-W) implies (B-R), suppose $v \in {\mathcal F}_{\mathbb Q}$ is analytic at infinity and ${\mathcal D}(v)$ is a rational function.  By (L2), there is an exponential polynomial $f(z)= \sum p_i(z) e^{\alpha_i z}$ with the $\alpha_i$ distinct algebraic numbers and $p_i(z) \in \overline{{\mathbb Q}}[z]$ such that ${\mathcal L}(f) = {\mathcal D}(v)$.  The function $g(z) := \frac{f(z)}{z-1}$ satisfies $\delta(g(z)) = f(z)$, so by the Lemma we have ${\mathcal L}(g)=v$.  As $v$ is analytic at infinity, we know by (L1) that $g$ is entire, and hence $f(1)=0$.  By (L-W), we must have $p_i(1)=0$, i.e. $(z-1) \mid p_i(z)$, for all $i$.  Thus $g(z)$ is also an exponential polynomial, which implies by (L2) that $v(x)$ is a rational function.

To see that (B-R) implies (L-W), assume for the sake of contradiction that $f(1)=0$, where $f(z) := \sum_{i=1}^m \beta_i e^{\alpha_i z}$, the $\alpha_i$ are distinct and algebraic, and the $\beta_i$ are algebraic and nonzero. Replacing $f(z)$ by the product of its Galois conjugates $\sum_{i=1}^m \sigma(\beta_i) e^{\sigma(\alpha_i) z}$, we may assume without loss of generality that the power series expansion of $f(z)$ lies in ${\mathbb Q}[[z]]$.  (This is a standard reduction which appears in many proofs of (L-W).)  The Laplace transform of $f(z)$ is

${\mathcal L}(f) = \sum \frac{\beta_i}{1-\alpha_i x},$

which has only simple poles. Moreover, since the $\alpha_i$ are distinct and $f(1)=0$ we must have $m \geq 2$ and some $\alpha_i$ is non-zero; thus ${\mathcal L}(f)$ has at least one simple pole. On the other hand, since $f(1)=0$, the function $g(z) := \frac{f(z)}{z-1}$ is entire and of exponential growth, so by (L1) $v := {\mathcal L}(g) \in {\mathcal F}_{\mathbb C}$ is analytic at infinity.  The Lemma tells us that ${\mathcal L}(f) = {\mathcal D}(v)$, so ${\mathcal D}(v)$ has only simple poles.  However, it is easy to see that if $u$ is a rational function then ${\mathcal D}(u) = u + u'$ can never have a simple pole.   Thus $v$ is not a rational function, contradicting (B-R).

Rationality of formal power series

In order to prove (B-R), we need to show that if $v \in {\mathcal F}_{\mathbb Q}$ is analytic at infinity and ${\mathcal D}(v)$ is a rational function, then $v$ is also a rational function.  For this, we need some kind of robust criterion for determining whether a formal power series with coefficients in ${\mathbb Q}$ represents a rational function.  There is a long history of such results culminating in what one might call the Borel-Polya-Dwork-Bertrandias criterion, which will turn out to be exactly what we need.  We interrupt our regularly scheduled proof to give a brief history of these developments.

Borel

Around 1894, Emile Borel noticed that if $f(z)=\sum_{n=0}^\infty a_n z^n$ is a power series with integer coefficients defining an analytic function on a closed disc of radius $R > 1$ in ${\mathbb C}$, then $f(z)$ must in fact be a polynomial.  This is a simple consequence of Cauchy’s integral formula, which shows that if $|f| \leq M$ on the disc then $|a_n| < \frac{M}{2\pi R^{n+1}}$.  Since the $a_n$ are assumed to be integers, the inequality implies that $a_n = 0$ for all sufficiently large $n$.

Borel extended this argument to show:

Theorem (Borel): If $f(z)=\sum_{n=0}^\infty a_n z^n$ is the power series expansion around $z=0$ of a meromorphic function on a closed disc of radius $R > 1$ in ${\mathbb C}$, and the coefficients $a_n$ are all integers, then $f(z)$ is a rational function.

The proof is based on the following well-known characterization of rational functions, whose proof we omit (see Lemma 9 in this blog post by Terry Tao):

Lemma (Kronecker): Let ${\mathbf a} = \{ a_n \}_{n \geq 0}$ be a sequence of complex numbers.  Then the following are equivalent:

(R1) $f(z) =\sum_{n=0}^\infty a_n z^n$ represents a rational function.

(R2) The Kronecker-Hankel determinant

$K_N({\mathbf a}) = \begin{vmatrix} a_0 & a_1 & a_2 & \dots & a_N \\ a_1 & a_2 & a_3 & \dots & a_{N+1} \\ \hdotsfor{5} \\ a_N & a_{N+1} & a_{N+2} & \dots & a_{2N} \end{vmatrix}$

is zero for $N$ sufficiently large.

The idea behind the proof of the more general result of Borel is to use the above Cauchy estimate (applied to the product of $f(z)$ with some polynomial), together with standard facts about determinants, to show that if $f$ is meromorphic on a closed disc of radius $R > 1$ then $K_N({\mathbf a}) \to 0$ as $N \to \infty$.  If the $a_n$ are all integers, this forces $K_N({\mathbf a}) = 0$ for $N$ sufficiently large.

Polya

Around 1916, George Polya realized that the proof of Borel’s theorem via Kronecker-Hankel determinants could be generalized by replacing the radius of convergence with the transfinite diameter of the region of convergence.

The transfinite diameter is a measure of the size of a set which generalizes the radius of a disc.  It has many uses in complex analysis and potential theory (as well as in number theory).  The diameter of a bounded set $A$ in some metric space $X$ is the maximum distance between two points of $A$, and one can generalize this to the $N^{\rm th}$ diameter $\delta_N(A)$, which by definition is the supremum over all $N$-tuples $(z_1,\ldots,z_N) \in A^N$ of the geometric mean of the pairwise distances between the $z_i$:

$\delta_N(A) = \sup_{z_1,\ldots,z_N \in A} \left( \prod_{i \neq j} |z_i - z_j| \right)^{\frac{1}{n(n-1)}}.$

It turns out that $\{ \delta_N \}_{N \geq 2}$ forms a monotonically decreasing sequence and thus one can define the transfinite diameter

$\delta_\infty(A) := \lim_{N \to \infty} \delta_N(A).$

The transfinite diameter of a disc in any algebraically closed normed field (e.g. ${\mathbb C}$) is its radius, and the transfinite diameter of a real line segment is one-quarter of its length.

It will be convenient for the statement of Polya’s theorem, and for our application to the Lindemann-Weierstrass theorem, to work with $g(z) = \frac{1}{z} f(\frac{1}{z})$ instead of $f(z)$ in Borel’s theorem, and to study the transfinite diameter of the complement of the region of convergence.

Theorem (Polya): If $g(z)=\sum_{n=0}^\infty \frac{a_n}{z^{n+1}}$ is a power series with integer coefficients which can be continued to a meromorphic function on the complement of a bounded set $A \subset {\mathbb C}$ containing $0$ with $\delta_\infty(A) < 1$, then $g(z)$ is a rational function.

The condition $\delta_\infty(A) < 1$ in Polya’s theorem is sharp: the series $g(z) = \sum_{n=0}^\infty \binom{2n}{n} z^n$ has integer coefficients and can be extended to the analytic function $\sqrt{1 - \frac{4}{z}}$ on the complement of the real segment $[0,4]$, which has transfinite diameter equal to 1.  However, $\sqrt{1 - \frac{4}{z}}$ is not a rational function.

Dwork

Bernard Dwork noticed around 1960 that Borel’s theorem has a $p$-adic analogue, and this observation is a key ingredient in Dwork’s famous proof of Weil’s conjecture that the zeta function of an algebraic variety over a finite field is a rational function.  Dwork realized, in fact, that one could deduce both Borel’s theorem and its $p$-adic analogue from the following global result.  (For the statement, we let ${\mathbb C}_v$ denote the completion of an algebraic closure of the $v$-adic completion of ${\mathbb Q}$.  For $v = \infty$ this is just ${\mathbb C}$; for $v$ corresponding to a prime number $p$ it is a p-adic analogue of the complex numbers.)

Theorem (Dwork): Suppose $f(z)=\sum_{n=0}^\infty a_n z^n$ is a power series with rational coefficients. Let $S$ be a finite set of places of ${\mathbb Q}$, containing the infinite place, such that:

(D1) For $p \not\in S$, $|a_n|_p\leq 1$ for all $n \geq 0$ (i.e., $a_n$ is a $p$-adic integer).

(D2) For $v \in S$, $f(z)$ extends to a meromorphic function on a disc $D_v$ of radius $R_v$ in ${\mathbb C}_v$ and $\prod_{v \in S} R_v > 1$.

Then $f(z)$ is a rational function.

The proof of Dwork’s theorem in the special case where $f$ is analytic (rather than just meromorphic) in each $D_v$ is not difficult.  In this case, for $v \in S$ corresponding to a prime number $p$, the $p$-adic convergence of $f$ on $D_p$ means that $|a_n| R_p^n \to 0$ as $n \to \infty$.  This implies that there is a constant $M_p$ such that $|a_n|_p \leq \frac{M_p}{R_p^{n+1}}$ for all $n$.  And as above, the Cauchy estimate implies that $|a_n|_\infty \leq \frac{M_\infty}{R_\infty^{n+1}}$ for some constant $M_\infty$.  Thus (setting $M = \prod_{v \in S} M_v$ and $R = \prod_{v \in S} R_v$)

$\prod_{v \in S} |a_n|_v \leq \frac{M}{R^{n+1}} \to 0$

as $n \to \infty$.  On the other hand, the product formula shows that if $a_n \neq 0$ then

$\prod_{v \in S} |a_n|_v \geq \prod_{{\rm all \;} v} |a_n|_v = 1.$

It follows that $a_n = 0$ for $n$ sufficiently large, and $f$ is a polynomial.

Bertrandias

The transfinite diameter makes sense in any metric space, and in particular we can define it for subsets of the “p-adic complex numbers” ${\mathbb C}_p$.  Bertrandias put several of the above ingredients together and proved the following common generalization of the theorems of Borel, Polya, and Dwork around 1963.

Theorem (Bertrandias): Let $g(z)=\sum_{n=0}^\infty \frac{a_n}{z^{n+1}}$ with $a_n \in {\mathbb Q}$ for all $n \geq 0$.  Let $S$ be a finite set of places of ${\mathbb Q}$, containing the infinite place, such that:

(B1) For $p \not\in S$, $|a_n|_p \leq 1$ for all $n \geq 0$ (i.e., $a_n$ is a $p$-adic integer).

(B2) For $v \in S$, $g(z)$ extends to a meromorphic function on the complement of a bounded set $K_v \subset {\mathbb C}_v$ (which is assumed to be a finite union of discs if $v$ is non-Archimedean) and $\prod_{v \in S} \delta_\infty(K_v) < 1$.

Then $g(z)$ is a rational function.

The proof is based on Kronecker-Hankel determinants and the product formula, like the proof of Dwork’s theorem above.  For simplicity we have assumed that the $a_n$ lie in ${\mathbf Q}$, but the statement and proof of Bertrandias’s theorem generalize easily to any number field $K$.  We will only use the special case of the theorem of Bertrandias in which each extension of $g(z)$ is assumed to be analytic.

The proof of assertion (B-R)

We are finally ready to explain Bezivin and Robba’s proof of assertion (B-R), which as we have seen implies the Lindemann-Weierstrass theorem (and hence the transcendence of $\pi$).  Perhaps the most interesting aspect of the proof is that it is the p-adic places which will be used to verify the hypotheses of Bertrandias’s theorem.

Let $\omega(x)= v(x) + v'(x)$, which by assumption is a rational function, and let

$\omega(x) = \sum_{i,j} \frac{c_{ij}}{(x - \gamma_i)^j}$

be the partial fraction expansion for $\omega$, where $\gamma_1,\ldots,\gamma_m$ are distinct algebraic numbers. Using the formal inverse $(I + \frac{d}{dx})^{-1} = \sum_{k \geq 0} (-1)^k \frac{d^k}{dx^k}$ for ${\mathcal D}$, one verifies easily that $v$ has the following explicit partial fraction expansion:

(*) $v(x) = \sum_{i,j} c_{ij} \sum_{k \geq 0} \binom{k+j-1}{j-1} \frac{k!}{(x - \gamma_i)^{k+j}}.$

Let $S_1$ be a finite set of places of ${\mathbb Q}$ containing the Archimedean place such that for $p \not\in S_1$, all of the nonzero $c_{i,j}$ and $\gamma_i$ have p-adic absolute value 1, and such that $|\gamma_i - \gamma_j|_p = 1$ for all $i \neq j$.  The explicit formula (*) shows that for $p \not\in S_1$ the coefficients $a_n$ of $v(x) = \sum_{n \geq 0} \frac{a_n}{x^{n+1}}$ are $p$-adic integers.  Thus $v(x)$ satisfies hypothesis (B1) for any set of places $S$ containing $S_1$.

For $v \in S_1$, formula (*) shows that the series defining $v(x)$ converges outside a disc $K_v \subset {\mathbb C}_v$ of some positive radius $R_v$.

For $p \not\in S_1$, formula (*) shows that the series defining $v(x)$ converges in the complement of a set $K_p \subset {\mathbb C}_p$ which is a union of discs $D_i$ centered at the various $\gamma_i$.  Since the series $\sum_{k=0}^\infty k! x^k$ has p-adic radius of convergence equal to $p^{\frac{1}{p-1}}$, we can take the radii of the discs $D_i$ to be $p^{-\frac{1}{p-1}}$.  By our assumptions on $S_1$, the discs $D_1,\ldots D_m$ are distinct, and it is a simple exercise using the non-Archimedean triangle inequality to prove that

$\delta_\infty \left( \bigcup_{i=1}^m D_i \right) = p^{-\frac{1}{m(p-1)}}.$

Since the series $\sum_{{\rm primes \;} p} \frac{1}{p \log p}$ diverges, the infinite product $\prod_{p \not\in S_1} \delta_\infty(K_p)$ diverges to zero.  Thus there exists a set of places $S$ containing $S_1$ such that

$\prod_{v \in S} \delta_\infty(K_v) < 1.$

For this choice of $S$, $v(x)$ satisfies both (B1) and (B2) and thus $v(x)$ is a rational function.  Q.E.D.

Concluding remarks

1. My formulation of (B-R), and the accompanying exposition of the proof that (L-W) and (B-R) are equivalent, differs a bit from Bezivin and Robba’s.   They work with power series $u(x) \in {\mathbb C}[[x]]$ and the differential operator ${\mathcal D}'(u) = x^2 u' + (x-1) u$ instead, which amounts to the same thing via the transformation $v(x) = \frac{1}{x} u(\frac{1}{x})$.  (I thank Xander Flood for helping me with the details of how to translate smoothly between the two settings.)

2. In their paper, Bezivin and Robba generalize assertion (B-R) to an arbitrary linear differential operator ${\mathcal D}$ with polynomial coefficients for which $\infty$ is a totally irregular singular point.  In the special case ${\mathcal D}(v) = v' + v$, the proof is significantly simpler than the general case because one has an explicit inverse operator.  In the general case, one needs to use techniques from the theory of $p$-adic differential equations to establish the properties (B1) and (B2).

3. The converses of the theorems of Borel, Polya, Dwork, and Bertrandias are clearly true as well, so these results give a precise characterization of rational functions among formal power series of a certain type.

4. A proof of the theorem of Bertrandias appears in Chapter 5 of Amice’s unfortunately out-of-print book Les Nombres p-adiques.

5. For a deeper understanding of p-adic transfinite diameters, it is very useful to work with Berkovich spaces.  See for example my book with Robert Rumely, in which we prove (as in the classical case) that the transfinite diameter of a compact set $K \subset {\mathbf A}^1_{\rm Berk}$ coincides with its capacity, defined in terms of a probability measure of minimum energy supported on $K$.

6. The theorem of Bost and Chambert-Loir mentioned in the introduction is a generalization of the theorem of Bertrandias giving a criterion for a formal meromorphic function on an algebraic curve to be the germ of a rational function.  The proof uses Arakelov geometry.  Bost and Chambert-Loir view their theorem, and its proof, as an arithmetic counterpart of the following theorem from algebraic geometry:

Theorem (Hartshorne): Let $X$ be a complex projective surface and $H$ an ample effective divisor on $X$.  Then any formal meromorphic function along $H$ is the restriction of a rational function on $X$.

For more details and background related to the theorem of Bost and Chambert-Loir, see http://www.math.u-psud.fr/~chambert/publications/pdf/toronto2008.pdf