On Pi Day 2016, I wrote in this post about the remarkable fact, discovered by Euler, that the probability that two randomly chosen integers have no prime factors in common is . The proof makes use of the famous identity
, often referred to as the “Basel problem”, which is also due to Euler. In the 2016 post I presented Euler’s original solution to the Basel problem using the Taylor series expansion for
.
In honor of Pi Day 2018, I’d like to explain a simple and intuitive solution to the Basel problem due to Johan Wästlund. (Wästlund’s paper is here; see also this YouTube video, which is where I first heard about this approach – thanks to Francis Su for sharing it on Facebook!) Wästlund’s approach is motivated by physical considerations (the inverse-square law which governs the apparent brightness of a light source) and uses only basic Euclidean geometry and trigonometry.
Outline of the proof
A brief outline of Wästlund’s argument is as follows:
Step 1: Through some simple algebraic manipulations, it suffices to prove the equivalent formula . This, in turn, follows (setting
) from the following more general fact:
Theorem: For every real number
which is not an integer, we have
Step 2: Let be even, and think of
(which without loss of generality we may suppose satisfies
) as a point
on the real number line. Place
stars of equal brightness on the number line, with one star at each integer (i.e., “lattice”) point of the half-open interval
. Then by the inverse square law, we can interpret the partial sum
as the total apparent brightness at
of the
-star system.
Step 3: We may approximate (to any desired precision) the lattice points of
by
equally spaced points along the perimeter a circle of very large radius which is tangent to the real number line at
. We can therefore replace our system of
stars on a line by a system of
stars lying on some arc centered at
inside a large circle (like the blue dots in Figure 1).
Step 4: Suppose we place equally spaced stars of equal brightness all the way around the perimeter of a circle of circumference
, and measure the total apparent brightness
at some point
on the circle having distance
(measured along the circle) from the nearest star (see Figure 2).
Then the inverse square law implies that for , “most” (in a precise quantitative sense) of this brightness comes from the
stars closest to
.
Step 5: Iterating this observation, let denote the total apparent brightness of the
closest stars to
when we place
equally spaced stars around the perimeter of a circle of circumference
, with the closest star at arc-distance
from
(see Figure 1 again). Then
.
Step 6: By an elegant geometric argument related to the “inverse Pythagoran theorem” (see Figure 3), it turns out that for every we have
. In other words, we can replace a system of
equally spaced stars along a circle of circumference
, tangent to the real line at
, by a system of
equally spaced stars along a circle of circumference
, also tangent to the real line at
, in such a way that the total apparent brightness at
is unchanged.
This implies, by induction, that for all natural numbers
. Combining this with the previous step, we obtain
.
Step 7: In particular, if is itself a large power of 2, then
is approximately
for all
. When
is also large,
is approximately
(where as before
). It follows that
.
Step 8: By elementary trigonometry, we have , which proves the Theorem.
Some Euclidean geometry
The crucial, and most innovative, part of the argument is the fact from Step 6 that . This is most easily explained for
, though the proof in the general case is essentially the same. So let’s examine how Wästlund proves that
.
The argument is based on the “Inverse Pythagorean Theorem”, which is the assertion that in the setting of Figure 4 (where ACB is a right angle), we have .
It is an elementary exercise to deduce this from the usual Pythagorean Theorem.
Given a single star (represented by the red point R in Figure 5) on a circle of radius 1, tangent to the real line at , we can replace it by two equally spaced stars (the blue points
and
) on a circle of radius 2, also tangent to the real line at
, in such a way that the apparent brightness of the red star at
equals the sum of the apparent brightnesses of the two blue stars at
.
The construction of and
from
goes as follows. Let
be the center of the smaller circle, and let
be the center of the larger circle. Then
and
are the two points where the line
intersects the larger circle.
Since is a diameter of the smaller circle,
is a right angle. The formula
expressing the equality between the apparent brightness at
in the red and blue star systems, will follow immediately from the Inverse Pythagorean theorem once we show that the (counterclockwise) arc-distance from
to
equals the (counterclockwise) arc-distance from
to
.
To see this, first note that times the arc distance from
to
is equal to the measure (in radians) of the central angle
. And
times the arc distance from
to
is equal to 2 (the circumference of the larger circle) times the measure of the central angle
. So it suffices to show that
. This follows from the fact that
, which intercepts the same arc of the small circle as the central angle
.
By a similar argument, replacing each red star by two blue stars as in Figure 3 above, it follows that for all
.
The base case (Step 8)
In the base case , the quantity
is just
where
is a circle of circumference 1 (and hence radius
),
are points on
which are at distance
as measured along the circumference of
, and
denotes the Euclidean (chordal) distance between
and
. It is an elementary trigonometry exercise to show that
(see Figure 6).
By induction on , we find:
Proposition 1:
whenever
is a power of 2.
The remaining technical details
We now show that when is large,
is approximately equal to
.
Consider the -star system along a circle of circumference
(and radius
). The total brightness at
is, by definition,
. Now remove the
stars furthest from
, and consider the total brightness
of the remaining
stars. Since each of the
deleted stars has distance at least
from
, it follows that
.
By a similar argument, if we begin with the -star system on a circle of radius
and remove all but the closest
stars to
, and denote by
the total brightness of the remaining
stars, we have
On the other hand, it’s geometrically clear (since the radii of the circles approach infinity) that
By the triangle inequality, the difference between and
is bounded by
Letting gives, for any fixed
, that
Taking to be an arbitrarily large power of 2 and applying Proposition 1 now yields Theorem 1 (in the special case
, but the general case follows easily from this).
Concluding Remarks
- The above estimates can be used to prove a posteriori that
for all positive integers
, not just powers of 2. This is reminiscent of Cauchy’s inductive proof of the inequality between the arithmetic mean and geometric mean which first establishes the result for powers of 2.
- To get from Theorem 1 to Euler’s theorem that
is equal to
, we can proceed as follows. First, setting
in Theorem 1 gives
. Multiplying both sides of this equality by
yields
But
and thus
as desired.
- Alternatively, as pointed out to me by Keith Conrad, one can deduce
from Theorem 1 as follows. Subtracting
from both sides of the formula in Theorem 1 yields
The Taylor series of the right-hand side around
is
Setting
gives
and thus
. And differentiating both sides of
twice and then setting
gives
and thus
One gets, in a similar way, an explicit formula for
for all positive integers
.
- It should hopefully be clear that the argument we’ve presented uses “physics” only for intuition; it is a rigorous mathematical proof.
Pingback: Counting with martingales | Matt Baker's Math Blog
Pingback: Math: The Borel Zero-One Law – Condensed Matter Corner