On Pi Day 2016, I wrote in this post about the remarkable fact, discovered by Euler, that the probability that two randomly chosen integers have no prime factors in common is . The proof makes use of the famous identity , often referred to as the “Basel problem”, which is also due to Euler. In the 2016 post I presented Euler’s original solution to the Basel problem using the Taylor series expansion for .
In honor of Pi Day 2018, I’d like to explain a simple and intuitive solution to the Basel problem due to Johan Wästlund. (Wästlund’s paper is here; see also this YouTube video, which is where I first heard about this approach – thanks to Francis Su for sharing it on Facebook!) Wästlund’s approach is motivated by physical considerations (the inverse-square law which governs the apparent brightness of a light source) and uses only basic Euclidean geometry and trigonometry.
Outline of the proof
A brief outline of Wästlund’s argument is as follows:
Step 1: Through some simple algebraic manipulations, it suffices to prove the equivalent formula . This, in turn, follows (setting ) from the following more general fact:
Theorem: For every real number which is not an integer, we have
Step 2: Let be even, and think of (which without loss of generality we may suppose satisfies ) as a point on the real number line. Place stars of equal brightness on the number line, with one star at each integer (i.e., “lattice”) point of the half-open interval . Then by the inverse square law, we can interpret the partial sum as the total apparent brightness at of the -star system.
Step 3: We may approximate (to any desired precision) the lattice points of by equally spaced points along the perimeter a circle of very large radius which is tangent to the real number line at . We can therefore replace our system of stars on a line by a system of stars lying on some arc centered at inside a large circle (like the blue dots in Figure 1).
Step 4: Suppose we place equally spaced stars of equal brightness all the way around the perimeter of a circle of circumference , and measure the total apparent brightness at some point on the circle having distance (measured along the circle) from the nearest star (see Figure 2).
Then the inverse square law implies that for , “most” (in a precise quantitative sense) of this brightness comes from the stars closest to .
Step 5: Iterating this observation, let denote the total apparent brightness of the closest stars to when we place equally spaced stars around the perimeter of a circle of circumference , with the closest star at arc-distance from (see Figure 1 again). Then .
Step 6: By an elegant geometric argument related to the “inverse Pythagoran theorem” (see Figure 3), it turns out that for every we have . In other words, we can replace a system of equally spaced stars along a circle of circumference , tangent to the real line at , by a system of equally spaced stars along a circle of circumference , also tangent to the real line at , in such a way that the total apparent brightness at is unchanged.
This implies, by induction, that for all natural numbers . Combining this with the previous step, we obtain .
Step 7: In particular, if is itself a large power of 2, then is approximately for all . When is also large, is approximately (where as before ). It follows that .
Step 8: By elementary trigonometry, we have , which proves the Theorem.
Some Euclidean geometry
The crucial, and most innovative, part of the argument is the fact from Step 6 that . This is most easily explained for , though the proof in the general case is essentially the same. So let’s examine how Wästlund proves that .
The argument is based on the “Inverse Pythagorean Theorem”, which is the assertion that in the setting of Figure 4 (where ACB is a right angle), we have .
It is an elementary exercise to deduce this from the usual Pythagorean Theorem.
Given a single star (represented by the red point R in Figure 5) on a circle of radius 1, tangent to the real line at , we can replace it by two equally spaced stars (the blue points and ) on a circle of radius 2, also tangent to the real line at , in such a way that the apparent brightness of the red star at equals the sum of the apparent brightnesses of the two blue stars at .
The construction of and from goes as follows. Let be the center of the smaller circle, and let be the center of the larger circle. Then and are the two points where the line intersects the larger circle.
Since is a diameter of the smaller circle, is a right angle. The formula expressing the equality between the apparent brightness at in the red and blue star systems, will follow immediately from the Inverse Pythagorean theorem once we show that the (counterclockwise) arc-distance from to equals the (counterclockwise) arc-distance from to .
To see this, first note that times the arc distance from to is equal to the measure (in radians) of the central angle . And times the arc distance from to is equal to 2 (the circumference of the larger circle) times the measure of the central angle . So it suffices to show that . This follows from the fact that , which intercepts the same arc of the small circle as the central angle .
By a similar argument, replacing each red star by two blue stars as in Figure 3 above, it follows that for all .
The base case (Step 8)
In the base case , the quantity is just where is a circle of circumference 1 (and hence radius ), are points on which are at distance as measured along the circumference of , and denotes the Euclidean (chordal) distance between and . It is an elementary trigonometry exercise to show that (see Figure 6).
By induction on , we find:
Proposition 1: whenever is a power of 2.
The remaining technical details
We now show that when is large, is approximately equal to .
Consider the -star system along a circle of circumference (and radius ). The total brightness at is, by definition, . Now remove the stars furthest from , and consider the total brightness of the remaining stars. Since each of the deleted stars has distance at least from , it follows that .
By a similar argument, if we begin with the -star system on a circle of radius and remove all but the closest stars to , and denote by the total brightness of the remaining stars, we have
On the other hand, it’s geometrically clear (since the radii of the circles approach infinity) that
By the triangle inequality, the difference between and is bounded by
Letting gives, for any fixed , that
Taking to be an arbitrarily large power of 2 and applying Proposition 1 now yields Theorem 1 (in the special case , but the general case follows easily from this).
- The above estimates can be used to prove a posteriori that for all positive integers , not just powers of 2. This is reminiscent of Cauchy’s inductive proof of the inequality between the arithmetic mean and geometric mean which first establishes the result for powers of 2.
- To get from Theorem 1 to Euler’s theorem that is equal to , we can proceed as follows. First, setting in Theorem 1 gives . Multiplying both sides of this equality by yields But and thus as desired.
- Alternatively, as pointed out to me by Keith Conrad, one can deduce from Theorem 1 as follows. Subtracting from both sides of the formula in Theorem 1 yields The Taylor series of the right-hand side around is Setting gives and thus . And differentiating both sides of twice and then setting gives and thus One gets, in a similar way, an explicit formula for for all positive integers .
- It should hopefully be clear that the argument we’ve presented uses “physics” only for intuition; it is a rigorous mathematical proof.