Spooky inference and the axiom of choice

A large crowd had gathered in Harvard Square, and I was curious what all the cheering and gasping was about.  Working my way through the crowd, I saw a street performer who (according to the handwritten red letters on his cardboard sign) went by the name “Zorn the Magnificent”.  He displayed a large orange, borrowed an extremely sharp knife from his assistant, and proceeded to chop the orange into five exotic-looking pieces while standing on one hand.  Working with almost unfathomably deft precision, he rearranged the pieces into two oranges, each the same size as the original one.  The oranges were given out for inspection and the crowd cheered wildly.  I clapped as well — even though I was familiar with the old Banach-Tarski paradox — since it was nevertheless an impressive display of skill and I had never seen it done one-handed before.  I heard a man with a long white beard whisper to the woman next to him “He hides it well, but I know that he’s secretly using the Axiom of Choice.”

Zorn’s next trick, though, left me short of breath and feeling as if I had been struck in the head with a baseball bat.  I put on my white hat (or was it my black one? I can hardly remember) and staggered away utterly astounded.  I will try to describe exactly what I saw.

The magician asked a volunteer to repeatedly flip an (apparently fair) coin and secretly record the results.  The spectator did this (a countably infinite number of times) and wrote down the resulting sequence:

1. H

2. H

3. T

4. H

5. T

… etc.

The magician announced with some bravura that he would do the following:

(1) Look at the flip results corresponding to a proper subset A of the natural numbers and then announce a natural number M not in A whose corresponding flip outcome he would try to guess.

(2) Look at all of the remaining results except flip number M and then announce a guess for whether flip M was heads or tails.

He did this, and guessed correctly.  Nobody was impressed, of course — after all, he had a 50-50 chance of being right.  However, he offered to repeat the experiment with another volunteer, and again Zorn the Magnificent guessed correctly.  He then did it again, and again, and again — 20 times in a row!  And every single time he guessed right.

How did he do it, I wondered?  It seemed completely mind-boggling — surely the result of each flip was independent of all the others, so how could he have better than a 50-50 chance each time he made a guess?  But with those odds, it seemed inconceivable that he could have succeeded 20 times in a row — especially with such a confident look on his face!  Having seen his orange trick, I figured that he was probably using the Axiom of Choice, but it still didn’t make sense to me.  What confused me even more was that the sets A he chose were not bizarre-looking, like the orange cuts he made — they were just finite unions of arithmetic progressions.  The demonstration had shattered some of my deepest-held intuition about probability and independence.

I couldn’t sleep that night, tossing around in my bed trying to figure out how the trick was done.  And then — at 4am — it finally hit me!  My God, I thought, that fellow is clever.

Spoiler Alert: Dear reader, if you would like to try figuring this out for yourself, please stop reading now. I am about to give away the secret behind this devilish feat…

Here, I realized, is what he did.  Consider the equivalence relation ~ on the set S of all binary sequences defined by $s \sim s'$ if and only if $s$ and $s'$ differ in just finitely many places.  Before arriving in Harvard Square, Zorn the Magnificent secretly chose one representative from each equivalence class.  (To do this, of course, he used the Axiom of Choice.) For convenience, I will denote by $[s]$ the chosen representative for the equivalence class containing a given sequence $s$. By definition, there is some integer $m(s)$ such that $s$ and $[s]$ agree at all places starting with $m(s).$

Zorn mentally partitioned the spectator’s binary sequence $s$ into subsequences $s^1,\ldots,s^{1000}$ corresponding to the different arithmetic progressions with common difference 1000.  He then chose a random integer $k$ between 1 and 1000 and defined $A$ to be the union of all the arithmetic progressions except for the $k^{\rm th}$ one.  As announced, he then looked at the coin flips corresponding to each index in $A$.  From this observation, he was able to determine $m(s^j)$ for each $j \neq k$, and he let $m$ be the maximum of all these integers.  The $m^{\rm th}$ element of the $k^{\rm th}$ arithmetic progression corresponded to some integer $M = (k-1) + 1000m$, which he announced.  Zorn then looked at all the coin flip results except for the one corresponding to $M$, and from this he determined $[s^k]$.  His guess for the $M^{\rm th}$ coin flip was the $m^{\rm th}$ element of the sequence $[s^k]$.

Why does this work?  Well, Zorn would guess incorrectly only if $m(s^k) > m$, in other worlds, only if $m(s^k)$ was strictly larger than $m(s^j)$ for all $j \neq k$.  Since k was chosen randomly, the probability of this happening is at most $1/1000$.  Thus Zorn was guaranteed to be right at least 99.9% of the time!  Repeating this procedure 20 times, there was thus less than a 2% chance that Zorn would fail to guess correctly every time (since $1 - .999^{20} < .02$).

What a sneaky man.

Concluding remarks

1. I learned about this “paradox” (in a slightly different form than I’ve presented it here) from Peter Winkler.  He was not sure of the precise origin of the puzzle, but it seems to be a spinoff of the “Hats and Infinity” puzzle which can be found on p.91 of Winkler’s book “Mathematical Mind-Benders”, in this expository paper, and also in this Math Overflow discussion.  It is very closely related to the puzzle discussed in this Stack Exchange post.  There is in fact an entire book by Hardin and Taylor inspired by infinite hat puzzles.

2. Zorn the Magnificent can use similar techniques to predict the future.  Indeed, the Axiom of Choice implies in a similar way that there is a strategy for predicting the values of an arbitrary function $f : {\mathbf R} \to {\mathbf R}$, based on its previous values, which is almost always correct.  More precisely, given the values of a function on an interval $(-\infty,t),$ the strategy produces a guess for the values of the function on $[t,\infty)$ such that for all but countably many $t$, there is an $\epsilon > 0$ for which the prediction is valid on $[t,t+\epsilon)$.  In other words, the strategy provides an “$\epsilon$-glimpse” of the future at almost all later moments.  This is explained in detail in this provocative paper by Hardin and Taylor.  The phrase “spooky inference” is my own; I think it nicely encapsulates what I find disturbing about such consequences of the Axiom of Choice.

3. It seems hard to escape the conclusion that if we assume the Axiom of Choice, then the bits of a uniformly random element $s = (b_1,b_2,\ldots)$ of $\{ 0, 1 \}^{\mathbf N}$ are not jointly independent.  This can be seen as an illustration of the well-known difficulty of developing a rigorous theory of conditional probability when we wish to condition on an infinite collection of events.  For this reason, among others, one usually regards random variables $X_1, X_2, \ldots$ as independent if each is independent from any finite collection of the others.  To my mind, this remark (due to Pete Winkler) is enlightening but nevertheless does not really “resolve” the paradox; it is somewhat analogous to explaining the counter-intuitiveness of the Banach-Tarski paradox by simply saying that we should restrict ourselves to only considering measurable sets.  Terry Tao gives an insightful analogy between Banach-Tarski and the counterintuitive conclusions of infinite hat puzzles in the comments section of this blog post.  Among other things, Tao points out that one of the issues here is that the direct limit as $n$ tends to infinity of $\{ 0, 1 \}^{\mathbf n}$ is not locally compact, and therefore does not support a normalized Haar measure.

4. I personally find Zorn the Magnificent’s trick to be a provocative example suggesting that the Axiom of Choice is too strong, especially when we want to use probabilistic intuition.  The Banach-Tarski paradox has served a similar historical role with respect to geometric intuition, but most mathematicians are now conditioned to ignore the feeling of discomfort we all felt when we first heard about it.  Maybe in 20 years we will all be similarly conditioned with respect to hat puzzles and inference and will continue to believe in the Axiom of Choice.  Or maybe there will be a shift toward alternative axioms such as the Axiom of Projective Determinacy

5 thoughts on “Spooky inference and the axiom of choice”

1. Agni Kumar says:

Dr. Baker, this is so interesting–both the nature of the problem and the way have written about it make this such a great read! Thank you for writing this and your other wonderful posts, which I would love to read through (and attempt to digest) in my schoolwork-less-time. 🙂