Spheres and equators

Posted on 22 Mar 2015
Tags: math, probability

Twitter user @isomorphismes, who might be an individual or a collective but in any case has great taste in mathematical band names, poses the following question:

The setup brings to mind an example in Rick Durrett’s Probability: Theory and Examples, in the section on conditional expectation:

Borel’s paradox. Let \(X\) be a randomly chosen point on the earth, let \(\theta\) be its longitude, and \(\varphi\) be its latitude. It is customary to take \(\theta \in [0, 2\pi)\) and \(\varphi \in (-\pi/2, \pi/2]\) but we can equally well take \(\theta \in [0, \pi)\) and \(\varphi \in (-\pi, \pi]\). In words, the new longitude specifies the great circle on which the point lies and then \(\varphi\) gives the angle.

At first glance it might seem that if \(X\) is uniform on the globe then θ and the angle φ on the great circle should both be uniform over their possible values. θ is uniform but φ is not. The paradox completely evaporates once we realize that in the new or in the traditional formulation φ is independent of θ, so the conditional distribution is the unconditional one, which is not uniform since there is more land near the equator than near the North Pole.

In fact, observing that the volume of a sphere under a region is proportional to the area of that region, @isomorphismes’ paradox takes place in exactly this setting. The “high-dimensional” aspect seems to be a red herring, since the concentration phenomenon is already happening for the 2-dimensional case (in the family of cases with a uniform distribution over the surface of a (geometer’s) \(n\)-sphere).

(Exercise for the reader: What does \(n = 2\) case look like? Is the distribution over angles uniform? Do the “customary” coordinates or Durrett’s “new” coordinates work better in this case?)

Durret’s presentation is so straightforward that it’s hard to even recall what about the situations seems so paradoxical. Fortunately, E. T. Jaynes addresses the paradox in Probability Theory: The Logic of Science:

Given a uniform probability density over the surface area, what is the corresponding conditional density on any great circle? Intuitively, everyone says immediately that, from geometrical symmetry, it must be uniform also. But if we specify points by latitude (\(-\pi/2 \leq \theta \leq \pi/2\)) and longitude (\(-\pi < \varphi \leq \pi\)), we do not seem to get this result. If that great circle is the equator, defined by \(|\theta| < \epsilon\) as \(\epsilon \to 0\), we have the expected uniform distribution \(p(\varphi) = (2\pi)^{-1}\) (\(-\pi < \varphi \leq \pi\)). But if it is the meridian of Greenwich defined by \(|\varphi| < \epsilon\) as \(\epsilon \to 0\), we have \(p(\theta) = (1/2) \cos(\theta)\) (\(-\pi/2 \leq \theta \leq \pi/2\)), with the density reaching a maximum on the equator and zero at the poles.

Many quite futile arguments have raged — between otherwise competent probabilists — over which of these results is ‘correct’. The writer has witnessed this more than once at professional meetings of scientists and statisticians. Nearly everybody feels that he knows perfectly well what a great circle is; so it is difficult to get people to see that the term ‘great circle’ is ambiguous until we specify what limiting operation is to produce it. The intuitive symmetry argument presupposes unconsciously the equatorial limit; yet one eating slices of an orange might presuppose the other.

So in this spirit, we can make a minimal statement of the paradox. In a problem with rotational symmetry, the equatorial great circle can be distinguished from all others by its uniform conditional density. Meditating on Jaynes’ passage for a few moments, we can also state a minimal resolution. The symmetry has stealthily been broken by the conditioning operation, which is only specified uniquely up to which great circle is given a uniform conditional density.

This is almost disappointing. Instead of being an unintuitive fact about the seemingly simple circle, Borel’s paradox is just one more reminder to be careful about conditioning on events with zero measure.

Returning to the Durrett excerpt, we can see a great act of subtlety. Let’s suppose that the great circles through the poles were to have the uniform conditional distribution we might wish them to have. Combining this with the uniform distribution along the equator would define a joint distribution on \((\theta, \varphi)\). From this, the latitude \(\theta\) could be integrated out to give the marginal distribution on the longitude \(\varphi\). The resulting marginal distribution on \(\varphi\) doesn’t match the one it should (which can be determined without using troublesome conditioning) so we must give up on our wish.

And returning to @isomorphismes’ problem, we are left with the knowledge that using conditioning as a tool to look into slices of high-dimensional spaces may also lead us astray. To paraphrase Jaden Smith, how can conditional distributions be radially symmetric if our conditioning isn’t radially symmetric?