GLEASON'S THEOREM HAS A CONSTRUCTIVE PROOF

1 July 1999

Fred Richman
Florida Atlantic University

Two recent papers in this journal have dealt with the possibility of a constructive proof of Gleason's theorem [5]. In the first [6], Geoffrey Hellman claims to give an example showing that this is impossible even in R3. In the second [1], Helen Billinge suggests that some reformulation of Gleason's theorem in R3 may have a constructive proof. In [3] it is noted that Hellman's example leaves open the problem of finding a constructive substitute-a theorem with a constructive proof that is easily seen to be classically equivalent to Gleason's theorem.

It turns out that Gleason's formulation admits a constructive proof as it stands [7]. In this paper we discuss this seemingly anomalous situation. Gleason's theorem itself is somewhat peripheral to the discussion. What is interesting is the relationship of classical mathematics to constructive mathematics that is highlighted by this misunderstanding.

1  Reformulations of classical theorems

It may be somewhat of an accident that the theorem as Gleason stated it has a constructive proof. Changes in wording that are insignificant from a classical point of view, in that the reworded formulation is easily proved equivalent to the original, can change a statement sufficiently so that the two versions are not constructively equivalent. A simple example of this is provided by the following two statements about a real number x.

The second statement is clumsy, but would read reasonably well as ``If x is not greater than 0, then x £ 0.'' The two statements are classically equivalent by their form alone, while only the second admits a constructive proof for all x.

An example with a little more substance is the statement that the positive integers are well ordered.

The first statement implies the second in anybody's book, whatever ``detachable'' might mean. In fact, a subset S of a set X is detachable if, for every x in X, either x is in S or x is not in S. Thus detachability is a classically trivial condition-every subset satisfies it-so the second implies the first classically. Not so constructively. The first statement is constructively equivalent to the law of excluded middle, while the second is provable.

The least upper bound principle is a continuous version of the previous example.

A set S of real numbers is located if for each pair of rational number r < s, either there exists x in S with r < x, or x < s for all x in S. Every set of real numbers is located from a classical point of view. Again, the first statement is constructively equivalent to the law of excluded middle, while the second is provable.

The example that is most relevant to our discussion is the principal axes theorem. Let B be a symmetric real bilinear form (or a quadratic form).

  1. There exists an orthonormal coordinate system for which the matrix of B is diagonal.

  2. For each n there exists an orthonormal coordinate system for which the off-diagonal entries of the matrix of B have absolute value less than 1/n.

Clearly Statement 1 implies Statement 2. The converse takes a little work, but follows from a standard application of the Bolzano-Weierstrass theorem. The set of orthonormal coordinate systems is compact. Construct a sequence Cn of them so that the off-diagonal entries of the matrix of B with respect to Cn have absolute value less than 1/n . The matrix of B with respect to any limit point of this sequence is diagonal.

Of course no classical mathematician would state the principal axes in the second form, but this form contains all the general computational content of the theorem. One can say more if the characteristic polynomial of B, which is the same in any coordinate system, has simple roots. Each simple root determines a principal axis. The problem comes from the possibility of multiple roots.

What precludes a constructive proof of Statement 1 is that determining the principal axes is, in general, an ill-posed problem : arbitrarily small changes in the bilinear form B can cause large changes in the principal axes. The worst case is when a matrix for B looks like an identity matrix, because that gives no information about what the principal axes might be. If the matrix is the identity matrix, then there is no problem-any coordinate system will do-but if it may just be very close to the identity matrix, there is a real problem, unless you are content with a matrix that is almost diagonal. Note that there is no problem writing down the desired matrix because the diagonal elements are the roots of the characteristic polynomial, and these can be computed. The problem is finding the coordinate system that diagonalizes the matrix.

In [2] it was shown that the principal axes theorem implies Bishop's omniscience principal LLPO, one of whose equivalent forms is that for any real number r, either r ³ 0 or r £ 0. This is also the content of Hellman's Brouwerian counterexample [6] to Gleason's theorem. Andre Scedrov showed [8] that, conversely, you can derive the principal axes theorem from LLPO. In an interesting reinterpretation of what that constructive proof shows, he establishes a classical result regarding diagonalizing matrices over rings of continuous functions on certain kinds of topological spaces.

The principal axes theorem is related to the problem of finding a point in a compact set at which a given uniformly continuous function achieves its maximum. Indeed, if you can solve this second problem, then you can find principal axes. The point on the unit sphere that maximizes the quadratic form f(x) = B(x,x) determines a principal axis, and the others can be constructed similarly by induction. Without the law of excluded middle, you can find x so that f(x) is close to the supremum, and so get an almost diagonal matrix for B.

2  Proving Gleason's theorem

Gleason's theorem for R3 says that if f is a nonnegative function on the unit sphere with the property that f(x)+f(y)+f(z) is a fixed constant, the weight of f, for each triple x,y,z of mutually orthogonal unit vectors, then f is a quadratic form. That is

f(x) = a11x12+a22x22+a33x32+2a12x1x2+2a13x1x3+2a23x2x3.
This is Theorem 2.8 of [5]: Every nonnegative frame function on the unit sphere S in R3 is regular. The principal axes theorem says that we can then find an orthonormal coordinate system in which f has the form
f(x) = a11x12+a22x22+a33x32
but Gleason did not express the theorem that way-that was done in and imitated in [6].

Gleason first proves that f is uniformly continuous, a highly nontrivial task. Then he appeals to the representation theory of the orthogonal group to finish the proof. To avoid this latter step, Cooke, Keane and Moran developed an "elementary proof" of the theorem [4].

In the proof of uniform continuity, Gleason appeals twice to the least upper bound principle. Once he assumes that he can find a point p on the sphere where f(p) is arbitrarily close to the infimum of f on the sphere. That would be no problem if f were uniformly continuous, but that is precisely what we are trying to prove. By reorganizing and rephrasing his proof, we can isolate the use of the point p to a statement of the form

Note that the hypothesis, f(p) < f(z)+d for all z, is a way of saying that f(p) is within d of the infimum of f without reference to the infimum. The conclusion, |f(x)-f(y)| £ e, is constructively equivalent to its own double negation. Therefore, if we can prove the statement, then we can derive the conclusion from the double negation of the hypothesis. But the double negation of the hypothesis is constructively provable: the assumption that no such point p exists leads to a contradiction. So the conclusion itself is constructively provable with no hypothesis.

The second application of the least upper bound principle cannot be circumvented by a logical trick. Gleason uses the infimum in an essential way to compute the modulus of continuity of f. To do this we must use a different method entirely. In fact, the two applications are nested, so we must eliminate the second before we can use the trick on the first.

3  Hellman's formulation of Gleason's theorem

In [6], Hellman suggests that constructive mathematics is inadequate for quantum mechanics; in particular, he claims that Gleason's theorem for R3 does not admit a constructive proof. How can this claim be reconciled with the constructive proof given in [7]? Part of the explanation lies in the fact that any statement has a classical reformulation that does not admit a constructive proof.

Hellman, following [4], chose to incorporate the principal axes theorem into his formulation of Gleason's theorem. The constructive problem is with the former, not with the latter. Quantum mechanics and Gleason's theorem are red herrings. The question Hellman really raised was whether you can do physics without the principal axes theorem.

As noted above, the difficulty with the principal axes theorem is that the axes change drastically with small changes in the data; that is, the problem of finding the axes is ill posed (in a technical sense). This is the reason why you can't prove it constructively. Similarly, a principle that asserts the existence of solutions to ill-posed problems would seem to be a dubious principle for your physics to rely upon. Physics is ultimately accountable to experiment, and experiment is always subject to inaccuracies. Thus, a prediction based on the solution to an ill-posed problem is unlikely to be verifiable, and if it were verifiable, it probably could be explained more naturally by some other analysis.

Another part of the explanation is the failure in [6] to keep straight which assertions are being made in a constructive context and which only in a classical context. Here is Hellman's formulation, following [4], of Gleason's theorem for R3:

(*)
Let f be a bounded frame function on S, and define g º w-M-m. There exists a frame (p,q,r)  such that, for any s Î S, if the coordinates of s with respect to (p,q,r) and (x,y,z), then
f(s) = Mx2+gy2+mz2.

Here S is the unit sphere in R3 and w is the weight of the frame function f. What are M and m?

Prior to stating (*), Hellman writes, "For f a frame function, if f is bounded, we denote its supremum, sup(f), by M, and its infimum, inf(f), by m." The tacit assumption here is that the supremum and the infimum of f exist. That's true in a classical context, where any bounded function has a supremum and an infimum, but it doesn't necessarily hold in a constructive context. So while (*) has a straightforward sense classically, it is far from clear what it means constructively. Is the existence of supf and inff part of the hypothesis or part of the conclusion? Or is there yet another interpretation?

We can see this phenomenon more clearly in a simpler case. Let an be a sequence of 0's and 1's. If the supremum of the an's is less than 1 , then each an is 0 and so is the supremum. If the supremum is greater than 0, then, by definition, there exists n such that an > 0, hence an = 1, so the supremum is 1 and is attained. We have just given a constructive proof of the theorem

If M is the supremum of the an, then there exists n such that an = M.

Now consider the theorem

There exists n such that an = M

where, just prior to stating the theorem, we say "let M be the supremum of the an". This could be construed as saying the same as the previous theorem, that is, if supmam exists, then there is n such that an is equal to it. We can prove that. Alternatively, it could be construed as saying

There exists n such that an ³ am for all m

because the statement "an = supmam" means precisely that an ³ am for all m. We certainly cannot prove that-it is the countable form of the law of excluded middle that Bishop called "the limited principle of omniscience" (LPO). The moral is that some care must be exercised when handling expressions which may not be defined, particularly if we are accustomed to their being defined when in a classical context.

4  Billinge's alternate formulation

In [1], Billinge gives three possible formulations of Gleason's theorem for R3, two of which she discards. The third is:

This is the right idea: the proof in [7] that f is a quadratic form proceeds by showing that f is approximable by diagonal quadratic forms. However, this formulation suffers from the same ambiguity as Hellman's (*) on which it is based.

References

[1]
Billinge, Helen, A constructive formulation of Gleason's theorem, J. Philosophical Logic, 1997.

[2]
Bridges, Douglas, A constructive look at positive linear functionals on L(H), Pacific J. Math. 95(1981), 11-25.

[3]
Bridges, Douglas, Constructive mathematics and unbounded operators - a reply to Hellman, J. Philosophical Logic, 24(1995), 549-561.

[4]
Cooke, Roger, Michael Keane and William Moran, An elementary proof of Gleason's theorem, Math. Proc. Camb. Phil. Soc. 98(1985), 117-128.

[5]
Gleason, Andrew M., Measures on the closed subspaces of a Hilbert space, J. Math. Mech. 6(1957), 885-893.

[6]
Hellman, Geoffrey, Gleason's theorem is not constructively provable, J. Philosophical Logic 22(1993), 193-203.

[7]
Richman, Fred, and Douglas Bridges, A constructive proof of Gleason's theorem, Journal of Functional Analysis, 162(1999) 287-312.

[8]
Scedrov Andre, Diagonalization of continuous matrices as a representation of the intuitionistic reals, Ann. Pure Appl. Logic 30(1986), 201-206.


File translated from TEX by TTH, version 2.27.
On 1 Jul 1999, 16:08.