Joseph’s conjectures on commuting probability, an ultrafinitary perspective

The commuting probability {\Pr(G)} of a finite group {G} is the proportion of pairs {(x,y)\in G^2} which commute. In 1977 Keith Joseph made three conjectures about the set

\displaystyle \mathcal{P} = \{\Pr(G) : G \text{ a finite group}\},

namely the following.

Conjecture 1 (Joseph’s conjectures) {\,}

  1. All limit points of {\mathcal{P}} are rational.
  2. {\mathcal{P}} is well ordered by {>}.
  3. {\{0\}\cup\mathcal{P}} is closed.

Earlier this month I uploaded a preprint to the arxiv which proves the first two of these conjectures, and yesterday I gave a talk at the algebra seminar in Oxford about the proof. While preparing the talk I noticed that some aspects of the proof are simpler from an ultrafinitary perspective, basically because ultrafilters can be used to streamline epsilon management, and I gave one indication of this perspective during the talk. In this post I wish to lay out the ultrafinitary approach in greater detail.

Throughout this post we fix a nonprincipal ultrafilter {u\in\beta\mathbf{N}\setminus\mathbf{N}}, and we let {\mathbf{R}^*} be the ultrapower {\mathbf{R}^\mathbf{N}/u}, where two elements of {\mathbf{R}^\mathbf{N}} are considered equivalent iff they are equal in {u}-almost-every coordinate. The elements of {\mathbf{R}^*} are called “nonstandard reals” or “hyperreals”. There is a principle at work in nonstandard analysis, possibly called Los’s theorem, which asserts, without going into the finer details, that all “first-order” things that you do with reals carry over in a natural way to the field of hyperreals, and everything more-or-less works just how you’d like it to. For instance, if {r} and {s} are hyperreals then {r<s} naturally means that the inequality holds in {u}-almost-every coordinate, and similarly the field operations of {\mathbf{R}} extend naturally, and with these definitions {\mathbf{R}^*} becomes a totally ordered field. We will seldom spell out so explicitly how to naturally extend first-order properties in this way.

An ultrafinite group {G} is an ultraproduct {\prod_u G_i = \prod_{i=1}^\infty G_i/u} of finite groups. Its order {|G| = (|G_i|)/u} is a nonstandard natural number, and its commuting probability {\Pr(G) = (\Pr(G_i))/u} is a nonstandard real in the interval {[0,1]}. Joseph's first two conjectures can now be stated together in the following way.

Theorem 2 The commuting probability of every ultrafinite group {G} has the form {q+\epsilon}, where {q} is a standard rational and {\epsilon} is a nonnegative infinitesimal.

Somewhat similarly, Joseph’s third conjecture can be stated in an ultrafinitary way as follows: For every ultrafinite group {G} there is a finite group {H} such that {\text{st}\Pr(G)=\Pr(H)}. Here {\text{st}} is the standard part operation, which maps a finite hyperreal to the nearest real. When phrased in this way it resembles a known result about compact groups. Every compact group {G} has a unique normalised Haar measure, so we have a naturally defined notion of commuting probability {\Pr(G)}. However every compact group {G} with {\Pr(G)>0} has a finite-index abelian subgroup, and with a little more work one can actually find a finite group {H} with {\Pr(G)=\Pr(H)}. This is a theorem of Hofmann and Russo. Nevertheless, I find Joseph’s third conjecture rather hard to believe.

For us the most important theorem about commuting probability is a theorem of Peter Neumann, which states that if {\epsilon>0} then every finite group {G} such that {\Pr(G)\geq\epsilon} has a normal subgroup {H} such that {|G/H|} and {|[H,H]|} are both bounded in terms of {\epsilon}. To prove the above theorem we need the following “amplified” version:

Theorem 3 (Neumann’s theorem, amplified, ultrafinitary version) Every ultrafinite group {G} has an internal normal subgroup {H} such that {[H,H]} is finite and such that almost every pair {(x,y)\in G^2} such that {[x,y]\in[H,H]} is contained in {H^2}.

Here if {G = \prod_u G_i} we say that {S\subset G} is internal if {S} is itself an ultraproduct {\prod_u S_i} of subsets {S_i\subset G_i}, and “almost every” needs little clarification because the set of pairs in question is an internal subset of {G^2}. (Otherwise we would need to introduce Loeb measure.)

Proof: If {\text{st}\Pr(G)=0} the theorem holds with {H=1}, so we may assume {\text{st}\Pr(G)>0}. By Neumann’s theorem {G} has an internal normal subgroup {K_0} of finite index such that {[K_0,K_0]} is finite. Since {G/K_0} is finite there are only finitely many normal subgroups {K\leq G} containing {K_0} and each of them is internal, so we may find normal subgroups {K,L\leq G} containing {K_0} such that {[K,L]} is finite, and which are maximal subject to these conditions.

Suppose that a positive proportion of pairs {(x,y)\in G^2} outside of {K\times L} satisfied {[x,y]\in[K,L]}. Then we could find {(x,y)\in G^2\setminus (K\times L)}, say with {x\notin K}, such that for a positive proportion of {(k,l)\in K\times L} we have {[xk,yl]\in[K,L]}. After a little commutator algebra one can show then that for a positive proportion of {l\in L} we have {[x,l]\in[K,L]}, or in other words that the centraliser

\displaystyle N_0 = C_{L/[K,L]}(x) = C_{L/[K,L]}(\langle K,x\rangle)

of {x} in {L/[K,L]} has finite index. But this implies that the largest normal subgroup contained in {N_0}, namely

\displaystyle N = C_{L/[K,L]}(K'),

where {K'} is the normal subgroup of {G} generated by {K} and {x}, also has finite index. Since certainly {K\leq C_{K'/[K,L]}(L)} a classical theorem of Baer implies that

\displaystyle [K'/[K,L], L/[K,L]] = [K',L]/[K,L]

is finite, and hence that {[K',L]} is finite, but this contradicts the maximality of {K} and {L}.

Hence almost every pair {(x,y)\in G^2} such that {[x,y]\in[K,L]} is contained in {K\times L}, and thus also in {L\times K}, so the theorem holds for {H=K\cap L}. \Box

Now let {G} be any ultrafinite group {G} and let {H} be as in the theorem. Then

\displaystyle \Pr(G) = \frac1{|G/H|^2} \Pr(H) + \epsilon,

where {\epsilon} is nonnegative and infinitesimal. Thus it suffices to show that {\Pr(H)} has the form

\displaystyle (\text{standard rational}) + (\text{nonnegative infinitesimal})

whenever {[H,H]} is finite. Note in this case that Hall’s theorem implies that the second centre

\displaystyle Z_2(H) = \{h\in H : [h,H]\subset Z(H)\}

has finite index. One can complete the proof using a little duality theory of abelian groups, but the ultrafinite perspective adds little here so I refer the reader to my paper.

The other thing I noticed while preparing my talk is that the best lower bound I knew for the order type of {\mathcal{P}}, {\omega^2}, is easy to improve to {\omega^\omega}, just by remembering that {\mathcal{P}} is a subsemigroup of {(0,1]}. In fact the order type of a well ordered subsemigroup of {(0,1]} is heavily restricted: it’s either {0}, {1}, or {\omega^{\omega^\alpha}} for some ordinal {\alpha}. This observation reduces the possibilities for the order type of {\mathcal{P}} to {\{\omega^\omega,\omega^{\omega^2}\}}. I have no idea which it is!

The idempotent theorem

Let {G} be a locally compact abelian group and let {M(G)} be the Banach algebra of regular complex Borel measures on {G}. Given {\mu\in M(G)} its Fourier transform

\displaystyle \hat{\mu}(\gamma) = \int \overline{\gamma}\,d\mu,

is a continuous function defined on the Pontryagin dual {\hat{G}} of {G}. If the measure {\mu} is “nice” in some way then we expect some amount of regularity from the function {\hat{\mu}}. For instance if {\mu} is actually an element of the subspace {L^1(G)\subset M(G)} of measures absolutely continuous with respect to the Haar measure of {G} then the Riemann-Lebesgue lemma asserts {\hat{\mu}\in C_0(\hat{G})}.

The idempotent theorem of Cohen, Helson, and Rudin describes the structure of measures {\mu} whose Fourier transform {\hat{\mu}} takes a discrete set of values, or equivalently, since {\|\hat{\mu}\|_\infty\leq\|\mu\|}, a finite set of values. To describe the theorem, note that we can define {P(\mu)} for any polynomial {P} by taking appropriate linear combinations of convolution powers of {\mu}, and moreover we have the relation {\widehat{P(\mu)} = P(\hat{\mu})}, where on the right hand side we apply {P} pointwise. Thus if {\hat{\mu}} takes only the values {a_1,\dots,a_n} then by setting

\displaystyle P_i(x) = \prod_{j\neq i} (x-a_j)/(a_i-a_j)

we obtain a decomposition {\mu = a_1\mu_1 + \cdots + a_n\mu_n} of {\mu} into a linear combination of measures {\mu_i=P_i(\mu)} whose Fourier transforms {\hat{\mu_i} = P_i(\hat{\mu})} take only values {0} and {1}. Such measures are called idempotent, because they are equivalently defined by {\mu\ast\mu=\mu}. By the argument just given it suffices to characterise idempotent measures: this explains the name of the theorem.

The most obvious example of an idempotent measure is the Haar measure {m_H} of a compact subgroup {H\leq G}. Moreover we can multiply any idempotent measure {\mu} by a character {\gamma\in\hat{G}} to obtain a measure {\gamma\mu} defined by

\displaystyle \int f \,d(\gamma\mu) = \int f\gamma\,d\mu.

This measure {\gamma\mu} will again be idempotent, as

\displaystyle  \begin{array}{rcl}  \int f\,d(\gamma \mu\ast\gamma \mu) &=& \int\int f(x+y)\gamma(x)\gamma(y)\,d\mu(x)d\mu(y) \\ &=& \int\int f(x+y) \gamma(x+y)\,d\mu(x)d\mu(y) \\ &=& \int f\gamma\,d\mu. \end{array}

If we add or subtract two idempotent measures then though we may not have again an idempotent measure we certainly have a measure whose Fourier transform takes integer values. On reflection, it feels more natural in the setting of harmonic analysis to require that {\hat{\mu}} takes values in a certain discrete subgroup than to require that it take values in {\{0,1\}}, so we relax our restriction so. The idempotent theorem states that we have already accounted for all those {\mu} such that {\hat{\mu}} is integer-valued.

Theorem 1 (The idempotent theorem) For every {\mu\in M(G)} such that {\hat{\mu}} is integer-valued there is a finite collection of compact subgroups {G_1,\dots,G_k\leq G}, characters {\gamma_1,\dots,\gamma_k\in\hat{G}}, and integers {n_1,\dots,n_k\in\mathbf{Z}} such that

\displaystyle \mu = n_1\gamma_1 m_{G_1} + \cdots + n_k\gamma_k m_{G_k}.

As a consequence we deduce a structure theorem for {\mu} with {\hat{\mu}} taking finitely many values, as we originally wanted: for every such {\mu} there is a finite collection of compact subgroups {G_1,\dots,G_k\leq G}, characters {\gamma_1,\dots,\gamma_k\in\hat{G}}, and complex numbers {a_1,\dots,a_k\in\mathbf{C}} such that

\displaystyle \mu = a_1\gamma_1 m_{G_1} + \cdots + a_k \gamma_k m_{G_k}.

The theorem was first proved in the case of {G=\mathbf{R}/\mathbf{Z}} by Helson in 1953: in this case the theorem states simply that if {\hat{\mu}} is integer-valued then it differs from some periodic function in finitely many places. In 1959 Rudin gave the theorem its present form and proved it for {(\mathbf{R}/\mathbf{Z})^d}. Finally in 1960 Cohen proved the general case, in the same paper in which he made the first substantial progress on the Littlewood problem. The proof was subsequently simplified a good deal, particularly by Amemiya and Ito in 1964. We reproduce their proof here.

First note that if {\hat{\mu}} is integer-valued then {\mu} is supported on a compact subgroup. Indeed by inner regularity there is a compact set {K} such that {|\mu|(K^c)<0.1}, the set {U} of all {\gamma\in\hat{G}} such that {|1-\gamma|<0.1/\|\mu\|} on {K} is then open, and if {\gamma\in U} then

\displaystyle \|\gamma\mu-\mu\| = \int_G |\gamma-1|\,d|\mu| \leq \int_K + \int_{K^c} < 0.1 + 0.1 < 1.

But if {\gamma\mu\neq\mu} then

\displaystyle \|\gamma\mu-\mu\|\geq \|\widehat{\gamma\mu}-\hat{\mu}\|_\infty \geq 1,

so {\gamma\mu=\mu} for all {\gamma\in U}. Thus {\Gamma=\{\gamma\in\hat{G}: \gamma\mu=\mu\}} is an open subgroup of {\hat{G}}, so by Pontryagin duality its preannihilator {\Gamma^\perp = \{g\in G: \gamma(g)=1 \text{ for all }\gamma\in\Gamma\}} is a compact subgroup of {G}. Clearly {\mu} is supported on {\Gamma^\perp}. Thus from now on we assume {G} is compact.

Fix a measure {\mu\in M(G)} and let {A=\{\gamma\mu: \gamma\in\hat{G}\}}.

Lemma 2 If {\nu} is a weak* limit point of {A} then {\|\nu\|<\|\mu\|}.

Proof: Fix {\epsilon>0} and suppose we could find {f\in C(G)} such that {\|f\|_\infty\leq 1} and {\int f\,d\nu > (1-\epsilon)\|\mu\|}. Let {\gamma\mu} be close enough to {\nu} that {\Re\int f\gamma\,d\mu > (1-\epsilon)\|\mu\|}. Write {\mu = \theta|\mu|} and {f\gamma\theta = g + ih}. Then if {Z} is the complex number {Z = \int (g+i|h|)\,d|\mu|}, then {|Z|\leq\|\mu\|} and

\displaystyle \Re Z = \int g \,d|\mu| = \Re\int f\gamma\,d\mu > (1-\epsilon)\|\mu\|,

so we must have

\displaystyle \Im Z = \int |h|\,d|\mu| \leq (1-(1-\epsilon)^2)^{1/2}\|\mu\| \leq 2\epsilon^{1/2}\|\mu\|.

Thus also

\displaystyle \int |1 - f\gamma\theta| \,d|\mu| \leq \int |1 - g|\,d|\mu| + \int |h|\,d|\mu| \leq 3\epsilon^{1/2}\|\mu\|.

But if this holds for both {\gamma_1\mu} and {\gamma_2\mu}, say with {\gamma_1\mu\neq\gamma_2\mu}, then we have

\displaystyle  1\leq \|\gamma_1\mu-\gamma_2\mu\| \leq \int |\gamma_1 - f\gamma_1\gamma_2\theta|\,d|\mu| + \int |\gamma_2 - f\gamma_1\gamma_2\theta|\,d|\mu| \leq 6\epsilon^{1/2}\|\mu\|,

so {\epsilon \geq 1/(36\|\mu\|^2)}, so

\displaystyle \|\nu\| \leq \|\mu\| - \frac{1}{36\|\mu\|}.

This proves the lemma. \Box

Lemma 3 If {\nu} is a weak* limit point of {A} then {\nu} is singular with respect to the Haar measure {m_G} of {G}.

Proof: By the Radon-Nikodym theorem we have a decomposition {\mu = f m_G + \mu_s} for some {f\in L^1(G)} and some {\mu_s} singular with respect to {m_G}. By the Riemann-Lebesgue lemma then {\nu} is a limit point of {\{\gamma\mu_s:\gamma\in\hat{G}\}}. Thus for any open set {U} and {f\in C(G)} such that {\|f\|_\infty\leq 1} and {f=0} outside of {U} we have

\displaystyle \left|\int f\,d\nu\right| \leq \sup_\gamma \left|\int f\gamma \,d\mu_s\right| \leq |\mu_s|(U),

so {|\nu|(U)\leq |\mu_s|(U)}. This inequality extends to Borel sets in the usual way, so {\nu} is singular. \Box

The theorem follows relatively painlessly from the two lemmas. Fix {\mu\in M(G)} with {\hat{\mu}} integer-valued and let {A = \{\gamma\mu: \int\gamma\,d\mu\neq 0\}}. Then {\overline{A}} is weak* compact, so because {\|\cdot\|} is lower semicontinuous in the weak* topology there is some {\nu\in\overline{A}} of minimal norm. Since {\int d\nu} is an integer different from {0} we must have {\nu\neq 0}. Thus by Lemma~2 the set {\{\gamma\nu: \int\gamma\,d\nu\neq 0\}} is finite. But this implies that

\displaystyle  \nu = (n_1 \gamma_1 + \cdots + n_k \gamma_k) m_H \ \ \ \ \ (1)

for some {n_1,\dots,n_k\in\mathbf{Z}}, {\gamma_1,\dots,\gamma_k\in\hat{G}}, and {H=\{\gamma:\gamma\nu=\nu\}^\perp} the support group of {\nu}. In particular {\nu} is absolutely continuous with respect to {m_H}, so because {\nu|_H} is in the weak* closure of {\{\gamma\mu|_H:\gamma\in\hat{G}\}} we deduce from Lemma 2 that {\nu|_H = \gamma\mu|_H} for some {\gamma}. Thus {\mu|_H} is a nonzero measure of the form (1) and we have an obvious mutually singular decomposition

\displaystyle \mu = \mu|_H + (\mu-\mu|_H).

Since {\|\mu-\mu|_H\| = \|\mu\| - \|\mu|_H\|\leq\|\mu\|-1} the theorem follows by induction.