17

Let $w$ be a word of length $2\ell$ chosen at random on the alphabet $\{x_1,x_1^{-1},x_2,x_2^{-1},\dotsc,x_k,x_k^{-1}\}$. By the reduction $\rho(w)$ I mean what you obtain by deleting substrings of the form $x_i x_i^{-1}$ or $x_i^{-1} x_i$ (and repeating, and repeating, until you cannot do it further).

What is the probability that the length of $\rho(w)$ is at most $2 s$, where $s\leq \ell$ is given?

(In the special case $s=0$, there is a good (indeed optimal in the limit $\ell\to \infty$) bound due to Kesten ("Symmetric random walks on groups", 1959).)

Wolfgang
  • 13,193
H A Helfgott
  • 19,290
  • Sorry if this is a silly question, but why are you only considering words of even length? Doesn't the problem still make sense if you replace $2\ell$ by $\ell$ and $2s$ by $s$? – Michael Albanese Jul 17 '12 at 13:25
  • 1
    @Michael: if you replace $2l$ by $l$ and $2s$ by $s$, you would have to assume that $l$ and $s$ are of the same parity. You cannot get a word of even length by cancelation from a word of odd length. With $2l$ and $2s$ you avoid this problem. –  Jul 17 '12 at 13:32
  • 1
    @Mark: I know, but couldn't you also ask the question for $2\ell +1$ and $2s + 1$? I was wondering if there was any reason for considering only the even length words if the odd length case also makes sense. – Michael Albanese Jul 17 '12 at 13:50
  • 6
    Completely off topic, but using characters from different fonts (script $\ell$, italic $s$) for the same purpose (indices) makes my heart sink. Anyone who says this is to avoid confusion between a $1$ and an $l$ needs glasses. – J.J. Green Jul 17 '12 at 14:59
  • 10
    @J.J. Green: In many fonts, the difference between some of 1, l, I, i, | is hard to tell. Even with glasses, which I wear since childhood. Worse, indices tend to occur in subscripts, in smaller font. Classic example: l and I in Computer Modern, the default (La)TeX font. I find $C_I$ and $C_l$ difficult to distinguish there (and on this website, too). Yes I can distinguish them when focusing, but a single mixup when reading can cause lots of confusion. Hence many people use \ell instead of l. If you have a better solution (note that you usually can't choose fonts in journals), please tell! – Max Horn Jul 18 '12 at 11:16
  • @Mat: I think if you must use script indices then you should use them consistenly -- MathTime Professional 2 (commercial) has a lower-case script, and I believe that the July release of STIX will have them too. If you can't specify the font and don't like the letter $l$ in the font you are forced to use, then just don't use it, there are several other letters available. – J.J. Green Jul 18 '12 at 16:36

3 Answers3

11

Adding a random generator or inverse to the end of a nonempty word has a $\frac {2k-1}{2k}$ chance of increasing the reduced length by $1$, and a $\frac{1}{2k}$ chance to decrease the reduced length by $1$. The empty word is always increased in length by $1$.

The walks from $0$ to $2t$ of length $2n$ on the nonnegative integers can be counted by the reflection principle as ${2n \choose n-t} - {2n \choose n-t-1}.$

The probability that the reductions of the prefixes of a word follow a particular walk $W$ is $$\frac {(2k-1)^{n+t}}{(2k)^{2n}} \bigg(\frac{2k}{2k-1}\bigg)^{a(W)}$$

where $a(W)$ is the number of times the walk visits $0$ before the endpoint. $a(W)$ is between $1$ and $\min(n-t+1,n)$. Walks with a fixed value of $a(W)$ can be enumerated, but even without doing so, we get the following estimates for the probability that the length of the reduced word is $2t$:

$$1 \le \frac{2k}{2k-1}\le\frac{Prob(2t)}{\bigg({2n \choose n-t} - {2n \choose n-t-1}\bigg) \frac {(2k-1)^{n+t}}{(2k)^{2n}}}\le \bigg(\frac{2k}{2k-1}\bigg)^{n-t+1} \le \exp\big(\frac{n-t+1}{2k-1}\big).$$

Douglas Zare
  • 27,806
9

(Edit : added something for big values of $s$).

An easy upper bound that generalizes Kesten's bound is given by $(\sqrt{2k-1}/k)^{2\ell} \sqrt{N_s}$ where $N_s$ is the number of reduced words of length at most $2s$. As noted in the comments, this in only interesting if $s$ is not to close to $\ell$.

Proof: Let $A$ denote the generator of the random walk on $F_k$, i.e. $A = \frac{1}{2k} \sum_{i=1}^k \lambda(g_i) + \lambda(g_i^{-1}) \in B(\ell^2 F_k)$, $\lambda$ is the left regular representation. To avoid confusion I denote by $|\cdot|$ the norm in $\ell^2 F_k$ and $\|\cdot\|$ the operator norm on $B(\ell^2 F_k)$. Then Kesten's theorem is that $\|A\|=(\sqrt{2k-1}/k)$. And the probability you are loooking for is $\langle A^{2\ell} \delta_0,\sum_{|\omega|\leq 2s} \delta_\omega\rangle$, which is less than $\|A^{2\ell} \| |\delta_0| |\sum_{|\omega|\leq 2s} \delta_\omega| = (\sqrt{2k-1}/k)^{2\ell} \sqrt{N_s}$.


For large values of $s$, you can get asymptotic results by combining the law of large numbers and the central limit theorem.

Indeed, there is a natural coupling for different values of $\ell$ (consider the uniform probability on infinite sequences of elements of the alphabet, and for each $\ell$ only remember the first $2\ell$ letters), and if $d_\ell$ is the random variable denoting the half of the length of a word of length $2\ell$, you have that $d_{\ell+1} - d_\ell$ is equal to $-1,0$ or $1$ with probability $1/4 k^2$, $(2k-1)/2k^2$ and $(2k-1)^2/4k^2$, unless $d_\ell = 0$. But Kesten's bound shows that we can forget this "unless" and work in the ideal random walk on $\mathbb R$ model, and apply the law of large numbers and the central limit theorem. Namely the probability that $d_\ell$ is less that $((k-1)/k) \ell + C \sqrt \ell$ has an explicitely computable limit for every $C$. In particular the probability that $d_\ell<\alpha \ell$ goes to zero if $\alpha<(k-1)/k$, and $1/2$ if $\alpha=(k-1)/k$ and $1$ otherwise.

If you want some more precise results (eg if $s/\ell \to \alpha < (k-1)/k$), a naive guess would be to apply large deviation techniques. But one should be careful and take into account the "unless $d_\ell = 0$".

1

This is studied in much greater generality in the paper of Cartwright and Woess (2004) [they walk on buildings, while this question is about trees], and prove a central limit theorem, which means, in this case, that for $s \in [3/4 l \pm O(\sqrt{l})]$, you have the central limit theorem estimate, where you can compute the variance explicitly. For $s$ much smaller than that, you should have a large deviation estimate, which I am not sure they do derive.

Igor Rivin
  • 95,560