What approximation techniques exist for the square super-root function?

Question

I need to implement an approximation to the inverse of $x^x$, i.e. the square super-root (ssrt) function. For example, $\mathrm{ssrt}(2) \approx 1.56$ means that $1.56^{1.56} \approx 2$. I'm not as interested in any particular accuracy/bit-depth as I am in understanding what my options are in contrast to more straightforward approaches using power series.

Wolfram Alpha gives a nice symbolic solution in terms of the Lambert W function (i.e. $\ln(x)/W(\ln(x))$ ). Wikipedia gives the same formula, as well as the equivalent $e^{W(\ln(x))}$. Given that there's a reasonable amount of information on computing $W(x)$ [1] [2], technically that's everything needed to implement something for a variety of requirements. I know of at least two books that go into extensive detail about approximating $\ln(x)$ [3] [4], so there's even plenty of room to optimize from that direction.

However, I have two questions:

Have approximation techniques specific to this function been published anywhere?
Does it go by another name besides "square super-root" that would make searching for references at a little easier?

Wikipedia/Google has turned up some references dedicated to more general "tetration" functions that include $\mathrm{ssrt}(x)$ as a special case, but most of them seem to be more geared to exploring/defining the general cases.

--

Corless, R.; Gonnet, G.; Hare, D.; Jeffrey, D.; Knuth, Donald (1996), "On the Lambert W function" http://www.apmaths.uwo.ca/~djeffrey/Offprints/W-adv-cm.pdf
Digital Library of Mathematical Functions. http://dlmf.nist.gov/4.13
Crenshaw, Jack W. (2000), Math Toolkit for Real-Time Programming.
Hart, John F. (1978), Computer Approximations.
Chapeau-Blondeau, F. and Monir, A. (2002). Numerical evaluation of the Lambert W function and application to generation of generalized Gaussian noise with exponent 1/2. IEEE Transactions on Signal Processing 50, 2160-2165. http://www.istia.univ-angers.fr/~chapeau/papers/lambertw.pdf
Minero, Paul. Fast Approximate Lambert W. http://www.machinedlearnings.com/2011/07/fast-approximate-lambert-w.html

--

Update

After doing some more research over the past few days, I still haven't found the kind of hands-on "Crenshaw style" $[3]$ treatment of $\mathrm{ssrt}(x)$ I was hoping for, but I did find a new reference worth documenting here. On page three in $[5]$, there's a section titled "Fast Approximation" that goes into great detail about approximating $W(x)$ in the context of noise generation. As an interesting aside, the probability density of "Gaussian noise with exponent 1/2" [in the paper] looks strikingly similar to the histogram in Kellenjb's answer to this question about detecting signal clipping.

In addition, the link given by rwong in the comments $[6]$ is a great resource for actually implementing $W(x)$, and it even links to the author's BSD licensed project called fastapprox, which includes the implementation described.

http://www.machinedlearnings.com/2011/07/fast-approximate-lambert-w.html — rwong, Aug 18 '11 at 07:07
I asked about this on Meta, since the comments field isn't meant for extended discussions. Please suggest how we should handle these questions here: Are questions on numerical analysis on-topic? — , Aug 18 '11 at 07:11
@datageist - The initial conclusion from the meta question was that if you want to use this numerical analysis to process DSP data, then it's on-topic. If not, then not. How does this relate to DSP? — Kevin Vermeer, Aug 19 '11 at 18:01
@Kevin It came up in the context of developing an audio effect. — datageist, Aug 19 '11 at 22:15
Over what range of values are you trying to approximate? xx is not monotonic, so an inverse mapping does not really exist. e.g. 00 == 1, 1**1 == 1. You could limit it to values of x > 1/e, above which it is monotonic — Mark Borgerding, Aug 28 '11 at 03:44
@Mark, I'm just interested in "the values that make sense", most likely real values in the range [e^(-1/e), e^(-1/e) + 2] (cf. http://bit.ly/paiCIr ). It's easy enough to come up with something, I'm just hoping someone out there already spent a month or so thinking about it and published something cooler than the straightforward approach. — datageist, Sep 14 '11 at 18:30
Whenever I need to write a routine for the Lambert function, I usually use the approximations given in this paper and then polish up with Newton-Raphson, Halley, or any other iterative method. You could adapt this approach for inverting $x^x$... — , Sep 15 '11 at 09:20
@datageist I am curious, what signal processing application are/were you working with that needed you to compute a super square root? — Spacey, Aug 03 '12 at 22:43

Jason S · Answer 1 · 2012-01-29T03:50:10.317

Some numerical stabs in the dark yielded the following for an iterative approach:

We're looking for the solution y = f(x) where y^y = x.

$$y \ln y = \ln x$$

$$y = g(x,y) = e^{\frac{\ln x}{y}}$$

The value for $y$ is a fixed point of the above equation, and empirically it seems to converge for some values of $x$, but for larger values of $x$ it oscillates or diverges.

Then I tried an approach similar to Newton's iterative square-root:

$$y = \frac{y_{previous} + y_{*}}{2} = \frac{y + e^{\frac{\ln x}{y}}}{2}$$

where y* is supposed to represent a nonconverging but optimistic answer that maintains accuracy if you happen to guess an accurate initial value (in square root y² = x, it's y* = x/y).

This appears to converge, but very slowly at the low end of $x$ (near $x_{min} = (\frac{1}{e})^{\frac{1}{e}}$ )

It also looks like a good initial guess is $y_0 = \ln(x)+1$.

So I figured maybe there's a better-converging solution:

$y = (1-a) \times y + a \times g(x,y)$ for some value of $a$ that is a function of $x$.

Then I found something interesting.

If I get a converging answer $y$ from the above approach for $y^y = x$, and then compute $y_2 = g(x,y+ \epsilon) = e^{\frac{\ln(x)}{y+\epsilon}}$, it appears as though $y_2-y$ = approximately $\epsilon \times (-\ln(y))$.... e.g. if we had a guess $y_1 = y + \epsilon$ for some unknown $\epsilon$, and computed $y_2 = g(x,y_1)$, then $(y_2 - y) \approx \epsilon \times (-\ln(y)) = (y_1 - y) \times (-\ln(y))$. (Just to clarify, I have no analysis to verify this, but the numbers just popped out of some numerical evaluation I performed.)

Solve for the linear terms in $y$, and you get $y = \dfrac{y_2 + \ln(y) \times y_1}{1 + \ln(y)}$... use $\ln(y_1)$ in place of $\ln(y)$ and you get this iterative approximation:

$$\begin{align*} y[n+1] &= \frac{g(x,y[n]) + \ln(y[n])\times y[n]}{1+\ln(y[n])}\\ &= \frac{e^{\frac{\ln(x)}{y[n]}} + \ln(y[n])\times y[n]}{1+\ln(y[n])} \end{align*}$$

This appears to work very well, with the initial guess $y = 1+\ln(x)$, and appears to converge within 4 or 5 iterations.

(Someone could probably show that this is equivalent to Newton-Raphson in some way, but I think it's beyond my capability.)

What approximation techniques exist for the square super-root function?

1 Answers1