7

I originally posted this question here:
https://math.stackexchange.com/questions/1296199/combinatorial-formula-for-the-number-of-different-words :

I am interested in the asymptotic behaviour of the following quantity:

Suppose we have $m$ distinct letters and we are allowed to use each letter at most $d$ times. What is the number of distinct words of length $k$ that can be formed?

Indeed, one can find a recurrence formula, but I do not quite see how one can find a uniform asymptotic for all $m,d,k.$

Edit: After discussion in the comments, I can reduce my problem to the range, $m\ge k$ and $d\ll m.$

sergey
  • 221
  • I misread words in the title as worlds. Too bad. – Włodzimierz Holsztyński May 24 '15 at 21:47
  • @Włodzimierz Holsztyński: I am interested in your proposed formulation too:) – sergey May 24 '15 at 22:02
  • You will probably need to make more specific assumptions on how $m,d,k$ are (asymptotically) related. – Christian Remling May 24 '15 at 22:17
  • Mine would be not so much from another world as off the wall. Your q. is harder than I thought at first. Thus I up-voted it. Now I expect the specialists to answer your question, possibly using some Bernoulli numbers or similar--let me see (let them sweat :-). – Włodzimierz Holsztyński May 24 '15 at 22:18
  • Well, it looked easy to me at first sight too... @Christian Remling: i can probably assume that d is small compared to k and m. – sergey May 24 '15 at 22:21
  • 2
    For $d$ slightly greater than $k/m$ it is natural to use a multidimensional normal approximation. I don't know whether that is progress for the ranges of values you care about. – Douglas Zare May 24 '15 at 22:25
  • Since $a_{d,m}(k)=k![z^k] \left(\sum_{i=0}^d {z^i \over i!}\right)^m$ the asymptotics is given given by the ``large powers'' case of the saddlepoint method, see e.g. Prop. VIII 7 and Thm. VIII 8 of Flajolet and Sedgewick, Analytic Combinatorics – esg Jun 01 '15 at 19:01

1 Answers1

3

After rescaling by the number of unrestricted words $m^k$, this asks for the probability that a multinomial distribution with equal probabilities will have largest count at most $d$. This has been studied before.

In this question about tail bounds I gave some coarse estimates, but you might find tergi's answer more helpful, with the references to

Algorithm AS 145: Exact Distribution of the Largest Multinomial Frequency P. R. Freeman Journal of the Royal Statistical Society. Series C (Applied Statistics) Vol. 28, No. 3 (1979), pp. 333-336

Bruce Levin, 1983, "On Calculations Involving the Maximum Cell Frequency."

See also

Robert E. Greenwood and Mark O. Glasgow. Distribution of Maximum and Minimum Frequencies in a Sample Drawn from a Multinomial Distribution Ann. Math. Statist. Volume 21, Number 3 (1950), 416-424.

Charles J. Corrado The exact distribution of the maximum, minimum and the range of Multinomial/Dirichlet and Multivariate Hypergeometric frequencies Statistics and Computing July 2011, Volume 21, Issue 3, pp 349-359.

Douglas Zare
  • 27,806