7

Random models for primes (such as Cramer's model) have been extensively used for informal justification of various conjectures involving primes. It is crucial to understand in what sense sequence of primes 2,3,5,7,..., or, equivalently, sequence 01101010001..., where $a_n=1$ if $n$ is prime and $a_n=0$ otherwise, can be considered as "random sequence". In fact, even define what exactly is "random sequence" is a non-trivial problem. The first noticeable definition is due to Richard von Mises, and states that infinite binary sequence $x=x_1x_2...$ is random, if (a) the limiting density of 1-s is $1/2$, and (b) the limiting density of 1-s is $1/2$ in any admissible subsequence of $x$. Here, "admissible" subsequence is defined informally as a rule, which, given $x_1, x_2, ..., x_{n-1}$, decides whether to include $x_n$ in the subsequence. For example, we can include every $x_n$ that follows immediately after $001$. However, if we allow any subsequence to be admissible, we could choose one consisting with $1$-s only, and property (b) would fail. In a sense, choosing such subsequence would be a "cheating". Von Mises, however, never gave fully formal definition of "admissible". Church later suggested that "rule" should mean "Turing machine", hence "admissible subsequence" is just "computable". This looked as a good definition until Ville showed that there are sequences, random in Church-Mises sense, such that for any $n$ the number of 1-s among $x_1, ...., x_n$ is not less than number of $0$-s. In a sense, Mises-Church definition of random sequence is based on the law of large numbers only, but fails some other statistical tests. Stronger definition was later suggested which "passes all statistical tests at once".

Interestingly, primes also fails "other statistical tests". For example, Chebyshev's bias is the fact that for most N there are more primes less than N is the form $4k+3$ than in the form $4k+1$. So, the most we can hope that primes pass "the law of large numbers only", that is, random in a sense similar to Mises.

Conjecture 1 (informal). The prime sequence $01101010001...$ is "random" in the sense that (a) $\lim\limits_{n\to\infty}\frac{P(n)}{n/\log n}=1$, where $P(n)$ is the number of primes less than $n$, and (b) the same holds in any "admissible" subsequence of it.

Part (a) is just the prime number theorem, so the main part is (b). The main question here is what the "admissible" subsequence should mean in case of primes. Informally, it should be a "non-cheating" one, that is, there should be no obvious reason for different density of primes within it. In particular, sequence of all even numbers is "cheating" (no primes), and the same with odd numbers (more primes). In fact, we expect about equal proportion of odd and even numbers in "admissible" sequence, and, more generally

(*) for every fixed primes $p_1,...,p_n$ the limiting density of numbers not divisible by $p_1,...,p_n$ in an admissible subsequence should be $(1-\frac{1}{p_1})(1-\frac{1}{p_2})...(1-\frac{1}{p_n})$

Obviously, there are different way of "cheating", and it is possible to construct a sequence of all composite numbers satisfying (*). However, it seems that doing this requires at least operation of multiplication, which (conjecturally) cannot be performed in linear time. Parallel to Church idea, why not define "admissible" to be "computable in linear time"?

Conjecture 2 (formal). Let $T$ be a Turing machine, which, given natural number $n$ (in binary), and an oracle which for any $k<n$ returns whether $k$ is a prime, decides in linear time whether to accept $n$. Let $A$ be a set of numbers accepted by $T$. Assume that $A$ is infinite and satisfies (*). Then $\lim\limits_{n\to\infty}\frac{PA(n)}{\sum\limits_{m\in A, m \leq n} (\log m)}=1$, where $PA(n)$ is the number of primes among elements of $A$ less than $n$.

Conjecture 2, if correct, would mean that primes are (almost) random in von Mises sense, and imply the truth of a lot of conjectures about primes, including, for example, the twin primes conjecture (include in $A$ numbers in the form $p+2$ for $p$ prime + some carefully selected composite numbers to satisfy (*)).

My questions are: is Conjecture 2 wrong with an obvious counterexample? If no, is it well-known? If indeed wrong, are there other reasonable ways to formalize the concept of "admissible subsequence" in Conjecture 1?

Bogdan Grechuk
  • 5,947
  • 1
  • 25
  • 42
  • Perfect squares are almost a counterexample. They can be detected in time $b\log b,2^{O(\log^*b)}$ where $b=\log n$ is the length of the input, and there is not really any reason they could not be testable in linear time (though it does not seem all that likely). Perhaps some approximate notion of “square” can be checked in linear time unconditionally. – Emil Jeřábek Mar 16 '16 at 18:45
  • Oh, now I see that you even mention multiplication algorithms in the post. So, IMHO basing any conjecture on the tiny (if any) difference between the running time of multiplication and linear time is asking for trouble. – Emil Jeřábek Mar 16 '16 at 18:49
  • See http://mathoverflow.net/questions/57543/walsh-fourier-transform-of-the-m%C3%B6bius-function for a similar idea that actually works. Note that the complexity class there is much more restricted ($\mathrm{AC}^0$). – Emil Jeřábek Mar 16 '16 at 18:53
  • Possible counterexample: numbers of the form $a2^b+1$ or $a2^b+2$, where (say) $a\le b$. These can be tested in linear time, and satisfy $(*)$, but I’d assume they include less primes than what the conjecture asks for. – Emil Jeřábek Mar 16 '16 at 21:39
  • Dear Emil, thank you for comments. I strongly believe that multiplication is not in linear time, and the "tiny difference" implies that the conjecture is almost the best possible we can hope. Thank you for the reference, but the complexity class there is indeed too restricted. It is very interesting why you think that the numbers $a2^b+1$ contain less primes than random odd numbers, my intuition is completely opposite. – Bogdan Grechuk Mar 17 '16 at 11:03
  • First, even if these numbers were as likely as any other to be prime, their expected number would be less than conjectured. The model of random primes would suggest that $PA(n)$ should be around $\sum_{m\in A,m\le n}(\log m)^{-1}$. For typical $A$, this is roughly $A(n)/\log n$, noticeably less than $A(n)/\log A(n)$; but even that approximation is wrong if the sequence is too sparse or irregular. Second, irrespective of that, I’m not a number theorist, but yes, I’d expect these numbers less likely to be prime. For example, consider the case $a=1$ (i.e. Fermat numbers). ... – Emil Jeřábek Mar 17 '16 at 14:54
  • ... Blind application of the random model would predict there to be roughly $\log\log n$ Fermat primes below $n$, but this ignores the more structure they have (e.g., $2^n+1$ can only be prime if $n$ is a power of $2$!), and they are actually conjectured to be only finitely many. Likewise, there are many constants $a$ such that there are no primes of the form $a2^b+1$ (Sierpiński numbers), contra the $\log\log n$ prediction. – Emil Jeřábek Mar 17 '16 at 14:59
  • Thank you for explaining your intuition. The formula in Conjecture 2 is corrected to reflect the prediction of random models. Concerning values of $a$ such that $a=1$ or $a$ is Sierpiński number, condition ($$) is very far from hold for them (for $a=78557$, all such numbers have a factor in the covering set {3, 5, 7, 13, 19, 37, 73}). When we add other values of $a$ sufficient for ($$) to hold, there is a hope that prime density returns to the conjectured level. – Bogdan Grechuk Mar 18 '16 at 10:05

0 Answers0