Showing the existence of a strictly positive probability vector using a separation argument

Question

Let $s = (s_1,...,s_n) \in (0,1)^n$. Suppose that $x$ is a real number and that $x \in (\min_j s_j, \max_j s_j)$. It's not difficult to show, by a direct construction, that

There exists a probability vector $p = (p_1,...,p_n) \in (0,1)^n$ such that $s \cdot p = x$,

where $s \cdot p$ is inner product, and $p$ is a probability vector if $\sum_j p_j = 1$.

I'm wondering if this result can be demonstrated using a separating hyperplane argument. Here's the idea.

Suppose for contradiction that no such $p \in (0,1)^n$ exists. Then, with $X = \{p \in (0,1)^n: p \ \text{a probability vector} \}$ and $Y = \{p: s \cdot p = x\}$, we have $X \cap Y = \emptyset$. Both sets $X$ and $Y$ are convex. So, by the separation theorem, there is a vector $q \in \mathbb{R}^n$ and $b \in \mathbb{R}$ such that $q \cdot p \leq b$ for all $p \in X$ and $q \cdot p \geq b$ for all $p \in Y$. I don't see a way to derive a contradiction from this, however, so any help here would be appreciated.

I would also like to generalize this argument, if it can be made to work, to the countably infinite case, i.e. replacing $n$ with $\mathbb{N}$. Any suggestions about how to achieve that would also be appreciated.

antkam · Answer 1 · 2018-08-10T18:12:25.437

Lemma: $q$ and $s$ are parallel. Geometrically this is somewhat obvious(?) as $Y$ is an $(n-1)$-dimensional hyperplane orthogonal to $s$, in $n$-space, so any other hyperplane wholly on one side of $Y$ must be orthogonal to the same $s$. The proof is a little tedious and is at the end.

Assuming the Lemma, then WLOG we can assume $q=s$, which implies $b \le x$ in order for satisfy $\forall p \in Y: q \cdot p \ge b$.

Now construct $p$ to be the probability vector with $1-\delta$ at the $\max_j s_j$ position, and ${\delta \over n-1}$ elsewhere, for some small $\delta > 0.$ This $p \in X$ (i.e. $p_j \in (0,1)$ and $\sum_j p_j = 1$), and $s \cdot p > (1-\delta) \max_j s_j$ with the inequality being strict because all the other terms of the dot product are strictly positive since ${\delta \over n-1} > 0, s_j > 0$.

Meanwhile, $x$ is a given constant, and $x < \max_j s_j$ implies we can choose $\delta > 0$ s.t. $(1 - \delta) \max_j s_j > x$. Thus, $s \cdot p > (1 - \delta) \max_j s_j > x \ge b$. This contradicts the separating hyperplane property of $\forall p \in X: s \cdot p \le b$.

Comments: (1) Sorry this still required a construction. :) (2) I think you needed to define $X$ not simply as $(0,1)^n,$ but rather as $X=\{p: p \in (0,1)^n \text{ and } p \text{ is a probability vector}\}$.

Proof of Lemma: Decompose the hyperplane's $q$ into parallel and orthogonal components, i.e. $q = as + t$ for some real $a$ and some $t \perp s$ (i.e. $s \cdot t = 0$). By definition of $Y$, for any real $c$, we have $y = {x \over |s|^2} s + c t \in Y$ since $s \cdot y = x + 0 = x$. Meanwhile, $q \cdot y = ax+c|t|^2$. Since $a, x, b$ are all constants while $c$ is any real number, the only way we can have $q \cdot y = ax+c|t|^2 \ge b$ always, is if $|t|^2=0$, i.e. $t=0$, i.e. $q = as$, i.e. $q \parallel s$. QED

Thanks. You're right about $X$, and I will edit. I'll need some time to study this, and, in particular, to see whether it can be generalized to the infinite dimensional case. — user435571, Aug 10 '18 at 16:50
i am by no means expert in the infinite dim case, but one problem we run into immediately is that $\max_j s_j$ might not exist, e.g. when $s_j = 1 - (1/2)^j$. you'd need to bring in limits and suprema etc. your statement is probably still true, since you're careful to specify $x \in (\min, \max)$ and not $x \in [\min, \max]$. still, my proof would not work directly because you cannot find $argmax_j s_j$ to construct my $p$. — antkam, Aug 10 '18 at 17:04
BTW my first proof incorrectly used a boundary point for the construction of $p$ -- such a boundary $p \notin X$. This has now been fixed to use an interior $p$. — antkam, Aug 10 '18 at 18:01

score 1 · Answer 2 · answered Aug 16 '18 at 10:11

I provide a proof for the general case using connectedness property and then comment further on the separating hyperplane argument.

The proof is simply based on generalized intermediate value theorem using the fact that the inner product is continuous.

Proof 1:

Consider the function: $$ f(u)={\langle s,u\rangle}, $$ defined on the set $$ \mathcal D=\{u\in(0,1)^{\mathcal U}:\|u\|_1=1\}. $$ Some examples of $\mathcal U$ are $\mathbb N$ and $\{1,\dots,n\}$. For $\mathcal U=\mathbb N$, $\mathcal D$ is a subset of $\ell_1$-sequence space., i.e., the space of all absolutely convergent sequences.

We use the idea of intermediate value theorem that is preservation of connectedness under continuous maps. If the set $\mathcal D$ is a connected subset of a topological space, then $f(\mathcal D)$ is also a connected subset of $\mathbb R$, which means that the whole interval $(s_{\inf},s_{\sup})$ is covered by elements of $\mathcal D$. Therefore we need to prove the following

$\mathcal D$ is a connected subset (of a topological space).
$f(u)$ is continuous.

These are trivially verified by finite $\mathcal U$.

Let's consider the case $\mathcal U=\mathbb N$. $\mathcal D$ is a convex set and hence simply connected. $\mathcal D$ is a subset of $\ell_1$-sequence space. This is a Banach space.

It is enough then to prove the continuity of $f(u)$ in $\ell_1$ over $\mathcal D$.

First of all, one should care that the inner product is well defined. Note that $$ |\langle s,u \rangle-\langle s,v\rangle|\leq \|s\|_2\|u-v\|_2\leq \|u-v\|_1. $$ So the function is indeed $1-$Lipschitz and therefore continuous on $\ell_1$-space. Hence, the result holds for countably infinite sequences.

Separating Hyperplane:

Let's consider the proof based on separating hyperplane. The lemma provided in the other answer can be adapted to the general case as well.

Lemma. Let $f$ and $g$ be two non-zero linear functions from a space $\mathcal X$ to $\mathbb R$. Suppose that for all $u$ inside the hyperplane $L$ defined as $f(u)=c_1$, the function $g(u)$ is bounded from above (or below) by a constant $b$, i.e., $g(u)\leq b$ (or $g(u)\geq b$). Then there is a constant $c_2$ such that for all $u\in \mathcal X$ $$ g(u)=c_2 f(u). $$

Proof. It is enough to prove that $g$ and $f$ have the same null space (see here).

To prove this we show that $g(u)$ is also constant over the hyperplane $L$ which implies $L-u_1$ is the same null space of both $f$ and $g$ for a $u_1\in L$.

Suppose that $g(u)$ is not constant for $u\in \mathcal R$ such that $f(u)=c_1$. Then there should be some $u_1$ and $u_2$ such that $f(u_1)=f(u_2)=c_1$ and $g(u_1)\neq g(u_2)$ (otherwise $g(u)$ would be constant.) Without loss of generality assume that $g(u_1)< g(u_2)$.

Note that $f(tu_1+(1-t)u_2)=c_1$ for $t\in\mathbb R$. We have: $$ g(tu_1+(1-t)u_2)=g(u_2)+t(g(u_1)-g(u_2)). $$ But this can be made arbitrarily small with large enough $t$ hence violating the condition. Therefore $g$ should be constant over the hyperplane $L$. $\blacksquare$

For the next step, since we are working in infinite dimensional $\ell_1$-space, we use Hahn-Banach separation theorem for two sets $X$ and $Y$. Both sets are convex and non-empty.

Problem: The set $X$ is neither close nor open and therefore the theorem cannot be applied here.

But if one can find a linear function $h$ and $b$ such that $h(u)<b$ for for $u\in X$ and $h(u)\geq b$ for $u\in Y$, then the rest of the proof follows accordingly. Since $Y$ is obtained by $\langle s,u\rangle=x$, the Lemma above implies that $h(u)=c\langle s,u\rangle$ for a real number $c$ which we assume to be $1$. Therefore $x\geq b$ and $\langle s,u\rangle<x$ for all $u\in X$. But this is wrong since $$ \sup_{\|u\|_1=1}\langle s,u\rangle=s_{\sup}>x. $$

So the open issue at the moment is to circumvent this problem. Note that changing $(0,1)^{\mathbb N}$ to $[0,1]^{\mathbb N}$ does not work here because although the set is closedd it is not compact.

Showing the existence of a strictly positive probability vector using a separation argument

2 Answers2

Proof 1:

Separating Hyperplane: