Help understanding a proof of Tverberg Theorem

Question

I'm trying to understand Roudneff's proof of a theorem of Tverberg, in discrete and convex geometry. The proof is given here: https://arxiv.org/pdf/1712.06119.pdf (proof on pg3).

Questions:

(1) Why does there exist $y_j$ so that $\mathrm{dist}(z, \mathrm{conv} ~X_j) = \|x - y_j\|$? Is it because $\mathrm{conv}~X_j$ is a compact set (it is the image of a continuous function applied to a compact set) and hence $\|x - y\|$ attains a minimum over the convex hull of $X_j$ by EVT?

(2) Why is the second to last sentence of the third to final paragraph on page 3 true? Why does that function attain minimum at the same point?

(3) Why is the second to last sentence of second to final paragraph of page 3 true? Why is that inner product > 0? (why is this implied by what is in the parentheses)?

(4) Why are the first and third sentences of the last paragraph of page 3 true? (In particular, why is the first sentence a result of the general position assumption and why does inner product > 0 imply we can reduce mu?)

(5) Why can we make the assumption that $X$ lies in general position?

Nick Matteo · Answer 1 · 2018-03-09T04:18:57.037

$\DeclareMathOperator{\dist}{dist} \DeclareMathOperator{\conv}{conv} \DeclareMathOperator{\dim}{dim} \DeclareMathOperator{\aff}{aff}$

There is a unique point $y_j \in \conv(X_j)$ such that $\|z-y_j\| = \dist(z, \conv(X_j))$ because $\conv(X_j)$ is a compact and convex set. As you mention, compactness means that the function $d(x) = \|x-z\|$ achieves its minimum on $\conv(X_j)$. If two distinct points $y_1$ and $y_2$ are at the same distance from $z$, then the midpoint $\frac{y_1 + y_2}{2}$ is closer to $z$ (consider the isosceles triangle formed by $z$, $y_1$, and $y_2$.) But the line segment between any two points of a convex set is included in the set—so the minimum of $d$ must be achieved at a unique point.
Since each point $y_j$ is in $\conv(X_j)$, for any point $x$, $\|x - y_j\| \geq \dist(x, \conv(X_j))$ (because $\dist(x, \conv(X_j))$ is the minimum distance of $x$ to any point of $\conv(X_j)$). So if $$ f(x) = \sum_{j=1}^r \dist^2(x,\conv(X_j)) $$ and $$ g(x) = \sum_{j=1}^r \|x - y_j\|^2 $$ then $g(x) \geq f(x)$ for all $x$.

On the other hand, at $z$, $\|z - y_j\| = \dist(z, \conv(X_j))$ for every $j$ (by definition of the points $y_j$), so $g(z) = f(z)$ is the minimum value of both functions.

An important condition which was somewhat glossed over is that each $Y_j \subseteq X_j$ is chosen to be a minimal set such that $y_j$ is in the relative interior of $\conv(Y_j)$.

Consider each flat $\aff(Y_j)$. The vector from $z$ to the closest point within each flat—call it $\omega_j$—is orthogonal to that flat; $(\omega_j - z) \perp \aff(Y_j)$. If $\omega_j$ is within $\conv(Y_j) \subset \aff(Y_j)$, then it must be $y_j$. But if it is not within $\conv(Y_j)$, then the point $y_j$ must be in the boundary of $\conv(Y_j)$ closest to $\omega_j$, which contradicts that $y_j$ is in the relative interior of $\conv(Y_j)$. (In fact, in this case, $Y_j$ would have been smaller so that $\conv(Y_j)$ is just the boundary in question, and $\aff(Y_j)$ has smaller dimension.)

Thus, the vector $y_j - z$ is orthogonal to $\aff(Y_j)$ for every $j$, and the scalar product $\langle v - z, y_j - z\rangle$ is positive for any $v \in \aff(Y_j)$.

I guess the general position assumption includes the idea that a hyperplane spanned by one subset of $X$ is never parallel to another hyperplane spanned by a disjoint subset of $X$, nor to an intersection of such other hyperplanes, because otherwise the claim is false.

If you have points in this kind of general position, and the intersection of some affine subspaces spanned by subsets of those points is empty, then the sum of the codimensions of the affine subspaces must be at least $d + 1$ (see dimension of intersection of hyperplanes).

Since each $Y_j$ is a minimal set with $y_j$ in its relative interior, $\dim(Y_j) = |Y_j| - 1$. So $$ \begin{align*} d + 1 &\leq \sum_{j=1}^r (d - \dim(Y_j)) \\ &= \sum_{j=1}^r (d + 1 - |Y_j|) \\ &= r(d+1) - \sum_{j=1}^r |Y_j| \\ \sum_{j=1}^r |Y_j| &\leq (r-1)(d+1) \end{align*} $$

The inner product $\langle x - y_j, z - y_j\rangle$ being positive means that the vector from $y_j$ to $x$ goes in the same "direction" as the vector from $y_j$ to $z$ (they lie in the same half-space), so that the line segment from $y_j$ to $x$ gets closer to $z$ than $y_j$ is. Then moving $x$ from whatever part $X_i$ it is currently in, to part $X_j$, will result in a smaller value for $\mu$; because we know there is a point in $\conv(Y_j \cup \{x\})$ (a subset of the convex hull of the new $X_j$) which is closer to $z$ than $y_j$ was, while removing $x$ from $X_j$ won't increase $\dist(z, \conv(X_i))$, since the closest point $y_i$ from $\conv(X_i)$ is in $\conv(Y_i)$ which does not involve $x$.

As I mentioned, the needed assumption is not what is usually called general position but rather that the intersections of any hyperplanes spanned by points in $X$ are not parallel to each other. One way of interpreting such parallel subspaces is that they meet "at infinity". In Roudneff's paper from which your survey takes the proof, it's posed as follows (I've slightly modified it to remove references to bases of positive cones, which are not relevant to us; for the classical Tverberg theorem every set $L_i$ in Roudneff's paper is the empty set.)

We consider $\mathbb{R}^d$ as the affine space $\mathbb{P}^d \setminus H_\infty$, where $\mathbb{P}^d$ denotes the projective space of dimension $d$ and $H_\infty$ the hyperplane at infinity. We first observe that the convex dependences of our configuration of points $X$ are not modified by slightly moving $H_\infty$, i.e., the fact that a given point belongs to the convex hull of a subset $X_i$ of $X$ is preserved. In other words, the oriented matroid of affine dependences defined by $X$ remains the same (see Oriented Matroids for this notion). Thus, we may assume that $H_\infty$ contains no vertex of the projective arrangement of hyperplanes defined by $X$.

I take this to mean that applying a projective transformation can put the points in "general position" in the needed sense.

Help understanding a proof of Tverberg Theorem

1 Answers1