1

I think I am misunderstanding the notion of mutual information of continuous variables. Could anyone help me clear up the following?

Let $X \sim N(0, \sigma^2) $ and $Y \sim N(0, \sigma^2) $ denote Gaussian random variables. If $X$ and $Y$ are correlated with a coefficient $\rho$, then the mutual information between $X$ and $Y$ is given by (reference: https://en.wikipedia.org/wiki/Mutual_information).

\begin{equation} I(X; Y) = -\frac{1}{2} \log (1-\rho^2). \end{equation}

Here, I thought $I(X; Y) \rightarrow \infty$ when $\rho \rightarrow 1$ (for $X = Y$, $\rho = 1$). I considered this another way.

I considered $Y = X$. In this case, I would obtain $ I (X; Y) = H(X) - H(Y|X) = H(X) $.

For the Gaussian random variable $X$, $H(X)$ is bounded as follows (reference: https://en.wikipedia.org/wiki/Differential_entropy): \begin{equation} H(X) \leq \frac{1}{2} \log ( 2 \pi e \sigma^2). \end{equation}

Thus, $ I (X; Y) \leq \frac{1}{2} \log ( 2 \pi e \sigma^2)$.

Here is my question. I obtained two different results on $ I (X; Y)$ for $X = Y$. What could be some mistakes in my understanding?

Thank you in advance.

  • One usually writes $h(X)$ for the differetial entropy, not $H(X)$, and with good reason, it reminds you that it's not a "true" entropy, so that you don't into the trap of assuming $h(X |X)=0$ (as with the true entropy). Actually $h(X|X)=-\infty$. See eg my answer here – leonbloy Nov 30 '18 at 23:37

1 Answers1

4

Differential entropy can actually be negative, and thus the upper bound on your information is not correct. Indeed, if they are the same random variable on a continuous domain, then you would hope that the mutual information between them would be infinite (and if they are the same Gaussian, indeed that is the case).

EDIT: I guess I should have clarified: In differential entropy sense, H(Y | X) is not 0; it is negative infinity if X = Y. Any singularity in differential entropy has negative infinite relative uncertainty to any quantized uniform distribution.

E-A
  • 5,987
  • Thank you for the comment. However, the upper bound ($H(X) \leq \frac{1}{2} \log ( 2 \pi e \sigma^2 )$) is in https://en.wikipedia.org/wiki/Differential_entropy. Also, $ -\frac{1}{2} \log ( 1 - \rho^2 ) $ is based on this formular. – Inkyu Bang Jun 06 '18 at 14:29
  • See edit above; I realized I did not actually write the negative quantity. – E-A Jun 06 '18 at 19:45
  • 1
    Okay, I got your point. So my assumption on $H ( Y | X ) = 0 $ only holds when $X$ and $Y$ are discrete and $Y=X$. In case of continuous R.Vs, $H ( Y | X ) = - \infty $ for $Y=X$. Thus, $ I(X; Y) = H(X) - (-\infty) = \infty $ for $X=Y$. Thank you! – Inkyu Bang Jun 07 '18 at 01:46
  • @InkyuBang Your assumption on $H(X|X)=0$ only holds for the true (Shannon) entropy. It does not apply to differential entropy, which is a different beast. The (true/Shannon) entropy of a (non degenerate) continuous variable is $+\infty$, because (among other things) you need (on average) an infinite amount of bits to represent its value. – leonbloy Nov 30 '18 at 23:40