22

We always hear about this vector of data VS this other vector of data being independent from each other, or uncorrelated, etc, and while it is easy to come across the math regarding those two concepts, I want to tie them into examples from real-life, and also find ways to measure this relationship.

From this stand point, I am looking for examples of two signals that are of the following combinations: (I will start with some):

  • Two signals that are independent AND (necessarily) uncorrelated:

    • The noise from a car engine (call it $v_1[n]$) and your voice ($v_2[n]$) as you are talking.
    • A recording of humidity every day ($v_1[n]$) and the dow-jones index ($v_2[n]$).

Q1) How would you measure/prove that they are independent with those two vectors in hand? We know that independence means that the product of their pdfs equals their joint pdf, and thats great, but with those two vectors in hand, how does one prove their independence?

  • Two signals that are NOT independent, but still uncorrelated:

Q2) I cant think of any examples here ... what would some examples be? I know we can measure correlatedness by taking the cross-correlation of two such vectors, but how would we prove that they are also NOT independent?

  • Two signals that are correlated:
    • A vector measuring an opera singer's voice in the main hall, $v_1[n]$, while someone records her voice from somewhere inside the building, say in the rehearsal room ($v_2[n]$).
    • If you continuously measured your heart rate in your car, ($v_1[n]$), and also measured the intensity of blue lights impinging on your rear windshield ($v_2[n]$)... I am guessing those would be very correlated... :-)

Q3) Related to q2, but in the case of measuring cross-correlation from this empirical stand point, is it enough to look at the dot product of those vectors (since that is the value at the peak of their cross-correlation)? Why would we care about other values in the cross-corr function?

Thanks again, the more examples given the better for building of the intuition!

Spacey
  • 9,817
  • 8
  • 43
  • 79
  • @DilipSarwate Thanks Dilip, I will take a look at it. For now some examples would be good though. – Spacey Jan 27 '12 at 06:36
  • You can't "prove" that they are independent in the same way that even a well-constructed poll can't "prove" how everyone is going to vote- and for the same reasons. – Jim Clay Feb 21 '12 at 18:15
  • @JimClay Feel free to relax the criterion 'prove' - what I am trying to get at are ways to measure/quantify independence. We often hear about so and so being independent, well, how do they know that? What measuring tape is being used? – Spacey Feb 21 '12 at 18:23
  • i would like to know whether cros corelation can be used for two analog signal one of high resolution & other of low resolution for analysis purpose. –  Nov 25 '12 at 04:42
  • If we have some random variable X and construct 2 signals a=$f_1$(x) and b=$f_2$(x) with $f_1$ and $f_2$ being orthogonal and x=a+b. Would this imply that such signals are independent? Does this require some additional conditions? This property would be interesting because it avoids constructing joint pdf of a and b. – Mladen Nov 05 '13 at 16:55

2 Answers2

10

A few elements... (I know that this is not exhaustive, a more complete answer should probably mention moments)

Q1

To check whether two distributions are independent, you need to measure how similar their joint distribution $p(x,y)$ is to the product of their marginal distribution $p(x) \times p(y)$. To this purpose, you can use any distance between distributions. If you use the Kullback-Leibler divergence to compare those distributions, you will consider the quantity:

$\int_x \int_y p(x, y) \log \frac{p(x, y)}{p(x) p(y)} dx dy$

And you will have recognized... the Mutual Information! The lower it is, the more independent the variables are.

More practically, to compute this quantity from your observations, you can either estimate the densities $p(x)$, $p(y)$, $p(x, y)$ from your data using a Kernel density estimator and do a numerical integration on a fine grid ; or just quantify your data into $N$ bins and use the expression of the Mutual Information for discrete distributions.

Q2

From the Wikipedia page on statistical independence and correlation:

Distribution plots

At the exception of the last example, these 2D distributions $p(x, y)$ have uncorrelated (diagonal covariance matrix), but not independent, marginal distributions $p(x)$ and $p(y)$.

Q3

There are indeed situations in which you might look at all the values of the cross-correlation functions. They arise, for example, in audio signal processing. Consider two microphones capturing the same source, but distant from a few meters. The cross-correlation of the two signals will have a strong-peak at the lag corresponding to the distance between microphones divided by the speed of sound. If you just look at the cross-correlation at lag 0, you won't see that one signal is a time-shifted version of the other one!

pichenettes
  • 19,413
  • 1
  • 50
  • 69
  • Thank you pichenettes: 1) Can you please elaborate on your first point - I am really having a hard time understanding just how, from two data vectors, x[n] and y[n], I can possibly come up with their JOINT PDF, $p(x,y)$. I can understand how taking a histogram of x[n] will give me pdf of X, ($p(x}$), and the same with Y, but how on earth does one come up with a joint given two vectors?? I am asking concretely - exact concrete mapping of a PDF from observed samples. This is what is confusing me the most. (contd) – Spacey Feb 21 '12 at 19:50
  • (contd) 2) So to summarize: If the covariance matrix of x, and y is diagonal, then they are uncorrelated, but NOT necessarily independent correct? To test for independence was the issue with follow up question (1). However, if we show they are indep, then of course their covariance matrix HAS to be diagonal. Have I understood right? What is an example of 2 physical signals I can measure in real life that would be dependent, but not correlated? Thanks again. – Spacey Feb 21 '12 at 19:52
  • 1
    Let's say you have two signals $x_n$ and $y_n$ represented as vectors of $N$ elements. You can get an estimate of $p(x, y)$ using, for example, a Kernel density estimator: $p^(x, y) = \sum_i \frac{1}{N}K(x - x_i, y - y_i)$ where $K$ is a Kernel function. Or you can use the same technique as for building an histogram, but in 2D. Build a rectangular grid, count how many pairs $(x_n, y_n)$ fall in each cell of the grid, and use $p^(x, y) = \frac{C}{N}$ where N is the size of your signals and $C$ is the number of elements in the cell associated with point $(x, y)$. – pichenettes Feb 21 '12 at 20:08
  • 1
    "2 physical signals that would be dependent, but not correlated": Let's say we hack the GPS of a NY cab to record a (latitude, longitude) history of its position. There's a good chance the lat., long. data will be uncorrelated - there's no privileged "orientation" of the point cloud. But it'll hardly be independent, since, if you were asked to guess the latitude of the cab, you would provide a much better a guess if you knew the longitude (you could then look at a map and rule out the [lat, long] pairs occupied by buildings). – pichenettes Feb 21 '12 at 21:12
  • Another example : two sines wave at an integer multiple of the same frequency. Null correlation (Fourier basis is orthonormal) ; but if you know the value of one there is only a finite set of values that the other one can take (think of a Lissajous plot). – pichenettes Feb 21 '12 at 21:18
  • Ah! thanks for that. 1) Regarding the kernel method - what kernel does one use for this? (I recently learnt about kernels as of 1 month ago). Are there particular ones that one uses for this? 2) Regarding the joint pdf histogram method: Perhaps one way to test for independence would be the following: Compute the 2-D histogram as described. Then, compute the histogram of x and y separately, but then, take both histogram's vectors' outer product, (to make a grid). Compare this grid to the 2-D grid. If x and y are indep, they should be the same. Agree? – Spacey Feb 21 '12 at 23:26
  • A kernel typically used to estimate a continuous density from iid samples is the gaussian kernel (also called RBF). Triangular kernel are sometimes also used. 2) This approach is correct. If you are using Kullback-Leibler divergence to compare the output of your 2D histogramming to your outer product of 1D histograms, you are effectively computing the Mutual Information between a discretized version of your two distributions.
  • – pichenettes Feb 21 '12 at 23:37
  • I see, radial basis functions - I will research those more. 2) It also think if we indeed have the 2-D histogram (joint pdf), then I can also just sum down the rows to get the marginal pdf of x, sum down the columns to get the marginal pdf of y, and THEN take those two vectors' outer products. Should this make a difference in this regard? I do not think so myself...
  • – Spacey Feb 21 '12 at 23:44
  • Yes, if count pairs you can "collapse" the counts on both axes to get the histograms of the marginals. – pichenettes Feb 22 '12 at 01:23
  • @Mohammad By any chance, did you manage to compute the 2D histogram? I know this would be sort of estimating the joint pdf, but I have no idea how to do this! – Rachel Mar 07 '12 at 20:32
  • @Rachel I have not been able to find anything to do that yet, but in my explorations I have found something called a Kernel Density Estimator. (Scroll to example and download the .m file it has a very nice example code). Anyway, apparently this can help in generating a good 1-D smooth PDF that might be more useful and robust that a histogram approach. I will let you know if I find anything for 2-D. – Spacey Mar 08 '12 at 16:40
  • @pichenettes In Matlab, you can compute the 1D histograms (let's call them N1 and N2) using hist, and the 2D histogram using hist3 (let's call this Z). Now (assuming you use 10 bins - the default), N1 and N2 are 10-element vectors, and Z is a 10x10 matrix. How do you compare the product of the vectors to the matrix? Should the outer product be calculated? And in this case, do you do N1' * N2, or N1 * N2'? – Rachel Mar 09 '12 at 17:40
  • @pichenettes I have tried to calculate the Mutual Information, as suggested, and used Kernel Density Estimators for estimating the required PDFs. Unfortunately it seems I have a bug in my code =/ You could have a look at it over here, if you like, perhaps you have a couple of useful suggestions.. – Rachel Mar 11 '12 at 22:41