15

I want to know how I can determine whether a series of data is periodic or not.

I want to use Fourier transform/series. My data looks either aperiodic

[111100001111000110010101010000101]

or periodic

[11001100110011001100]

and I need to decide which it is automatically. What types of analyses or calculations can I perform to determine if a signal is periodic or not?

James Mertz
  • 109
  • 7
safzam
  • 153
  • 1
  • 1
  • 4

2 Answers2

15

I would do a normalized autocorrelation to determine periodicity. If it is periodic with period $P$ you should see peaks at every $P$ samples in the result. A normalized result of "1" implies perfect periodicity, "0" implies no periodicity at all at that period, and values in between imply imperfect periodicity. Subtract the data sequence's mean from the data sequence before doing the autocorrelation because it will bias the results.

The peaks will tend to diminish the farther away from the center they get simply because of having fewer overlapping samples. You can mitigate that effect by multiplying the results by the inverse of the percentage of overlapping samples.

$$ U(n) = A(n) * \frac{N}{\lvert N-n\rvert} $$ where $U(n)$ is the un-biased autocorrelation, $A(n)$ is the normalized autocorrelation, $n$ is the offset, and $N$ is the number of samples in the data sequence that you are checking for periodicity.

EDIT: This is an example of how to tell if the sequences are periodic. The following is Matlab code.

s1 = [1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1 0 1];
s1n = s1 - mean(s1);
plot(xcorr(s1n, 'unbiased'))

The "unbiased" parameter to the xcorr function tells it to do the scaling described in my equation above. The auto-correlation is not normalized, though, which is why the peak at the center is around 0.25 instead of 1. That doesn't matter, though, as long as we keep in mind that the center peak is perfect correlation. We see that there are no other corresponding peaks except at the outermost edges. Those do not matter because there there is only one sample overlapping, so that is not meaningful.

Non-periodic

s2 = [1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0];
s2n = s2 - mean(s2);
plot(xcorr(s2n, 'unbiased'))

Here we see that the sequence is periodic because there are multiple unbiased autocorrelation peaks with the same magnitude as the center peak.

Periodic

Jim Clay
  • 12,101
  • 31
  • 55
  • 3
    +1: You might also want to mean-correct the data before forming $A(n)$, otherwise any DC-offset might get in the way of finding the periodic peaks. – Peter K. Apr 24 '13 at 13:51
  • 1
    @PeterK Good point. – Jim Clay Apr 24 '13 at 14:13
  • Hey Jim, thanks..I am little bit confused how to start programming this, because where ever I search about autocorrelation I find complex formulas, I dont really get the idea where to start from and how to detect peak with period P in the code. With me I have a list of values V[]={110011001100..} now how to put them in autocorrelation formulas and determine wether its periodic or not...Can you please give me a little easy start...Thanks a lot – safzam Apr 26 '13 at 10:00
  • @safzam If you are using Matlab or Python (numpy) they have autocorrelation functions already. If you need something in C/C++/Java/whatever, then try here- http://www.dsprelated.com/showmessage/59527/1.php – Jim Clay Apr 26 '13 at 18:16
  • For example I used following two signals s1 ans s2: s1 = [1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1] s2 = [1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1] r1 = numpy.correlate(s1, s1, mode='full') r2 = numpy.correlate(s2, s2, mode='full') I used these four lines in a python code. I got r1 = [1 2 1 2 4 2 3 6 3 4 8 4 3 6 3 2 4 2 1 2 1] and r2 = [1 0 1 1 2 0 3 2 3 2 6 2 3 2 3 0 2 1 1 0 1] both r1 and r2 gives a same rainbow curve like shape.. How can i determine in the code that one signal is peroidc or nearly periodic or not periodic at all, thanks – safzam Apr 29 '13 at 15:48
  • You did not subtract the means before doing the auto-correlation. – Jim Clay Apr 29 '13 at 16:30
  • so now I get r1=[-0.24793388 0.50413223 -0.2892562 0.00826446 0.21487603 -0.48760331 0.71900826 -0.98347107 1.2231405 -2.02479339 2.72727273 -2.02479339 1.2231405 -0.98347107 0.71900826 -0.48760331 0.21487603 0.00826446 -0.2892562 0.50413223 -0.24793388] and r2= [-0.23140496 0.17355372 0.30578512 0.07438017 0.20661157 -0.66115702 -0.52892562 -0.76033058 0.37190083 -0.2231405 2.54545455 -0.2231405 0.37190083 -0.76033058 -0.52892562 -0.66115702 0.20661157 0.07438017 0.30578512 0.17355372 -0.23140496].how r1 and r2 tell that s1 is periodic and s2 not? thanks – safzam Apr 30 '13 at 08:31
  • @safzam I have added a section to my answer that demonstrates how to do it. – Jim Clay Apr 30 '13 at 12:23
  • Jim, good answer! One thing I wondered was whether it's possibly to do automatic threshold selection? For example, suppose the center (zero lag) is X, then at a non-zero-lag level above 0.8*X we decide that the signal is periodic? It seems to me you could define a statistical test around this. – Peter K. Apr 30 '13 at 13:50
  • @PeterK. Yes, I think that someone smart could come up with statistical information or confidence intervals or something, but I'm not that guy. If I were doing it I would probably go with a threshold like 0.8 like you did, and look for repeated peaks at the same period. – Jim Clay Apr 30 '13 at 15:07
  • hello Jim, thanks for the above valuable info but I am still trying to solve my problem. – safzam May 15 '13 at 13:03
4

Jim's answer sent me to think about how to test this statistically. This led me to the Durbin-Watson autocorrelation test.

The generalization of it is to form:

$$ DW(\tau) = \frac{\displaystyle \sum_{n=\tau}^{N-1} \left [ U(n) - U(n-\tau) \right ]^2 }{\displaystyle\sum_{n=0}^{N-1} U(n)^2} $$

and my attempt at implementing this in scilab is:

// http://en.wikipedia.org/wiki/Durbin%E2%80%93Watson_statistic
s1 = [1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1 0 1];
s1n = s1 - mean(s1);
xs1 = xcorr(s1n,"unbiased");
N1 = length(xs1);

s2 = [1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0];
s2n = s2 - mean(s2);
xs2 = xcorr(s2n,"unbiased");
N2 = length(xs2);

dwstat1 = [];
dwstat2 = [];

for lag = 1:15,

    dxs1 = xs1((lag+1):N1) - xs1(1:(N1-lag));
    dxs2 = xs2((lag+1):N2) - xs2(1:(N2-lag));


    dwstat1 = [dwstat1 sum(dxs1.^2) / sum(xs1.^2)];
    dwstat2 = [dwstat2 sum(dxs2.^2) / sum(xs2.^2)];

end;

The upshot is that if $DW(\tau)$ is close to zero or close to 4 for a given value of $\tau$, then there is a periodicity (autocorrelation) there.

If I plot the result for our two example sequences:

enter image description here

Then it's clear that the second sequence exhibits correlation at lags of 4, 8, etc. and anti-correlation at lags of 2, 6, etc.

What I'm still not sure about is how to say $DW(\tau)$ is "close to" 0 or 4. I'll dig a little further and see what I come up with.

Peter K.
  • 25,714
  • 9
  • 46
  • 91
  • thank you for this info. Infact I am making a program in python where I get a lot of lists of 0s and 1s. I want to separate periodic, random, burst type of series. I am trying the above logic in python but "xcorr" function is not in python, then I used numpy.correlate(lst, lst, mode='full') function. Also the lists contains round 70,000 list of 0s and 1s.. I just want to determine if this list is periodic or not... if there is a little un periodicity I can avoid it. any further hint plz. thanks in advance. – safzam May 15 '13 at 13:02