insights into making AR Burg extrapolate finite periodic signals

Question

Following a past question, I'd like to extrapolate a periodic signal using AR Burg, but when doing so, it seems that I need to sample "enough" for that to work. For example, if I use the original question example, but my signal "ends" at t=151, the AR Burg prediction isn't capturing the expected extrapolation, and decay rapidly:

N = 150;    % Order of LPC auto-regressive model
P = 500;    % Number of samples in the extrapolated time series
M = 150;    % Point at which to start predicting
t = 1:P;
x = 5sin(t/3.7+.3)+3sin(t/1.3+.1)+2*sin(t/34.7+.7); %This is the measured signal
x0=x;
x(M+1:end)=0;
a = arburg(x, N);
y = zeros(1, P);
% fill in the known part of the time series
y(1:M) = x(1:M);
% in reality, you would use filter instead of the for-loop
for ii=(M+1):P

    y(ii) = -sum(a(2:end) .* y((ii-1):-1:(ii-N)));
end
plot(t, x0,'b', t, x,'or',t, y,':xk','LineWidth',1);
l = line(M*[1 1], get(gca, 'ylim'));
set(l, 'color', [0,0,0]);
legend('expected signal prediction', 'actual signal', 'extrapolated signal', 'start of extrapolation');

Since the motivation is to extrapolate periodic signals, how would that work if they are finite in nature (here padded with zeros)? Or, if I misunderstood this, can you show how to use AR Burg to extrapolate beyond a time point that is "measured" when the signal is unknown?

The signal decays rapidly because you are including the zeros in your estimation of the AR parameters. Use a = arburg(x0,N); and that should fix it. — Baddioes, Mar 08 '24 at 19:36
how do I simulate a signal like x that has only N elements ? do I keep the point where to start predicting the same ? — bla, Mar 08 '24 at 20:24
You can't have the same order AR model as the length of the data. If you use x0(1:M) where M is instead 200, it should work. — Baddioes, Mar 08 '24 at 21:42
So this is actually the question, how do I choose the order of the AR model given my data? — bla, Mar 08 '24 at 22:36
Look into akaike or bayesian information criterion. There are others as well, but those are the most basic. — Baddioes, Mar 09 '24 at 01:44
As @ADDY answered, AIC\BIC don't seem to be helpful, unless you can articulate an answer that show how they can be used here? — bla, Mar 12 '24 at 18:22

Baddioes · Accepted Answer · 2024-03-13T17:11:03.937

Alright, sorry for the delayed response. My comments on AIC and BIC, while if applied correctly will work, were not helpful to reaching a conclusive answer, which I try to give when I comment or answer. I'm not an expert on information criteria like those, but I know they are used somewhat often which is why I suggested them. Hopefully the following discussion will be more practical, as it is specifically geared towards AR models.

AR models are all-pole white noise generative models, which means that the current value in the time series can be predicted as a linear combination of its previous $p$ values plus a stochastic input term. This is mathematically described as

\begin{equation} x_{n} = a_{1}x_{n-1} + a_{2}x_{n-2} + \dots + a_{p}x_{n-p} + w_{n} \end{equation}

In order to delve into the basics of model order selection, we need to understand the autocorrelation function (ACF) and partial autocorrelation function (PACF), specifically of AR models. The autocorrelation function is pretty well known, and practically is found by a time-averaged estimate of

\begin{equation}E[x(n)x^{*}(n-m)]\end{equation}

The autocorrelation function of an AR process typically exhibits erratic behavior that slowly tapers towards zero as you move away from the central point. Conceptually this makes sense. The current value in an AR series is dependent on the previous $p$ values, but each of those previous values are dependent on their own respective sets of previous $p$ values! So, the autocorrelation function will have somewhat erratic behavior that doesn't settle quickly.

On the other hand, the PACF measures the correlation between the current value $n$ and the lagged value $n-k$ while removing the effects of the intervening lags. That is to say, it measures the direct effect of each lagged value on the current value. This is important because the AR model assumes that there are only $p$ previous values, so the PACF should theoretically be zero for all values greater than $p$.

What this means is that we should expect the PACF to be zero after a certain number of values (to within a confidence interval), and the number of values whose magnitude is greater than this confidence interval should be the model order. I believe most, if not all, of Matlab's AR parameter estimation functions have the ability to return reflection coefficients, of which the PACF is the negation of. You can use this link to see how to perform this in Matlab.

Now, for this specific case, ie multiple sinusoids (potentially in noise) that form a line spectrum, you will likely see that both the ACF and PACF have decaying oscillatory behavior. This is because this signal model exhibits strong periodicity (I believe this is referred to as seasonality by time series people). This means that, in order to get really accurate extrapolation, you will likely need an ARMA model of some sort. But, Burg's should at least give you a reasonable estimate. You'll want to take the first value that dips below the confidence interval as "optimal", which I found to give roughly an order of $p=23$ with my testing.

Hopefully this helps!

**EDIT #1: ** Per request in the comments, here is how I came to the conclusion of what model order to use.

As can be seen in the picture, the PACF dips below the confidence interval at the value highlighted in red. This is the 24th value in the sequence, so we use 23 as our model order. The code I used is shown below.

N = 150;    % Order of LPC auto-regressive model
P = 500;    % Number of samples in the extrapolated time series
M = 200;    % Point at which to start predicting
t = 1:P;
x1 = 5sin(t/3.7+.3)+3sin(t/1.3+.1)+2*sin(t/34.7+.7); % This is the measured signal
x0 = x1;
x1(M+1:end) = 0;
[~,~,k] = arburg(x0(1:M),M-1);
pacf = -k;
figure;
stem(pacf,'filled'); 
xlim([1 50]); 
conf = sqrt(2)erfinv(0.95)/sqrt(M); 
hold on; 
plot(xlim,[1 1]'[-conf conf],'r'); 
hold off;
p = input("Enter the first x-axis value where the PACF magnitude drops to within the confidence interval: ");
a = arburg(x0(1:M),p-1);
y1 = zeros(1, P);
% fill in the known part of the time series
y1(1:M) = x1(1:M);
% in reality, you would use filter instead of the for-loop
for ii=(length(a)):P

    y1(ii) = -sum(a(2:end) .* y1(ii-1:-1:ii-length(a)+1));
end
figure;
plot(t, x0,'b', t, x1,'or', t, y1,':xk','LineWidth',1);
l = line(M*[1 1], get(gca, 'ylim'));
set(l, 'color', [0,0,0]);
legend('expected signal prediction', 'actual signal', 'extrapolated signal', 'start of extrapolation');

This using a value of $p=24$ produces the following output

I learned about PACF, thanks for that. But can you show in MATLAB how you got P = 23 for the OP's data? I don't seem to get that value according to the procedure you referred (https://www.mathworks.com/help/signal/ug/ar-order-selection-with-partial-autocorrelation-sequence.html) — ADDY, Mar 13 '24 at 15:56
Thanks! I made the mistake by not checking the abs(pacf). Instead, I was checking pacf values < conf level. which was giving p = 2. My bad. — ADDY, Mar 13 '24 at 17:50
No worries! If the answer is what you are looking for, would you mind upvoting? Thanks! — Baddioes, Mar 13 '24 at 17:54
thank you for the answer! I am trying to understand what you meant by "you will likely need an ARMA model of some sort..." why would a moving average model or (maybe other types of AR models such as ARIMAX) be better \ more appropriate than Burg's? — bla, Mar 13 '24 at 22:36
@bla An explanation from the data is that the ACF and PACF are cyclic, meaning that there is both a reasonable AR and MA model fit. From a theory standpoint, an MA model has similar characteristics to an FIR, whereas an AR model has characteristics similar to an all-pole IIR. You might need zeros and poles to get an accurate model. A PID controller could be somewhat analogous in this case. ARMA models are more difficult to solve for though, making AR models a good tradeoff. If this answer works, it would be great if you could accept it. It helps out both me and the site! — Baddioes, Mar 14 '24 at 01:48
thank you. any insights on how to choose conf = sqrt(2)*erfinv(0.95)/sqrt(M); given some amount of estimated noise ? is the 0.95 generic, or should this depend on SNR? — bla, Mar 15 '24 at 17:30
@bla admittedly that number came from the Matlab article. It’s a large-sample confidence interval. I’m not super familiar with model order selection (other than the basics) as I primarily use AR models for spectral analysis. I would start with looking into why the large sample confidence interval works in this case. — Baddioes, Mar 15 '24 at 18:55
@bla also, if my answer has answered your question, please accept it so that this question can be viewed as completed. If you want to discuss follow up questions, you are free to create a new question, but please close this one out first by accepting the answer. — Baddioes, Mar 15 '24 at 19:00

score 2 · Answer 2 · answered Mar 12 '24 at 09:01

Working on the same thing for the past few days as well. I am wondering the same as of how much order will be optimal. I was trying the Akaike information criterion (AIC) for optimal order prediction. But for that we will need to give maximum order up to which the AR model is created, and errors will be calculated. But according to this, if I use for example 150 as maximum order, it always says 149 is the best order to choose as the AIC value of 149 is the least. This definitely overfits the data and when we try to extrapolate, it is just some rubbish. I have provided the function to calculate the AIC if you would like to try.

I tried to modify your code so that the extrapolation works to some degree. Here, I used little more samples to calculate the AR model. and as @baddioes mentioned, you should not provide the zeros as well to calculate the AR model. Here low order like 15, 16 also works. But yeah, need to find a way how to calculate proper order for extrapolation. If you come across, let me know :)

clear;clc
N = 16;    % Order of LPC auto-regressive model
P = 500;    % Number of samples in the extrapolated time series
M = 180;    % Point at which to start predicting
t = 1:P;
x = 5sin(t/3.7+.3)+3sin(t/1.3+.1)+2*sin(t/34.7+.7); %This is the measured signal
x0 = x;
x0(M+1:end)=[];
a = arburg(x0, N);
y = zeros(1, P-M);
y(1:M) = x(1:M);
for ii=(M+1):P

    y(ii) = -sum(a(2:end) .* y((ii-1):-1:(ii-N)));
end
figure(1);clf
plot(t, x,'b', t(1:M), x0,'or',t, y,':');
xline(M,'k','linewidth',2);
legend('actual signal', 'known part', 'extrapolated part', 'start of extrapolation');
function [bestOrder, AIC] = best_ARorder(y, maxOrder)
    N = length(y);
    AIC = inf(maxOrder, 1);
for p = 1:maxOrder
    [~, e] = arburg(y, p);
    AIC(p) = N*log(e) + 2*p;
end

[~, bestOrder] = min(AIC);

end

Result:

thanks for the answer. this demonstrates why generic comments such as "look into AIC\BIC..." by @baddioes are not helpful unless they are not going to answer the question. — bla, Mar 12 '24 at 18:20

insights into making AR Burg extrapolate finite periodic signals

2 Answers2