Ankur Goel

On hunt of awesomeness!

Levinson-Durbin and Autocovariance

As discussed, I am implementing estimation methods for phi in AR modelling. We covered yule_walker earlier, I’ll write a post about that. After it’s implementation, we go ahead with another estimation method - Levinson Durbin

Levinson-Durbin requires timeline series to be demeaned(series = series - series.mean) and it’s autocovavirance.

Autocovariance of series is represented by summation of summation of product of series with series at lag k. That is, summation of (x_i * x_{i+lag}). It is also directly related with acf of series as acf(k) = acvf(h) / acvf(0). It’s code can now be found in Statsample::TimeSeries’s acvf method.

Now, with the help of autocovariance series, our levinson_durbin function recursively computes the following parameters:

  • sigma_v : estimation of error variance
  • arcoefs : AR phi values for timeseries
  • pac : unbiased levinson pacf estiation
  • sigma : sigma for AR.

L-D performs recursive matrix and vector multiplications to populate it’s toeplitz matrix. Here is some code depicting those manipulations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
      order = nlags
      phi = Matrix.zero(nlags + 1)
      sig = Array.new(nlags+1)

      #setting initial point for recursion:
      phi[1,1] = series[1]/series[0]
      #phi[1][1] = series[1]/series[0]
      sig[1] = series[0] - phi[1, 1] * series[1]

      2.upto(order).each do |k|
        phi[k, k] = (series[k] - (Statsample::Vector.new(phi[1...k, k-1]) * series[1...k].reverse.to_ts).sum) / sig[k-1]
        #some serious refinement needed in above for matrix manipulation. Will do today
        1.upto(k-1).each do |j|
          phi[j, k] = phi[j, k-1] - phi[k, k] * phi[k-j, k-1]
        end
        sig[k] = sig[k-1] * (1-phi[k, k] ** 2)

      end

Implementation can be found here.

Now, in this week, I will integrate this in AR modelling and perform some tests to verify the estimation. And will soon start with next estimation method :)

Cheers,
Ankur Goel

AR/MA, ARMA Acf - Pacf Visualizations

As mentioned in previous post, I have been working with Autoregressive and Moving Average simulations.
To test the correctness of estimations by our simulations, we employ acf(Autocorrelation) and pacf(partial autocorrelation) to our use. For different order of AR and MA, we get the varying visualizations with them, such as:

  • Exponential decreasing curves.
  • Damped sine waves.
  • Positive and negative spikes, etc.

While analyzing and writing tests for same, I also took some time to visualize that data on ilne and bar charts to get a clearer picture:

AR(1) process

AR(1) process is the autoregressive simulation with order p = 1, i.e, with one value of phi.
Ideal AR(p) process is represented by:
Courtesy: Wikipedia
To simulate this, install statsample-timeseries from here.

1
2
3
4
require 'statsample-timeseries'
include Statsample::ARIMA
series = ARIMA.new
ar_1 = series.ar_sim(1500, [0.9], 2)

Here, number of observations, n = 1500 (greater value is preferrable for best fit), p = 1, with phi = [0.9].

ACF

To generate it’s autocorrelation

1
2
acf = ar_1.to_ts.acf
p acf

For an AR(1) process, acf must exponentially decay if phi > 0, or alternate in sign if phi < 0Ref. Go through the analysis above. It can be visualized as:
When phi > 0, acf decreases exponentially:
AR(1) positive phi acf AR(1) positive phi acf line chart
When phi < 0, you get the alternate acf lags:
AR(1) negative phi acf AR(1) negative phi acf line chart

PACF

To generate it’s partial autocorrelation:

1
2
pacf = ar_1.to_ts.pacf
p pacf

For AR(1) process, pacf must have a spike at lag 1, then 0. Former spike must be positive if phi > 0, otherwise, negative spike. Have a look at the pacf series generated above. On visualizing the data:
When phi > 0, positive lag at 1 and 0(contains 1.0):
AR(1) positive phi pacf AR(1) positive phi pacf line chart
When phi < 0, negative lag at 1:
AR(1) negative phi pacf bar chart

Here is the representation of ideal acf-vs-pacf for positive phi in AR(1):
AR(1) acf/pacf

AR(P) process

Simulation of AR(p) process is similar as AR(1).

1
2
3
4
5
6
series = ARIMA.new
phi_params = [0.5, 0.3]
ar_p = series.ar_sim(1500, phi_params, 2)
#acf
acf = ar_p.to_ts.acf
pacf = ar_p.to_ts.pacf

ACF

For AR(p), acf must give a damping sine wave. The pattern is greatly dependent on the value and sign of phi parameters.
When positive content in phi coefficients is more, you will get a sine wave starting from positive side, else, sine wave will start from negative side.
Notice, the damping sine wave starting from positive side here:
AR(p) positive
and negative side here..
AR(p) negative

PACF

pacf gives spike at lag 0(value = 1.0, default) and from lag 1 to lag k. The example above, features AR(2) process, for this, we must get spikes at lag 1 - 2 as:
AR(p) pacf spikes

MA(1) process

MA(1) process is the moving average simulation with order q = 1, i.e, with one value of theta.
To simulate this, use ma_sim method from Statsample::ARIMA::ARIMA

1
2
3
4
5
6
require 'statsample-timeseries'
include Statsample::ARIMA
series = ARIMA.new
ma_1 = series.ma_sim(1500, [0.9], 2)
acf = ma_1.to_ts.acf
pacf = ma_1.to_ts.pacf

For theta > 0, for MA(1), we must get a positive spike at lag 1 as:
MA(1) positive acf For theta < 0, the spike at lag 1 must be in negatie direction as:
MA(1) negative acf

When I put these two visualizations aside each other, the visualization seems quite fit:
MA(1) positive/negative

MA(q) process

MA(q) process. Order = q => Number of theta coefficients = q.
Ideal MA(q) process is represented by:
MA(q) Courtesy: Wiki

ACF

Similar to AR(1) simulation, it will have spikes for lag 1 - lag p as :
MA(1) acf

PACF

In pacf of MA(q) simulation, we observe exponentially decaying/damping sine wave.
MA(q) pacf

ARMA(p, q) process

ARMA(p, q) is combination of autoregressive and moving average simulations.
When q = 0, the process is called as pure autoregressive process; when p = 0, the process is purely moving average.
The simulator of ARMA can be found as arma_sim in Statsample::ARIMA::ARIMA.
For ARMA(1, 1) process, here are the comparisons of the visualizations from R and this code, which just made my day :)

R ARMA(1, 1) Statsample ARMA(1,1)

Quite Fit!

Cheers,
- Ankur Goel

AR and MA Simulations

I have been reading and coding AR(Autoregressive Model) and Moving Average(MA) basics.
First, I am very grateful for all the patience by Claudio Bustos to help me with understanding this. There is still lot to be learnt in these models, but I will keep bugging him. :D

We wrote the simulations for AR(p) and MA(q). The idea behind creating them is to first simulate the process with known coefficients and then move on to write ARMA process which could also estimate the coefficients.

Autoregressive Model

This model is represented by AR(p) where p is the order of this model. For a pure autoregressive model, we consider order of moving average model as 0.

AR(p) is represented by the following equation:

Courtesy: Wikipedia

Here, are the parameters of model, and is the error noise.
To realize this model, we have to observe and keep track of previous x(t) values, and realize current value with observed summation and error noise.

Here is that code fragment of general AR(p) simulator:

  • It creates a buffer initially to prevent ‘broken values’, and trims that buffer before returning the result.
  • It goes on to create the backshifts vector. The backshift vector depends on the ongoing iteration from (1..n).
  • After creating backshifts vector, it extracts the required phi values from the stack.
  • Then it performs vector multiplication - backshifts * parameters, then adds the result.
  • The current x(t) values is then returned with active white noise.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def ar_sim(n, phi, sigma)
  err_nor = Distribution::Normal.rng(0, sigma)
  #creating buffer with 10 random values
  buffer = Array.new(10, err_nor.call())

  x = buffer + Array.new(n, 0)

  #For now "phi" are the known model parameters
  #later we will obtain it by Yule-walker/Burg

  11.upto(n+11) do |i|
    if i <= phi.size
      #dependent on previous accumulation of x
      backshifts = create_vector(x[0...i].reverse)
    else
      #dependent on number of phi size/order
      backshifts = create_vector(x[(i - phi.size)...i].reverse)
    end
    parameters = create_vector(phi[0...backshifts.size])

    summation = (backshifts * parameters).inject(:+)
    x[i] = summation + err_nor.call()
  end
  x - buffer
end


There are certain tests which will now be performed in context of acf and pacf, which I earlier coded. They form the basis of correctness of our autoregressive estimation. :) Expect the next post about those tests.

Moving Average Model

This model is represented by MA(q), where q is the order of this model. Again, for pure moving-average model, we consider p=0.

This is represented by following equation:

Unlike autoregressive model, this model was somewhat hard to obtain. It needs to obsere previous error noise instead of previous x(t) values. And the series largely depends upon the order of the model.

It’s code can also be found on my github branch.

I am now currently working on those tests analysis, I mentioned; for various combinations of AR(p) and MA(q). As they get over, I will move to realize few algorithms like yule walker and burgs for estimation of coefficients.

Cheers,
-Ankur Goel