Problem Set 4: VAR models ¶

Exercise 1¶

Consider the following VAR

\[\begin{split}\begin{align} y_t &= (1 + \beta) y_{t-1} - \beta \alpha x_{t-1} + \epsilon_{1t}\\ x_t &= \gamma y_{t-1} + (1 - \gamma\alpha) x_{t-1} + \epsilon_{2t} \end{align}\end{split}\]

Show that this VAR is non-stationary.

To show the former statement, we will first rewrite the system of equations in matrix notation.

\[\begin{split}\begin{align} \begin{pmatrix} y_t \\ x_t \end{pmatrix} = \begin{pmatrix}1 + \beta & -\alpha\beta\\\gamma & 1 - \alpha\gamma\end{pmatrix} \begin{pmatrix}y_{t-1}\\x_{t-1}\end{pmatrix} + \begin{pmatrix}\epsilon_{1t}\\\epsilon_{2t}\end{pmatrix} \end{align}\end{split}\]

For stationarity, it has to hold that

\[\det(A(z)) = \det(I_2 - A_1 z) \neq 0\]

Inserting the values from the previous formula yields

\[\begin{split}\begin{align} \det(A(z)) &= \det(I_2 - A_1 z)\\ &= \det\begin{pmatrix} 1 - (1 + \beta)z & \alpha\beta z\\ -\gamma z & 1 - (1 - \alpha\gamma)z \end{pmatrix}\\ &= (1 - (1 + \beta)z)(1 - (1 - \alpha\gamma)z) + \alpha\beta\gamma z^2\\ &= 1 - (1 - \alpha\gamma) z - (1 + \beta)z + (1 + \beta)(1 - \alpha\gamma)z^2 + \alpha\beta\gamma z^2\\ &= 1 - z + \alpha\gamma z - z - \beta z + (1 - \alpha\gamma + \beta - \alpha\beta\gamma)z^2 + \alpha\beta\gamma z^2\\ &= 1 + (\alpha\gamma - 2 - \beta)z + (1 - \alpha\gamma + \beta)z^2 \end{align}\end{split}\]

TODO: Finish proving non-stationarity

Exercise 2¶

Consider a stationary vector-autoregressive process \(A(L)X_t = \epsilon_t\) and its corresponding moving average representation \(X_t = C(L)\epsilon_t\), where \(C(L) = \sum^\infty_{i=0} C_i L^i\).

Find the moving average coefficients for a \(VAR(1)\) process.

Given a \(VAR(1)\) with \(X_t = \nu + A_1 X_{t-1} + \epsilon_t\), we can rewrite the process the following way:

\[\begin{split}\begin{align} X_t &= \nu + A_1 X_{t-1} + \epsilon_t\\ (I_K - A_1 L) X_t &= \nu + \epsilon_t\\ \end{align}\end{split}\]

Then, for the Wold representation we need that \(C(L)(I_K - A_1 L) = I_K\) with \(C(L) = \sum^\infty_{i=0} C^i L^i\). Premultiply yields:

\[\begin{split}\begin{align} I_K &= \sum^\infty_{i=0} C_i L^i (I_K - A_1 L)\\ &= C_0 + C_1 L + C_2 L^2 + \dots\\ &- C_0 A_1 L - C_1 A_1 L^2 + \dots\\ \end{align}\end{split}\]

Matching by the lag operator yields the following relation

\[\begin{split}\begin{align} L = 0: && C_0 &= I_K\\ L = 1: && C_1 &= C_0 A_1 = A_1\\ L = 2: && C_2 &= C_1 A_1 = A^2_1\\ \vdots && \vdots\\ L = i && C_i &= A^i_1 \end{align}\end{split}\]

Show that the moving average coefficients for a \(VAR(2)\) can befound recoursively by \(C_0 = I\), \(C_1 = A_1\), and \(C_i = A_1 C_{i-1} + A_2 C_{i-2}\) for \(i = 2, \dots\).

First, let us define a \(VAR(2)\) process as \(X_t = \nu + A_1 X_{t-1} + A_2 X_{t-2} + \epsilon_t\). Then, rewrite the process with the lag operator and apply the Wold representation.

\[\begin{split}\begin{align} X_t &= \nu + A_1 X_{t-1} + A_2 X_{t-2} + \epsilon_t\\ (I_K - A_1 L - A_2 L^2) X_t &= \nu + \epsilon_t\\ \end{align}\end{split}\]

The Wold representation requires that \(C(L)(I_K - A_1 L - A_2 L) = I_K\).

\[\begin{split}\begin{align} I_K &= \sum^\infty_{i=0} C_i L^i (I_K - A_1 L - A_2 L^2)\\ &= C_0 + C_1 L + C_2 L^2 + C_3 L^3 + \dots\\ &- C_0 A_1 L - C_1 A_1 L^2 - C_2 A_1 L^3 - \dots\\ &- C_0 A_2 L^2 - C_1 A_2 L^3 - \dots \end{align}\end{split}\]

Matching by the lag operator yields the following relation

\[\begin{split}\begin{align} L = 0: && C_0 &= I_K\\ L = 1: && C_1 &= C_0 A_1 = A_1\\ L = 2: && C_2 &= C_1 A_1 + C_0 A_2\\ L = 3: && C_3 &= C_2 A_1 + C_1 A_2\\ \vdots && \vdots\\ L = i && C_i &= C_{i-1} A_1 + C_{i-2} A_2 \end{align}\end{split}\]

TODO: The Wold presentation is on the wrong side of matrix \(A_i\). It has to be \(A_1 C_0\) instead of \(C_0 A_1\), but I want to know the reason for this position change, except that it is obvious as the multiplication would not work without it.

Exercise 3¶

Data Collection

Get the data from the data folder or go to https://fred.stlouisfed.org and download the following quarterly data series (code in parentheses) for the years 1947Q1-2015Q4: - Real GDP (gdpc96) - GDP Deflator (GDPDEF) - Government consumption (GCE) - Population (B230RC0Q173SBEA) - Personal Consumption: Nondurable Goods (PCND) - Personal Consumption: Services (PCESV)

Construct the three time series: real per capita GDP Y , real per capita government consumption G, and real per capita private consumption C (consisting of nondurables and services consumption).

In [1]:

using LinearAlgebra, XLSX

In [2]:

xf = XLSX.readxlsx("problem_set_4_data/us_data.xlsx")

gdp_real = xf["Tabelle1!B8:B283"]
gdp_deflator = xf["Tabelle1!D8:D283"]
nom_gov_cons_inv = xf["Tabelle1!F8:F283"]
nom_priv_cons_ndg = xf["Tabelle1!J8:J283"]
nom_priv_cons_services = xf["Tabelle1!L8:L283"]
nipa_pop = xf["Tabelle1!N8:N283"]

close(xf)

In [3]:

Y = gdp_real ./ nipa_pop
G = nom_gov_cons_inv ./ nipa_pop ./ gdp_deflator
C = (nom_priv_cons_ndg + nom_priv_cons_services) ./ nipa_pop ./ gdp_deflator

timeline = (1947:0.25:2015.75);

VAR Estimation

Use the following function to estimate a \(VAR(4)\) on the vector of observables \(x_t = [log(G_t), log(Y_t), log(C_t)]\) via OLS. Also include a constant and a linear time trend.

In [4]:

function olsvar(y, p, trend_dummy)

    t, K = size(y)
    y = y'

    # create lags
    Z = y[:, p:t-1]
    for i = 1:p-1
        Z = [Z; y[:, p-i:t-i-1]]
    end

    if trend_dummy == 1
        Z = [ones(1, t-p); (1:t-p)'; Z]
    else
        Z = [ones(1, t-p); Z]
    end

    Y = y[:, p+1:end]
    # Lecture 8, Equations 2 and following
    A = (Y * Z') / (Z * Z')
    U = Y - A * Z
    Σ = U * U' / (t - p - p*K - 1)
    V = A[:, 1:1+trend_dummy]
    A = A[:, 2+trend_dummy:K*p+1+trend_dummy]

    np = length(A[:])
    Σ_ML = (t - p - p*K - 1) / (t - p) * Σ
    AIC = logdet(Σ_ML) + 2 * np / (t - p)
    BIC = logdet(Σ_ML) + log(t - p) * np / (t - p)
    HQ  = logdet(Σ_ML) + 2 * log(log(t - p)) * np / (t - p)


    return A, Σ, V, U, Y, Z, AIC, BIC, HQ
end;

The function for the OLS estimation of the \(VAR(4)\) requires three inputs where y is the matrix containing the log series of the three enitities, govermnent spending, GDP and private consumption. p is the number of lags, 4, and trend_dummy indicates whether the model should contain a trend dummy.

In [5]:

x = [log.(G) log.(Y) log.(C)]
lags = 4
trend = 1;

In [6]:

A, Σ, V, U, Y, Z, AIC, BIC, HQ = olsvar(x, lags, trend);

In [7]:

lag_maximum = 8

aic_array = Vector{Float64}(undef, lag_maximum)
bic_array = Vector{Float64}(undef, lag_maximum)
hq_array = Vector{Float64}(undef, lag_maximum)

for i = 1 : lag_maximum
    _, _, _, _, _, _, aic_array[i], bic_array[i], hq_array[i] = olsvar(x[lag_maximum + 1 - i:end, :], i, 1)
end

In [8]:

argmin(aic_array), argmin(bic_array), argmin(hq_array)

Out[8]:

(2, 2, 2)