{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem Set 1: Hodrick-Prescott Filter [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/tobiasraabe/time_series/master?filepath=docs%2Fproblem_sets%2Fproblem_set_1.ipynb)\n", "\n", "## Introduction\n", "\n", "A commonly used approach to decompose a time series into a permanent component $y^p$ and a transitory (cyclical) component $y^c$ goes back to Hodrick and Prescott (1997). Suppose you have a sequence of data, $y_t$ with $t = 1,...,T$ (i.e. there are $T$ total observations). Our objective is to find a trend $\\{y^p_t\\}^T_{t=1}$ to minimize the following objective function:\n", "\n", "$$\n", "f = \\underset{y^p_t}{\\min} \\sum^T_{t=1} (y_t - y^p_t)^2 + \\lambda \\sum^{T-1}_{t=2} [(y^p_{t+1} - y^p_t) - (y^p_t - y^p_{t-1})]^2\n", "$$\n", "\n", "where the parameter $\\lambda \\geq 0$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 1\n", "\n", "**Question**: Provide a verbal interpretation of this objective function. What are the trade-offs?*\n", "\n", "**Answer**: The first term of the function is the sum of quadratic differences between the realization $y_t$ and the trend component $y^p_t$. If the function would only consist of this component, the minimization problem would yield $y^p_t = y_t$. Therefore, $y^p_t$ has also to minimize the squared difference to its neighbours $y^p_{t+1}$ and $y^p_{t-1}$ so that all trend components are similar to each other. The weighting parameter $\\lambda$ controls how important it is that the trend values are similar to each other. A higher $\\lambda$ will lead to more equal trend components meaning a smoother trend, but it might also wash out important differences in the trend." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 2\n", "\n", "**Question**: Prove that if $\\lambda = 0$, the solution is $y^p_t = y_t \\forall t$, i.e. the trend and the actual series are identical.\n", "\n", "**Answer**: If $\\lambda = 0$ the former minimization problem simplifies to\n", "\n", "$$\n", "\\underset{y^p_t}{\\min} \\sum^T_{t=1} (y_t - y^p_t)^2\n", "$$\n", "\n", "The first derivative with respect to $y^p_t$ is given by\n", "\n", "$$\n", "\\begin{align}\n", "\\frac{d f}{d y^p_t} &= 2 (y_t - y^p_t)\\\\\n", " &= y_t - y^p_t\\\\\n", " &= 0\n", "\\end{align}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 3\n", "\n", "**Question**: Prove that as $\\lambda \\to \\infty$, the filtered series $\\{y^p_t\\}^T_{t=1}$ is linear (i.e. $y^p_t = \\beta t$ for some $\\beta$)\n", "\n", "**Answer**: If the formula is divided by $\\lambda$ and $\\lambda \\to \\infty$ the first term of the equation approaches 0." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 4\n", "\n", "**Question**: For the more general case in which $0 < \\lambda < \\infty$, derive analytical conditions which implicitly define the trend. To do this, take the derivative with respect to $y^p_t$ for each $t = 1, \\dots, T$ and set them equal to zero, yielding $T$ first order conditions.\n", "\n", "**Answer**: Here are the three different derivations:\n", "\n", "For $y^p_1$:\n", "$$\n", "\\begin{align}\n", " - (y_1 - y^p_1) + \\lambda (y^p_1 - 2 y^p_2 + y^p_3) &= 0\\\\\n", " (1 + \\lambda) y^p_1 - 2 \\lambda y^p_2 + \\lambda y^p_3 &= y_1\n", "\\end{align}\n", "$$\n", "\n", "For $y^p_2$:\n", "$$\n", "\\begin{align}\n", " - (y_2 - y^p_2) + \\lambda [-2(y^p_1 - 2 y^p_2 + y^p_3) + (y^p_2 - 2 y^p_3 + y^p_4)] &= 0\\\\\n", " -2 \\lambda y^p_1 + (1 - 5\\lambda) y^p_2 - 4 \\lambda y^p_3 + \\lambda y^p_4 &= y_2\n", "\\end{align}\n", "$$\n", "\n", "For $y^p_t$ for $3 \\leq t \\leq T-2$:\n", "$$\n", "\\begin{align}\n", " - (y_t - y^p_t) + \\lambda [(y^p_t - 2y^p_{t-1} + y^p_{t-2}) - 2(y^p_{t+1} - 2 y^p_t + y^p_{t-1}) + (y^p_{t+2} - 2y^p_{t+1} + y^p_{t}) &= 0\\\\\n", " \\lambda y^p_{t-2} - 4 \\lambda y^p_{t-1} + (1 + 6 \\lambda) y^p_t - 4 \\lambda y^p_{t+1} + \\lambda y^p_{t+2} &= y_t\n", "\\end{align}\n", "$$\n", "\n", "for $y^p_{T-1}$:\n", "$$\n", "\\begin{align}\n", " - (y_{T-1} - y^p_{T-1}) + \\lambda [2(y^p_{T-3} - 2y^p_{T-2} + y^p_{T-1}) - 2(y^p_T - 2y^p_{T-1} + y^p_{T-2})] &= 0\\\\\n", " \\lambda y^p_{T-3} - 4 \\lambda y^p_{T-2} + (1 + 5\\lambda) y^p_{T-1} - 2\\lambda y^p_T&= y_{T-1}\n", "\\end{align}\n", "$$\n", "\n", "for $y^p_{T}$:\n", "$$\n", "\\begin{align}\n", " - (y_T - y^p_T) + \\lambda(y^p_T - 2y^p_{T-1} + y^p_{T-2}) &= 0\\\\\n", " \\lambda y^p_{T-2} - 2\\lambda y^p_{T-1} + 2\\lambda y^p_T &= y_T\n", "\\end{align}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 5\n", "\n", "**Question**: Write a *Julia* function to find the trend taking the actual data series $y_t$ and parameter $\\lambda$ as inputs. To do this, express the first order conditions in Exercise 4 in matrix form as follows:\n", "\n", "$$\n", "\\begin{align}\n", "\\Lambda Y^t &= Y\\\\\n", "Y^t &= \\Lambda^{-1}Y\n", "\\end{align}\n", "$$\n", "\n", "where $\\Lambda$ is a $T \\times T$ matrix whose elements are functions of $\\lambda$, $Y^t$ is a $T \\times 1$ vector equal to $[y^t_1, y^t_2, \\dots, y^t_T]'$ and $Y$ is a $T \\times 1$ vector equal to $[y_1, y_2, \\dots, y_T]'$." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "function hodrick_prescott_filter(y, lambda)\n", " T = size(y, 1)\n", " matrix = zeros(T, T)\n", " \n", " matrix[1, 1:3] = [1 + lambda, -2 * lambda, lambda]\n", " matrix[2, 1:4] = [-2 * lambda, 1 + 5 * lambda, -4 * lambda, lambda]\n", " \n", " for i = 3 : T - 2\n", " matrix[i, i-2 : i+2] = [lambda, -4*lambda, 1 + 6 * lambda, -4 * lambda, lambda]\n", " end\n", " \n", " matrix[T-1, T-3:T] = [lambda, -4 * lambda, 1 + 5 * lambda, -2 * lambda]\n", " matrix[T, T-2:T] = [lambda, -2 * lambda, 1 + lambda]\n", " \n", " trend = matrix \\ y\n", " cycle = y - trend\n", " \n", " return trend, cycle\n", "end;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 6" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "using DataFrames, Plots, SparseArrays, XLSX" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "y = float.(XLSX.readdata(\"problem_set_1_data/us_real_gdp.xlsx\", \"FRED Graph\", \"B12:B295\"));" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "trend, cycle = hodrick_prescott_filter(y, 1600);" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "timeline = 1947:0.25:2017.75;" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "1950\n", "\n", "\n", "1960\n", "\n", "\n", "1970\n", "\n", "\n", "1980\n", "\n", "\n", "1990\n", "\n", "\n", "2000\n", "\n", "\n", "2010\n", "\n", "\n", "-400\n", "\n", "\n", "-200\n", "\n", "\n", "0\n", "\n", "\n", "200\n", "\n", "\n", "\n", "\n", "\n", "\n", "Cycle\n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "1950\n", "\n", "\n", "1960\n", "\n", "\n", "1970\n", "\n", "\n", "1980\n", "\n", "\n", "1990\n", "\n", "\n", "2000\n", "\n", "\n", "2010\n", "\n", "\n", "3.0×10\n", "\n", "\n", "3\n", "\n", "\n", "6.0×10\n", "\n", "\n", "3\n", "\n", "\n", "9.0×10\n", "\n", "\n", "3\n", "\n", "\n", "1.2×10\n", "\n", "\n", "4\n", "\n", "\n", "1.5×10\n", "\n", "\n", "4\n", "\n", "\n", "\n", "\n", "\n", "\n", "Trend\n", "\n", "\n" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot(\n", " plot(timeline, cycle, label=\"Cycle\", color=:blue),\n", " plot(timeline, trend, label=\"Trend\", color=:red),\n", " link=:x, layout=(2, 1),\n", " legend=:topleft,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To validate the method, we take another implementation of the HP-Filter from http://www.econforge.org/posts/2014/juil./28/cef2014-julia/ (note the API change introduced by https://github.com/JuliaLang/julia/pull/23757 and the new form explained in https://github.com/JuliaLang/julia/pull/23757/files#diff-7904f4ddd9158030529e0ed5ee8707eeR1771)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "function hp_filter(y, lambda)\n", " n = length(y)\n", " @assert n >= 4\n", "\n", " diag2 = lambda*ones(n-2)\n", " diag1 = [ -2lambda; -4lambda*ones(n-3); -2lambda ]\n", " diag0 = [ 1+lambda; 1+5lambda; (1+6lambda)*ones(n-4); 1+5lambda; 1+lambda ]\n", "\n", " D = spdiagm(-2 => diag2, -1 => diag1, 0 => diag0, 1 => diag1, 2 => diag2)\n", "\n", " trend = D \\ y\n", " cycle = y - trend\n", " \n", " return trend, cycle\n", "end;" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "trend, cycle = hp_filter(y, 1600);" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "1950\n", "\n", "\n", "1960\n", "\n", "\n", "1970\n", "\n", "\n", "1980\n", "\n", "\n", "1990\n", "\n", "\n", "2000\n", "\n", "\n", "2010\n", "\n", "\n", "-400\n", "\n", "\n", "-200\n", "\n", "\n", "0\n", "\n", "\n", "200\n", "\n", "\n", "\n", "\n", "\n", "\n", "Cycle\n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "1950\n", "\n", "\n", "1960\n", "\n", "\n", "1970\n", "\n", "\n", "1980\n", "\n", "\n", "1990\n", "\n", "\n", "2000\n", "\n", "\n", "2010\n", "\n", "\n", "3.0×10\n", "\n", "\n", "3\n", "\n", "\n", "6.0×10\n", "\n", "\n", "3\n", "\n", "\n", "9.0×10\n", "\n", "\n", "3\n", "\n", "\n", "1.2×10\n", "\n", "\n", "4\n", "\n", "\n", "1.5×10\n", "\n", "\n", "4\n", "\n", "\n", "\n", "\n", "\n", "\n", "Trend\n", "\n", "\n" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot(\n", " plot(timeline, cycle, label=\"Cycle\", color=:blue),\n", " plot(timeline, trend, label=\"Trend\", color=:red),\n", " link=:x, layout=(2, 1), legend=:topleft,\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.0.0", "language": "julia", "name": "julia-1.0" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.0.0" } }, "nbformat": 4, "nbformat_minor": 2 }