Vector autoregression

What Is Vector Autoregression?

Vector autoregression (VAR) is a statistical model used in econometrics and quantitative finance to capture the linear interdependencies among multiple time series variables. Unlike simpler statistical modeling techniques that focus on a single variable, VAR models treat all variables in the system as endogenous variables, meaning they mutually influence each other. This approach allows for the analysis of dynamic relationships without imposing restrictive assumptions about the causal structure between variables. Each variable in a vector autoregression model is expressed as a linear function of its own lagged values and the lagged values of all other variables in the system.

History and Origin

The concept of vector autoregression gained prominence in macroeconomics primarily through the work of American economist Christopher A. Sims. In a seminal 1980 paper titled "Macroeconomics and Reality," Sims challenged the prevailing large-scale macroeconomic models that relied on numerous identifying assumptions. He proposed VAR models as a more data-driven alternative to estimate economic relationships. Sims was later awarded the Nobel Memorial Prize in Economic Sciences in 2011, shared with Thomas J. Sargent, for their independent research on causality in the macroeconomy, with Sims specifically recognized for his development and application of vector autoregression.¹⁶ His advocacy for VAR methods provided a new framework for economists to analyze how economic variables respond to temporary changes in economic policy and other factors.¹⁵

Key Takeaways

Vector autoregression (VAR) models analyze the dynamic interdependencies among multiple time series variables.
VAR treats all variables as endogenous, meaning each variable is modeled as a function of its own past and the past values of all other variables in the system.
Introduced by Christopher Sims, VAR became a pivotal tool for forecasting and policy analysis in macroeconomics.
Key outputs include impulse response functions that illustrate how variables respond to shocks, and variance decomposition which shows how much of a variable's forecast error variance is explained by shocks to other variables.
While powerful, VAR models can suffer from overfitting with too many variables or lags and may not be suitable for direct structural policy analysis without additional identification assumptions.

Formula and Calculation

A basic vector autoregression model of order (p), denoted as VAR((p)), for (k) variables can be expressed in matrix form as:

$Y_t = c + A_1 Y_{t-1} + A_2 Y_{t-2} + \dots + A_p Y_{t-p} + \epsilon_t$

Where:

(Y_t) is a (k \times 1) vector of observed variables at time (t).
(c) is a (k \times 1) vector of constants (intercepts).
(A_i) for (i = 1, \dots, p) are (k \times k) coefficient matrices that capture the linear relationships between the variables at different lagged values.
(\epsilon_t) is a (k \times 1) vector of error terms (disturbances), assumed to be white noise and possibly correlated across equations.

Each equation within the VAR system is essentially a linear regression where a variable is regressed on its own past values and the past values of all other variables in the system. The estimation typically uses Ordinary Least Squares (OLS) for each equation separately.

Interpreting the Vector Autoregression

Interpreting the raw coefficients from a vector autoregression model can be complex due to the interconnected nature of the variables. Since each variable is regressed on the past values of all other variables, a change in one coefficient affects the entire system. Therefore, economists and analysts often rely on derived statistics such as impulse response functions (IRFs) and variance decomposition to understand the model's implications.

Impulse response functions trace the effect of a one-standard-deviation shock to one variable on the current and future values of all variables in the system. This provides a dynamic view of how shocks propagate through the economy or financial system. Variance decomposition, on the other hand, quantifies the proportion of the forecast error variance of each variable that can be attributed to shocks in other variables in the VAR system. These tools help in understanding the relationships and influences between different endogenous variables over time.

Hypothetical Example

Consider a hypothetical two-variable vector autoregression model examining the relationship between corporate bond yields (CBY) and the stock market volatility index (VIX). We might use monthly data for the past 12 months to forecast future values.

A VAR(1) model (using one lag) for these two variables could be expressed as:

$\begin{align*} \text{CBY}_t &= c_1 + a_{11} \text{CBY}_{t-1} + a_{12} \text{VIX}_{t-1} + \epsilon_{1t} \\ \text{VIX}_t &= c_2 + a_{21} \text{CBY}_{t-1} + a_{22} \text{VIX}_{t-1} + \epsilon_{2t} \end{align*}$

Suppose, after estimation, we find the following simplified coefficients:

$\begin{align*} \text{CBY}_t &= 0.1 + 0.8 \text{CBY}_{t-1} + 0.05 \text{VIX}_{t-1} + \epsilon_{1t} \\ \text{VIX}_t &= 0.5 - 0.1 \text{CBY}_{t-1} + 0.7 \text{VIX}_{t-1} + \epsilon_{2t} \end{align*}$

If in the previous month ((t-1)), CBY was 3% and VIX was 18:

Forecast for CBY at time (t): (0.1 + (0.8 \times 3) + (0.05 \times 18) = 0.1 + 2.4 + 0.9 = 3.4%)
Forecast for VIX at time (t): (0.5 - (0.1 \times 3) + (0.7 \times 18) = 0.5 - 0.3 + 12.6 = 12.8)

This simplified example demonstrates how the past values of both variables jointly contribute to the forecasting of their future values within the vector autoregression framework.

Practical Applications

Vector autoregression models are widely applied across various fields, particularly in finance and economics, due to their ability to capture complex interdependencies in time series data.

Some key applications include:

Macroeconomic Forecasting and Policy Analysis: Central banks, such as the Federal Reserve and the European Central Bank, frequently use VAR models to forecast key macroeconomic indicators like GDP, inflation, and unemployment. They are also employed to assess the impact of monetary policy decisions (e.g., changes in interest rates) on the broader economy.¹³, ¹⁴
Financial Market Analysis: VAR models can be used to analyze relationships between financial assets, such as stock prices, exchange rates, and commodity prices. This can aid in asset allocation strategies, understanding market dynamics, and portfolio management.¹²
Granger Causality Testing: VAR models provide a natural framework for testing for Granger causality, which examines whether past values of one variable can help predict another, beyond its own past values.
Risk Management: Financial institutions may use VAR models in stress testing to simulate adverse economic scenarios and evaluate the resilience of their portfolios to various shocks.¹¹

Limitations and Criticisms

Despite their widespread use, vector autoregression models have several limitations and have faced criticism:

Curse of Dimensionality: As the number of variables ((k)) or the number of lags ((p)) increases, the number of parameters to be estimated in a VAR model grows quadratically ((k^2p) coefficients). This can quickly lead to an excessive number of parameters, making the model prone to overfitting, especially with limited data.⁸, ⁹, ¹⁰ This can also make the models less parsimonious.
Interpretation of Coefficients: The direct interpretation of individual coefficients in a VAR model can be challenging because all variables are interdependent. Analysts often rely on impulse response functions or variance decomposition for meaningful insights.⁶, ⁷
Stationarity Requirement: Standard VAR models assume that the underlying time series are stationary (i.e., their statistical properties like mean and variance do not change over time). Non-stationary data often require transformation (e.g., differencing) or the use of more advanced techniques like vector error correction models (VECM) if variables are cointegration.⁴, ⁵
Exogeneity Assumption: While VAR models treat all included variables as endogenous, they implicitly assume that any variables not included in the model are exogenous variables and do not influence the system, or that their influence is captured by the error terms. If relevant variables are omitted, the model's results may be biased or incomplete.
Policy Analysis Limitations: Simple VAR models are not always well-suited for direct fiscal policy or monetary policy counterfactual analysis without further "identifying restrictions" to distinguish between structural shocks. This has been a subject of ongoing debate in econometrics.³

Vector Autoregression vs. Autoregressive Integrated Moving Average (ARIMA)

Vector autoregression (VAR) and Autoregressive Integrated Moving Average (ARIMA) are both widely used models in time series forecasting, but they differ fundamentally in their scope and application.

The primary distinction lies in the number of variables they handle:

ARIMA models are designed for univariate time series analysis. They focus on modeling and forecasting a single variable based on its own past values (autoregressive, AR), past forecast errors (moving average, MA), and differencing (integrated, I) to achieve stationarity.
Vector Autoregression (VAR) models are built for multivariate time series analysis. They simultaneously model multiple interdependent variables, where each variable's current value is a function of its own lagged values and the lagged values of all other variables in the system.

In essence, if the goal is to predict a single series and external factors are either not considered or treated as exogenous variables in an ARIMAX extension, ARIMA is often suitable. However, if the analysis requires understanding the dynamic interrelationships and feedback loops among several variables, the vector autoregression framework is the appropriate choice. VAR models do not require prior assumptions about which variables are causes or effects, making them flexible for systems with bidirectional relationships.¹, ²

FAQs

What is the main purpose of vector autoregression?

The main purpose of vector autoregression is to model and analyze the dynamic relationships among multiple time series variables. It is often used for forecasting future values of these variables and understanding how shocks to one variable propagate through the system.

How is a VAR model different from a simple autoregressive (AR) model?

A simple autoregressive model forecasts a single variable based only on its own past values. A vector autoregression (VAR) model, conversely, forecasts multiple variables simultaneously, where each variable depends on its own past values and the past values of all other variables in the system. This allows VAR to capture complex interdependencies.

What are impulse response functions in the context of VAR?

Impulse response functions are a key tool used to interpret vector autoregression models. They illustrate the estimated dynamic response of all variables in the system to a one-time, unexpected shock (or "impulse") in one of the variables, tracing its effects over time. This helps understand the direction and persistence of influence among variables.

Can VAR models be used for causal inference?

While VAR models can show statistical relationships (like Granger causality), they do not inherently establish true structural causation in an economic sense without imposing additional identifying assumptions. They primarily describe correlations and dynamic responses within the system.

What data is typically used for vector autoregression?

Vector autoregression is applied to time series data, which consists of observations collected sequentially over time. This data is commonly found in macroeconomics (e.g., GDP, inflation, interest rates) and finance (e.g., stock prices, exchange rates, commodity prices). The data must be stationary or transformed to achieve stationarity for standard VAR models.