Vector autoregression


Vector autoregression is a stochastic process model used to capture the linear interdependencies among multiple time series. VAR models generalize the univariate autoregressive model by allowing for more than one evolving variable. All variables in a VAR enter the model in the same way: each variable has an equation explaining its evolution based on its own lagged values, the lagged values of the other model variables, and an error term. VAR modeling does not require as much knowledge about the forces influencing a variable as do structural models with simultaneous equations: The only prior knowledge required is a list of variables which can be hypothesized to affect each other intertemporally.

Specification

Definition

A VAR model describes the evolution of a set of k variables over the same sample period as a linear function of only their past values. The variables are collected in a k-vector yt, which has as the i th element, yi,t, the observation at time t of the i th variable. For example, if the i th variable is GDP, then yi,t is the value of GDP at time t.
A p-th order VAR, denoted VAR, is
where the observation yt−i is called the i-th lag of y, c is a k-vector of constants, Ai is a time-invariant -matrix and et is a k-vector of error terms satisfying
  1. — every error term has mean zero;
  2. — the contemporaneous covariance matrix of error terms is Ω ;
  3. for any non-zero k — there is no correlation across time; in particular, no serial correlation in individual error terms.
A pth-order VAR is also called a VAR with p lags. The process of choosing the maximum lag p in the VAR model requires special attention because inference is dependent on correctness of the selected lag order.

Order of integration of the variables

Note that all variables have to be of the same order of integration. The following cases are distinct:
One can stack the vectors in order to write a VAR as a stochastic matrix difference equation, with a concise matrix notation:
Details of the matrices are in a separate page.

Example

For a general example of a VAR with k variables, see General matrix notation of a VAR.
A VAR in two variables can be written in matrix form as
, or, equivalently, as the following system of two equations
Each variable in the model has one equation. The current observation of each variable depends on its own lagged values as well as on the lagged values of each other variable in the VAR.

Writing VAR(''p'') as VAR(1)

A VAR with p lags can always be equivalently rewritten as a VAR with only one lag by appropriately redefining the dependent variable. The transformation amounts to stacking the lags of the VAR variable in the new VAR dependent variable and appending identities to complete the number of equations.
For example, the VAR model
can be recast as the VAR model
where I is the identity matrix.
The equivalent VAR form is more convenient for analytical derivations and allows more compact statements.

Structural vs. reduced form

Structural VAR

A structural VAR with p lags is
where c0 is a k × 1 vector of constants, Bi is a k × k matrix and εt is a k × 1 vector of error terms. The main diagonal terms of the B0 matrix are scaled to 1.
The error terms εt satisfy the conditions - in the definition above, with the particularity that all the elements in the off diagonal of the covariance matrix are zero. That is, the structural shocks are uncorrelated.
For example, a two variable structural VAR is:
where
that is, the variances of the structural shocks are denoted and the covariance is.
Writing the first equation explicitly and passing y2,t to the right hand side one obtains
Note that y2,t can have a contemporaneous effect on y1,t if B0;1,2 is not zero. This is different from the case when B0 is the identity matrix, when y2,t can impact directly y1,t+1 and subsequent future values, but not y1,t.
Because of the parameter identification problem, ordinary least squares estimation of the structural VAR would yield inconsistent parameter estimates. This problem can be overcome by rewriting the VAR in reduced form.
From an economic point of view, if the joint dynamics of a set of variables can be represented by a VAR model, then the structural form is a depiction of the underlying, "structural", economic relationships. Two features of the structural form make it the preferred candidate to represent the underlying relations:

Reduced-form VAR

By premultiplying the structural VAR with the inverse of B0
and denoting
one obtains the pth order reduced VAR
Note that in the reduced form all right hand side variables are predetermined at time t. As there are no time t endogenous variables on the right hand side, no variable has a direct contemporaneous effect on other variables in the model.
However, the error terms in the reduced VAR are composites of the structural shocks et = B0−1εt. Thus, the occurrence of one structural shock εi,t can potentially lead to the occurrence of shocks in all error terms ej,t, thus creating contemporaneous movement in all endogenous variables. Consequently, the covariance matrix of the reduced VAR
can have non-zero off-diagonal elements, thus allowing non-zero correlation between error terms.

Estimation

Estimation of the regression parameters

Starting from the concise matrix notation :
This can be written alternatively as:
where denotes the Kronecker product and Vec the vectorization of the indicated matrix.
This estimator is consistent and asymptotically efficient. It is furthermore equal to the conditional maximum likelihood estimator.
As in the standard case, the maximum likelihood estimator of the covariance matrix differs from the ordinary least squares estimator.
MLE estimator:
OLS estimator: for a model with a constant, k variables and p lags.
In a matrix notation, this gives:

Estimation of the estimator's covariance matrix

The covariance matrix of the parameters can be estimated as

Degrees of freedom

Vector autoregression models often involve the estimation of many parameters. For example, with seven variables and four lags, each matrix of coefficients for a given lag length is 7 by 7, and the vector of constants has 7 elements, so a total of 49×4 + 7 = 203 parameters are estimated, substantially lowering the degrees of freedom of the regression. This can hurt the accuracy of the parameter estimates and hence of the forecasts given by the model.

Interpretation of estimated model

Properties of the VAR model are usually summarized using structural analysis using Granger causality, impulse responses, and forecast error variance decompositions.

Impulse response

Consider the first-order case, with equation of evolution
for evolving vector and vector of shocks. To find, say, the effect of the j-th element of the vector of shocks upon the i-th element of the state vector 2 periods later, which is a particular impulse response, first write the above equation of evolution one period lagged:
Use this in the original equation of evolution to obtain
then repeat using the twice lagged equation of evolution, to obtain
From this, the effect of the j-th component of upon the i-th component of is the i, j element of the matrix
It can be seen from this induction process that any shock will have an effect on the elements of y infinitely far forward in time, although the effect will become smaller and smaller over time assuming that the AR process is stable — that is, that all the eigenvalues of the matrix A are less than 1 in absolute value.

Forecasting using an estimated VAR model

An estimated VAR model can be used for forecasting, and the quality of the forecasts can be judged, in ways that are completely analogous to the methods used in univariate autoregressive modelling.

Applications

has advocated VAR models, criticizing the claims and performance of earlier modeling in macroeconomic econometrics. He recommended VAR models, which had previously appeared in time series statistics and in system identification, a statistical specialty in control theory. Sims advocated VAR models as providing a theory-free method to estimate economic relationships, thus being an alternative to the "incredible identification restrictions" in structural models. VAR models are also increasingly used in health research for automatic analyses of diary data or sensor data.

Software