Fixed effects model


In statistics, a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random variables. In many applications including econometrics and biostatistics a fixed effects model refers to a regression model in which the group means are fixed as opposed to a random effects model in which the group means are a random sample from a population. Generally, data can be grouped according to several observed factors. The group means could be modeled as fixed or random effects for each grouping. In a fixed effects model each group mean is a group-specific fixed quantity.
In panel data where longitudinal observations exist for the same subject, fixed effects represent the subject-specific means. In panel data analysis the term fixed effects estimator is used to refer to an estimator for the coefficients in the regression model including those fixed effects.

Qualitative description

Such models assist in controlling for omitted variable bias due to unobserved heterogeneity when this heterogeneity is constant over time. This heterogeneity can be removed from the data through differencing, for example by subtracting the group-level average over time, or by taking a first difference which will remove any time invariant components of the model.
There are two common assumptions made about the individual specific effect: the random effects assumption and the fixed effects assumption. The random effects assumption is that the individual-specific effects are uncorrelated with the independent variables. The fixed effect assumption is that the individual-specific effects are correlated with the independent variables. If the random effects assumption holds, the random effects estimator is more efficient than the fixed effects estimator. However, if this assumption does not hold, the random effects estimator is not consistent. The Durbin–Wu–Hausman test is often used to discriminate between the fixed and the random effects models.

Formal model and assumptions

Consider the linear unobserved effects model for observations and time periods:
Where:
Unlike, cannot be directly observed.
Unlike the random effects model where the unobserved is independent of for all, the fixed effects model allows to be correlated with the regressor matrix. Strict exogeneity with respect to the idiosyncratic error term is still required.

Statistical estimation

Fixed effects estimator

Since is not observable, it cannot be directly controlled for. The FE model eliminates by demeaning the variables using the within transformation:
where and.
Since is constant, and hence the effect is eliminated. The FE estimator is then obtained by an OLS regression of on.
At least three alternatives to the within transformation exist with variations.
One is to add a dummy variable for each individual . This is numerically, but not computationally, equivalent to the fixed effect model and only works if the sum of the number of series and the number of global parameters is smaller than the number of observations. The dummy variable approach is particularly demanding with respect to computer memory usage and it is not recommended for problems larger than the available RAM, and the applied program compilation, can accommodate.
Second alternative is to use consecutive reiterations approach to local and global estimations. This approach is very suitable for low memory systems on which it is much more computationally efficient than the dummy variable approach.
The third approach is a nested estimation whereby the local estimation for individual series is programmed in as a part of the model definition. This approach is the most computationally and memory efficient, but it requires proficient programming skills and access to the model programming code; although, it can be programmed even in SAS.
Finally, each of the above alternatives can be improved if the series-specific estimation is linear, in which case the direct linear solution for individual series can be programmed in as part of the nonlinear model definition.

First difference estimator

An alternative to the within transformation is the first difference transformation, which produces a different estimator. For :
When, the first difference and fixed effects estimators are numerically equivalent. For, they are not. If the error terms are homoskedastic with no serial correlation, the fixed effects estimator is more efficient than the first difference estimator. If follows a random walk, however, the first difference estimator is more efficient.

Equality of fixed effects and first difference estimators when T=2

For the special two period case, the fixed effects estimator and the first difference estimator are numerically equivalent. This is because the FE estimator effectively "doubles the data set" used in the FD estimator. To see this, establish that the fixed effects estimator is:
Since each can be re-written as, we'll re-write the line as:

Chamberlain method

's method, a generalization of the within estimator, replaces with its linear projection onto the explanatory variables. Writing the linear projection as:
this results in the following equation:
which can be estimated by minimum distance estimation.

Hausman–Taylor method

Need to have more than one time-variant regressor and time-invariant
regressor and at least one and one that are uncorrelated with
Partition the and variables such that where and are uncorrelated with. Need.
Estimating via OLS on using and as instruments yields a consistent estimate.

Generalization with input uncertainty

When there is input uncertainty for the data,, then the value, rather than the sum of squared residuals, should be minimized. This can be directly achieved from substitution rules:
then the values and standard deviations for and can be determined via classical ordinary least squares analysis and variance-covariance matrix.

Testing fixed effects (FE) vs. random effects (RE)

We can test whether a fixed or random effects model is appropriate using a Durbin–Wu–Hausman test.
If is true, both and are consistent, but only is efficient. If is true, is consistent and is not.
The Hausman test is a specification test so a large test statistic might be indication that there might be errors-in-variables or our model is misspecified. If the FE assumption is true, we should find that.
A simple heuristic is that if there could be EIV.