Overidentifying restrictions

What Are Overidentifying Restrictions?

Overidentifying restrictions are a concept in econometrics that arise when there are more valid instrumental variables (instruments) available than strictly necessary to estimate the parameters of a statistical or structural equation model. In such scenarios, the number of moment conditions, which are mathematical expressions reflecting known population characteristics, exceeds the number of unknown parameters to be estimated. This surplus of information provides a way to test the validity of the chosen instruments and the overall model specification.

This concept is particularly central to advanced estimation techniques like the Generalized Method of Moments (GMM) and Instrumental Variables (IV) estimation. When overidentifying restrictions are present, they impose testable conditions on the data that, if violated, suggest either that the instruments are not truly exogenous (i.e., they are correlated with the error term) or that the underlying model is misspecified.

History and Origin

The foundation for dealing with overidentifying restrictions was laid as econometricians sought robust methods to estimate models where direct relationships between variables were complicated by endogeneity. Early work on instrumental variables provided a solution for situations where explanatory variables were correlated with the error term. J.D. Sargan, in his seminal 1958 paper, "The Estimation of Economic Relationships Using Instrumental Variables," developed a test for overidentifying restrictions within the context of instrumental variable estimation, often referred to as the Sargan test.¹¹

Later, in 1982, Lars Peter Hansen significantly generalized the instrumental variables approach with his development of the Generalized Method of Moments (GMM). His paper, "Large Sample Properties of Generalized Method of Moments Estimators," provided a comprehensive framework for parameter estimation that relies on moment conditions and is applicable to a wide range of linear and nonlinear econometric models.¹⁰,⁹,⁸ The Hansen J-test, a cornerstone of GMM, extends Sargan's idea, allowing for testing these overidentifying restrictions in more general settings, including those with heteroskedasticity and serially correlated errors. These developments were crucial for establishing the asymptotic properties of estimators derived from such complex models.

Key Takeaways

Overidentifying restrictions exist when there are more instrumental variables than parameters being estimated in an econometric model.
They provide testable implications for the validity of instruments and the correctness of the model specification.
The Sargan-Hansen J-test is the primary statistical test used to evaluate overidentifying restrictions.
A rejection of the null hypothesis in the J-test indicates potential problems with instrument exogeneity or model specification.
These restrictions are fundamental in Generalized Method of Moments (GMM) and Instrumental Variables (IV) estimation.

Formula and Calculation

In the context of the Generalized Method of Moments (GMM), the test statistic for overidentifying restrictions, commonly known as the Sargan-Hansen J-test, is based on the minimized value of the GMM objective function.

Let the set of sample moment conditions be denoted by (g_T(\beta)), where (T) is the sample size and (\beta) is the vector of parameters to be estimated. The GMM estimator chooses (\beta) to minimize a quadratic form:

\min_{\beta} g_T(\beta)' W_T g_T(\beta)

where (W_T) is a positive semi-definite weighting matrix that converges in probability to a matrix (W_0).

If the model is correctly specified and the instruments are valid, the population moment conditions are expected to be zero. The J-test statistic, (J), is given by:

J = T \cdot g_T(\hat{\beta})' \hat{W} g_T(\hat{\beta})

Here, (\hat{\beta}) is the GMM estimator and (\hat{W}) is a consistent estimator of the optimal weighting matrix, typically based on the estimated error covariance matrix. Under the null hypothesis that the model specification is correct and the instruments are valid, the J-statistic follows a chi-squared distribution with degrees of freedom equal to the number of overidentifying restrictions. The number of overidentifying restrictions is calculated as the number of moment conditions (or instruments) minus the number of parameter estimation parameters.

Interpreting the Overidentifying Restrictions

The interpretation of the J-test statistic for overidentifying restrictions is crucial for statistical inference in econometric models. Under the null hypothesis, the overidentifying restrictions are valid, implying that the specified moment conditions hold true and, consequently, the chosen instrumental variables are exogenous and correctly exclude from the structural equation. A high p-value (typically above a chosen significance level like 0.05) suggests that one cannot reject the null hypothesis, which is generally taken as evidence in favor of the model's validity and the instruments' exogeneity.

Conversely, a low p-value (below the significance level) indicates a rejection of the null hypothesis. This rejection signals that there is evidence against the validity of the overidentifying restrictions. Such a result can imply one or both of the following:

Invalid Instruments: At least one of the instrumental variables is not truly exogenous, meaning it is correlated with the error term of the structural equation.
Model Misspecification: The econometric model itself is incorrectly specified, perhaps due to omitted variables, incorrect functional form, or other structural errors.

Therefore, the J-test serves as a diagnostic tool for researchers engaging in hypothesis testing to assess the internal consistency of their model and the credibility of their instruments.

Hypothetical Example

Consider an econometrician studying the impact of education on income. They postulate a model where income (dependent variable) is a function of education (explanatory variable) and other factors. However, education might be an endogenous variable because factors like innate ability, which also influence income, might also influence the level of education attained. To address this, the econometrician employs instrumental variables.

Suppose they propose three potential instruments for education:

Proximity to a college in adolescence: This is argued to influence educational attainment but not directly affect later income, making it a suitable exogenous variable.
Parental education level: Similar reasoning.
Local average tuition costs during college-going age: Again, assumed to influence education decisions but not directly income.

If the structural equation for income requires only one instrument to identify the effect of education, but the researcher has three candidate instruments, then there are two overidentifying restrictions (3 instruments - 1 endogenous variable = 2 overidentifying restrictions). The econometrician can estimate the model using GMM and then perform the J-test.

If the J-test yields a high p-value (e.g., 0.35), it suggests that the data are consistent with the hypothesis that all three instruments are valid and that the model is correctly specified. If the p-value is low (e.g., 0.01), it indicates a problem: either one or more of the proposed instruments are not truly exogenous (e.g., parental education has a direct effect on income beyond its influence on the child's education), or there's an issue with the underlying income determination model.

Practical Applications

Overidentifying restrictions and their associated tests are widely applied across various fields of economics and finance where endogeneity is a concern. In financial models, they are routinely used to evaluate the validity of asset pricing models, such as the consumption-based asset pricing model (CCAPM), where consumption growth serves as a key driver of asset returns.⁷,⁶ Researchers often use GMM to estimate the parameters of these models, and the J-test is then employed to check if the implied moment conditions are consistent with the observed financial data.

Beyond asset pricing, overidentifying restrictions are critical in:

Corporate Finance: Analyzing the impact of capital structure decisions on firm value, controlling for the endogeneity of financing choices.
Macroeconomics: Estimating dynamic stochastic general equilibrium (DSGE) models where agents' expectations and policy rules introduce complex endogenous relationships.
Labor Economics: Studying the causal effects of policy interventions or educational attainment on wages, where individual characteristics often lead to endogenous participation.

The ability to test overidentifying restrictions provides empirical researchers with a vital diagnostic tool, enhancing the credibility of their causal inferences in settings where controlled experiments are not feasible.⁵,⁴

Limitations and Criticisms

Despite their utility, tests of overidentifying restrictions, such as the Sargan-Hansen J-test, have important limitations. One significant issue arises in the presence of weak instruments. If instruments are only weakly correlated with the endogenous variables, the J-test can have poor size properties (i.e., it may reject the null hypothesis too often or too rarely even when it is true), leading to unreliable conclusions.³

Another challenge occurs when there are "many instruments," meaning the number of instruments is a large fraction of the sample size. In such cases, the asymptotic distribution of the J-test statistic may not hold well in finite samples, leading to over-rejection of the null hypothesis. Researchers have proposed various adjustments and alternative tests to address this "many instruments" problem.²,¹

Furthermore, a rejection of the null hypothesis by the J-test is an omnibus result; it signals a problem but does not pinpoint whether the issue lies with specific instruments or with the overall model specification. It's also important to note that a high p-value (failure to reject) does not definitively "prove" the validity of the instruments or the model, but rather indicates that the data do not contain sufficient evidence to contradict the specified restrictions. Researchers must therefore use these tests in conjunction with other diagnostic methods and economic theory to build a compelling case for their findings.

Overidentifying Restrictions vs. Just-identified Model

The concept of overidentifying restrictions is best understood in contrast to a just-identified model. The key distinction lies in the number of available instrumental variables relative to the number of endogenous variables in the model.

Feature	Overidentifying Restrictions	Just-Identified Model
Instrument Count	More instrumental variables than endogenous variables needing instruments.	Exactly the same number of instrumental variables as endogenous variables.
Testability	Implies testable restrictions on the model's parameters and instrument validity (e.g., via the J-test).	No testable overidentifying restrictions; the model is identified, but its internal consistency cannot be tested in this manner.
Information Content	Contains "surplus" information from extra instruments, allowing for diagnostic checks.	Uses precisely the information needed for identification, with no redundancy for internal consistency checks.
Estimation	Typically estimated using GMM, which optimally combines the excess information.	Can be estimated using Two-Stage Least Squares (2SLS) or IV, but GMM can also be applied.

In an overidentified model, the existence of more instruments than necessary provides a valuable opportunity to assess the plausibility of the assumptions underlying the chosen instruments. In contrast, a just-identified model, while estimable, offers no such internal mechanism to verify the validity of its instruments through statistical tests of overidentifying restrictions. This highlights a key advantage of having overidentifying restrictions: they empower researchers to perform crucial diagnostic checks on their econometric models.

FAQs

What does it mean for a model to be "overidentified"?

A model is overidentified when you have more pieces of independent information (typically from instrumental variables or moment conditions) than you need to uniquely determine all the unknown parameters in your econometrics model. This "extra" information allows you to perform a statistical test to check the model's consistency.

What is the purpose of testing overidentifying restrictions?

The main purpose is to test the validity of the instrumental variables you are using and, by extension, the overall model specification. If the test suggests the restrictions are violated, it means your instruments might not be truly exogenous or your model has some fundamental flaw.

Can a model be overidentified but still incorrect?

Yes. The test of overidentifying restrictions (like the Sargan-Hansen J-test) checks for consistency between the data and the model's assumptions given the chosen instruments. A high p-value suggests this consistency holds, but it doesn't guarantee the model is "correct" in every sense. For example, if all your instruments are invalid in a very specific way that doesn't violate the overidentifying restrictions, the test might not catch it. It's a necessary check, but not a sufficient condition for a perfect model.

Is it always better to have more instruments to create overidentifying restrictions?

Not necessarily. While having overidentifying restrictions allows for testing instrument validity, using too many instruments, especially "weak" ones (instruments that are only weakly correlated with the endogenous variable), can lead to problems. It can bias the estimates and distort the properties of the J-test itself. The quality of instruments is often more important than the sheer quantity.