Empirical likelihood

What Is Empirical Likelihood?

Empirical likelihood (EL) is a powerful, nonparametric method of statistical inference that offers an alternative to traditional parametric approaches. Unlike conventional methods that often rely on the assumption of a specific probability distribution for the underlying data, empirical likelihood constructs a likelihood function directly from the observed data points. This flexibility makes empirical likelihood particularly robust, especially when dealing with complex datasets or situations where strong distributional assumptions may be unwarranted or difficult to verify. It is widely used in various quantitative fields, including econometrics, biostatistics, and general nonparametric statistics.

History and Origin

The concept of empirical likelihood was primarily formalized and significantly developed by Art B. Owen in the late 1980s and early 1990s. His seminal work, including a key 1988 paper and his comprehensive 2001 book "Empirical Likelihood," laid the theoretical foundation for this innovative statistical method.³⁹, ⁴⁰ Owen's contributions bridged the gap between classical maximum likelihood estimation (MLE), which relies on a specified parametric model, and the need for more flexible, data-driven approaches. The method allows researchers to leverage the desirable properties of likelihood-based inference without being constrained by rigid parametric assumptions.³⁸

Key Takeaways

Nonparametric Approach: Empirical likelihood does not require assuming a specific parametric distribution for the data, offering greater flexibility and robustness.³⁷
Data-Driven Inference: It uses the observed data to construct a likelihood function and determine the shape of confidence intervals and regions, which can adapt to skewness or other non-normal characteristics of the data.³⁵, ³⁶
Asymptotic Properties: Under certain conditions, empirical likelihood ratio statistics asymptotically follow a chi-squared distribution, providing a strong theoretical basis for hypothesis testing.³², ³³, ³⁴
Incorporates Constraints: The method can easily incorporate side information or known moment conditions on parameters, making it versatile for various estimation problems.³⁰, ³¹
Computational Aspects: While powerful, applying empirical likelihood often involves solving constrained optimization problems, which can be computationally intensive, especially for complex models or large datasets.²⁸, ²⁹

Formula and Calculation

The empirical likelihood for a parameter (\theta) is constructed by assigning probabilities (p_i) to each observation (X_i) in a sample of size (n). These probabilities are chosen to maximize their product, (\prod_{i=1}^{n} p_i), subject to two key constraints:

The probabilities must sum to one, like any probability distribution:
$\sum_{i=1}^{n} p_i = 1$
A moment conditions constraint, typically stating that the expected value of some function (g(X_i, \theta)) is zero. This function (g(X_i, \theta)) links the observed random variables (X_i) to the parameter of interest (\theta):
$\sum_{i=1}^{n} p_i g(X_i, \theta) = 0$
Here, (g(X_i, \theta)) is a vector-valued function representing the estimating equations for the parameter (\theta).

The empirical likelihood ratio function is then formed by comparing this maximized likelihood to a baseline where all (p_i = 1/n). The value of (\theta) that maximizes the empirical likelihood function is known as the maximum empirical likelihood estimator (MELE).²⁶, ²⁷

Interpreting the Empirical Likelihood

Interpreting results from empirical likelihood involves examining the shape and boundaries of the estimated confidence intervals or regions for a parameter. Unlike methods based on normal approximations, empirical likelihood confidence regions are data-driven; their shape and orientation are determined by the sample itself. This means they can naturally adapt to asymmetries or irregularities in the data, rather than imposing a symmetric structure.²³, ²⁴, ²⁵

For instance, if you are estimating a parameter that must lie within a certain range (e.g., a probability between 0 and 1, or a variance that must be non-negative), empirical likelihood confidence regions will respect these natural boundaries. This is a significant advantage over methods that might produce confidence intervals extending into impossible values. The empirical likelihood ratio statistic is often transformed (e.g., -2 times the log-likelihood ratio) to obtain a quantity that asymptotically follows a chi-squared distribution, which is then used for hypothesis testing to determine the statistical significance of parameter estimates.²¹, ²²

Hypothetical Example

Imagine a portfolio manager wants to estimate the true mean return of a new investment strategy using a small sample of historical daily returns. Traditional methods might assume these returns follow a normal distribution, but the manager suspects the returns are skewed.

Here's how empirical likelihood could be applied:

Data Collection: Suppose the manager collects five daily returns for the strategy: (X_1 = 0.01), (X_2 = -0.005), (X_3 = 0.02), (X_4 = -0.01), (X_5 = 0.008). These are the data points.
Define Parameter: The parameter of interest is the true mean return, denoted as (\mu).
Formulate Moment Condition: The natural moment condition for a mean is (g(X_i, \mu) = X_i - \mu). The constraint then becomes (\sum_{i=1}^{{n} p_i (X_i - \mu) = 0), which simplifies to (\sum_{i=1}}{n} p_i X_i = \mu).
Optimization: The manager would then find the probabilities (p_1, \dots, p_5) that maximize (p_1 p_2 p_3 p_4 p_5) subject to (\sum p_i = 1) and (\sum p_i X_i = \mu) for a given candidate value of (\mu).
Construct Likelihood Ratio: By performing this maximization for various values of (\mu), an empirical likelihood profile for the mean is constructed. The value of (\mu) that yields the highest empirical likelihood (i.e., when (p_i = 1/n) for all (i), which means (\mu) is the sample mean) is the maximum empirical likelihood estimate.
Form Confidence Interval: To find a confidence interval, the manager would identify a range of (\mu) values where the empirical likelihood ratio (typically, -2 times the log-likelihood ratio) falls below a critical value from the chi-squared distribution. This interval, unlike a standard normal-theory interval, would reflect any skewness present in the sample of returns for these random variables.

This approach allows the manager to make inferences about the true mean return without arbitrarily assuming that the daily returns are normally distributed, which is often not the case in financial markets.

Practical Applications

Empirical likelihood finds extensive practical applications across various quantitative disciplines, particularly in areas where data may not conform to standard parametric assumptions. In econometrics, it is used for estimating parameters in linear regression analysis without assuming normally distributed errors. It also serves as an enhancement or alternative to the Generalized Method of Moments (GMM), often offering improved finite-sample properties.²⁰

For instance, in financial econometrics, empirical likelihood is applied to analyze complex dynamic interactions and spillovers across multiple financial time series, such as in multivariate volatility models.¹⁹ Researchers can utilize specialized software packages, like the emplik package available for R, to perform empirical likelihood computations for various analyses, including those involving censored data in survival analysis or for testing hypotheses about means and other descriptive statistics.¹⁷, ¹⁸ The method's ability to handle complex dependence structures and high-dimensional data makes it a valuable tool for modern economic and financial modeling.¹⁵, ¹⁶

Limitations and Criticisms

Despite its numerous advantages, empirical likelihood is not without its limitations and criticisms. One of the primary drawbacks is its computational intensity. Maximizing the empirical likelihood function often requires solving complex constrained optimization problems, which can be numerically challenging and time-consuming, especially with large datasets or intricate moment conditions.¹³, ¹⁴ This computational burden can be more significant than for some alternative methods like the bootstrap or Wald-type tests.¹²

Another limitation relates to its sensitivity to sample size. While empirical likelihood possesses desirable asymptotic properties (meaning its properties hold true as the sample size approaches infinity), it may require a sufficiently fair sample size to accurately approximate these properties.¹⁰, ¹¹ In small samples, the coverage probability of empirical likelihood confidence regions can sometimes fall below the nominal level.⁹ Furthermore, "boundary issues" can arise where parameter estimates may approach the limits of feasible values, complicating numerical stability.⁸ Developing robust methods to address these issues, such as improving the precision of adjusted empirical likelihood confidence regions for smaller samples, remains an area of ongoing research.⁷

Empirical Likelihood vs. Parametric Likelihood

Empirical likelihood and parametric likelihood are both fundamental approaches to statistical inference, but they differ significantly in their underlying assumptions and flexibility.

Feature	Empirical Likelihood	Parametric Likelihood
Distributional Assumption	Nonparametric; does not assume a specific parametric form for the data. Builds likelihood directly from data.	Parametric; requires the assumption that data follows a known distribution (e.g., normal, Poisson, binomial).
Flexibility	Highly flexible; adapts to the shape and characteristics of the data (e.g., skewness, heavy tails).	Less flexible; inferences are tied to the assumed distribution. Model misspecification can lead to biased results.
Confidence Regions	Data-driven shape; respects natural boundaries of parameters.	Often symmetric (e.g., based on normal approximation); may produce confidence intervals extending beyond feasible ranges if not handled.
Robustness	More robust to model misspecification since it relies on weaker assumptions.	Less robust; sensitive to incorrect distributional assumptions.
Computational Complexity	Can be computationally intensive due to constrained optimization.	Generally less computationally demanding if the likelihood function has a closed form.

The main point of confusion often arises because both are "likelihood-based" methods. However, the "empirical" in empirical likelihood signifies its direct reliance on the observed data's empirical distribution, rather than a pre-specified mathematical model of a probability distribution that is the hallmark of parametric likelihood.

FAQs

What kind of data is empirical likelihood best suited for?

Empirical likelihood is particularly well-suited for data where the underlying probability distribution is unknown, complex, or does not conform to common parametric forms (e.g., normal distribution). This makes it valuable in fields like econometrics, where financial and economic data often exhibit skewness or heavy tails.⁵, ⁶

Can empirical likelihood handle dependent data, such as time series?

Yes, extensions of empirical likelihood methods have been developed to handle dependent data, including time series and other complex structures. While the core theory was initially developed for independent and identically distributed (i.i.d.) observations, researchers have adapted the framework to account for temporal dependence, making it applicable to a broader range of financial and economic models.³, ⁴

How does empirical likelihood compare to bootstrapping?

Both empirical likelihood and bootstrapping are nonparametric statistics techniques that allow for statistical inference without strong distributional assumptions. However, they differ in their approach. Bootstrapping typically involves resampling the original data to create many simulated datasets, then calculating statistics from these resamples. Empirical likelihood, on the other hand, constructs a likelihood function directly from the data by assigning probabilities to each observation under specific moment conditions. Empirical likelihood confidence regions have data-determined shapes and automatically respect parameter range constraints, which can be an advantage over some bootstrap methods.¹, ²