Credible interval

What Is Credible Interval?

A credible interval in Statistical inference is a range of values that, within the framework of Bayesian statistics, is believed to contain the true value of an unobserved parameter with a specified probability. Unlike traditional confidence intervals, a credible interval directly quantifies the probability that the true parameter lies within the given range, considering both the observed data and any prior beliefs about the parameter's distribution. This makes the credible interval a more intuitive measure of uncertainty for many practitioners, as it represents a direct probability statement about the parameter itself. Its calculation relies on the posterior distribution, which is the updated probability distribution of the parameter after incorporating new data.

History and Origin

The concept of the credible interval is intrinsically linked to the development of Bayesian inference, a statistical framework that has roots in the work of Thomas Bayes from the 18th century. While Bayes' Theorem provided the mathematical foundation for updating beliefs based on evidence, the widespread adoption and practical application of Bayesian methods, including the use of credible intervals, lagged for centuries due to computational complexity. The mid-20th century, particularly with advancements in computing power and the development of Markov Chain Monte Carlo (MCMC) methods, revolutionized the ability to calculate complex posterior distributions, making credible intervals computationally feasible. This allowed statisticians and researchers to effectively incorporate prior distribution into their analyses and directly interpret the probability that an unknown parameter falls within a specific range, a key advantage over frequentist approaches. The evolution of Bayesian inference has thus paved the way for the prominence of the credible interval in modern statistical data analysis.¹⁰

Key Takeaways

A credible interval directly states the probability that a parameter's true value falls within a given range, based on observed data and prior beliefs.
It is a fundamental concept in Bayesian statistics, derived from the posterior distribution of a parameter.
Credible intervals offer a more intuitive interpretation than frequentist confidence intervals by providing a direct probability statement about the parameter.
The choice of prior distribution can influence the credible interval, necessitating careful consideration in its application.
Credible intervals are widely used in fields requiring parameter estimation under uncertainty, including various aspects of finance.

Formula and Calculation

The credible interval is not represented by a single, universal algebraic formula in the way some statistical measures are. Instead, its calculation is based on the posterior distribution of a parameter. If (\theta) represents the parameter of interest and (p(\theta|data)) is its posterior probability density function (PDF), a ( (1-\alpha) \times 100% ) credible interval can be defined in a few ways:

Equal-Tailed Interval (ETI): This is the most common form, where the interval is defined by percentiles of the posterior distribution, such that ( \alpha/2 ) of the probability mass lies below the lower bound and ( \alpha/2 ) lies above the upper bound. For a 95% credible interval, this would be the range from the 2.5th percentile to the 97.5th percentile of the posterior distribution.
$P(L \le \theta \le U | data) = 1 - \alpha$
where (L) and (U) are the lower and upper bounds, respectively, chosen such that:
$P(\theta \le L | data) = \alpha/2$
$P(\theta \ge U | data) = \alpha/2$
Highest Posterior Density (HPD) Interval: This interval contains the specified percentage of the posterior probability mass, but all points within the interval have a higher probability density than points outside the interval. For unimodal distributions, the HPD interval is often narrower than the equal-tailed interval.

In practice, particularly for complex models, the posterior distribution is often derived computationally through methods like Markov Chain Monte Carlo (MCMC) simulations. The credible interval is then constructed by taking the relevant quantiles or identifying the highest density region from the simulated samples of the posterior distribution. This process inherently incorporates the likelihood function and the prior distribution that shaped the posterior.

Interpreting the Credible Interval

Interpreting a credible interval is generally straightforward and aligns with intuitive understanding of probability. A 95% credible interval for a parameter indicates that, given the observed data and the specified prior beliefs, there is a 95% probability that the true value of the parameter falls within that interval. This differs significantly from the interpretation of a confidence interval, which refers to the long-run frequency of intervals containing the true parameter over many hypothetical repetitions of an experiment.

For example, if a credible interval for the expected return of a stock is [5%, 10%], it means there is a 95% probability that the actual expected return lies between 5% and 10%, given the data used and the prior assumptions made. This direct probabilistic statement is highly valuable for decision making because it quantifies the uncertainty about the parameter's true value directly, rather than relying on a frequentist interpretation of repeated experiments.

Hypothetical Example

Imagine a quantitative analyst wants to estimate the volatility of a specific stock, which is currently unknown. Instead of relying solely on historical data (as a frequentist approach might), the analyst decides to use a Bayesian approach, incorporating her prior belief about the stock's volatility.

Scenario: An analyst believes, based on industry averages and historical context, that the stock's annualized volatility is likely around 20%, but it could reasonably range between 15% and 25%. She collects a month of daily stock price data.

Steps:

Define Prior: The analyst sets up a prior distribution for the volatility, centered around 20%, reflecting her initial beliefs.
Collect Data: She gathers the stock's daily returns for the past month.
Calculate Likelihood: Using the collected data, she calculates the likelihood of observing that data for different possible volatility values.
Derive Posterior: She combines her prior belief with the data's likelihood using Bayes' Theorem to obtain the posterior distribution of the stock's volatility. This distribution represents her updated belief, combining initial expectations with new evidence.
Construct Credible Interval: From this posterior distribution, she calculates a 90% credible interval. Let's say the calculated 90% credible interval for the stock's annualized volatility is [18.5%, 23.0%].

Interpretation: The analyst can now state that, based on her prior knowledge and the observed monthly data, there is a 90% probability that the true annualized volatility of the stock lies between 18.5% and 23.0%. This probabilistic range aids in risk management and portfolio allocation decisions.

Practical Applications

Credible intervals find diverse applications across quantitative finance and beyond, especially in scenarios where incorporating prior knowledge and directly quantifying parameter uncertainty is beneficial.

Financial Modeling and Forecasting: In areas like asset pricing and time series analysis, credible intervals can be used to estimate parameters such as expected returns, volatilities, and correlations. This allows financial professionals to express the uncertainty around their forecasts directly, rather than relying on point estimates. For instance, a credible interval for a future stock return provides a probabilistic range, aiding portfolio managers in making more informed decisions.⁹
Risk Management: Quantifying market risk, credit risk, or operational risk often involves estimating parameters of underlying distributions. Credible intervals provide a direct measure of the range of potential losses or exposures with a given probability, enhancing risk management frameworks, including methods like Value-at-Risk (VaR) or Conditional Value-at-Risk (CVaR).⁸
Portfolio Optimization: When constructing portfolios, investors must estimate parameters like expected returns and covariance matrices. Using credible intervals allows for portfolio optimization under uncertainty by considering the full range of plausible parameter values, leading to more robust portfolio allocations.
Economic Research and Model Selection: Researchers frequently use Bayesian methods to test economic theories and select among competing models. Credible intervals help in assessing the range of plausible values for economic parameters, providing a richer understanding than single point estimates. Bayesian hypothesis testing also directly uses the posterior distribution which forms the basis of credible intervals.
Regulatory Capital Calculation: Financial institutions may use advanced statistical models to calculate regulatory capital requirements. The explicit quantification of uncertainty provided by credible intervals can offer a more comprehensive view of potential shortfalls than point estimates alone.

Limitations and Criticisms

While credible intervals offer a powerful and intuitive way to express uncertainty in statistical inference, they are not without limitations and criticisms, primarily stemming from the Bayesian framework itself.

One of the most common critiques centers on the subjectivity of the prior distribution.⁷ The choice of prior can significantly influence the resulting posterior distribution and, consequently, the credible interval. While Bayesians argue that explicitly stating one's prior beliefs is a strength, critics contend that an ill-chosen or overly informative prior can unduly bias the results, especially when data is scarce. This subjectivity can make results appear less "objective" to some traditional statisticians.⁶

Another practical limitation is computational intensity. For complex models or large datasets, deriving the posterior distribution and thus the credible interval often requires sophisticated computational methods like Markov Chain Monte Carlo (MCMC). These methods can be time-consuming and computationally demanding, requiring specialized software and expertise.⁵

Furthermore, the interpretation of "non-informative" priors can be complex. While such priors are intended to have minimal influence on the posterior, they can sometimes lead to results that are still influenced by implicit assumptions or may not be truly "flat" across all relevant scales. The selection of a suitable prior requires careful consideration and, at times, sensitivity analysis to assess its impact on the credible interval.⁴

Despite these criticisms, proponents argue that the transparency required in defining priors, coupled with the direct probabilistic interpretation of credible intervals, outweighs the drawbacks, especially in contexts where incorporating existing knowledge is valuable for robust decision making.

Credible Interval vs. Confidence Interval

The terms credible interval and confidence interval are often confused due to their similar names and apparent purpose of providing a range for an unknown parameter. However, they stem from fundamentally different statistical philosophies—Bayesian and frequentist, respectively—leading to distinct interpretations.

Feature	Credible Interval	Confidence Interval
Statistical Basis	Bayesian statistics. Parameters are treated as random variables with probability distributions.	Frequentist statistics. Parameters are fixed but unknown constants. The interval itself is considered random.
Interpretation	A 95% credible interval means there is a 95% probability that the true parameter value lies within this specific interval, given the observed data and prior beliefs. It is a direct statement about the parameter. ³	A 95% confidence interval means that if the experiment were repeated many times, 95% of the calculated intervals would contain the true parameter value. It is a statement about the procedure's long-run performance, not a direct probability about the parameter for a single calculated interval. ²
Prior Information	Explicitly incorporates prior beliefs (prior distribution) about the parameter, which are combined with the data's likelihood to form the posterior distribution.	Does not incorporate prior beliefs directly. It relies solely on the observed data.
Flexibility	More intuitive for non-statisticians and useful when incorporating existing knowledge or expert opinion. ¹	Can be less intuitive to interpret, often misinterpreted as a direct probability about the parameter.

The core distinction lies in what is considered "random." For a credible interval, the parameter is a random variable, and the interval's bounds are fixed given the posterior distribution. For a confidence interval, the parameter is fixed, and the interval's bounds are random, varying with each hypothetical sample.

FAQs

What does a 95% credible interval mean?

A 95% credible interval means that, based on your observed data and your initial beliefs about the parameter (your prior), there is a 95% probability that the true value of the parameter falls within that specific range. It’s a direct statement of probability about the parameter itself.

How is a credible interval different from a confidence interval?

The main difference lies in their interpretation. A credible interval states the probability that the parameter's true value is within the interval. A confidence interval indicates that if you were to repeat your study many times, a certain percentage of the intervals calculated from those studies would contain the true parameter value. Credible intervals also integrate prior knowledge, while confidence intervals do not.

Can a credible interval be used for any type of data?

Yes, credible intervals can be applied to estimate parameters for various types of data and models, from simple averages to complex financial models. Their versatility comes from their foundation in Bayesian statistics, which provides a coherent framework for updating beliefs with new evidence across diverse data structures.

Why would I choose a credible interval over a confidence interval?

You might choose a credible interval if you want a direct probabilistic statement about the parameter of interest or if you have meaningful prior information that you wish to incorporate into your analysis. Its interpretation is often seen as more intuitive, especially for decision making in fields like finance and risk management, where quantifying uncertainty precisely is paramount.