Difference in differences

What Is Difference in Differences?

Difference in differences (DiD) is a statistical technique employed within econometrics and quantitative research to estimate the causal effect of an intervention or policy. It achieves this by comparing the changes in outcomes over time between a population that receives a "treatment" (the treatment group) and a similar population that does not (the control group). This method is particularly useful for achieving causal inference in observational study settings, where random assignment to groups is not feasible. The core idea behind Difference in Differences is to isolate the impact of the intervention by accounting for pre-existing differences between groups and general trends over time that would have occurred regardless of the intervention.³⁷

History and Origin

The conceptual underpinnings of the difference in differences method can be traced back to the mid-19th century in social sciences, often referred to as a "controlled before-and-after study."³⁶,³⁵ However, its prominence as a formal econometric technique surged in the 1990s. A seminal application that popularized DiD was the 1994 study by economists David Card and Alan Krueger. They utilized the technique to analyze the effect of a minimum wage increase in New Jersey on employment in the fast-food industry, comparing it to employment trends in neighboring Pennsylvania where the minimum wage remained constant. Their findings challenged conventional wisdom at the time regarding minimum wage impacts.³⁴,³³ This influential work cemented Difference in Differences as a powerful tool for policy evaluation and empirical economic research.³²

Key Takeaways

Difference in differences is a quasi-experimental method used to estimate causal effects in non-experimental settings.
It compares the change in an outcome over time for a treated group to the change in the same outcome for an untreated control group.
The method controls for time-invariant unobserved differences between groups and common time trends.
A critical assumption is the "parallel trends assumption," which posits that, in the absence of treatment, both groups would have followed similar outcome trajectories.
DiD is widely applied in economics, public health, and social sciences for policy evaluation.

Formula and Calculation

The basic Difference in Differences estimator calculates the "difference of differences" in average outcomes. Consider two groups (treatment and control) and two time periods (before and after intervention).

Let:

$Y_{jt}$ = Average outcome for group $j$ at time $t$
$j = 1$ for the treatment group, $j = 0$ for the control group
$t = 1$ for the post-treatment period, $t = 0$ for the pre-treatment period

The DiD estimator ($\delta$) is calculated as follows:

\delta = (Y_{11} - Y_{10}) - (Y_{01} - Y_{00})

Where:

$(Y_{11} - Y_{10})$ is the change in the outcome for the treatment group from the pre-treatment to the post-treatment period.
$(Y_{01} - Y_{00})$ is the change in the outcome for the control group from the pre-treatment to the post-treatment period.

This formula effectively subtracts out the changes in the control group from the changes in the treatment group, isolating the effect attributable to the intervention.³¹,³⁰ This approach typically uses panel data or repeated cross-sectional data.,²⁹

In regression analysis, Difference in Differences is often implemented using a linear model with dummy variables and an interaction term:

Y_{it} = \beta_0 + \beta_1 \text{Treatment}_i + \beta_2 \text{Post}_t + \beta_3 (\text{Treatment}_i \times \text{Post}_t) + \epsilon_{it}

Where:

$Y_{it}$ is the outcome for individual $i$ at time $t$.
$\text{Treatment}_i$ is a dummy variable equal to 1 if individual $i$ is in the treatment group, 0 otherwise.
$\text{Post}_t$ is a dummy variable equal to 1 if the observation is in the post-treatment period, 0 otherwise.
$(\text{Treatment}_i \times \text{Post}_t)$ is the interaction term, and its coefficient $\beta_3$ is the Difference in Differences estimator, representing the causal effect of the treatment.²⁸,²⁷
$\beta_0$ represents the average outcome for the control group in the pre-treatment period.
$\beta_1$ accounts for permanent average differences between the treatment and control groups before the intervention.²⁶
$\beta_2$ captures the general time trend affecting both groups.²⁵
$\epsilon_{it}$ is the error term.

Interpreting the Difference in Differences

Interpreting the Difference in Differences estimate involves understanding that the calculated value represents the average impact of the intervention or "treatment" on the outcome variable, over and above any changes that would have occurred naturally or due to pre-existing differences. For instance, if a DiD analysis of a new training program for financial advisors yields a positive and statistically significant coefficient, it suggests that the program led to an increase in advisor productivity relative to what would have been observed without the program, and relative to the general trend observed in a similar, untreated group.²⁴

The validity of this interpretation hinges critically on the "parallel trends assumption." This assumption dictates that, in the absence of the intervention, the outcome trends for both the treatment and control groups would have moved in parallel.²³,²² While this assumption cannot be directly tested since the counterfactual (what would have happened without the treatment) is unobservable, researchers often examine pre-treatment trends visually and statistically to assess its plausibility. If pre-treatment trends are not parallel, the DiD estimate may be biased, and the interpretation of the causal effect becomes problematic.²¹ Robustness checks, such as examining shorter pre-treatment periods or incorporating additional covariates, can help bolster confidence in the interpretation of the Difference in Differences results.²⁰

Hypothetical Example

Imagine a regional bank, Diversified Savings, wants to assess the effectiveness of a new digital financial literacy program on increasing the average savings rate among its checking account holders. They roll out the program in one specific state (the treatment group) but not in a neighboring, similar state (the control group). They collect cross-sectional data on the average monthly savings rate per account holder in both states for six months before the program launch and six months after the launch.

Here's the hypothetical data:

Group	Pre-Program (Average Monthly Savings Rate)	Post-Program (Average Monthly Savings Rate)	Change (Post - Pre)
Treatment State	$100	$125	$25
Control State	$90	$105	$15

To calculate the Difference in Differences:

Calculate the change for the Treatment Group:
$125 - 100 = 25$
Calculate the change for the Control Group:
$105 - 90 = 15$
Calculate the Difference in Differences:
DiD = (Change in Treatment Group) - (Change in Control Group)
DiD = $25 - $15 = $10

In this hypothetical example, the Difference in Differences estimate is $10. This suggests that the digital financial literacy program led to an average increase of $10 in monthly savings per account holder in the treatment state, relative to the change observed in the control state. This accounts for any general economic trends over that time series that might have influenced savings rates in both states.

Practical Applications

Difference in Differences is a versatile technique with widespread applications across various fields, particularly in areas requiring rigorous policy evaluation and impact assessment. Its ability to mimic a natural experiment makes it invaluable when true randomization is not possible.

In finance and economics, DiD is frequently used to:

Evaluate regulatory changes: Assessing the impact of new financial regulations on market behavior, bank performance, or consumer lending. For example, a study might use DiD to analyze how a change in state-level banking laws affects loan approval rates, comparing states with the new law to those without it, both before and after implementation.
Analyze market interventions: Determining the effect of government subsidies, tax reforms, or monetary policy shifts on economic indicators like employment, investment, or inflation. The World Bank notes that DiD is a useful tool for data analysis in program¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ¹⁰ ¹¹ ¹² ¹³ ¹⁴ ¹⁵