| Title: | Synthetic Control Changes-in-Changes Estimator |
|---|---|
| Description: | Implements the Changes-in-Changes (CIC) estimator of Athey and Imbens (2006) <doi:10.1111/j.1468-0262.2006.00668.x> combined with synthetic control methods. Provides nonparametric estimation of the entire counterfactual distribution of outcomes for a treated group, allowing evaluation of average, quantile, and distributional treatment effects. Synthetic control weights are constructed via elastic net regularization to handle settings with many potential control units. |
| Authors: | Neil Hwang [aut, cre] |
| Maintainer: | Neil Hwang <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.0 |
| Built: | 2026-05-10 06:24:30 UTC |
| Source: | https://github.com/neilhwang/sccic |
Checks for common data issues and issues informative warnings.
Called internally by cic when input is potentially
problematic. Also available for manual use.
check_data(y_00, y_01, y_10, y_11)check_data(y_00, y_01, y_10, y_11)
y_00, y_01, y_10, y_11
|
Numeric vectors of outcomes. |
Invisible TRUE. Produces warnings for potential issues.
Warns if the treated unit's pre-treatment outcomes fall outside the range of the synthetic control's pre-treatment outcomes, which can cause the CIC distributional transport to extrapolate.
check_support(x)check_support(x)
x |
An object of class |
Logical. TRUE if support condition is satisfied.
Implements the Changes-in-Changes (CIC) estimator of Athey and Imbens (2006) for the average treatment effect on the treated in a two-group, two-period difference-in-differences setting.
cic( y_00, y_01, y_10, y_11, se = TRUE, boot = FALSE, boot_iters = 500L, seed = NULL )cic( y_00, y_01, y_10, y_11, se = TRUE, boot = FALSE, boot_iters = 500L, seed = NULL )
y_00 |
Numeric vector. Outcomes for the control group in the pre-treatment period. |
y_01 |
Numeric vector. Outcomes for the control group in the post-treatment period. |
y_10 |
Numeric vector. Outcomes for the treated group in the pre-treatment period. |
y_11 |
Numeric vector. Outcomes for the treated group in the post-treatment period. |
se |
Logical. If |
boot |
Logical. If |
boot_iters |
Integer. Number of bootstrap iterations. Default 500. |
seed |
Integer or |
The CIC estimator constructs a counterfactual distribution for the treated group in the post-treatment period by applying the transformation:
The average treatment effect is then:
The analytic variance follows Theorem 5.1 of Athey and Imbens (2006):
An object of class "cic" containing:
tau |
The CIC average treatment effect estimate. |
se |
Analytic standard error (if |
z |
z-statistic. |
pval |
Two-sided p-value. |
counterfactual_mean |
Mean of the counterfactual distribution. |
tau_did |
The standard DID estimate for comparison. |
N |
Total sample size. |
n |
Named vector of group sample sizes. |
boot_se |
Bootstrap standard error (if |
ecdfs |
List of empirical CDF objects for each group. |
Athey, S. and Imbens, G. W. (2006). Identification and Inference in Nonlinear Difference-in-Differences Models. Econometrica, 74(2), 431–497. doi:10.1111/j.1468-0262.2006.00668.x
# Workers' compensation example (Meyer, Viscusi, and Durbin 1995) if (requireNamespace("wooldridge", quietly = TRUE)) { data("injury", package = "wooldridge") result <- cic( y_00 = injury$ldurat[injury$highearn == 0 & injury$afchnge == 0], y_01 = injury$ldurat[injury$highearn == 0 & injury$afchnge == 1], y_10 = injury$ldurat[injury$highearn == 1 & injury$afchnge == 0], y_11 = injury$ldurat[injury$highearn == 1 & injury$afchnge == 1] ) print(result) }# Workers' compensation example (Meyer, Viscusi, and Durbin 1995) if (requireNamespace("wooldridge", quietly = TRUE)) { data("injury", package = "wooldridge") result <- cic( y_00 = injury$ldurat[injury$highearn == 0 & injury$afchnge == 0], y_01 = injury$ldurat[injury$highearn == 0 & injury$afchnge == 1], y_10 = injury$ldurat[injury$highearn == 1 & injury$afchnge == 0], y_11 = injury$ldurat[injury$highearn == 1 & injury$afchnge == 1] ) print(result) }
Re-estimates SC-CIC dropping one donor at a time to assess sensitivity to individual donors.
loo_donors(y_treated, y_donors, treatment_period, alpha = 1, seed = 42)loo_donors(y_treated, y_donors, treatment_period, alpha = 1, seed = 42)
y_treated |
Numeric vector. Treated unit outcomes. |
y_donors |
Numeric matrix. Donor unit outcomes. |
treatment_period |
Integer. First treatment period index. |
alpha |
Elastic net mixing parameter. |
seed |
Integer or |
A data frame with one row per donor, showing the SC-CIC estimate when that donor is excluded.
Plots the empirical CDFs of the four group-period cells used in the CIC estimator, illustrating the distributional transport.
plot_distributions(x, ...)plot_distributions(x, ...)
x |
An object of class |
... |
Additional arguments (currently unused). |
Invisible. Called for its side effect of producing a plot.
Produces a quantile-quantile plot comparing the pre-treatment distributions of the treated unit and the synthetic control. This assesses whether the synthetic control tracks the treated unit's distributional dynamics, not just its mean—a necessary condition for CIC validity. Points on the 45-degree line indicate identical distributions.
plot_qq(x, ...)plot_qq(x, ...)
x |
An object of class |
... |
Additional arguments passed to |
Invisible. Called for its side effect of producing a plot.
Plot Quantile Treatment Effects
plot_qte(x, probs = seq(0.05, 0.95, 0.05), ...)plot_qte(x, probs = seq(0.05, 0.95, 0.05), ...)
x |
An object of class |
probs |
Numeric vector of quantiles. |
... |
Additional arguments passed to |
Invisible. Called for its side effect of producing a plot.
Plots the treated unit against the synthetic control over time, with a vertical line at the treatment period.
## S3 method for class 'sc_cic' plot(x, ...)## S3 method for class 'sc_cic' plot(x, ...)
x |
An object of class |
... |
Additional arguments passed to |
The plot shows the treated unit (solid line) and synthetic control (dashed line) over all time periods, with a vertical dashed line marking the start of treatment. Good pre-treatment fit is a necessary (but not sufficient) condition for valid SC-CIC inference.
Invisible. Called for its side effect of producing a plot.
Estimates quantile treatment effects from a CIC fit by comparing quantiles of the actual post-treatment treated distribution with quantiles of the counterfactual distribution.
quantile_te(x, probs = seq(0.05, 0.95, 0.05))quantile_te(x, probs = seq(0.05, 0.95, 0.05))
x |
An object of class |
probs |
Numeric vector of quantiles at which to compute effects.
Default is |
The quantile treatment effect at quantile is:
where is the CIC counterfactual distribution.
A data frame with columns quantile, actual,
counterfactual, and qte (quantile treatment effect).
Combines synthetic control methods with the Changes-in-Changes estimator. First constructs a synthetic control unit from donor units using elastic net regularization, then applies the CIC estimator using the synthetic control as the comparison group.
sc_cic( y_treated, y_donors, treatment_period, alpha = 1, boot = TRUE, boot_iters = 500L, seed = NULL )sc_cic( y_treated, y_donors, treatment_period, alpha = 1, boot = TRUE, boot_iters = 500L, seed = NULL )
y_treated |
Numeric vector. Outcome for the treated unit across all time periods (pre and post). |
y_donors |
Numeric matrix. Outcomes for donor units, with rows as
time periods (matching |
treatment_period |
Integer. The index (row number) of the first
treatment period. Periods 1 to |
alpha |
Elastic net mixing parameter. |
boot |
Logical. Compute bootstrap standard errors. Default |
boot_iters |
Integer. Number of bootstrap iterations. Default 500. |
seed |
Integer or |
The procedure works in two steps:
Step 1: Synthetic Control Construction.
In the pre-treatment period, the treated unit's outcome is regressed on
the donor units' outcomes using elastic net (via cv.glmnet).
This yields a sparse set of weights that construct a synthetic control unit
as a weighted combination of donors.
Step 2: CIC Estimation. The CIC estimator is applied with the synthetic control as the "control group" and the treated unit as the "treatment group."
Inference.
Because the synthetic control is an estimated object, the analytic
asymptotic variance of Athey and Imbens (2006) does not directly apply.
Instead, sc_cic provides bootstrap standard errors that
re-estimate the elastic net weights in each bootstrap iteration,
thereby accounting for first-stage estimation uncertainty. The bootstrap
resamples time periods (with replacement) within the pre-treatment and
post-treatment windows separately, preserving the panel structure.
An object of class "sc_cic" inheriting from "cic",
with components:
tau |
The SC-CIC average treatment effect estimate. |
se |
Bootstrap standard error (if |
z |
z-statistic (bootstrap-based). |
pval |
Two-sided p-value (bootstrap-based). |
boot_se |
Same as |
tau_did |
The SC-DID estimate for comparison. |
sc_weights |
Named vector of synthetic control weights (including intercept). |
sc_fitted |
Synthetic control outcome across all time periods. |
donors_selected |
Names of donor units with nonzero weights. |
pre_fit_rmse |
Root mean squared error of pre-treatment fit. |
Athey, S. and Imbens, G. W. (2006). Identification and Inference in Nonlinear Difference-in-Differences Models. Econometrica, 74(2), 431–497.
Abadie, A., Diamond, A., and Hainmueller, J. (2010). Synthetic Control Methods for Comparative Case Studies. Journal of the American Statistical Association, 105(490), 493–505.
# Basque Country example if (requireNamespace("Synth", quietly = TRUE)) { data("basque", package = "Synth") gdp <- reshape(basque[, c("regionno", "year", "gdpcap")], idvar = "year", timevar = "regionno", direction = "wide") y_treated <- gdp[, "gdpcap.17"] donors <- as.matrix(gdp[, grep("gdpcap\\.", names(gdp))]) donors <- donors[, !colnames(donors) %in% c("gdpcap.17", "gdpcap.1")] valid <- complete.cases(y_treated, donors) result <- sc_cic(y_treated[valid], donors[valid, ], treatment_period = 16, seed = 42) print(result) }# Basque Country example if (requireNamespace("Synth", quietly = TRUE)) { data("basque", package = "Synth") gdp <- reshape(basque[, c("regionno", "year", "gdpcap")], idvar = "year", timevar = "regionno", direction = "wide") y_treated <- gdp[, "gdpcap.17"] donors <- as.matrix(gdp[, grep("gdpcap\\.", names(gdp))]) donors <- donors[, !colnames(donors) %in% c("gdpcap.17", "gdpcap.1")] valid <- complete.cases(y_treated, donors) result <- sc_cic(y_treated[valid], donors[valid, ], treatment_period = 16, seed = 42) print(result) }
Returns a data frame of donor weights from an SC-CIC fit, sorted by absolute weight. Useful for inspecting which donors contribute to the synthetic control.
sc_weights(x, nonzero_only = TRUE)sc_weights(x, nonzero_only = TRUE)
x |
An object of class |
nonzero_only |
Logical. If |
A data frame with columns donor and weight.
Re-estimates the SC-CIC treatment effect over a grid of elastic net penalty parameters, showing sensitivity to the regularization choice.
sensitivity_alpha( y_treated, y_donors, treatment_period, alphas = seq(0, 1, 0.2), seed = 42 )sensitivity_alpha( y_treated, y_donors, treatment_period, alphas = seq(0, 1, 0.2), seed = 42 )
y_treated |
Numeric vector. Treated unit outcomes. |
y_donors |
Numeric matrix. Donor unit outcomes. |
treatment_period |
Integer. First treatment period index. |
alphas |
Numeric vector. Grid of alpha values to evaluate.
Default is |
seed |
Integer or |
A data frame with columns alpha, tau_cic,
tau_did, n_donors, and pre_rmse.
Generates data under controlled DGPs and evaluates SC-CIC performance.
simulate_sccic( n_sims = 500, T_pre = 25, T_post = 15, J = 15, tau_true = 1, dgp = c("linear", "nonlinear", "sc_good", "sc_bad"), alpha = 1, boot_iters = 200, seed = 42, verbose = TRUE )simulate_sccic( n_sims = 500, T_pre = 25, T_post = 15, J = 15, tau_true = 1, dgp = c("linear", "nonlinear", "sc_good", "sc_bad"), alpha = 1, boot_iters = 200, seed = 42, verbose = TRUE )
n_sims |
Integer. Number of simulation replications. |
T_pre |
Integer. Number of pre-treatment periods. |
T_post |
Integer. Number of post-treatment periods. |
J |
Integer. Number of donor units. |
tau_true |
Numeric. True average treatment effect. |
dgp |
Character. Data generating process. See Details. |
alpha |
Elastic net mixing parameter for SC construction. |
boot_iters |
Integer. Bootstrap iterations per simulation. |
seed |
Integer. Random seed. |
verbose |
Logical. Print progress. |
Four DGPs are available, designed to test different aspects of SC-CIC:
DGP 1: "linear" — Baseline.
Outcomes are linear in a common factor and unit-specific loadings.
DID is correctly specified. CIC matches DID. SC fits well.
Purpose: verify the method works in the easy case.
DGP 2: "nonlinear" — CIC advantage.
Cross-sectional DGP (not SC). N observations per cell.
Control and treated have different distributions of unobservables.
The production function is nonlinear and changes over time.
DID is biased due to the nonlinear distributional shift; CIC is correct.
Purpose: demonstrate the advantage of CIC over DID.
Note: this tests cic(), not sc_cic().
DGP 3: "sc_good" — SC with good distributional fit.
The treated unit is a true sparse combination of donors plus noise.
SC recovers the weights well; the distributional dynamics are similar.
Purpose: show SC-CIC works when SC fit is good.
DGP 4: "sc_bad" — SC with mean-only fit.
The SC matches the treated mean, but donors have much lower variance
than the treated unit. The distributional transport is wrong.
Purpose: show SC-CIC fails when distributional assumptions are violated.
A data frame with simulation results.
## Not run: r <- simulate_sccic(n_sims = 200, dgp = "nonlinear", tau_true = 1) summarize_simulation(r, tau_true = 1) ## End(Not run)## Not run: r <- simulate_sccic(n_sims = 200, dgp = "nonlinear", tau_true = 1) summarize_simulation(r, tau_true = 1) ## End(Not run)
Summarize simulation results
summarize_simulation(results, tau_true)summarize_simulation(results, tau_true)
results |
Data frame from |
tau_true |
True treatment effect. |
Prints summary statistics and returns them invisibly.