pycausalarima

A Python implementation of the C-ARIMA methodology for estimating causal effects of interventions on time series data.

Disclaimer: This package is provided for educational and didactic purposes only. It comes with no warranties of any kind, express or implied. Use at your own risk. This is not production-grade software and should not be used for critical applications without thorough independent validation.

Overview

pycausalarima estimates the causal effect of an intervention on a univariate time series using ARIMA models. It implements the methodology described in:

Menchetti, F., Cipollini, F., & Mealli, F. (2023). "Combining counterfactual outcomes and ARIMA models for policy evaluation." The Econometrics Journal.

This package is a Python port of the CausalArima R package. Results have been validated against the R implementation across 30 test scenarios covering ARIMA and SARIMA models. See VALIDATION.md for details.

Installation

From source

git clone https://github.com/RobsonTigre/pycausalarima.git
cd pycausalarima
pip install -e .

For development

pip install -e ".[dev]"

Dependencies

Requires Python 3.9+ with:

numpy, pandas, scipy
statsmodels, pmdarima
matplotlib

Quick Start

import numpy as np
import pandas as pd
from pycausalarima import CausalArima

# Create sample data
n = 100
np.random.seed(1)
y = np.cumsum(np.random.normal(0, 1, n)) + 100
y[70:] += 10  # Add intervention effect at day 70

dates = pd.date_range('2020-01-01', periods=n, freq='D')
intervention_date = dates[70]

# Fit model
ca = CausalArima(
    y=y,
    dates=dates,
    intervention_date=intervention_date,
)
result = ca.fit()

# View results
print(ca.summary())

# Visualize
ca.plot(type='forecast')

Complete Example: Replicating the R Package Demo

This example replicates the canonical example from the original R CausalArima package, demonstrating detection of a +10 unit intervention effect with an exogenous regressor.

import numpy as np
import pandas as pd
from pycausalarima import CausalArima

# Simulate data (matches R package example)
n = 100
np.random.seed(1)

# Generate AR(1)-like covariate
x1 = np.zeros(n)
x1[0] = 100
for t in range(1, n):
    x1[t] = 100 + 0.999 * (x1[t-1] - 100) + np.random.normal(0, 1)

# Response with relationship to x1
y = 1.2 * x1 + np.random.normal(0, 1, n)

# Add intervention effect (+10) starting at 71% of the series
intervention_start = int(n * 0.71)
y[intervention_start:] += 10

# Create dates
dates = pd.date_range(start='2014-01-05', periods=n, freq='D')
intervention_date = pd.Timestamp('2014-03-16')

# Fit model with exogenous regressor and bootstrap inference
ca = CausalArima(
    y=y,
    dates=dates,
    intervention_date=intervention_date,
    xreg=x1.reshape(-1, 1),  # Include covariate
    n_boot=1000              # Enable bootstrap inference
)
result = ca.fit()

# View summary
print(ca.summary())

Expected Output:

                              Estimate     SD    P-value (2-sided)
Point causal effect            12.257  1.211                 0.000
Cumulative causal effect      310.709  6.634                 0.000
Temporal average effect        10.357  0.221                 0.000

Note: Output format shown is for illustration. Actual summary() returns a pandas DataFrame with full inference statistics including left-sided, bidirectional, and right-sided p-values. Values are from the R reference implementation; your results may differ slightly due to bootstrap sampling and numerical precision.

Interpretation:

True intervention effect: +10 units
Estimated temporal average: 10.357 (within 4% of true value)
P-value ≈ 0: Strong evidence of a significant positive effect
Cumulative effect (310.7): Total impact over 30 post-intervention days

Visualizations

# Forecast plot: observed vs counterfactual
ca.plot(type='forecast')

# Impact plot: point effects and cumulative effects
ca.plot(type='impact')

# Residual diagnostics: ACF, PACF, Q-Q plot
ca.plot(type='residuals')

Main Features

Automatic ARIMA selection via pmdarima (similar to R's auto.arima)
Normal-based and bootstrap inference for uncertainty quantification
Three causal estimands: point effect, cumulative effect, temporal average
Visualization: forecast plots, impact plots, residual diagnostics
Exogenous regressors support
Full SARIMA support: Tested with 20+ seasonal model configurations

API

CausalArima Constructor

CausalArima(
    y,                          # Time series (array-like)
    dates,                      # Dates (DatetimeIndex)
    intervention_date,          # Intervention date (Timestamp)
    auto=True,                  # Auto-select ARIMA order
    order=(0, 0, 0),            # Manual ARIMA order (p, d, q)
    seasonal_order=(0, 0, 0, 1),# Seasonal order (P, D, Q, s)
    xreg=None,                  # Exogenous regressors
    ic='aic',                   # Information criterion
    n_boot=None,                # Bootstrap iterations
    alpha=0.05                  # Significance level
)

Methods

Method	Description
`fit()`	Fit model and compute effects
`summary()`	Summary table of effects
`plot(type=...)`	Visualizations (`'forecast'`, `'impact'`, `'residuals'`). Returns `matplotlib.figure.Figure` or dict of figures
`impact()`	Detailed impact tables
`get_residuals(standardized=False)`	Model residuals for diagnostics

Methodology

The C-ARIMA approach:

Fits an ARIMA model on pre-intervention data
Forecasts the counterfactual scenario (what would have happened without intervention)
Computes causal effects as observed minus counterfactual
Provides inference via analytical variance formulas or bootstrap

Causal Assumptions

The C-ARIMA method produces valid causal estimates under three key assumptions:

Assumption	Description	How to Check
Continuation	Pre-intervention dynamics would have continued absent intervention	Examine residual ACF; verify stable pre-trend
No Anticipation	Intervention was not anticipated or acted upon early	Verify timing; check for pre-intervention drift
Persistence	Single, sustained intervention (not multiple or time-varying)	Review study design
Exogeneity	Exogenous regressors (`xreg`) are not affected by the intervention	Verify regressors are determined outside the causal pathway

State-Space Consistency

The method assumes the ARIMA model correctly captures pre-intervention dynamics:

Residuals should be white noise (no autocorrelation)
Model order should be appropriate (use auto=True or validate with AIC/BIC)
Seasonal patterns must be explicitly modeled if present

Check with: ca.plot(type='residuals') to examine ACF/PACF of residuals.

What Happens When Assumptions Fail

Violation	Consequence	Symptom
Pre-trend instability	Biased effect estimates	Poor fit in pre-intervention period
Anticipation effects	Effect appears before intervention	Gradual drift near intervention date
Multiple interventions	Confounded estimates	Large residuals after first intervention
Wrong ARIMA order	Invalid inference	Autocorrelated residuals
Unmodeled seasonality	Periodic bias	Seasonal patterns in residuals

When NOT to Use C-ARIMA

This method may not be appropriate if:

Multiple interventions occur at different times
The intervention effect varies over time (use dynamic regression instead)
There's a control group available (use difference-in-differences or synthetic control)
Pre-intervention series is very short (< 30 observations recommended)
Strong non-stationarity that differencing cannot address

Recommended Diagnostics

result = ca.fit()

# 1. Residual diagnostics - should show no significant ACF/PACF
ca.plot(type='residuals')

# 2. Visual inspection - does the counterfactual look reasonable?
ca.plot(type='forecast')

# 3. Check model order (if auto=True)
print(f"Selected order: {result.order}")

Testing

pip install -e ".[dev]"
pytest tests/ -v

Current test suite includes:

Unit tests: Core functionality, edge cases, reporting, visualization
R comparison tests: Validates results match R implementation
Cross-validation: 211 parametrized pytest tests across 30 DGPs (7 metrics each)

# Run cross-validation only
pytest tests/test_dgp_cross_validation.py -v

# Filter by suite
pytest -m dgp_main -v        # DGPs 1-8 (basic ARIMA)
pytest -m dgp_sarima -v      # DGPs 9-18 (seasonal differencing)
pytest -m dgp_extended -v    # DGPs 19-30 (edge cases)

See VALIDATION.md for methodology and comparison/ for full reports.

Validation Summary

Cross-validation against the R CausalArima package across 30 Data Generating Processes:

Metric	Tested	Pass	Status
Point causal effects	30 DGPs	30	All match (max diff < 1.8%)
Cumulative effects	30 DGPs	30	All match
Temporal average effects	30 DGPs	30	All match
Bidirectional p-values	30 DGPs	30	All match (< 0.01 abs diff)
Tau time series correlation	30 DGPs	30	All > 0.99 (29/30 = 1.0000)
Cumulative series correlation	30 DGPs	30	All = 1.0000
Standard deviations	30 DGPs	30	All within 20% relative tolerance

Note on SD tolerance: Standard deviations use a wider tolerance (20%) because Python (statsmodels state-space MLE) and R (CSS-ML) estimate the innovation variance (sigma2) differently. This is a known cross-library numerical characteristic, especially for models with seasonal differencing (D > 0) or high regular differencing (d >= 2). Point estimates and statistical significance are unaffected.

Known Limitations

SD tolerance for seasonal models: Standard deviations for seasonal and differenced ARIMA models may differ from R by up to 20% due to sigma2 estimation method differences (Python state-space MLE vs R CSS-ML). Point estimates and p-values match closely. See VALIDATION.md for details.
Bootstrap compatibility: Bootstrap simulation uses the statsmodels simulate() API, which may change in future statsmodels versions. Normal-based inference is unaffected.
Very short time series (< 20 observations) may produce unstable estimates

Next Steps

Exogenous regressor (xreg) cross-validation: Current 30 DGPs test ARIMA-only models. Add DGPs with exogenous regressors to validate ARIMAX behavior against R.
Bootstrap inference cross-validation: Add DGPs comparing bootstrap CIs and p-values between R and Python (current tests validate normal-based inference only).
Increase coverage for edge cases: Near-unit-root with seasonal components, very long series (n > 500), multiple exogenous regressors.
CI/CD integration: Add the cross-validation suite to CI pipeline (runs without R using pre-computed reference data).

License

GPL-3.0 (same as original R package)

References

@article{menchetti2023combining,
  title={Combining counterfactual outcomes and ARIMA models for policy evaluation},
  author={Menchetti, Fiammetta and Cipollini, Fabrizio and Mealli, Fabrizia},
  journal={The Econometrics Journal},
  year={2023}
}

Acknowledgments

This is a Python port of the CausalArima R package by Fabrizio Cipollini, Fiammetta Menchetti, and Eugenio Palmieri.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
comparison		comparison
pycausalarima		pycausalarima
reference_r_package		reference_r_package
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
REVIEW_REPORT.md		REVIEW_REPORT.md
VALIDATION.md		VALIDATION.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pycausalarima

Overview

Installation

From source

For development

Dependencies

Quick Start

Complete Example: Replicating the R Package Demo

Visualizations

Main Features

API

CausalArima Constructor

Methods

Methodology

Causal Assumptions

State-Space Consistency

What Happens When Assumptions Fail

When NOT to Use C-ARIMA

Recommended Diagnostics

Testing

Validation Summary

Known Limitations

Next Steps

License

References

Acknowledgments

See Also

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pycausalarima

Overview

Installation

From source

For development

Dependencies

Quick Start

Complete Example: Replicating the R Package Demo

Visualizations

Main Features

API

CausalArima Constructor

Methods

Methodology

Causal Assumptions

State-Space Consistency

What Happens When Assumptions Fail

When NOT to Use C-ARIMA

Recommended Diagnostics

Testing

Validation Summary

Known Limitations

Next Steps

License

References

Acknowledgments

See Also

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages