CausalDGP: A Python Engine for Structural Causal Models

A powerful and flexible Python library for creating, manipulating, and generating data from Structural Causal Models (SCMs) and analyzing causal graphs.

📖 Overview

CausalDGP is a Python engine for causal inference research and experimentation. It provides a robust framework for defining complex causal systems, generating data from them, and analyzing their underlying graphical properties.

While the engine is a general-purpose tool, it also serves as a comprehensive benchmark, providing a standardized collection of data generating processes (DGPs) from the causal inference literature. It's designed to help researchers evaluate and compare estimators under challenging scenarios, including:

Back-door and front-door adjustment
Unobserved confounding and complex "napkin" graphs
Longitudinal models with time-varying treatments
Advanced identification strategies beyond simple adjustments

✨ Key Features

The power of CausalDGP comes from its modular design:

Flexible SCM Engine (scm.py): Define any SCM by specifying variables, their causal parents, and their functional relationships. The engine handles the recursive data generation process automatically.
Rich Graph Toolkit (graph.py): A suite of utility functions for causal graph analysis, including d-separation checks, finding ancestors, implementing do-calculus rules, and generating interactive visualizations.
Benchmark Suite (generator.py): A ready-to-use collection of famous and challenging SCMs from the literature, perfect for evaluating new methods.

For detailed API documentation, please see our docs/ folder (coming soon).

🚀 Quick Start: The Benchmark Application

The easiest way to see CausalDGP in action is through the included benchmark generators (generator.py). These pre-built SCMs from the literature demonstrate the capabilities of the engine.

Setup

Make sure you have the necessary libraries installed:

pip install networkx scipy matplotlib numpy pandas pyvis

Install CausalDGP as an editable package to ensure all imports work correctly.

Example: Generating Data

You can import any SCM from the generator module and use it to create a dataset. Here’s how to generate 1,000 samples from the Kang & Schafer (2007) simulation:

# 1. Import a pre-built SCM generator from the CausalDGP package
from CausalDGP.generator import Kang_Schafer

# 2. Get the SCM object, treatment, and outcome variable names
scm, treatments, outcomes = Kang_Schafer(seednum=42)

# 3. Use the SCM object to generate a pandas DataFrame
sample_data = scm.generate_samples(num_samples=1000)

# 4. Display the first few rows of your new dataset
print(sample_data.head())

📊 Available Benchmark Models

CausalDGP includes a variety of pre-built SCMs from the causal inference literature. Below is a summary. For detailed mathematical descriptions and graph visualizations for each model, please see our Benchmark Details Documentation.

Model Function	Brief Description	Key Features & Notes
`BD_SCM`	A model for the standard back-door adjustment	User-defined covariate dimensions.
`Kang_Schafer`	A classic model from Kang & Schafer (2007).	-
`CCDDHNR2018_IRM`	Interactive-Regression-Model (IRM) from Chernozhukov et al. (2018).	High-dimensional confounders
`CCDDHNR2018_PLR`	Partially-Linear-Regression (PLR) model from Chernozhukov et al. (2018).	Continuous treatment, high-dimensional confounders.
`SBD_SCM`	Standard sequential back-door adjustment.	Time-varying covariates and treatments.
`mSBD_SCM`	Multi-outcome sequential back-door adjustment.	Multiple sequential outcomes.
`luedtke_2017_sim1_scm`	Longitudinal model from Luedtke et al., (2017).	Sequential data with time-varying treatments.
`Canonical_FD_SCM`	Canonical front-door graph.	No observed confounders of T and Y.
`Fulcher_FD`	A model for the Front-Door criterion, used in Fulcher et al., (2020)	Front-door with observed confounders.
`FD_SCM`	A model for the Front-door with covariates	Front-door model with high-dimensional confounders
`Bhattacharya2022_Fig2b_SCM`	A model used in Bhattacharya et al., (2022) Figure 2b	Solvable by advanced identification, not front/back-door.
`Bhattacharya2022_Fig3_SCM`	A model used in Bhattacharya et al., (2022) Figure 3	-
`Bhattacharya2022_Fig5_SCM`	A model used in Bhattacharya et al., (2022) Figure 5	-
`Bhattacharya2022_Fig5_SCM`	A model used in Bhattacharya et al., (2022) Figure 5	-
`ConeCloud_15_SCM`	A model for the 15-node Cone Cloud graph from Figure 3b of Raichev et al., 2024	Large graph with categorical variables.
`Napkin_SCM`	The minimal "Napkin" graph.	Irreducible ratio of probabilities estimand.
`Napkin_SCM_dim`	Napkin graph with high-dimensional covariates.	Irreducible ratio with high-dimensional covariates.
`Napkin_FD_SCM`	Combines Napkin and Front-Door structures.	Irreducible ratio with front-door adjustment.
`Nested_Napkin_SCM`	Nested graph from Tian (2002) Thesis	Estimand is a ratio of two sequential back-door formulas.
`Double_Napkin_SCM`	Two stacked Napkin graphs.	Estimand is a ratio-of-ratios.
`Plan_ID_SCM`	A model from Figure Figure 1b of Jung et al., 2021	Not solvable by simple back-door or front-door.

📜 Citation

If you use CausalDGP in your research, please consider citing it:

@software{CausalDGP_2025,
  author = {Jung, Yonghan},
  title = {{CausalDGP: A Python Engine for Structural Causal Models}},
  url = {[https://github.com/yonghanjung/CausalDGP](https://github.com/yonghanjung/CausalDGP)},
  version = {0.1.0},
  year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
CausalDGP		CausalDGP
docs		docs
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CausalDGP: A Python Engine for Structural Causal Models

📖 Overview

✨ Key Features

🚀 Quick Start: The Benchmark Application

Setup

Example: Generating Data

📊 Available Benchmark Models

📜 Citation

About

Uh oh!

Releases

Packages

Languages

License

yonghanjung/CausalDGP

Folders and files

Latest commit

History

Repository files navigation

CausalDGP: A Python Engine for Structural Causal Models

📖 Overview

✨ Key Features

🚀 Quick Start: The Benchmark Application

Setup

Example: Generating Data

📊 Available Benchmark Models

📜 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages