Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"permissions": {
"allow": [
"Bash(uv run:*)"
],
"deny": []
}
}
43 changes: 43 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased] - 2025-06-18

### Added
- Modern development setup using `uv` for dependency management
- Support for Python 3.13
- Comprehensive installation documentation in README

### Changed
- **BREAKING**: Minimum Python version updated from 3.8 to 3.13
- **NumPy**: Updated from 1.23.5 to 2.3.0 (major version upgrade)
- **Pandas**: Updated from 1.5.3 to 2.3.0 (major version upgrade)
- **Matplotlib**: Updated from 3.7.1 to 3.10.3
- **SciPy**: Updated from 1.10.1 to 1.15.3
- **Statsmodels**: Updated from 0.13.5 to 0.14.4
- Git repository moved from `d2cml-ai/synthdid.py` to `workhelix/synthdid.py`
- Simplified dependency management in `setup.py` - removed unnecessary development dependencies
- Updated `Pipfile` to use Python 3.13 and latest package versions
- Streamlined `requirements.txt` to include only core scientific packages

### Technical Details
- All dependencies now use minimum version constraints (`>=`) instead of exact pinning for better compatibility
- Virtual environment creation updated to use `uv venv --python 3.13`
- Removed obsolete dependencies like `black==19.3b0`, `click==7.0`, and other development tools from install_requires

### Testing
- Verified compatibility with updated dependencies
- Core functionality tested with California Prop 99 example dataset
- All package imports working correctly with new versions

### Migration Notes
- Existing installations will need to upgrade to Python 3.13+
- Some deprecation warnings may appear with Pandas 2.3.0 due to updated groupby behavior
- NumPy 2.0+ includes breaking changes from NumPy 1.x - see [NumPy 2.0 migration guide](https://numpy.org/devdocs/numpy_2_0_migration_guide.html)

## [0.10.1] - Previous Release
- Previous stable version with Python 3.8 support and older dependencies
122 changes: 122 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

This is a Python implementation of Synthetic Difference-in-Differences (SDID), a causal inference method for policy evaluation and treatment effect estimation in panel data. The library combines synthetic control and difference-in-differences approaches.

## Development Commands

### Package Management
- **Install dependencies**: `pipenv install` (uses Pipfile for dependency management)
- **Install dev dependencies**: `pipenv install --dev`
- **Activate environment**: `pipenv shell`
- **Install package in development mode**: `pip install -e .`

### Running Code
- **Run examples**: `jupyter notebook examples/examples.ipynb`
- **Import main class**: `from synthdid.synthdid import Synthdid`
- **Load sample data**: `from synthdid.get_data import california_prop99`

### Testing
⚠️ **No test suite exists** - this is a critical gap that should be addressed. Consider adding:
- `pytest` for testing framework
- `tests/` directory with unit tests for each module
- Test coverage for core estimation methods

## Code Architecture

### Core Design Pattern
The library uses a **method chaining pattern** where users:
1. Initialize `Synthdid` with data and column specifications
2. Call `.fit()` to estimate the model
3. Call `.vcov()` to compute standard errors
4. Call `.summary()` to format results

### Key Modules

**`synthdid/synthdid.py`**: Main entry point that inherits from multiple mixins:
- Combines functionality from SDID, Variance, Plots, and Summary classes
- Provides unified interface for all estimation methods

**`synthdid/sdid.py`**: Core estimation logic
- `SDID` class with `fit()` method
- Supports three modes: 'sdid', 'sc' (synthetic control), 'did' (difference-in-differences)
- Handles staggered adoption designs

**`synthdid/solver.py`**: Optimization algorithms
- Frank-Wolfe method implementation for weight estimation
- Covariate adjustment with "optimized" and "projected" methods

**`synthdid/utils.py`**: Data preprocessing
- Panel data matrix construction
- Data validation and transformation utilities

**`synthdid/vcov.py`**: Variance estimation
- Multiple standard error methods: placebo, bootstrap, jackknife
- `Variance` class with flexible inference options

**`synthdid/plots.py`**: Visualization capabilities
- Outcome trajectory plots
- Weight visualization for units and time periods

### Data Flow
1. Raw panel data → `utils.py` preprocessing → structured matrices
2. Matrices → `solver.py` optimization → estimated weights
3. Weights + data → `sdid.py` estimation → treatment effects
4. Results → `vcov.py` inference → standard errors
5. Final results → `summary.py` + `plots.py` → formatted output

## Common Usage Patterns

### Basic Estimation
```python
from synthdid.synthdid import Synthdid
from synthdid.get_data import california_prop99

df = california_prop99()
result = Synthdid(df, "State", "Year", "treated", "PacksPerCapita").fit().vcov().summary()
```

### Staggered Adoption
When treatment occurs at different times across units, the library automatically handles staggered designs.

### Covariate Adjustment
Use `covariates` parameter with method "optimized" or "projected" for incorporating additional control variables.

## Data Requirements

Input data must be a pandas DataFrame with:
- **Unit column**: Identifier for panel units (e.g., states, countries)
- **Time column**: Time period identifier
- **Treatment column**: Binary treatment indicator
- **Outcome column**: Dependent variable
- Optional covariate columns

## Dependencies

Core dependencies (pinned versions in requirements.txt):
- numpy (1.23.5)
- pandas (1.5.3)
- matplotlib (3.7.1)
- scipy (1.10.1)
- statsmodels (0.13.5)

Development environment targets Python 3.8+ with Pipenv for dependency management.

## Known Issues and Limitations

- **No test suite**: Critical gap in code quality assurance
- **Empty `__init__.py`**: Main classes not exposed at package level
- **Inconsistent error handling**: Mix of print statements and exceptions
- **Missing type hints**: Reduces code maintainability
- **Rigid dependency pinning**: May cause compatibility issues

## Development Priorities

1. Add comprehensive test suite with pytest
2. Implement proper package-level imports in `__init__.py`
3. Add type hints throughout codebase
4. Standardize error handling with custom exceptions
5. Add code formatting and linting tools
25 changes: 6 additions & 19 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,14 @@ verify_ssl = true
name = "pypi"

[packages]
numpy = "==1.23.5"
pandas = "==1.5.3"
matplotlib = "==3.7.1"
contourpy = "==1.0.7"
cycler = "==0.11.0"
fonttools = "==4.39.0"
importlib-resources = "==5.12.0"
kiwisolver = "==1.4.4"
packaging = "==23.0"
pillow = "==9.4.0"
pyparsing = "==3.0.9"
python-dateutil = "==2.8.2"
pytz = "==2022.7.1"
six = "==1.16.0"
zipp = "==3.15.0"
statsmodels = "*"
pipenv-setup = "*"
scipy = "*"
numpy = "==2.3.0"
pandas = "==2.3.0"
matplotlib = "==3.10.3"
scipy = "==1.15.3"
statsmodels = "==0.14.4"

[dev-packages]
ipykernel = "*"

[requires]
python_version = "3.8"
python_version = "3.13"
37 changes: 34 additions & 3 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,46 @@
# sythdid: Synthetic Difference in Difference Estimation
# synthdid: Synthetic Difference in Difference Estimation

This package implements the synthetic difference-in-differences estimation procedure, along with a range of inference and graphing procedures, following the work of the author. The package draws on [R](https://github.com/synth-inference/synthdid) and [Julia](https://github.com/d2cml-ai/Synthdid.jl) code for optimization and [Stata](https://github.com/Daniel-Pailanir/sdid) code for implementation in contexts with staggered adoption over multiple treatment periods (as well as in a single adoption period as in the original code). The package extends the functionality of the original code, allowing for estimation in a wider range of contexts. Overall, this package provides a comprehensive toolkit for researchers interested in using the synthetic difference-in-differences estimator in their work.

## Requirements

- **Python 3.13+** (updated from 3.8 to support latest dependencies)
- **NumPy 2.3.0+** (major version upgrade for improved performance)
- **Pandas 2.3.0+** (latest version with enhanced functionality)
- **Matplotlib 3.10.3+** (updated plotting capabilities)
- **SciPy 1.15.3+** (latest scientific computing features)
- **Statsmodels 0.14.4+** (updated statistical modeling)

## Instalation
## Installation

```py
### Using pip (PyPI)
```bash
pip install synthdid
```

### Using uv (recommended for development)
```bash
# Install with Python 3.13
uv venv --python 3.13
uv pip install -r requirements.txt
```

### Development Installation
```bash
# Clone the repository
git clone https://github.com/workhelix/synthdid.py.git
cd synthdid.py

# Create virtual environment with Python 3.13
uv venv --python 3.13

# Install dependencies
uv pip install -r requirements.txt

# Install in development mode
uv pip install -e .
```

## Usage

### Input class `Synthdid`
Expand Down
20 changes: 5 additions & 15 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,15 +1,5 @@
contourpy==1.0.7
cycler==0.11.0
fonttools==4.39.0
importlib-resources==5.12.0
kiwisolver==1.4.4
matplotlib==3.7.1
numpy==1.23.5
packaging==23.0
pandas==1.5.3
Pillow==9.4.0
pyparsing==3.0.9
python-dateutil==2.8.2
pytz==2022.7.1
six==1.16.0
zipp==3.15.0
numpy==2.3.0
pandas==2.3.0
matplotlib==3.10.3
scipy==1.15.3
statsmodels==0.14.4
30 changes: 5 additions & 25 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,31 +10,11 @@
setup(
dependency_links=[],
install_requires=[
"appdirs==1.4.3",
"attrs==19.1.0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'",
"black==19.3b0; python_version >= '3.6'",
"click==7.0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'",
"contourpy==1.0.7",
"cycler==0.11.0",
"fonttools==4.39.0",
"importlib-resources==5.12.0",
"kiwisolver==1.4.4",
"matplotlib==3.7.1",
"numpy==1.23.5",
"packaging==23.0",
"pandas==1.5.3",
"patsy==0.5.3",
"pillow==9.4.0",
"pipenv-setup==2.0.0",
"pipfile==0.0.2",
"pyparsing==3.0.9",
"python-dateutil==2.8.2",
"pytz==2022.7.1",
"scipy==1.10.1; python_version >= '3.10' and python_version < '3.12' and platform_system != 'Windows' or platform_machine != 'x86'",
"six==1.16.0",
"statsmodels==0.13.5",
"toml==0.10.0",
"zipp==3.15.0",
"numpy>=2.3.0",
"pandas>=2.3.0",
"matplotlib>=3.10.3",
"scipy>=1.15.3",
"statsmodels>=0.14.4",
],
name="synthdid",
author="D2CML Team, Alexander Quispe, Rodrigo Grijalba, Jhon Flores, Franco Caceres",
Expand Down