SynEval is a comprehensive evaluation framework for assessing the quality of synthetic data. The framework provides quantitative scoring across four key dimensions:
- Fidelity: Measures how well the synthetic data preserves the statistical properties and patterns of the original data
- Utility: Evaluates the usefulness of synthetic data for downstream tasks
- Diversity: Assesses the variety and uniqueness of the generated data
- Privacy: Analyzes the privacy protection level of the synthetic data
- Clone the repository:
git clone https://github.com/privacy-enhancing-technologies/SynEval.git
cd SynEval- Create and activate a conda virtual environment:
conda create -n syneval python=3.10
conda activate syneval- Install dependencies:
pip install -r requirements.txt- Download NLTK data (required for text processing):
python -m nltk.downloader punkt punkt_tab stopwordsNote: You may see dependency conflict warnings during installation. This is normal in environments like Google Colab or when other packages are already installed. For a clean installation without conflicts, consider using a virtual environment.
After installation, you can use SynEval from the command line. The main entry point for the framework is run.py. This script allows you to evaluate synthetic data against original data using various metrics.
The general command format is:
python run.py --synthetic <synthetic_data.csv> --original <original_data.csv> --metadata <metadata.json> [evaluation_flags] [--output <results.json>]--synthetic: Path to the synthetic data CSV file--original: Path to the original data CSV file--metadata: Path to the metadata JSON file
You can select one or more evaluation dimensions to run:
--fidelity: Run fidelity evaluation--utility: Run utility evaluation--diversity: Run diversity evaluation--privacy: Run privacy evaluation
--output: Path to save evaluation results in JSON format. If not specified, results will be printed to stdout. (Default:artifacts/reports/evaluation_results.json)--plot: Generate plots for all evaluation metrics and save them to theartifacts/plotsdirectory. Plots visualize key metrics from fidelity, utility, diversity, and privacy evaluations.--html: Generate an HTML dashboard summarizing each evaluation dimension (saved toartifacts/html/syneval_dashboard.htmlby default).
--device: Device to use for computation (auto,cpu,cuda). Default:auto(automatically detect best available device)--force-cpu: Force CPU usage even if GPU is available (overrides--device)--gpu-memory-fraction: Fraction of GPU memory to use (0.0-1.0, default: 0.8)
Example:
python run.py \
--synthetic data/gpt4.1_synthetic.csv \
--original data/real_10k.csv \
--metadata data/metadata.json \
--dimensions fidelity utility diversity privacy \
--utility-input text \
--utility-output rating \
--output artifacts/reports/results.json \
--plot \
--html \
--device autoFor GPU acceleration (if available):
python run.py \
--synthetic data/gpt4.1_synthetic.csv \
--original data/real_10k.csv \
--metadata data/metadata.json \
--dimensions fidelity utility diversity privacy \
--utility-input text \
--utility-output rating \
--output artifacts/reports/results.json \
--plot \
--html \
--device cuda \
--gpu-memory-fraction 0.8For CPU-only processing:
python run.py \
--synthetic data/gpt4.1_synthetic.csv \
--original data/real_10k.csv \
--metadata data/metadata.json \
--dimensions fidelity utility diversity privacy \
--utility-input text \
--utility-output rating \
--output artifacts/reports/results.json \
--plot \
--html \
--device cpuAll artifacts generated by run.py are stored under the artifacts/ directory:
artifacts/cache: cached intermediate computationsartifacts/plots: matplotlib/seaborn plotsartifacts/html: HTML dashboards (including privacy visualizations)artifacts/reports: JSON summaries and other text outputs
The metadata file should be a JSON file that describes the structure of your data. It should include column names, types, dataset name, and primary key information. See data/metadata.json for a concrete example.
Detailed documentation for each evaluation module is located under evaluation/descriptions/:
The use_cases/ directory contains scenario-focused extensions that demonstrate how SynEval can be adapted to solve concrete problems beyond the core evaluation CLI (e.g., NER analysis, differential privacy dashboards).
As we implement more evaluation metrics, this README will be updated with additional documentation for each component.
This project is licensed under the MIT License - see the LICENSE file for details.