Financial Entity Extraction with DSPy + GEPA

Experimenting with DSPy and GEPA optimization for structured information extraction. TL;DR: automatic prompt optimization yields +22pp exact match accuracy over vanilla OpenAI API calls.

Results

Method	Exact Match	Mean Field
OpenAI Baseline	32.07%	84.34%
DSPy Baseline	39.79%	87.06%
DSPy + BAML	42.74%	87.86%
DSPy + GEPA	53.84%	91.64%
DSPy + BAML + GEPA	54.43%	91.62%

Dataset

Financial NER extraction from Cleanlab's structured output benchmark - 2,117 samples with 7 entity types: Company, Date, Location, Money, Person, Product, Quantity.

Files

get_responses.ipynb - Original OpenAI baseline
get_responses_dspy.ipynb - DSPy + GEPA experiments
generate_comparison_charts.py - Regenerate comparison charts
schema.py - Pydantic schema for extracted entities

Quick Start

pip install dspy pandas numpy matplotlib scikit-learn

# Run the DSPy notebook
jupyter notebook get_responses_dspy.ipynb

Key Takeaways

DSPy alone doesn't always beat hand-crafted prompts
GEPA optimization is where the real gains are (~14pp over DSPy baseline)
Cost-effective: ~$2-3 in API costs for permanent accuracy improvements
Biggest gains in Product (+12pp) and Date (+9pp) extraction

Blog Post

Full writeup: Optimizing Structured Output with DSPy and GEPA

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
generate_comparison_charts.py		generate_comparison_charts.py
get_responses.ipynb		get_responses.ipynb
get_responses_dspy.ipynb		get_responses_dspy.ipynb
get_responses_dspy_baml.ipynb		get_responses_dspy_baml.ipynb
schema.py		schema.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Financial Entity Extraction with DSPy + GEPA

Results

Dataset

Files

Quick Start

Key Takeaways

Blog Post

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

kmad/dspy-optimizer-experiment

Folders and files

Latest commit

History

Repository files navigation

Financial Entity Extraction with DSPy + GEPA

Results

Dataset

Files

Quick Start

Key Takeaways

Blog Post

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages