Skip to content

feat(python): expose path operations in extract_paths() dict#261

Open
willywg wants to merge 1 commit intoyfedoseev:mainfrom
willywg:feat/expose-path-operations-python
Open

feat(python): expose path operations in extract_paths() dict#261
willywg wants to merge 1 commit intoyfedoseev:mainfrom
willywg:feat/expose-path-operations-python

Conversation

@willywg
Copy link
Copy Markdown

@willywg willywg commented Mar 18, 2026

Description

Add an operations field to the path dict returned by extract_paths(), extract_lines(), and extract_rects() Python bindings.

Currently the Python API only exposes operations_count (the number of path operations) but not the actual operations with their coordinates. This makes it impossible to access the precise vector geometry from Python — users only get the bounding box.

Each operation is a dict with an op key and the relevant coordinates:

  • move_to: {op, x, y}
  • line_to: {op, x, y}
  • curve_to: {op, cx1, cy1, cx2, cy2, x, y}
  • rectangle: {op, x, y, width, height}
  • close_path: {op}

Use case: We're building a construction document processing platform that extracts vector geometry from architectural PDFs for snap-to-geometry and CAD overlay features. pdf_oxide is the fastest library we've tested, and this change makes it fully usable for our vector extraction pipeline.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Tests
  • CI/CD changes

Related Issues

Relates to #147 — This PR exposes the raw path operations to Python, which is a prerequisite for the PathContent::to_points(tolerance) API discussed in that issue.

Changes Made

  • Modified path_to_py_dict() in src/python.rs to include an operations list alongside the existing operations_count
  • Each PathOperation variant is converted to a Python dict with the op type and coordinates
  • Added test_extract_paths_operations test in tests/test_python.py

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • All existing tests pass locally (cargo test — 128 passed, 0 failed)
  • cargo clippy -- -D warnings passes with no warnings
  • cargo fmt --check passes

Python Bindings (if applicable)

  • Python bindings have been updated
  • Python tests pass
  • ruff format — N/A (only added test, follows existing style)
  • ruff check — N/A

Documentation

  • README.md or documentation site has been updated
  • Code examples have been updated
  • CHANGELOG.md has been updated

Note: I didn't update CHANGELOG.md as I'm not sure which version this would target. Happy to add it if you let me know.

Checklist

  • My code follows the coding guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code where necessary
  • My changes generate no new warnings
  • I have checked my code for spelling errors
  • PR title follows conventional commits format

Additional Notes

  • This is a non-breaking, additive change — the existing operations_count field is preserved, and the new operations field is simply added to the same dict
  • The Rust PathOperation enum already had all the data; only the Python serialization in path_to_py_dict() needed updating (~40 lines)
  • Coordinate values are f32 as they come from the Rust struct, matching the existing bbox and stroke_width precision

Add 'operations' field to the path dict returned by extract_paths(),
extract_lines(), and extract_rects() Python bindings.

Each operation is a dict with 'op' key and coordinates:
- move_to: {op, x, y}
- line_to: {op, x, y}
- curve_to: {op, cx1, cy1, cx2, cy2, x, y}
- rectangle: {op, x, y, width, height}
- close_path: {op}

This enables vector extraction use cases (snap-to-geometry, CAD overlays)
that need actual path coordinates, not just bounding boxes.

Relates to yfedoseev#147
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant