Skip to content

Conversation

@nwoodruff-co
Copy link
Contributor

Summary

This PR enhances the entity mapping system to support custom value arrays and multiple aggregation methods, making it more flexible for complex analysis workflows.

Changes

Core enhancements:

  • Added values parameter to map custom value arrays instead of requiring existing columns
  • Extended how parameter with new aggregation methods:
    • Person → Group: sum (default), first
    • Group → Person: project (default), divide
    • Group → Group: sum, first, project, divide

Refactoring:

  • Created base YearData class in src/policyengine/core/dataset.py
  • UKYearData and USYearData now inherit from base class
  • Eliminated duplicate map_to_entity implementations

Documentation:

  • Added comprehensive entity mapping section to docs/core-concepts.md
  • Added examples to UK and US model documentation
  • Documented all aggregation methods with practical use cases

Testing

All existing tests pass, confirming backward compatibility:

  • tests/test_entity_mapping.py: 9 tests passing
  • tests/test_us_entity_mapping.py: 5 tests passing

Use cases

  1. Custom value mapping: Map imputed or calculated values that don't exist as columns
  2. Equal splitting: Divide household-level values equally among persons
  3. Representative values: Take first person's value as group representative
  4. Flexible aggregation: Control how values combine across entity levels

Example usage

# Map custom per-person values to household level
household_totals = dataset.data.map_to_entity(
    source_entity="person",
    target_entity="household",
    values=custom_values_array,
    how="sum"
)

# Split household rent equally among persons
person_rent_share = dataset.data.map_to_entity(
    source_entity="household",
    target_entity="person",
    columns=["rent"],
    how="divide"
)

Closes #183

nwoodruff-co and others added 6 commits November 17, 2025 11:59
…alues

Add support for custom values and multiple aggregation methods to the entity
mapping system, making it more flexible for complex analysis workflows.

Features added:
- values parameter: Map custom value arrays instead of existing columns
- Extended how parameter with new aggregation methods:
  * Person → Group: 'sum' (default), 'first'
  * Group → Person: 'project' (default), 'divide'
  * Group → Group: 'sum', 'first', 'project', 'divide'

Refactoring:
- Created base YearData class to eliminate code duplication
- UKYearData and USYearData now inherit from base class
- Removed duplicate map_to_entity implementations

Documentation:
- Added comprehensive entity mapping section to core-concepts.md
- Added examples to UK and US model documentation
- Documented all aggregation methods with use cases

All existing tests pass, confirming backward compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive guides for AI assistants to use policyengine.py:
- .claude/policyengine-guide.md: Detailed patterns and examples
- .claude/quick-reference.md: Quick lookup for common operations

Includes:
- 7 common workflow patterns (synthetic scenarios, parameter sweeps, reforms)
- Minimal working examples for UK and US
- Entity mapping examples with all aggregation methods
- Critical fields reference
- Common parameters cheat sheet
- Troubleshooting guide

These guides help AI assistants quickly understand and use the package
for tax-benefit microsimulation analysis.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add convenience methods to look up parameters and variables by name:
- get_parameter(name): Returns Parameter object by name
- get_variable(name): Returns Variable object by name
- Both raise ValueError if not found with helpful error messages

Tests added (12 tests, all passing):
- UK and US variable lookup tests
- UK and US parameter lookup tests
- Error handling tests for non-existent parameters/variables
- Multiple parameter/variable lookup tests

Usage:
  var = uk_latest.get_variable('income_tax')
  param = uk_latest.get_parameter('gov.hmrc.income_tax.allowances.personal_allowance.amount')

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@nwoodruff-co nwoodruff-co merged commit de2d3ce into main Nov 17, 2025
3 checks passed
@nwoodruff-co nwoodruff-co deleted the enhance-entity-mapping-flexibility branch November 17, 2025 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance entity mapping with flexible aggregation methods

2 participants