Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# PolicyEngine Core - Development Guide 📚

## Common Commands
- `make all`: Full development cycle (install, format, test, build)
- `make install`: Install package in dev mode with dependencies
- `make format`: Format code with Black (79 char line limit)
- `make test`: Run all tests with coverage
- `make mypy`: Run type checking
- `pytest tests/path/to/test_file.py::test_function_name -v`: Run single test

## Code Style Guidelines
- **Formatting**: Black with 79 character line length
- **Imports**: Standard library, third-party, local modules (standard Python grouping)
- **Types**: Use type annotations, checked with mypy
- **Naming**: snake_case for functions/variables, CamelCase for classes
- **Error Handling**: Use custom error classes from policyengine_core/errors/
- **Testing**: All new features need tests, both unit and YAML-based tests supported
- **PRs**: Include "Fixes #{issue}" in description, add changelog entry, pass all checks
- **Changelog**: Update changelog_entry.yaml with proper bump type and change description
- **Dependencies**: All versions must be capped to avoid breaking changes
- **Python**: 3.10+ required

## Git Workflow
- First push to a new branch: `git push -u origin feature/branch-name` to set up tracking
- Subsequent pushes: just use `git push` to update the same PR
- Always run `make format` before committing to ensure code passes style checks

## PR Reviews
- Check PR comments with: `gh pr view <PR_NUMBER> --repo PolicyEngine/policyengine-core`
- Get review comments with: `gh api repos/PolicyEngine/policyengine-core/pulls/<PR_NUMBER>/comments`
- Address all reviewer feedback before merging
- Add "Claude Code wrote this comment" to GitHub comments written by Claude
- Follow Clean Code principles when refactoring:
- Keep functions small and focused on a single task
- Avoid deeply nested conditionals and exceptions
- Extract complex logic into well-named helper functions
- Minimize duplication and optimize for readability
- Use consistent error handling patterns
- Avoid too many levels of indentation
- Prefer clear, well-named functions over comments
- Ensure 100% test coverage for error handling code
- Separate concerns with dedicated helper functions

## Package Architecture
- **Parameter System**: Core framework for tax-benefit system parameters
- Parameters organized in a hierarchical tree (accessible via dot notation)
- Parameters can be scalar values or bracket-based scales
- Supports time-varying values with date-based lookup
- **Errors**: Custom error classes in `policyengine_core/errors/` for specific failures
- **Entities**: Represents different units (person, household, etc.) in microsimulations
- **Variables**: Calculations that can be performed on entities
- **Testing**: Supports both Python tests and YAML-based test files
- **Country Template**: Reference implementation of a tax-benefit system
9 changes: 9 additions & 0 deletions changelog_entry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
- bump: minor
changes:
added:
- Added ParameterPathError class for specific error handling in parameter operations
changed:
- Completely refactored get_parameter implementation for better readability and maintainability
- Improved error messages with specific details and suggestions for parameter path resolution issues
- Separated parameter path handling into focused helper functions
- Added comprehensive test cases for all error scenarios
1 change: 1 addition & 0 deletions policyengine_core/errors/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from .nan_creation_error import NaNCreationError
from .parameter_not_found_error import ParameterNotFoundError
from .parameter_parsing_error import ParameterParsingError
from .parameter_path_error import ParameterPathError
from .period_mismatch_error import PeriodMismatchError
from .situation_parsing_error import SituationParsingError
from .spiral_error import SpiralError
Expand Down
21 changes: 21 additions & 0 deletions policyengine_core/errors/parameter_path_error.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
class ParameterPathError(ValueError):
"""
Exception raised when there's an error in parameter path resolution.

This includes:
- Parameter not found errors
- Invalid bracket syntax errors
- Invalid bracket index errors
- Attempted bracket access on non-bracket parameters
"""

def __init__(self, message, parameter_path=None, failed_at=None):
"""
Args:
message (str): The error message.
parameter_path (str, optional): The full parameter path that was being accessed.
failed_at (str, optional): The specific component in the path where the failure occurred.
"""
self.parameter_path = parameter_path
self.failed_at = failed_at
super(ParameterPathError, self).__init__(message)
204 changes: 188 additions & 16 deletions policyengine_core/parameters/operations/get_parameter.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from policyengine_core.parameters.parameter import Parameter
from policyengine_core.parameters.parameter_node import ParameterNode
from policyengine_core.errors.parameter_path_error import ParameterPathError


def get_parameter(root: ParameterNode, parameter: str) -> Parameter:
Expand All @@ -13,21 +14,192 @@ def get_parameter(root: ParameterNode, parameter: str) -> Parameter:
Parameter: The parameter.
"""
node = root
for name in parameter.split("."):

for path_component in parameter.split("."):
node = _navigate_to_node(node, path_component, parameter)

return node


def _navigate_to_node(node, path_component, full_path):
Copy link
Collaborator

@mikesmit mikesmit Mar 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion I was expecting the following logic, this is pretty different and I think has some bugs:

  1. split the parameter into components by "."
  2. for each component attempt to find the child node matching that component
    • if the current node has no children we can't possibly find any child with the specified name so error and exit.
    • use a regex to confirm it is a variable name followed by an optional index in brackets and to parse out the name and the index as separate groups.
    • now you should have a variable name and an optional integer index.
    • now look to see if the current node has a child with the appropriate name
      • if not, generate an error that lists possible matches based on the text of the variable name and what children do exist.
    • if the optional index is None, then return the child directly.
    • otherwise confirm the matching child actually is an array
      • if not again generate an error indicating that the parameter exists but does not support indexing
      • if the index is out of range generate an error ....
      • otherwise return the indexed child.

As per clean code this code should be decomposed into relatively small, logical, clearly named methods.

For example:

def _find_child(node, path_component,...):
   name, index = self._parse_name(path_component)
   child = self._find_matching_child(node, name)
   if index is None:
      return child
   return self._get_indexed_value(child, index)

"""Navigate to the next node in the parameter tree."""
if hasattr(node, "brackets") and "[" in path_component:
# Handle case where we're already at a scale parameter and need to access a bracket
return _handle_bracket_access(node, path_component, full_path)

Comment on lines +26 to +29
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking, unclear maybe bug This doesn't make sense. If I parse a path name "parent[1].child[2]" and parent[1] resolves to something, when I try to resolve 'child[2]'

  • 'node' would be the parent and it would have brackets.
  • the path_component would be 'child[2]'
  • we'd go into _handle_bracket_access
  • which .. would then fail because it expects the path to start and end with brackets.

What case would make sense for a segment to just be an index in brackets? Segments have to be separated by '.' so wouldn't that be "parent[1].[2]"?

Either way I didn't see this supported in the original code you are refactoring. What functionality are you trying to replicate or add here?

# For regular node navigation (no brackets)
if not hasattr(node, "children"):
raise ParameterPathError(
f"Cannot access '{path_component}' in '{full_path}'. "
"The parent is not a parameter node with children.",
parameter_path=full_path,
failed_at=path_component,
)

# Find the child node using the path component
return _find_child(node, path_component, full_path)


def _find_child(node, path_component, full_path):
"""
Process a path component to find a child node, including bracket access if needed.

Args:
node: The parent node to search in
path_component: The path component to process
full_path: The full parameter path being accessed (for error reporting)

Returns:
The found node after processing the path component

Raises:
ParameterPathError: When the child cannot be found or other errors occur
"""
# Parse the path component into name part and optional bracket part
name_part, index = _parse_path_component(path_component, full_path)

# Find the matching child by name
child_node = _find_matching_child(node, name_part, full_path)

# If we have a bracket index, access the brackets property
if index is not None:
return _access_indexed_value(
child_node, name_part, index, path_component, full_path
)

return child_node


def _find_matching_child(node, name_part, full_path):
"""
Find a child node by name, providing helpful error messages when not found.

Args:
node: The parent node to search in
name_part: The name of the child to find
full_path: The full parameter path being accessed (for error reporting)

Returns:
The found child node

Raises:
ParameterPathError: When the child cannot be found, with helpful suggestions
"""
try:
return node.children[name_part]
except KeyError:
suggestions = _find_similar_parameters(node, name_part)
suggestion_text = (
f" Did you mean: {', '.join(suggestions)}?" if suggestions else ""
)
raise ParameterPathError(
f"Parameter '{name_part}' not found in path '{full_path}'.{suggestion_text}",
parameter_path=full_path,
failed_at=name_part,
)


def _handle_bracket_access(node, path_component, full_path):
"""Handle bracket access on an existing ParameterScale object."""
# Extract the bracket index from the path component
if not path_component.startswith("[") or not path_component.endswith("]"):
raise ParameterPathError(
f"Invalid bracket syntax at '{path_component}' in parameter path '{full_path}'. "
"Use format: [index], e.g., [0]",
parameter_path=full_path,
failed_at=path_component,
)

try:
index = int(path_component[1:-1])
except ValueError:
raise ParameterPathError(
f"Invalid bracket index in '{path_component}' of parameter path '{full_path}'. "
"Index must be an integer.",
parameter_path=full_path,
failed_at=path_component,
)

try:
return node.brackets[index]
except IndexError:
max_index = len(node.brackets) - 1
raise ParameterPathError(
f"Bracket index out of range in '{path_component}' of parameter path '{full_path}'. "
f"Valid indices are 0 to {max_index}.",
parameter_path=full_path,
failed_at=path_component,
)


def _parse_path_component(path_component, full_path):
"""Parse a path component into name and optional bracket index."""
if "[" not in path_component:
return path_component, None

try:
parts = path_component.split(
"[", 1
) # Split only on the first occurrence of "["
name_part = parts[0]
bracket_part = (
"[" + parts[1]
) # Include the "[" for consistent reporting

if not bracket_part.endswith("]"):
raise ParameterPathError(
f"Invalid bracket syntax at '{path_component}' in parameter path '{full_path}'. "
"Use format: parameter_name[index], e.g., brackets[0].rate",
parameter_path=full_path,
failed_at=name_part
+ bracket_part, # Use the original bracket part for error reporting
)

try:
if "[" not in name:
node = node.children[name]
else:
try:
name, index = name.split("[")
index = int(index[:-1])
node = node.children[name].brackets[index]
except:
raise ValueError(
"Invalid bracket syntax (should be e.g. tax.brackets[3].rate"
)
except:
raise ValueError(
f"Could not find the parameter {parameter} (failed at {name})."
index = int(bracket_part[1:-1]) # Remove both "[" and "]"
return name_part, index
except ValueError:
raise ParameterPathError(
f"Invalid bracket index in '{path_component}' of parameter path '{full_path}'. "
"Index must be an integer.",
parameter_path=full_path,
failed_at=name_part + bracket_part,
)
return node

except ValueError: # More than one "[" found
raise ParameterPathError(
f"Invalid bracket syntax at '{path_component}' in parameter path '{full_path}'. "
"Use format: parameter_name[index], e.g., brackets[0].rate",
parameter_path=full_path,
failed_at=path_component,
)


def _access_indexed_value(node, name_part, index, path_component, full_path):
"""Access a bracket by index on a node."""
if not hasattr(node, "brackets"):
raise ParameterPathError(
f"'{name_part}' in parameter path '{full_path}' does not support bracket indexing. "
"Only scale parameters (like brackets) support indexing.",
parameter_path=full_path,
failed_at=path_component,
)

try:
return node.brackets[index]
except IndexError:
max_index = len(node.brackets) - 1
raise ParameterPathError(
f"Bracket index out of range in '{path_component}' of parameter path '{full_path}'. "
f"Valid indices are 0 to {max_index}.",
parameter_path=full_path,
failed_at=path_component,
)


def _find_similar_parameters(node, name):
"""Find parameter names similar to the requested one."""
if not hasattr(node, "children"):
return []

return [key for key in node.children.keys() if name.lower() in key.lower()]
Loading