Skip to content

[Shared] Create Q-Learning config & types (Pydantic v2) #25

@saviornt

Description

@saviornt

Description

Implement the foundational configuration and data models for the Q-Learning component using Pydantic v2. This includes hyperparameter models (general + tabular-specific), transition tuples, episode statistics, and checkpoint metadata stubs. All models must be fully typed, documented with Google-style docstrings, and support validation + serialization.

Why: Establishes strong typing and config safety across shared, appliance, and assistant code to prevent drift.

Type

  • Task

Focus Area (pick one)

  • Shared Utils & Models

Priority

  • Critical

Acceptance Criteria

  • QLearningHyperparams BaseModel with all core hyperparameters (alpha, gamma, epsilon schedule, etc.)
  • TabularQConfig extending it with n_states, n_actions
  • Transition and EpisodeStats models with proper typing (Any → constrained where possible)
  • All models pass strict MyPy (no Any leaks, full coverage)
  • Unit tests cover creation, validation errors, JSON round-trip, and field constraints
  • Google-style docstrings on every class and important field
  • Files created: shared/src/learning/q_learning/config.py and types.py

Blocker / Dependencies

None

Notes / Links

  • Related files: shared/src/learning/q_learning/
  • Pydantic v2 best practices (use Field for defaults/constraints)

Metadata

Metadata

Assignees

Labels

needs-triageNew issue that hasn't been reviewed/prioritized yettaskGeneral work item (implementation, setup, cleanup) – most common label

Projects

Status

Manual QA Testing

Relationships

None yet

Development

No branches or pull requests

Issue actions