Skip to content

[Shared Utils & Models] Add basic uniform replay buffer & checkpoint serialization #27

@saviornt

Description

@saviornt

Description

Implement a simple uniform/circular replay buffer and basic checkpoint save/load for Q-table + metadata. This enables experience sharing between edge episodes and central training.

Why: Replay is essential for off-policy learning; checkpoints allow policy transfer edge ↔ central.

Type

  • Task

Focus Area (pick one)

  • Shared Utils & Models

Priority

  • High

Acceptance Criteria

  • UniformReplay class (append, sample(batch_size), len, max_size control)
  • Checkpoint serialization: save/load Q-table (numpy .npz) + config + episode stats
  • Save → load round-trip preserves Q-values (float32 tolerance)
  • Tests: buffer overflow behavior, sampling uniformity, checkpoint integrity
  • Google-style docstrings on public API
  • Located in shared/src/learning/q_learning/replay/ and io/serialization.py

Blocker / Dependencies

  • [Shared Utils & Models] Create Q-Learning config & types (Pydantic v2)

Notes / Links

  • Future extension: prioritized replay in same module

Metadata

Metadata

Assignees

Labels

needs-triageNew issue that hasn't been reviewed/prioritized yettaskGeneral work item (implementation, setup, cleanup) – most common label

Projects

Status

Manual QA Testing

Relationships

None yet

Development

No branches or pull requests

Issue actions