-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
needs-triageNew issue that hasn't been reviewed/prioritized yetNew issue that hasn't been reviewed/prioritized yettaskGeneral work item (implementation, setup, cleanup) – most common labelGeneral work item (implementation, setup, cleanup) – most common label
Milestone
Description
Description
Implement a simple uniform/circular replay buffer and basic checkpoint save/load for Q-table + metadata. This enables experience sharing between edge episodes and central training.
Why: Replay is essential for off-policy learning; checkpoints allow policy transfer edge ↔ central.
Type
- Task
Focus Area (pick one)
- Shared Utils & Models
Priority
- High
Acceptance Criteria
-
UniformReplayclass (append, sample(batch_size), len, max_size control) - Checkpoint serialization: save/load Q-table (numpy .npz) + config + episode stats
- Save → load round-trip preserves Q-values (float32 tolerance)
- Tests: buffer overflow behavior, sampling uniformity, checkpoint integrity
- Google-style docstrings on public API
- Located in shared/src/learning/q_learning/replay/ and io/serialization.py
Blocker / Dependencies
- [Shared Utils & Models] Create Q-Learning config & types (Pydantic v2)
Notes / Links
- Future extension: prioritized replay in same module
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
needs-triageNew issue that hasn't been reviewed/prioritized yetNew issue that hasn't been reviewed/prioritized yettaskGeneral work item (implementation, setup, cleanup) – most common labelGeneral work item (implementation, setup, cleanup) – most common label
Projects
Status
Manual QA Testing