[Appliance Core] Build async Q-Learning inference/learning loop with MQTT reporting

## Description

Create the main asynchronous event loop for the edge Q-Learning agent: observe state → select action → (dummy) execute → store transition → periodic learning → publish episode stats via MQTT.

Why: This is the heart of the edge autonomy — must be non-blocking and observable.

## Type

- [x] Feature

## Focus Area (pick one)

- [x] Appliance Core (Pi edge) / MQTT & Comms

## Priority

- [x] Critical

## Acceptance Criteria

- [ ] Async loop using asyncio that runs indefinitely (with graceful shutdown)
- [ ] Publishes episode summaries to MQTT (e.g. network-chan/edge/qlearn/stats)
- [ ] Integrates shared replay, agent, and safety stubs
- [ ] Logs key metrics (epsilon, avg reward, steps)
- [ ] No blocking I/O in the main loop
- [ ] Uses paho-mqtt async client

## Blocker / Dependencies

- [Appliance Core] Implement dummy environment & basic Q-Learning agent

## Notes / Links

- Later: replace dummy observe/execute with real telemetry/Netmiko

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Appliance Core] Build async Q-Learning inference/learning loop with MQTT reporting #29

Description

Type

Focus Area (pick one)

Priority

Acceptance Criteria

Blocker / Dependencies

Notes / Links

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Appliance Core] Build async Q-Learning inference/learning loop with MQTT reporting #29

Description

Description

Type

Focus Area (pick one)

Priority

Acceptance Criteria

Blocker / Dependencies

Notes / Links

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions