-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Epic: #87 — Architectural Simplification v0.6.0
Goal
Enable multiple CLI/MCP processes to coordinate worker spawning through a shared database table, and persist task output for cross-process visibility.
Design Constraints
- Dual tracking: Keep
Map<WorkerId, WorkerState>in-memory for single-process fast path (O(1) lookups). Mirror toworkerstable for cross-process coordination. Same-process operations stay fast; cross-process queries use DB. - Orphan row recovery: If a process crashes mid-kill, the worker is dead but the DB row remains. Recovery on startup handles this — check each
owner_pidwithprocess.kill(pid, 0), DELETE dead rows. - In-memory/DB boundary:
ChildProcesshandles, timeout timers, and output streams are ephemeral per-process state — they stay in-memory. Theworkerstable stores only serializable coordination data (id, task_id, pid, owner_pid, agent, started_at).
Sub-tasks
2a. Add workers Table
Schema (migration v9):
CREATE TABLE workers (
id TEXT PRIMARY KEY,
task_id TEXT NOT NULL UNIQUE,
pid INTEGER NOT NULL,
owner_pid INTEGER NOT NULL, -- PID of the process that spawned this worker
agent TEXT NOT NULL DEFAULT 'claude',
started_at INTEGER NOT NULL,
FOREIGN KEY (task_id) REFERENCES tasks(id) ON DELETE CASCADE
);
CREATE INDEX idx_workers_owner ON workers(owner_pid);Design principle: This is a coordination registry, not full worker state. The table answers one question: "how many workers exist across all processes?"
File: src/implementations/database.ts — Add migration v9.
2b. Worker Lifecycle Writes
On spawn: INSERT into workers table.
On completion/kill: DELETE from workers table.
On startup (recovery): DELETE stale rows where owner_pid is no longer running.
Files:
src/implementations/event-driven-worker-pool.ts— AddDatabaseinjection. INSERT after successful spawn, DELETE on completion/kill. Keep in-memoryMap<WorkerId, WorkerState>for single-process fast path.src/services/recovery-manager.ts— On startup, queryworkerstable, check eachowner_pidwithprocess.kill(pid, 0). DELETE rows for dead processes. Mark their tasks as FAILED.
2c. Cross-Process Resource Checks
Change: ResourceMonitor.canSpawnWorker() queries the workers table for global worker count instead of relying on in-memory workerCount.
Files:
src/implementations/resource-monitor.ts— AddDatabaseinjection.canSpawnWorker()runsSELECT COUNT(*) FROM workersfor global count. KeepsettlingWorkersarray in-memory (it's per-process and still relevant).src/core/interfaces.ts— UpdateResourceMonitorinterface if method signatures change.
2d. Spawn Serialization Across Processes
Change: Use SQLite's built-in locking for cross-process spawn serialization. The existing in-process mutex (spawnLock in WorkerHandler) handles within-process serialization. For cross-process, use a BEGIN IMMEDIATE transaction around the spawn-decision + INSERT.
File: src/services/handlers/worker-handler.ts — Wrap the spawn decision (resource check + dequeue + spawn + INSERT) in a database.runInTransaction().
2e. Output Persistence for Cross-Process Visibility
Problem: OutputCapture stores task output in a process-local Map<TaskId, OutputBuffer>. beat task logs from a different process returns nothing. An SQLiteOutputRepository exists (src/implementations/output-repository.ts) with a task_output table and file-based fallback for large outputs, but it's never wired into the live capture path.
Change: Wire OutputRepository into ProcessConnector so output is persisted to SQLite during capture. Same-process callers can still use in-memory OutputCapture for speed. Cross-process callers (including the lightweight CLI) read from OutputRepository.
Files:
src/services/process-connector.ts— InjectOutputRepository. After capturing to in-memory buffer, also calloutputRepository.append(taskId, stream, data). Batch writes to reduce DB contention (e.g., flush every 100ms or on buffer threshold).src/bootstrap.ts— CreateSQLiteOutputRepository, pass toProcessConnector.
Files Changed
Modified
src/implementations/database.ts— migration v9src/implementations/event-driven-worker-pool.ts— INSERT/DELETE on spawn/completionsrc/services/recovery-manager.ts— stale worker cleanup on startupsrc/implementations/resource-monitor.ts— global worker count from DBsrc/core/interfaces.ts— ResourceMonitor interface updatesrc/services/handlers/worker-handler.ts— cross-process spawn serializationsrc/services/process-connector.ts— wire OutputRepository for persistencesrc/bootstrap.ts— create SQLiteOutputRepository, pass to ProcessConnector
Risk
Medium — new table, cross-process logic. No existing behavior should break since the workers table is additive. Output persistence wires an existing implementation.
| Sub-task | Risk | Notes |
|---|---|---|
| 2a-d | Medium | New table, cross-process coordination logic |
| 2e | Low | Wiring existing OutputRepository implementation |
Verification
npm run build— clean compilationnpx biome check src/ tests/— no lint issues- All test groups pass
- Two concurrent
beat run— workers table shows both, resource checks account for both - Kill process mid-task → restart → stale workers cleaned, tasks marked failed
beat task logsfrom a different process — returns output persisted via OutputRepository