Skip to content

Add Tier 6: Self-Improving Fleet (Engineering Agent) to roadmap#150

Open
bicced wants to merge 1 commit intomainfrom
worktree-feat/engineering-agent-roadmap
Open

Add Tier 6: Self-Improving Fleet (Engineering Agent) to roadmap#150
bicced wants to merge 1 commit intomainfrom
worktree-feat/engineering-agent-roadmap

Conversation

@bicced
Copy link
Contributor

@bicced bicced commented Feb 25, 2026

Summary

  • Adds Tier 6: Self-Improving Fleet to ROADMAP.md — a comprehensive 5-phase plan for a containerized engineering agent that observes fleet performance and applies improvements
  • Updates competitive positioning table, "Our Moat", and "Next Differentiators" sections

Architecture Decision

The engineering agent runs in a standard container with the same hardening as every other agent — no host access, no Docker socket, no special privileges. This was a deliberate choice over a host-side tool:

Host-side tool Containerized agent
Full .env access (all API keys) Vault-mediated, agent-tier only
Docker socket = root on host No Docker access
Direct config file write Validated admin API with security guards
No audit trail All operations traced + permission-checked

Phases

Phase What Why
6.1 Fleet Observability API New mesh endpoints aggregating traces, costs, health, model status Foundation — gives the engineering agent eyes
6.2 Config Management API Validated read/write of agents.yaml, permissions.json, mesh.yaml Hands — agent can tune deployment config safely
6.3 Engineering Agent Definition Agent config + ODIT system prompt + permissions + GitHub PAT in vault The agent itself — observe, diagnose, improve, track
6.4 Heartbeat Diagnostics Publishing Agents self-report tool success rates, error counts, compaction stats to blackboard Cross-fleet performance data from inside agents
6.5 Improvement Tracking & Feedback Loop Before/after metrics, auto-rollback on regression, learning from outcomes Closes the loop — improvements become data-driven

Security

Every threat was analyzed and mitigated — see the Security Summary table in the roadmap. Key properties: no privilege escalation (deadman's switch on admin API), human review for code PRs, auto-rollback for config regressions, allowlisted workspace file access only.

Test plan

  • Review Tier 6 section for consistency with existing roadmap format
  • Verify security model holds — no new attack surface
  • Confirm competitive positioning table is accurate

Adds a comprehensive 5-phase plan for a containerized engineering agent
that observes fleet performance and applies improvements — without
breaking the security model.

Key design decision: the engineering agent runs in a standard container
with the same hardening as every other agent (UID 1000, no-new-privileges,
512MB mem, no Docker socket). Fleet data flows through new mesh API
endpoints. Config changes go through validated admin endpoints with
schema checks, security guards, and automatic backups. Code improvements
go through git clone → test → PR with human review.

Phases:
- 6.1: Fleet Observability API (mesh endpoints for fleet-wide data)
- 6.2: Config Management API (validated read/write of deployment config)
- 6.3: Engineering Agent Definition (agent config + ODIT system prompt)
- 6.4: Heartbeat Diagnostics Publishing (agents self-report performance)
- 6.5: Improvement Tracking & Feedback Loop (before/after metrics, auto-rollback)

Also updates competitive positioning table (no competitor has this) and
"Our Moat" / "Next Differentiators" sections.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant